1
|
Sajek MP, Bilodeau DY, Beer MA, Horton E, Miyamoto Y, Velle KB, Eckmann L, Fritz-Laylin L, Rissland OS, Mukherjee N. Evolutionary dynamics of polyadenylation signals and their recognition strategies in protists. Genome Res 2024; 34:1570-1581. [PMID: 39327029 PMCID: PMC11529991 DOI: 10.1101/gr.279526.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 09/11/2024] [Indexed: 09/28/2024]
Abstract
The poly(A) signal, together with auxiliary elements, directs cleavage of a pre-mRNA and thus determines the 3' end of the mature transcript. In many species, including humans, the poly(A) signal is an AAUAAA hexamer, but we recently found that the deeply branching eukaryote Giardia lamblia uses a distinct hexamer (AGURAA) and lacks any known auxiliary elements. Our discovery prompted us to explore the evolutionary dynamics of poly(A) signals and auxiliary elements in the eukaryotic kingdom. We use direct RNA sequencing to determine poly(A) signals for four protists within the Metamonada clade (which also contains G. lamblia) and two outgroup protists. These experiments reveal that the AAUAAA hexamer serves as the poly(A) signal in at least four different eukaryotic clades, indicating that it is likely the ancestral signal, whereas the unusual Giardia version is derived. We find that the use and relative strengths of auxiliary elements are also plastic; in fact, within Metamonada, species like G. lamblia make use of a previously unrecognized auxiliary element where nucleotides flanking the poly(A) signal itself specify genuine cleavage sites. Thus, despite the fundamental nature of pre-mRNA cleavage for the expression of all protein-coding genes, the motifs controlling this process are dynamic on evolutionary timescales, providing motivation for future biochemical and structural studies as well as new therapeutic angles to target eukaryotic pathogens.
Collapse
Affiliation(s)
- Marcin P Sajek
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
- Institute of Human Genetics, Polish Academy of Sciences, 60-479 Poznan, Poland
| | - Danielle Y Bilodeau
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| | - Michael A Beer
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | - Emma Horton
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| | - Yukiko Miyamoto
- Department of Medicine, University of California San Diego, La Jolla, California 92093, USA
| | - Katrina B Velle
- Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - Lars Eckmann
- Department of Medicine, University of California San Diego, La Jolla, California 92093, USA
| | - Lillian Fritz-Laylin
- Department of Biology, University of Massachusetts, Amherst, Massachusetts 01003, USA
| | - Olivia S Rissland
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA;
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| | - Neelanjan Mukherjee
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA;
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| |
Collapse
|
2
|
Wang Y, Zhang H, Zhang Z, Hua B, Liu J, Miao M. Source leaves are regulated by sink strengths through non-coding RNAs and alternative polyadenylation in cucumber (Cucumis sativus L.). BMC PLANT BIOLOGY 2024; 24:812. [PMID: 39198785 PMCID: PMC11360537 DOI: 10.1186/s12870-024-05416-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 07/12/2024] [Indexed: 09/01/2024]
Abstract
BACKGROUND The yield of major crops is generally limited by sink capacity and source strength. Cucumber is a typical raffinose family oligosaccharides (RFOs)-transporting crop. Non-coding RNAs and alternative polyadenylation (APA) play important roles in the regulation of growth process in plants. However, their roles on the sink‒source regulation have not been demonstrated in RFOs-translocating species. RESULTS Here, whole-transcriptome sequencing was applied to compare the leaves of cucumber under different sink strength, that is, no fruit-carrying leaves (NFNLs) and fruit-carrying leaves (FNLs) at 12th node from the bottom. The results show that 1101 differentially expressed (DE) mRNAs, 79 DE long non-coding RNAs (lncRNAs) and 23 DE miRNAs were identified, which were enriched in photosynthesis, energy production and conversion, plant hormone signal transduction, starch and carbohydrate metabolism and protein synthesis pathways. Potential co-expression networks like, DE lncRNAs-DE mRNAs/ DE miRNAs-DE mRNAs, and competing endogenous RNA (ceRNA) regulation models (DE lncRNAs-DE miRNAs-DE mRNAs) associated with sink‒source allocation, were constructed. Furthermore, 37 and 48 DE genes, which enriched in MAPK signaling and plant hormone signal transduction pathway, exist differentially APA, and SPS (CsaV3_2G033300), GBSS1 (CsaV3_5G001560), ERS1 (CsaV3_7G029600), PNO1 (CsaV3_3G003950) and Myb (CsaV3_3G022290) may be regulated by both ncRNAs and APA between FNLs and NFNLs, speculating that ncRNAs and APA are involved in the regulation of gene expression of cucumber sink‒source carbon partitioning. CONCLUSIONS These results reveal a comprehensive network among mRNAs, ncRNAs, and APA in cucumber sink-source relationships. Our findings also provide valuable information for further research on the molecular mechanism of ncRNA and APA to enhance cucumber yield.
Collapse
Affiliation(s)
- Yudan Wang
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou, 225009, China
| | - Huimin Zhang
- Jiangsu Yanjiang Institute of Agricultural Sciences, Nantong, 226541, China
| | - Zhiping Zhang
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou, 225009, China
| | - Bing Hua
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou, 225009, China
| | - Jiexia Liu
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou, 225009, China
| | - Minmin Miao
- College of Horticulture and Landscape Architecture, Yangzhou University, Yangzhou, 225009, China.
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou, 225009, China.
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding, Yangzhou University, Yangzhou, 225009, China.
| |
Collapse
|
3
|
Murari E, Meadows D, Cuda N, Mangone M. A comprehensive analysis of 3'UTRs in Caenorhabditis elegans. Nucleic Acids Res 2024; 52:7523-7538. [PMID: 38917330 PMCID: PMC11260456 DOI: 10.1093/nar/gkae543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 04/29/2024] [Accepted: 06/11/2024] [Indexed: 06/27/2024] Open
Abstract
3'Untranslated regions (3'UTRs) are essential portions of genes containing elements necessary for pre-mRNA 3'end processing and are involved in post-transcriptional gene regulation. Despite their importance, they remain poorly characterized in eukaryotes. Here, we have used a multi-pronged approach to extract and curate 3'UTR data from 11533 publicly available datasets, corresponding to the entire collection of Caenorhabditis elegans transcriptomes stored in the NCBI repository from 2009 to 2023. We have also performed high throughput cloning pipelines to identify and validate rare 3'UTR isoforms and incorporated and manually curated 3'UTR isoforms from previously published datasets. This updated C. elegans 3'UTRome (v3) is the most comprehensive resource in any metazoan to date, covering 97.4% of the 20362 experimentally validated protein-coding genes with refined and updated 3'UTR boundaries for 23489 3'UTR isoforms. We also used this novel dataset to identify and characterize sequence elements involved in pre-mRNA 3'end processing and update miRNA target predictions. This resource provides important insights into the 3'UTR formation, function, and regulation in eukaryotes.
Collapse
Affiliation(s)
- Emma Murari
- The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA
- School of Life Sciences, Arizona State University, 427 E Tyler Mall, Tempe, AZ, USA
| | - Dalton Meadows
- The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA
- School of Life Sciences, Arizona State University, 427 E Tyler Mall, Tempe, AZ, USA
| | - Nicholas Cuda
- The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA
- School of Life Sciences, Arizona State University, 427 E Tyler Mall, Tempe, AZ, USA
| | - Marco Mangone
- The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA
| |
Collapse
|
4
|
Peleke FF, Zumkeller SM, Gültas M, Schmitt A, Szymański J. Deep learning the cis-regulatory code for gene expression in selected model plants. Nat Commun 2024; 15:3488. [PMID: 38664394 PMCID: PMC11045779 DOI: 10.1038/s41467-024-47744-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 04/09/2024] [Indexed: 04/28/2024] Open
Abstract
Elucidating the relationship between non-coding regulatory element sequences and gene expression is crucial for understanding gene regulation and genetic variation. We explored this link with the training of interpretable deep learning models predicting gene expression profiles from gene flanking regions of the plant species Arabidopsis thaliana, Solanum lycopersicum, Sorghum bicolor, and Zea mays. With over 80% accuracy, our models enabled predictive feature selection, highlighting e.g. the significant role of UTR regions in determining gene expression levels. The models demonstrated remarkable cross-species performance, effectively identifying both conserved and species-specific regulatory sequence features and their predictive power for gene expression. We illustrated the application of our approach by revealing causal links between genetic variation and gene expression changes across fourteen tomato genomes. Lastly, our models efficiently predicted genotype-specific expression of key functional gene groups, exemplified by underscoring known phenotypic and metabolic differences between Solanum lycopersicum and its wild, drought-resistant relative, Solanum pennellii.
Collapse
Affiliation(s)
- Fritz Forbang Peleke
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT, Gatersleben, Germany
| | - Simon Maria Zumkeller
- Institute of Bio- and Geosciences, IBG-4: Bioinformatics, Forschungszentrum Jülich, D-52428, Jülich, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-Universität Düsseldorf, 40225, Düsseldorf, Germany
| | - Mehmet Gültas
- Faculty of Agriculture, South Westphalia University of Applied Sciences, Soest, 59494, Germany
| | - Armin Schmitt
- Breeding Informatics Group, University of Göttingen, Göttingen, 37075, Germany
- Center of Integrated Breeding Research (CiBreed), Göttingen, 37075, Germany
| | - Jędrzej Szymański
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, D-06466 Seeland, OT, Gatersleben, Germany.
- Institute of Bio- and Geosciences, IBG-4: Bioinformatics, Forschungszentrum Jülich, D-52428, Jülich, Germany.
- Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-Universität Düsseldorf, 40225, Düsseldorf, Germany.
| |
Collapse
|
5
|
Niederau PA, Eglé P, Willig S, Parsons J, Hoernstein SNW, Decker EL, Reski R. Multifactorial analysis of terminator performance on heterologous gene expression in Physcomitrella. PLANT CELL REPORTS 2024; 43:43. [PMID: 38246952 PMCID: PMC10800305 DOI: 10.1007/s00299-023-03088-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 11/02/2023] [Indexed: 01/23/2024]
Abstract
KEY MESSAGE Characterization of Physcomitrella 3'UTRs across different promoters yields endogenous single and double terminators for usage in molecular pharming. The production of recombinant proteins for health applications accounts for a large share of the biopharmaceutical market. While many drugs are produced in microbial and mammalian systems, plants gain more attention as expression hosts to produce eukaryotic proteins. In particular, the good manufacturing practice (GMP)-compliant moss Physcomitrella (Physcomitrium patens) has outstanding features, such as excellent genetic amenability, reproducible bioreactor cultivation, and humanized protein glycosylation patterns. In this study, we selected and characterized novel terminators for their effects on heterologous gene expression. The Physcomitrella genome contains 53,346 unique 3'UTRs (untranslated regions) of which 7964 transcripts contain at least one intron. Over 91% of 3'UTRs exhibit more than one polyadenylation site, indicating the prevalence of alternative polyadenylation in Physcomitrella. Out of all 3'UTRs, 14 terminator candidates were selected and characterized via transient Dual-Luciferase assays, yielding a collection of endogenous terminators performing equally high as established heterologous terminators CaMV35S, AtHSP90, and NOS. High performing candidates were selected for testing as double terminators which impact reporter levels, dependent on terminator identity and positioning. Testing of 3'UTRs among the different promoters NOS, CaMV35S, and PpActin5 showed an increase of more than 1000-fold between promoters PpActin5 and NOS, whereas terminators increased reporter levels by less than tenfold, demonstrating the stronger effect promoters play as compared to terminators. Among selected terminator attributes, the number of polyadenylation sites as well as polyadenylation signals were found to influence terminator performance the most. Our results improve the biotechnology platform Physcomitrella and further our understanding of how terminators influence gene expression in plants in general.
Collapse
Affiliation(s)
| | - Pauline Eglé
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Sandro Willig
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Juliana Parsons
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg, Germany
| | | | - Eva L Decker
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Ralf Reski
- Plant Biotechnology, Faculty of Biology, University of Freiburg, Freiburg, Germany.
- Signalling Research Centre BIOSS and CIBSS, University of Freiburg, Freiburg, Germany.
| |
Collapse
|
6
|
Jian H, Sun H, Liu R, Zhang W, Shang L, Wang J, Khassanov V, Lyu D. Construction of drought stress regulation networks in potato based on SMRT and RNA sequencing data. BMC PLANT BIOLOGY 2022; 22:381. [PMID: 35909124 PMCID: PMC9341072 DOI: 10.1186/s12870-022-03758-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 07/08/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND Potato (Solanum tuberosum) is the fourth most important food crop in the world and plays an important role in food security. Drought stress has a significantly negative impact on potato growth and production. There are several publications involved drought stress in potato, this research contributes to enrich the knowledge. RESULTS In this study, next-generation sequencing (NGS) and single-molecule real-time (SMRT) sequencing technology were used to study the transcription profiles in potato in response to 20%PEG6000 simulates drought stress. The leaves of the variety "Désirée" from in vitro plantlets after drought stress at six time points from 0 to 48 hours were used to perform NGS and SMRT sequencing. According to the sequencing data, a total of 12,798 differentially expressed genes (DEGs) were identified in six time points. The real-time (RT)-PCR results are significantly correlated with the sequencing data, confirming the accuracy of the sequencing data. Gene ontology and KEGG analysis show that these DEGs participate in response to drought stress through galactose metabolism, fatty acid metabolism, plant-pathogen interaction, glutathione metabolism and other pathways. Through the analysis of alternative splicing of 66,888 transcripts, the functional pathways of these transcripts were enriched, and 51,098 transcripts were newly discovered from alternative splicing events and 47,994 transcripts were functionally annotated. Moreover, 3445 lncRNAs were predicted and enrichment analysis of corresponding target genes was also performed. Additionally, Alternative polyadenylation was analyzed by TADIS, and 26,153 poly (A) sites from 13,010 genes were detected in the Iso-Seq data. CONCLUSION Our research greatly enhanced potato drought-induced gene annotations and provides transcriptome-wide insights into the molecular basis of potato drought resistance.
Collapse
Affiliation(s)
- Hongju Jian
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
- State Cultivation Base of Crop Stress Biology for Southern Mountainous Land of Southwest University, Chongqing, 400715 China
- Chongqing Key Laboratory of Biology and Genetic Breeding for Tuber and Root Crops, Chongqing, 400715 China
| | - Haonan Sun
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
| | - Rongrong Liu
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
| | - Wenzhe Zhang
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
| | - Lina Shang
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
| | - Jichun Wang
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
- State Cultivation Base of Crop Stress Biology for Southern Mountainous Land of Southwest University, Chongqing, 400715 China
- Chongqing Key Laboratory of Biology and Genetic Breeding for Tuber and Root Crops, Chongqing, 400715 China
| | - Vadim Khassanov
- S. Seifullin Kazakh Agrotechnical University, Zhenis Avenue, 010011 Astana, Republic of Kazakhstan
| | - Dianqiu Lyu
- College of Agronomy and Biotechnology, Southwest University, Chongqing, 400715 China
- State Cultivation Base of Crop Stress Biology for Southern Mountainous Land of Southwest University, Chongqing, 400715 China
- Chongqing Key Laboratory of Biology and Genetic Breeding for Tuber and Root Crops, Chongqing, 400715 China
| |
Collapse
|
7
|
Bilodeau DY, Sheridan RM, Balan B, Jex AR, Rissland OS. Precise gene models using long-read sequencing reveal a unique poly(A) signal in Giardia lamblia. RNA (NEW YORK, N.Y.) 2022; 28:668-682. [PMID: 35110372 PMCID: PMC9014877 DOI: 10.1261/rna.078793.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 01/17/2022] [Indexed: 06/14/2023]
Abstract
During pre-mRNA processing, the poly(A) signal is recognized by a protein complex that ensures precise cleavage and polyadenylation of the nascent transcript. The location of this cleavage event establishes the length and sequence of the 3' UTR of an mRNA, thus determining much of its post-transcriptional fate. Using long-read sequencing, we characterize the polyadenylation signal and related sequences surrounding Giardia lamblia cleavage sites for over 2600 genes. We find that G. lamblia uses an AGURAA poly(A) signal, which differs from the mammalian AAUAAA. We also describe how G. lamblia lacks common auxiliary elements found in other eukaryotes, along with the proteins that recognize them. Further, we identify 133 genes with evidence of alternative polyadenylation. These results suggest that despite pared-down cleavage and polyadenylation machinery, 3' end formation still appears to be an important regulatory step for gene expression in G. lamblia.
Collapse
Affiliation(s)
- Danielle Y Bilodeau
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| | - Ryan M Sheridan
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| | - Balu Balan
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Melbourne, VIC 3052, Australia
| | - Aaron R Jex
- Population Health and Immunity Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Melbourne, VIC 3052, Australia
- Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC 3052, Australia
| | - Olivia S Rissland
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
- RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, Colorado 80045, USA
| |
Collapse
|
8
|
Schärfen L, Zigackova D, Reimer KA, Stark MR, Slat VA, Francoeur NJ, Wells ML, Zhou L, Blackshear PJ, Neugebauer KM, Rader SD. Identification of Alternative Polyadenylation in Cyanidioschyzon merolae Through Long-Read Sequencing of mRNA. Front Genet 2022; 12:818697. [PMID: 35154260 PMCID: PMC8831791 DOI: 10.3389/fgene.2021.818697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 12/22/2021] [Indexed: 12/04/2022] Open
Abstract
Alternative polyadenylation (APA) is widespread among metazoans and has been shown to have important impacts on mRNA stability and protein expression. Beyond a handful of well-studied organisms, however, its existence and consequences have not been well investigated. We therefore turned to the deep-branching red alga, Cyanidioschyzon merolae, to study the biology of polyadenylation in an organism highly diverged from humans and yeast. C. merolae is an acidothermophilic alga that lives in volcanic hot springs. It has a highly reduced genome (16.5 Mbp) and has lost all but 27 of its introns and much of its splicing machinery, suggesting that it has been under substantial pressure to simplify its RNA processing pathways. We used long-read sequencing to assess the key features of C. merolae mRNAs, including splicing status and polyadenylation cleavage site (PAS) usage. Splicing appears to be less efficient in C. merolae compared with yeast, flies, and mammalian cells. A high proportion of transcripts (63%) have at least two distinct PAS's, and 34% appear to utilize three or more sites. The apparent polyadenylation signal UAAA is used in more than 90% of cases, in cells grown in both rich media or limiting nitrogen. Our documentation of APA for the first time in this non-model organism highlights its conservation and likely biological importance of this regulatory step in gene expression.
Collapse
Affiliation(s)
- Leonard Schärfen
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Dagmar Zigackova
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Kirsten A. Reimer
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Martha R. Stark
- Department of Chemistry, University of Northern British Columbia, Prince George, BC, Canada
| | - Viktor A. Slat
- Department of Chemistry, University of Northern British Columbia, Prince George, BC, Canada
| | - Nancy J. Francoeur
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Melissa L. Wells
- The Signal Transduction Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, United States
| | - Lecong Zhou
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, United States
| | - Perry J. Blackshear
- The Signal Transduction Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, United States
| | - Karla M. Neugebauer
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Stephen D. Rader
- Department of Chemistry, University of Northern British Columbia, Prince George, BC, Canada
| |
Collapse
|
9
|
Thieffry A, Vigh ML, Bornholdt J, Ivanov M, Brodersen P, Sandelin A. Characterization of Arabidopsis thaliana Promoter Bidirectionality and Antisense RNAs by Inactivation of Nuclear RNA Decay Pathways. THE PLANT CELL 2020; 32:1845-1867. [PMID: 32213639 PMCID: PMC7268790 DOI: 10.1105/tpc.19.00815] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Revised: 02/03/2020] [Accepted: 03/20/2020] [Indexed: 05/20/2023]
Abstract
In animals, RNA polymerase II initiates transcription bidirectionally from gene promoters to produce pre-mRNAs on the forward strand and promoter upstream transcripts (PROMPTs) on the reverse strand. PROMPTs are degraded by the nuclear exosome. Previous studies based on nascent RNA approaches concluded that Arabidopsis (Arabidopsis thaliana) does not produce PROMPTs. Here, we used steady-state RNA sequencing in mutants defective in nuclear RNA decay including the exosome to reassess the existence of Arabidopsis PROMPTs. While they are rare, we identified ∼100 cases of exosome-sensitive PROMPTs in Arabidopsis. Such PROMPTs are sources of small interfering RNAs in exosome-deficient mutants, perhaps explaining why plants have evolved mechanisms to suppress PROMPTs. In addition, we found ∼200 long, unspliced and exosome-sensitive antisense RNAs that arise from transcription start sites within parts of the genome encoding 3'-untranslated regions on the sense strand. The previously characterized noncoding RNA that regulates expression of the key seed dormancy regulator, DELAY OF GERMINATION1, is a typical representative of this class of RNAs. Transcription factor genes are overrepresented among loci with exosome-sensitive antisense RNAs, suggesting a potential for widespread control of gene expression via this class of noncoding RNAs. Lastly, we assess the use of alternative promoters in Arabidopsis and compare the accuracy of existing TSS annotations.
Collapse
Affiliation(s)
- Axel Thieffry
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Maria Louisa Vigh
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Jette Bornholdt
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Maxim Ivanov
- Department of Plant and Environmental Sciences, University of Copenhagen, DK-1871 Frederiksberg C, Denmark
| | - Peter Brodersen
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Albin Sandelin
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
- Biotech Research and Innovation Centre, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| |
Collapse
|
10
|
Xie L, Teng K, Tan P, Chao Y, Li Y, Guo W, Han L. PacBio single-molecule long-read sequencing shed new light on the transcripts and splice isoforms of the perennial ryegrass. Mol Genet Genomics 2020; 295:475-489. [PMID: 31894400 DOI: 10.1007/s00438-019-01635-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 12/06/2019] [Indexed: 10/25/2022]
Abstract
Perennial ryegrass (Lolium perenne), one of the most widely used forage and cool-season turfgrass worldwide, has a breeding history of more than 100 years. However, the current draft genome annotation and transcriptome characterization are incomplete mainly because of the enormous difficulty in obtaining full-length transcripts. To explore the complete structure of the mRNA and improve the current draft genome, we performed PacBio single-molecule long-read sequencing for full-length transcriptome sequencing in perennial ryegrass. We generated 29,175 high-confidence non-redundant transcripts from 15,893 genetic loci, among which more than 66.88% of transcripts and 24.99% of genetic loci were not previously annotated in the current reference genome. The re-annotated 18,327 transcripts enriched the reference transcriptome. Particularly, 6709 alternative splicing events and 23,789 alternative polyadenylation sites were detected, providing a comprehensive landscape of the post-transcriptional regulation network. Furthermore, we identified 218 long non-coding RNAs and 478 fusion genes. Finally, the transcriptional regulation mechanism of perennial ryegrass in response to drought stress based on the newly updated reference transcriptome sequences was explored, providing new information on the underlying transcriptional regulation network. Taken together, we analyzed the full-length transcriptome of perennial ryegrass by PacBio single-molecule long-read sequencing. These results improve our understanding of the perennial ryegrass transcriptomes and refined the annotation of the reference genome.
Collapse
Affiliation(s)
- Lijuan Xie
- School of Applied Chemistry and Biotechnology, Shenzhen Polytechnic, Shenzhen, 518055, China
| | - Ke Teng
- College of Grassland Science, Beijing Forestry University, Beijing, 100083, China.,Beijing Research and Development Center for Grass and Environment, Beijing Academy of Agriculture and Forestry Sciences, Beijing, 100097, China
| | - Penghui Tan
- College of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Yuehui Chao
- College of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Yinruizhi Li
- College of Grassland Science, Beijing Forestry University, Beijing, 100083, China
| | - Weier Guo
- Department of Plant Biology, University of California, Davis, Davis, CA, 95616, USA
| | - Liebao Han
- College of Grassland Science, Beijing Forestry University, Beijing, 100083, China.
| |
Collapse
|
11
|
Genome-Wide Profiling of Polyadenylation Events in Maize Using High-Throughput Transcriptomic Sequences. G3-GENES GENOMES GENETICS 2019; 9:2749-2760. [PMID: 31239292 PMCID: PMC6686930 DOI: 10.1534/g3.119.400196] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Polyadenylation is an essential post-transcriptional modification of eukaryotic transcripts that plays critical role in transcript stability, localization, transport, and translational efficiency. About 70% genes in plants contain alternative polyadenylation (APA) sites. Despite availability of vast amount of sequencing data, to date, a comprehensive map of the polyadenylation events in maize is not available. Here, 9.48 billion RNA-Seq reads were analyzed to characterize 95,345 Poly(A) Clusters (PAC) in 23,705 (51%) maize genes. Of these, 76% were APA genes. However, most APA genes (55%) expressed a dominant PAC rather than favoring multiple PACs equally. The lincRNA genes with PACs were significantly longer in length than the genes without any PAC and about 48% genes had APA sites. Heterogeneity was observed in 52% of the PACs supporting the imprecise nature of the polyadenylation process. Genomic distribution revealed that the majority of the PACs (78%) were located in the genic regions. Unlike previous studies, large number of PACs were observed in the intergenic (n = 21,264), 5′-UTR (735), CDS (2,542), and the intronic regions (12,841). The CDS and introns with PACs were longer in length than without PACs, whereas intergenic PACs were more often associated with transcripts that lacked annotated 3′-UTRs. Nucleotide composition around PACs demonstrated AT-richness and the common upstream motif was AAUAAA, which is consistent with other plants. According to this study, only 2,830 genes still maintained the use of AAUAAA motif. This large-scale data provides useful insights about the gene expression regulation and could be utilized as evidence to validate the annotation of transcript ends.
Collapse
|
12
|
Characterization of mRNA polyadenylation in the apicomplexa. PLoS One 2018; 13:e0203317. [PMID: 30161237 PMCID: PMC6117058 DOI: 10.1371/journal.pone.0203317] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2018] [Accepted: 08/18/2018] [Indexed: 11/19/2022] Open
Abstract
Messenger RNA polyadenylation is a universal aspect of gene expression in eukaryotes. In well-established model organisms, this process is mediated by a conserved complex of 15–20 subunits. To better understand this process in apicomplexans, a group of unicellular parasites that causes serious disease in humans and livestock, a computational and high throughput sequencing study of the polyadenylation complex and poly(A) sites in several species was conducted. BLAST-based searches for orthologs of the human polyadenylation complex yielded clear matches to only two—poly(A) polymerase and CPSF73—of the 19 proteins used as queries in this analysis. As the human subunits that recognize the AAUAAA polyadenylation signal (PAS) were not immediately obvious, a computational analysis of sequences adjacent to experimentally-determined apicomplexan poly(A) sites was conducted. The results of this study showed that there exists in apicomplexans an A-rich region that corresponds in position to the AAUAAA PAS. The set of experimentally-determined sites in one species, Sarcocystis neurona, was further analyzed to evaluate the extent and significance of alternative poly(A) site choice in this organism. The results showed that almost 80% of S. neurona genes possess more than one poly(A) site, and that more than 780 sites showed differential usage in the two developmental stages–extracellular merozoites and intracellular schizonts–studied. These sites affected more than 450 genes, and included a disproportionate number of genes that encode membrane transporters and ribosomal proteins. Taken together, these results reveal that apicomplexan species seem to possess a poly(A) signal analogous to AAUAAA even though genes that may encode obvious counterparts of the AAUAAA-recognizing proteins are absent in these organisms. They also indicate that, as is the case in other eukaryotes, alternative polyadenylation is a widespread phenomenon in S. neurona that has the potential to impact growth and development.
Collapse
|
13
|
Bell SA, Shen C, Brown A, Hunt AG. Experimental Genome-Wide Determination of RNA Polyadenylation in Chlamydomonas reinhardtii. PLoS One 2016; 11:e0146107. [PMID: 26730730 PMCID: PMC4701671 DOI: 10.1371/journal.pone.0146107] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 12/14/2015] [Indexed: 11/20/2022] Open
Abstract
The polyadenylation of RNA is a near-universal feature of RNA metabolism in eukaryotes. This process has been studied in the model alga Chlamydomonas reinhardtii using low-throughput (gene-by-gene) and high-throughput (transcriptome sequencing) approaches that recovered poly(A)-containing sequence tags which revealed interesting features of this critical process in Chlamydomonas. In this study, RNA polyadenylation has been studied using the so-called Poly(A) Tag Sequencing (PAT-Seq) approach. Specifically, PAT-Seq was used to study poly(A) site choice in cultures grown in four different media types—Tris-Phosphate (TP), Tris-Phosphate-Acetate (TAP), High-Salt (HS), and High-Salt-Acetate (HAS). The results indicate that: 1. As reported before, the motif UGUAA is the primary, and perhaps sole, cis-element that guides mRNA polyadenylation in the nucleus; 2. The scope of alternative polyadenylation events with the potential to change the coding sequences of mRNAs is limited; 3. Changes in poly(A) site choice in cultures grown in the different media types are very few in number and do not affect protein-coding potential; 4. Organellar polyadenylation is considerable and affects primarily ribosomal RNAs in the chloroplast and mitochondria; and 5. Organellar RNA polyadenylation is a dynamic process that is affected by the different media types used for cell growth.
Collapse
Affiliation(s)
- Stephen A. Bell
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, Kentucky, United States of America
| | - Chi Shen
- Division of Computer Science, Kentucky State University, Frankfort, Kentucky, United States of America
| | - Alishea Brown
- Division of Computer Science, Kentucky State University, Frankfort, Kentucky, United States of America
| | - Arthur G. Hunt
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, Kentucky, United States of America
- * E-mail:
| |
Collapse
|
14
|
Li XQ. Comparative analysis of the base compositions of the pre-mRNA 3' cleaved-off region and the mRNA 3' untranslated region relative to the genomic base composition in animals and plants. PLoS One 2014; 9:e99928. [PMID: 24941005 PMCID: PMC4062462 DOI: 10.1371/journal.pone.0099928] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2013] [Accepted: 05/20/2014] [Indexed: 12/26/2022] Open
Abstract
The precursor messenger RNA (pre-mRNA) three-prime cleaved-off region (3′COR) and the mRNA three-prime untranslated region (3′UTR) play critical roles in regulating gene expression. The differences in base composition between these regions and the corresponding genomes are still largely uncharacterized in animals and plants. In this study, the base compositions of non-redundant 3′CORs and 3′UTRs were compared with the corresponding whole genomes of eleven animals, four dicotyledonous plants, and three monocotyledonous (cereal) plants. Among the four bases (A, C, G, and U for adenine, cytosine, guanine, and uracil, respectively), U (which corresponds to T, for thymine, in DNA) was the most frequent, A the second most frequent, G the third most frequent, and C the least frequent in most of the species in both the 3′COR and 3′UTR regions. In comparison with the whole genomes, in both regions the U content was usually the most overrepresented (particularly in the monocotyledonous plants), and the C content was the most underrepresented. The order obtained for the species groups, when ranked from high to low according to the U contents in the 3′COR and 3′UTR was as follows: dicotyledonous plants, monocotyledonous plants, non-mammal animals, and mammals. In contrast, the genomic T content was highest in dicotyledonous plants, lowest in monocotyledonous plants, and intermediate in animals. These results suggest the following: 1) there is a mechanism operating in both animals and plants which is biased toward U and against C in the 3′COR and 3′UTR; 2) the 3′UTR and 3′COR, as functional units, minimized the difference between dicotyledonous and monocotyledonous plants, while the dicotyledonous and monocotyledonous genomes evolved into two extreme groups in terms of base composition.
Collapse
Affiliation(s)
- Xiu-Qing Li
- Potato Research Centre, Agriculture and Agri-Food Canada, Fredericton, New Brunswick, Canada
- * E-mail:
| |
Collapse
|