1
|
Zhao Z, Chen Y, Zou X, Lin L, Zhou X, Cheng X, Yang G, Xu Q, Gong L, Li L, Ni T. Pan-cancer transcriptome analysis reveals widespread regulation through alternative tandem transcription initiation. SCIENCE ADVANCES 2024; 10:eadl5606. [PMID: 38985880 PMCID: PMC11235174 DOI: 10.1126/sciadv.adl5606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 06/05/2024] [Indexed: 07/12/2024]
Abstract
Abnormal transcription initiation from alternative first exon has been reported to promote tumorigenesis. However, the prevalence and impact of gene expression regulation mediated by alternative tandem transcription initiation were mostly unknown in cancer. Here, we developed a robust computational method to analyze alternative tandem transcription start site (TSS) usage from standard RNA sequencing data. Applying this method to pan-cancer RNA sequencing datasets, we observed widespread dysregulation of tandem TSS usage in tumors, many of which were independent of changes in overall expression level or alternative first exon usage. We showed that the dynamics of tandem TSS usage was associated with epigenomic modulation. We found that significant 5' untranslated region shortening of gene TIMM13 contributed to increased protein production, and up-regulation of TIMM13 by CRISPR-mediated transcriptional activation promoted proliferation and migration of lung cancer cells. Our findings suggest that dysregulated tandem TSS usage represents an addtional layer of cancer-associated transcriptome alterations.
Collapse
Affiliation(s)
- Zhaozhao Zhao
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
- MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Yu Chen
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xudong Zou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Limin Lin
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xiaolan Zhou
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xiaomeng Cheng
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Guangrui Yang
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Qiushi Xu
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Lihai Gong
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Lei Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| |
Collapse
|
2
|
Alfonso-Gonzalez C, Hilgers V. (Alternative) transcription start sites as regulators of RNA processing. Trends Cell Biol 2024:S0962-8924(24)00033-3. [PMID: 38531762 DOI: 10.1016/j.tcb.2024.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/20/2024] [Accepted: 02/23/2024] [Indexed: 03/28/2024]
Abstract
Alternative transcription start site usage (ATSS) is a widespread regulatory strategy that enables genes to choose between multiple genomic loci for initiating transcription. This mechanism is tightly controlled during development and is often altered in disease states. In this review, we examine the growing evidence highlighting a role for transcription start sites (TSSs) in the regulation of mRNA isoform selection during and after transcription. We discuss how the choice of transcription initiation sites influences RNA processing and the importance of this crosstalk for cell identity and organism function. We also speculate on possible mechanisms underlying the integration of transcriptional and post-transcriptional processes.
Collapse
Affiliation(s)
- Carlos Alfonso-Gonzalez
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany; Faculty of Biology, Albert Ludwigs University, 79104 Freiburg, Germany; International Max Planck Research School for Molecular and Cellular Biology (IMPRS- MCB), 79108 Freiburg, Germany
| | - Valérie Hilgers
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany.
| |
Collapse
|
3
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. RNA (NEW YORK, N.Y.) 2023; 29:1839-1855. [PMID: 37816550 PMCID: PMC10653393 DOI: 10.1261/rna.079849.123] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/12/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- Department of Neuromuscular Diseases, UCL Queen Square Motor Neuron Disease Centre, UCL Queen Square Institute of Neurology, UCL, London WC1N 3BG, United Kingdom
| | - Dominik Burri
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Matthew R Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Christina J Herrmann
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Christina M Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore 138672
- Yong Loo Lin School of Medicine, National University of Singapore, Kent Ridge, Singapore 119228
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mervin M Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell Graduate Studies, New York, New York 10065, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, New York 10065, USA
| | - José M Fernández
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Meritxell Ferret
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Asier Gonzalez-Uriarte
- Life Sciences Department, Barcelona Supercomputing Center, 08034 Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), 28029 Madrid, Spain
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3FF, United Kingdom
| | - Chelsea Herdman
- Department of Neurobiology, University of Utah, Salt Lake City, Utah 84132, USA
| | - Alexander Kanitz
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg-University Mainz, 55118 Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, Leeds LS2 9NL, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, Amsterdam UMC, University of Amsterdam, 1066 CX Amsterdam, The Netherlands
- Oncode Institute, 3521 AL Utrecht, The Netherlands
| | - Chi-Lam Poon
- Graduate School of Medical Sciences, Weill Cornell Medicine, New York, New York 10065, USA
| | - Gregor Rot
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
- Institute of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72076 Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California 92617, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, 4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
4
|
Carrion SA, Michal JJ, Jiang Z. Alternative Transcripts Diversify Genome Function for Phenome Relevance to Health and Diseases. Genes (Basel) 2023; 14:2051. [PMID: 38002994 PMCID: PMC10671453 DOI: 10.3390/genes14112051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/26/2023] Open
Abstract
Manipulation using alternative exon splicing (AES), alternative transcription start (ATS), and alternative polyadenylation (APA) sites are key to transcript diversity underlying health and disease. All three are pervasive in organisms, present in at least 50% of human protein-coding genes. In fact, ATS and APA site use has the highest impact on protein identity, with their ability to alter which first and last exons are utilized as well as impacting stability and translation efficiency. These RNA variants have been shown to be highly specific, both in tissue type and stage, with demonstrated importance to cell proliferation, differentiation and the transition from fetal to adult cells. While alternative exon splicing has a limited effect on protein identity, its ubiquity highlights the importance of these minor alterations, which can alter other features such as localization. The three processes are also highly interwoven, with overlapping, complementary, and competing factors, RNA polymerase II and its CTD (C-terminal domain) chief among them. Their role in development means dysregulation leads to a wide variety of disorders and cancers, with some forms of disease disproportionately affected by specific mechanisms (AES, ATS, or APA). Challenges associated with the genome-wide profiling of RNA variants and their potential solutions are also discussed in this review.
Collapse
Affiliation(s)
| | | | - Zhihua Jiang
- Department of Animal Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA 99164-7620, USA; (S.A.C.); (J.J.M.)
| |
Collapse
|
5
|
Bryce-Smith S, Burri D, Gazzara MR, Herrmann CJ, Danecka W, Fitzsimmons CM, Wan YK, Zhuang F, Fansler MM, Fernández JM, Ferret M, Gonzalez-Uriarte A, Haynes S, Herdman C, Kanitz A, Katsantoni M, Marini F, McDonnel E, Nicolet B, Poon CL, Rot G, Schärfen L, Wu PJ, Yoon Y, Barash Y, Zavolan M. Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.23.546284. [PMID: 37425672 PMCID: PMC10327023 DOI: 10.1101/2023.06.23.546284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, and limitations and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for seamless extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies. Furthermore, the containers and reproducible workflows generated in the course of this project can be seamlessly deployed and extended in the future to evaluate new methods or datasets.
Collapse
Affiliation(s)
- Sam Bryce-Smith
- UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Dominik Burri
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Matthew R. Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
| | - Christina J. Herrmann
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Weronika Danecka
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom
| | - Christina M. Fitzsimmons
- Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Yuk Kei Wan
- Genome Institute of Singapore, Buona Vista, Singapore
- National University of Singapore, Kent Ridge, Singapore
| | - Farica Zhuang
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA
| | - Mervin M. Fansler
- Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell GraduateStudies, New York, NY, USA
- Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, NY, USA
| | - José M. Fernández
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Meritxell Ferret
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Asier Gonzalez-Uriarte
- Barcelona Supercomputing Center, Barcelona, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES)
| | - Samuel Haynes
- Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom
| | | | - Alexander Kanitz
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Maria Katsantoni
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI) - UniversityMedical Center of the Johannes Gutenberg, University Mainz, Germany
| | - Euan McDonnel
- Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, United Kingdom
| | - Ben Nicolet
- Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, AmsterdamUMC, University of Amsterdam, and Oncode Institute, Amsterdam, The Netherlands
| | | | - Gregor Rot
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Life Sciences, Zurich, Switzerland
| | - Leonard Schärfen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven CT, USA
| | - Pin-Jou Wu
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, Germany
| | - Yoseop Yoon
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California, USA
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA
- Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA
| | - Mihaela Zavolan
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
6
|
Vlasenok M, Margasyuk S, Pervouchine DD. Transcriptome sequencing suggests that pre-mRNA splicing counteracts widespread intronic cleavage and polyadenylation. NAR Genom Bioinform 2023; 5:lqad051. [PMID: 37260513 PMCID: PMC10227441 DOI: 10.1093/nargab/lqad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 05/09/2023] [Accepted: 05/17/2023] [Indexed: 06/02/2023] Open
Abstract
Alternative splicing (AS) and alternative polyadenylation (APA) are two crucial steps in the post-transcriptional regulation of eukaryotic gene expression. Protocols capturing and sequencing RNA 3'-ends have uncovered widespread intronic polyadenylation (IPA) in normal and disease conditions, where it is currently attributed to stochastic variations in the pre-mRNA processing. Here, we took advantage of the massive amount of RNA-seq data generated by the Genotype Tissue Expression project (GTEx) to simultaneously identify and match tissue-specific expression of intronic polyadenylation sites with tissue-specific splicing. A combination of computational methods including the analysis of short reads with non-templated adenines revealed that APA events are more abundant in introns than in exons. While the rate of IPA in composite terminal exons and skipped terminal exons expectedly correlates with splicing, we observed a considerable fraction of IPA events that lack AS support and attributed them to spliced polyadenylated introns (SPI). We hypothesize that SPIs represent transient byproducts of a dynamic coupling between APA and AS, in which the spliceosome removes the intron while it is being cleaved and polyadenylated. These findings indicate that cotranscriptional pre-mRNA splicing could serve as a rescue mechanism to suppress premature transcription termination at intronic polyadenylation sites.
Collapse
Affiliation(s)
- Maria Vlasenok
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar 30, Moscow 121205, Russia
| | - Sergey Margasyuk
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar 30, Moscow 121205, Russia
| | - Dmitri D Pervouchine
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Bolshoy Bulvar 30, Moscow 121205, Russia
| |
Collapse
|
7
|
Ye W, Lian Q, Ye C, Wu X. A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00121-8. [PMID: 36167284 PMCID: PMC10372920 DOI: 10.1016/j.gpb.2022.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/17/2022] [Accepted: 09/19/2022] [Indexed: 05/08/2023]
Abstract
Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new machine learning and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3' untranslated region, tissue-specific, cross-species, and single-cell pA prediction.
Collapse
Affiliation(s)
- Wenbin Ye
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Coastal and Wetland Ecosystems, Ministry of Education, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|
8
|
Song P, Zhou S, Qi X, Jiao Y, Gong Y, Zhao J, Yang H, Qian Z, Qian J, Tang L. RNA modification writers influence tumor microenvironment in gastric cancer and prospects of targeted drug therapy. J Bioinform Comput Biol 2022; 20:2250004. [PMID: 35287562 DOI: 10.1142/s0219720022500044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Background: RNA adenosine modifications are crucial for regulating RNA levels. N6-methyladenosine (m6A), N1-methyladenosine (m1A), adenosine-to-inosine RNA editing, and alternative polyadenylation (APA) are four major RNA modification types. Methods: We evaluated the altered mRNA expression profiles of 27 RNA modification enzymes and compared the differences in tumor microenvironment (TME) and clinical prognosis between two RNA modification patterns using unsupervised clustering. Then, we constructed a scoring system, WM_score, and quantified the RNA modifications in patients of gastric cancer (GC), associating WM_score with TME, clinical outcomes, and effectiveness of targeted therapies. Results: RNA adenosine modifications strongly correlated with TME and could predict the degree of TME cell infiltration, genetic variation, and clinical prognosis. Two modification patterns were identified according to high and low WM_scores. Tumors in the WM_score-high subgroup were closely linked with survival advantage, CD4[Formula: see text] T-cell infiltration, high tumor mutation burden, and cell cycle signaling pathways, whereas those in the WM_score-low subgroup showed strong infiltration of inflammatory cells and poor survival. Regarding the immunotherapy response, a high WM_score showed a significant correlation with PD-L1 expression, predicting the effect of PD-L1 blockade therapy. Conclusion: The WM_scoring system could facilitate scoring and prediction of GC prognosis.
Collapse
Affiliation(s)
- Peng Song
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Sheng Zhou
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Xiaoyang Qi
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Yuwen Jiao
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Yu Gong
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Jie Zhao
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Haojun Yang
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Zhifen Qian
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Jun Qian
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| | - Liming Tang
- Department of Gastrointestinal Surgery, The Affiliated Changzhou, No. 2 People's Hospital of Nanjing Medical University, Changzhou 213000, Jiangsu Province, P. R. China
| |
Collapse
|
9
|
Fiszbein A, McGurk M, Calvo-Roitberg E, Kim G, Burge CB, Pai AA. Widespread occurrence of hybrid internal-terminal exons in human transcriptomes. SCIENCE ADVANCES 2022; 8:eabk1752. [PMID: 35044812 PMCID: PMC8769537 DOI: 10.1126/sciadv.abk1752] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Accepted: 11/23/2021] [Indexed: 06/12/2023]
Abstract
Messenger RNA isoform differences are predominantly driven by alternative first, internal, and last exons. Despite the importance of classifying exons to understand isoform structure, few tools examine isoform-specific exon usage. We recently observed that alternative transcription start sites often arise near internal exons, often creating “hybrid” first/internal exons. To systematically detect hybrid exons, we built the hybrid-internal-terminal (HIT) pipeline to classify exons depending on their isoform-specific usage. On the basis of splice junction reads in RNA sequencing data and probabilistic modeling, the HIT index identified thousands of previously misclassified hybrid first-internal and internal-last exons. Hybrid exons are enriched in long genes and genes involved in RNA splicing and have longer flanking introns and strong splice sites. Their usage varies considerably across human tissues. By developing the first method to classify exons according to isoform contexts, our findings document the occurrence of hybrid exons, a common quirk of the human transcriptome.
Collapse
Affiliation(s)
- Ana Fiszbein
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biology, Boston University, Boston, MA, USA
| | - Michael McGurk
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - GyeungYun Kim
- Department of Biology, Boston University, Boston, MA, USA
| | - Christopher B. Burge
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Athma A. Pai
- RNA Therapeutics Institute, University of Massachusetts Medical School, Worcester, MA, USA
| |
Collapse
|
10
|
Guerra-Almeida D, Tschoeke DA, da-Fonseca RN. Understanding small ORF diversity through a comprehensive transcription feature classification. DNA Res 2021; 28:6317669. [PMID: 34240112 PMCID: PMC8435553 DOI: 10.1093/dnares/dsab007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Indexed: 11/13/2022] Open
Abstract
Small open reading frames (small ORFs/sORFs/smORFs) are potentially coding sequences smaller than 100 codons that have historically been considered junk DNA by gene prediction software and in annotation screening; however, the advent of next-generation sequencing has contributed to the deeper investigation of junk DNA regions and their transcription products, resulting in the emergence of smORFs as a new focus of interest in systems biology. Several smORF peptides were recently reported in noncanonical mRNAs as new players in numerous biological contexts; however, their relevance is still overlooked in coding potential analysis. Hence, this review proposes a smORF classification based on transcriptional features, discussing the most promising approaches to investigate smORFs based on their different characteristics. First, smORFs were divided into nonexpressed (intergenic) and expressed (genic) smORFs. Second, genic smORFs were classified as smORFs located in noncoding RNAs (ncRNAs) or canonical mRNAs. Finally, smORFs in ncRNAs were further subdivided into sequences located in small or long RNAs, whereas smORFs located in canonical mRNAs were subdivided into several specific classes depending on their localization along the gene. We hope that this review provides new insights into large-scale annotations and reinforces the role of smORFs as essential components of a hidden coding DNA world.
Collapse
Affiliation(s)
- Diego Guerra-Almeida
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Diogo Antonio Tschoeke
- Alberto Luiz Coimbra Institute of Graduate Studies and Engineering Research (COPPE), Biomedical Engineering Program, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Rodrigo Nunes- da-Fonseca
- Institute of Biodiversity and Sustainability, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil.,National Institute of Science and Technology in Molecular Entomology, Rio de Janeiro, Brazil
| |
Collapse
|
11
|
Kandhari N, Kraupner-Taylor CA, Harrison PF, Powell DR, Beilharz TH. The Detection and Bioinformatic Analysis of Alternative 3 ' UTR Isoforms as Potential Cancer Biomarkers. Int J Mol Sci 2021; 22:5322. [PMID: 34070203 PMCID: PMC8158509 DOI: 10.3390/ijms22105322] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/06/2021] [Accepted: 05/06/2021] [Indexed: 12/17/2022] Open
Abstract
Alternative transcript cleavage and polyadenylation is linked to cancer cell transformation, proliferation and outcome. This has led researchers to develop methods to detect and bioinformatically analyse alternative polyadenylation as potential cancer biomarkers. If incorporated into standard prognostic measures such as gene expression and clinical parameters, these could advance cancer prognostic testing and possibly guide therapy. In this review, we focus on the existing methodologies, both experimental and computational, that have been applied to support the use of alternative polyadenylation as cancer biomarkers.
Collapse
Affiliation(s)
- Nitika Kandhari
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Calvin A. Kraupner-Taylor
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Paul F. Harrison
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - David R. Powell
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - Traude H. Beilharz
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| |
Collapse
|
12
|
Wang R, Tian B. APAlyzer: a bioinformatics package for analysis of alternative polyadenylation isoforms. Bioinformatics 2020; 36:3907-3909. [PMID: 32321166 DOI: 10.1093/bioinformatics/btaa266] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 04/13/2020] [Accepted: 04/15/2020] [Indexed: 02/01/2023] Open
Abstract
SUMMARY Most eukaryotic genes produce alternative polyadenylation (APA) isoforms. APA is dynamically regulated under different growth and differentiation conditions. Here, we present a bioinformatics package, named APAlyzer, for examining 3'UTR APA, intronic APA and gene expression changes using RNA-seq data and annotated polyadenylation sites in the PolyA_DB database. Using APAlyzer and data from the GTEx database, we present APA profiles across human tissues. AVAILABILITY AND IMPLEMENTATION APAlyzer is freely available at https://bioconductor.org/packages/release/bioc/html/APAlyzer.html as an R/Bioconductor package. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ruijia Wang
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ 07103, USA.,QIAGEN Digital Insights, Concord, MA 01742, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ 07103, USA.,Program in Gene Expression and Regulation, and Center for Systems and Computational Biology, Wistar Institute, Philadelphia, PA 19104, USA
| |
Collapse
|