1
|
Jonnakuti VS, Wagner EJ, Maletić-Savatić M, Liu Z, Yalamanchili HK. PolyAMiner-Bulk is a deep learning-based algorithm that decodes alternative polyadenylation dynamics from bulk RNA-seq data. CELL REPORTS METHODS 2024; 4:100707. [PMID: 38325383 PMCID: PMC10921021 DOI: 10.1016/j.crmeth.2024.100707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/13/2023] [Accepted: 01/11/2024] [Indexed: 02/09/2024]
Abstract
Alternative polyadenylation (APA) is a key post-transcriptional regulatory mechanism; yet, its regulation and impact on human diseases remain understudied. Existing bulk RNA sequencing (RNA-seq)-based APA methods predominantly rely on predefined annotations, severely impacting their ability to decode novel tissue- and disease-specific APA changes. Furthermore, they only account for the most proximal and distal cleavage and polyadenylation sites (C/PASs). Deconvoluting overlapping C/PASs and the inherent noisy 3' UTR coverage in bulk RNA-seq data pose additional challenges. To overcome these limitations, we introduce PolyAMiner-Bulk, an attention-based deep learning algorithm that accurately recapitulates C/PAS sequence grammar, resolves overlapping C/PASs, captures non-proximal-to-distal APA changes, and generates visualizations to illustrate APA dynamics. Evaluation on multiple datasets strongly evinces the performance merit of PolyAMiner-Bulk, accurately identifying more APA changes compared with other methods. With the growing importance of APA and the abundance of bulk RNA-seq data, PolyAMiner-Bulk establishes a robust paradigm of APA analysis.
Collapse
Affiliation(s)
- Venkata Soumith Jonnakuti
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA; Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX 77030, USA; Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eric J Wagner
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
| | - Mirjana Maletić-Savatić
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA; Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, TX 77030, USA; USDA/ARS Children's Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
2
|
Jonnakuti VS, Wagner EJ, Maletić-Savatić M, Liu Z, Yalamanchili HK. PolyAMiner-Bulk: A Machine Learning Based Bioinformatics Algorithm to Infer and Decode Alternative Polyadenylation Dynamics from bulk RNA-seq data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.23.523471. [PMID: 36747700 PMCID: PMC9900750 DOI: 10.1101/2023.01.23.523471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
More than half of human genes exercise alternative polyadenylation (APA) and generate mRNA transcripts with varying 3' untranslated regions (UTR). However, current computational approaches for identifying cleavage and polyadenylation sites (C/PASs) and quantifying 3'UTR length changes from bulk RNA-seq data fail to unravel tissue- and disease-specific APA dynamics. Here, we developed a next-generation bioinformatics algorithm and application, PolyAMiner-Bulk, that utilizes an attention-based machine learning architecture and an improved vector projection-based engine to infer differential APA dynamics accurately. When applied to earlier studies, PolyAMiner-Bulk accurately identified more than twice the number of APA changes in an RBM17 knockdown bulk RNA-seq dataset compared to current generation tools. Moreover, on a separate dataset, PolyAMiner-Bulk revealed novel APA dynamics and pathways in scleroderma pathology and identified differential APA in a gene that was identified as being involved in scleroderma pathogenesis in an independent study. Lastly, we used PolyAMiner-Bulk to analyze the RNA-seq data of post-mortem prefrontal cortexes from the ROSMAP data consortium and unraveled novel APA dynamics in Alzheimer's Disease. Our method, PolyAMiner-Bulk, creates a paradigm for future alternative polyadenylation analysis from bulk RNA-seq data.
Collapse
Affiliation(s)
- Venkata Soumith Jonnakuti
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
- Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX, 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eric J. Wagner
- Department of Biochemistry and Biophysics, University of Rochester School of Medicine and Dentistry, Rochester, NY 14642, USA
| | - Mirjana Maletić-Savatić
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
- Program in Quantitative and Computational Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, 77030, USA
- Jan and Dan Duncan Neurological Research Institute, Texas Children’s Hospital, Houston, TX, 77030, USA
| |
Collapse
|
3
|
Fahmi NA, Ahmed KT, Chang JW, Nassereddeen H, Fan D, Yong J, Zhang W. APA-Scan: detection and visualization of 3'-UTR alternative polyadenylation with RNA-seq and 3'-end-seq data. BMC Bioinformatics 2022; 23:396. [PMID: 36171568 PMCID: PMC9520800 DOI: 10.1186/s12859-022-04939-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 09/16/2022] [Indexed: 11/26/2022] Open
Abstract
Background The eukaryotic genome is capable of producing multiple isoforms from a gene by alternative polyadenylation (APA) during pre-mRNA processing. APA in the 3′-untranslated region (3′-UTR) of mRNA produces transcripts with shorter or longer 3′-UTR. Often, 3′-UTR serves as a binding platform for microRNAs and RNA-binding proteins, which affect the fate of the mRNA transcript. Thus, 3′-UTR APA is known to modulate translation and provides a mean to regulate gene expression at the post-transcriptional level. Current bioinformatics pipelines have limited capability in profiling 3′-UTR APA events due to incomplete annotations and a low-resolution analyzing power: widely available bioinformatics pipelines do not reference actionable polyadenylation (cleavage) sites but simulate 3′-UTR APA only using RNA-seq read coverage, causing false positive identifications. To overcome these limitations, we developed APA-Scan, a robust program that identifies 3′-UTR APA events and visualizes the RNA-seq short-read coverage with gene annotations.
Methods APA-Scan utilizes either predicted or experimentally validated actionable polyadenylation signals as a reference for polyadenylation sites and calculates the quantity of long and short 3′-UTR transcripts in the RNA-seq data. APA-Scan works in three major steps: (i) calculate the read coverage of the 3′-UTR regions of genes; (ii) identify the potential APA sites and evaluate the significance of the events among two biological conditions; (iii) graphical representation of user specific event with 3′-UTR annotation and read coverage on the 3′-UTR regions. APA-Scan is implemented in Python3. Source code and a comprehensive user’s manual are freely available at https://github.com/compbiolabucf/APA-Scan. Result APA-Scan was applied to both simulated and real RNA-seq datasets and compared with two widely used baselines DaPars and APAtrap. In simulation APA-Scan significantly improved the accuracy of 3′-UTR APA identification compared to the other baselines. The performance of APA-Scan was also validated by 3′-end-seq data and qPCR on mouse embryonic fibroblast cells. The experiments confirm that APA-Scan can detect unannotated 3′-UTR APA events and improve genome annotation. Conclusion APA-Scan is a comprehensive computational pipeline to detect transcriptome-wide 3′-UTR APA events. The pipeline integrates both RNA-seq and 3′-end-seq data information and can efficiently identify the significant events with a high-resolution short reads coverage plots. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04939-w.
Collapse
Affiliation(s)
- Naima Ahmed Fahmi
- Department of Computer Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL, 32816, USA
| | - Khandakar Tanvir Ahmed
- Department of Computer Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL, 32816, USA
| | - Jae-Woong Chang
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, 420 Washington Ave. S.E., Minneapolis, MN, 55455, USA
| | - Heba Nassereddeen
- Department of Computer Engineering, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL, 32816, USA
| | - Deliang Fan
- School of Electrical, Computer and Energy Engineering, Arizona State University, 650 E Tyler Mall, Tempe, AZ, 85287, USA
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, 420 Washington Ave. S.E., Minneapolis, MN, 55455, USA.
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL, 32816, USA.
| |
Collapse
|
4
|
Ye W, Lian Q, Ye C, Wu X. A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00121-8. [PMID: 36167284 PMCID: PMC10372920 DOI: 10.1016/j.gpb.2022.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/17/2022] [Accepted: 09/19/2022] [Indexed: 05/08/2023]
Abstract
Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new machine learning and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3' untranslated region, tissue-specific, cross-species, and single-cell pA prediction.
Collapse
Affiliation(s)
- Wenbin Ye
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Coastal and Wetland Ecosystems, Ministry of Education, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|
5
|
Nachtigall PG, Bovolenta LA. Computational Detection of MicroRNA Targets. Methods Mol Biol 2022; 2257:187-209. [PMID: 34432280 DOI: 10.1007/978-1-0716-1170-8_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
MicroRNAs (miRNAs) are small noncoding RNAs that are recognized as posttranscriptional regulators of gene expression. These molecules have been shown to play important roles in several cellular processes. MiRNAs act on their target by guiding the RISC complex and binding to the mRNA molecule. Thus, it is recognized that the function of a miRNA is determined by the function of its target (s). By using high-throughput methodologies, novel miRNAs are being identified, but their functions remain uncharted. Target validation is crucial to properly understand the specific role of a miRNA in a cellular pathway. However, molecular techniques for experimental validation of miRNA-target interaction are expensive, time-consuming, laborious, and can be not accurate in inferring true interactions. Thus, accurate miRNA target predictions are helpful to understand the functions of miRNAs. There are several algorithms proposed for target prediction and databases containing miRNA-target information. However, these available computational tools for prediction still generate a large number of false positives and fail to detect a considerable number of true targets, which indicates the necessity of highly confident approaches to identify bona fide miRNA-target interactions. This chapter focuses on tools and strategies used for miRNA target prediction, by providing practical insights and outlooks.
Collapse
Affiliation(s)
- Pedro Gabriel Nachtigall
- Laboratório Especial de Toxinologia Aplicada, CeTICS, Instituto Butantan, São Paulo, SP, Brazil.
| | - Luiz Augusto Bovolenta
- Department of Morphology, Institute of Biosciences of Botucatu (IBB), São Paulo State University (UNESP), Botucatu, Brazil
| |
Collapse
|
6
|
Dharmalingam P, Mahalingam R, Yalamanchili HK, Weng T, Karmouty-Quintana H, Guha A, A Thandavarayan R. Emerging roles of alternative cleavage and polyadenylation (APA) in human disease. J Cell Physiol 2021; 237:149-160. [PMID: 34378793 DOI: 10.1002/jcp.30549] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 07/13/2021] [Accepted: 07/20/2021] [Indexed: 12/11/2022]
Abstract
In the messenger RNA (mRNA) maturation process, the 3'-end of pre-mRNA is cleaved and a poly(A) sequence is added, this is an important determinant of mRNA stability and its cellular functions. More than 60%-70% of human genes have three or more polyadenylation (APA) sites and can be cleaved at different sites, generating mRNA transcripts of varying lengths. This phenomenon is termed as alternative cleavage and polyadenylation (APA) and it plays role in key biological processes like gene regulation, cell proliferation, senescence, and also in various human diseases. Loss of regulatory microRNA binding sites and interactions with RNA-binding proteins leading to APA are largely investigated in human diseases. However, the functions of the core APA machinery and related factors during disease conditions remain largely unknown. In this review, we discuss the roles of polyadenylation machinery in relation to brain disease, cardiac failure, pulmonary fibrosis, cancer, infectious conditions, and other human diseases. Collectively, we believe this review will be a useful avenue for understanding the emerging role of APA in the pathobiology of various human diseases.
Collapse
Affiliation(s)
- Prakash Dharmalingam
- Department of Biochemistry, Saveetha Dental College & Hospitals, Saveetha Institute of Medical & Technical Sciences, Saveetha University, Chennai, India
| | - Rajasekaran Mahalingam
- Laboratory of Neuroimmunology, Department of Symptom Research, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, USA.,Department of Pediatrics - Neurology, Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, Texas, USA.,Department of Pediatrics, USDA/ARS Children's Nutrition Research Center, Baylor College of Medicine, Houston, Texas, USA
| | - Tingting Weng
- Department of Biochemistry and Molecular Biology & Divisions of Critical Care, Pulmonary and Sleep Medicine, Department of Internal Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Harry Karmouty-Quintana
- Department of Biochemistry and Molecular Biology & Divisions of Critical Care, Pulmonary and Sleep Medicine, Department of Internal Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Ashrith Guha
- Department of Cardiology, Houston Methodist DeBakey Heart & Vascular Center, Houston, Texas, USA
| | | |
Collapse
|
7
|
Kandhari N, Kraupner-Taylor CA, Harrison PF, Powell DR, Beilharz TH. The Detection and Bioinformatic Analysis of Alternative 3 ' UTR Isoforms as Potential Cancer Biomarkers. Int J Mol Sci 2021; 22:5322. [PMID: 34070203 PMCID: PMC8158509 DOI: 10.3390/ijms22105322] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 05/06/2021] [Accepted: 05/06/2021] [Indexed: 12/17/2022] Open
Abstract
Alternative transcript cleavage and polyadenylation is linked to cancer cell transformation, proliferation and outcome. This has led researchers to develop methods to detect and bioinformatically analyse alternative polyadenylation as potential cancer biomarkers. If incorporated into standard prognostic measures such as gene expression and clinical parameters, these could advance cancer prognostic testing and possibly guide therapy. In this review, we focus on the existing methodologies, both experimental and computational, that have been applied to support the use of alternative polyadenylation as cancer biomarkers.
Collapse
Affiliation(s)
- Nitika Kandhari
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Calvin A. Kraupner-Taylor
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| | - Paul F. Harrison
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - David R. Powell
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC 3800, Australia;
| | - Traude H. Beilharz
- Development and Stem Cells Program, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, VIC 3800, Australia; (N.K.); (C.A.K.-T.); (P.F.H.)
| |
Collapse
|
8
|
Alternative Polyadenylation: a new frontier in post transcriptional regulation. Biomark Res 2020; 8:67. [PMID: 33292571 PMCID: PMC7690165 DOI: 10.1186/s40364-020-00249-6] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 11/16/2020] [Indexed: 12/13/2022] Open
Abstract
Polyadenylation of pre-messenger RNA (pre-mRNA) specific sites and termination of their downstream transcriptions are signaled by unique sequence motif structures such as AAUAAA and its auxiliary elements. Alternative polyadenylation (APA) is an important post-transcriptional regulatory mechanism that processes RNA products depending on its 3'-untranslated region (3'-UTR) specific sequence signal. APA processing can generate several mRNA isoforms from a single gene, which may have different biological functions on their target gene. As a result, cellular genomic stability, proliferation capability, and transformation feasibility could all be affected. Furthermore, APA modulation regulates disease initiation and progression. APA status could potentially act as a biomarker for disease diagnosis, severity stratification, and prognosis forecast. While the advance of modern throughout technologies, such as next generation-sequencing (NGS) and single-cell sequencing techniques, have enriched our knowledge about APA, much of APA biological process is unknown and pending for further investigation. Herein, we review the current knowledge on APA and how its regulatory complex factors (CFI/IIm, CPSF, CSTF, and RBPs) work together to determine RNA splicing location, cell cycle velocity, microRNA processing, and oncogenesis regulation. We also discuss various APA experiment strategies and the future direction of APA research.
Collapse
|
9
|
Tu M, Li Y. Profiling Alternative 3' Untranslated Regions in Sorghum using RNA-seq Data. Front Genet 2020; 11:556749. [PMID: 33193635 PMCID: PMC7649775 DOI: 10.3389/fgene.2020.556749] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Accepted: 09/30/2020] [Indexed: 12/18/2022] Open
Abstract
Sorghum is an important crop widely used for food, feed, and fuel. Transcriptome-wide studies of 3′ untranslated regions (3′UTR) using regular RNA-seq remain scarce in sorghum, while transcriptomes have been characterized extensively using Illumina short-read sequencing platforms for many sorghum varieties under various conditions or developmental contexts. 3′UTR is a critical regulatory component of genes, controlling the translation, transport, and stability of messenger RNAs. In the present study, we profiled the alternative 3′UTRs at the transcriptome level in three genetically related but phenotypically contrasting lines of sorghum: Rio, BTx406, and R9188. A total of 1,197 transcripts with alternative 3′UTRs were detected using RNA-seq data. Their categorization identified 612 high-confidence alternative 3′UTRs. Importantly, the high-confidence alternative 3′UTR genes significantly overlapped with the genesets that are associated with RNA N6-methyladenosine (m6A) modification, suggesting a clear indication between alternative 3′UTR and m6A methylation in sorghum. Moreover, taking advantage of sorghum genetics, we provided evidence of genotype specificity of alternative 3′UTR usage. In summary, our work exemplifies a transcriptome-wide profiling of alternative 3′UTRs using regular RNA-seq data in non-model crops and gains insights into alternative 3′UTRs and their genotype specificity.
Collapse
Affiliation(s)
- Min Tu
- Waksman Institute of Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Yin Li
- Waksman Institute of Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| |
Collapse
|
10
|
Oliveira EH, Assis AF, Speck-Hernandez CA, Duarte MJ, Passos GA. Aire Gene Influences the Length of the 3' UTR of mRNAs in Medullary Thymic Epithelial Cells. Front Immunol 2020; 11:1039. [PMID: 32547551 PMCID: PMC7270294 DOI: 10.3389/fimmu.2020.01039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Accepted: 04/29/2020] [Indexed: 12/15/2022] Open
Abstract
Aire is a transcriptional controller in medullary thymic epithelial cells (mTECs) modulating a set of peripheral tissue antigens (PTAs) and non-PTA mRNAs as well as miRNAs. Even miRNAs exerting posttranscriptional control of mRNAs in mTECs, the composition of miRNA-mRNA networks may differ. Under reduction in Aire expression, networks exhibited greater miRNA diversity controlling mRNAs. Variations in the number of 3'UTR binding sites of Aire-dependent mRNAs may represent a crucial factor that influence the miRNA interaction. To test this hypothesis, we analyzed through bioinformatics the length of 3'UTRs of a large set of Aire-dependent mRNAs. The data were obtained from existing RNA-seq of mTECs of wild type or Aire-knockout (KO) mice. We used computational algorithms as FASTQC, STAR and HTSEQ for sequence alignment and counting reads, DESEQ2 for the differential expression, 3USS for the alternative 3'UTRs and TAPAS for the alternative polyadenylation sites. We identified 152 differentially expressed mRNAs between these samples comprising those that encode PTAs as well as transcription regulators. In Aire KO mTECs, most of these mRNAs featured an increase in the length of their 3'UTRs originating additional miRNA binding sites and new miRNA controllers. Results from the in silico analysis were statistically significant and the predicted miRNA-mRNA interactions were thermodynamically stable. Even with no in vivo or in vitro experiments, they were adequate to show that lack of Aire in mTECs might favor the downregulation of PTA mRNAs and transcription regulators via miRNA control. This could unbalance the overall transcriptional activity in mTECs and thus the self-representation.
Collapse
Affiliation(s)
- Ernna H. Oliveira
- Molecular Immunogenetics Group, Department of Genetics, Ribeirão Preto Medical School, University of São Paulo (USP), Ribeirão Preto, Brazil
| | | | - Cesar A. Speck-Hernandez
- Molecular Immunogenetics Group, Department of Genetics, Ribeirão Preto Medical School, University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Max Jordan Duarte
- Molecular Immunogenetics Group, Department of Genetics, Ribeirão Preto Medical School, University of São Paulo (USP), Ribeirão Preto, Brazil
| | - Geraldo A. Passos
- Molecular Immunogenetics Group, Department of Genetics, Ribeirão Preto Medical School, University of São Paulo (USP), Ribeirão Preto, Brazil
- Laboratory of Genetics and Molecular Biology, Department of Basic and Oral Biology, School of Dentistry of Ribeirão Preto, USP, Ribeirão Preto, Brazil
| |
Collapse
|
11
|
Nachtigall PG, Kashiwabara AY, Durham AM. CodAn: predictive models for precise identification of coding regions in eukaryotic transcripts. Brief Bioinform 2020; 22:5847603. [PMID: 32460307 PMCID: PMC8138839 DOI: 10.1093/bib/bbaa045] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 02/19/2020] [Accepted: 03/06/2020] [Indexed: 12/13/2022] Open
Abstract
Motivation Characterization of the coding sequences (CDSs) is an essential step in transcriptome annotation. Incorrect identification of CDSs can lead to the prediction of non-existent proteins that can eventually compromise knowledge if databases are populated with similar incorrect predictions made in different genomes. Also, the correct identification of CDSs is important for the characterization of the untranslated regions (UTRs), which are known to be important regulators of the mRNA translation process. Considering this, we present CodAn (Coding sequence Annotator), a new approach to predict confident CDS and UTR regions in full or partial transcriptome sequences in eukaryote species. Results Our analysis revealed that CodAn performs confident predictions on full-length and partial transcripts with the strand sense of the CDS known or unknown. The comparative analysis showed that CodAn presents better overall performance than other approaches, mainly when considering the correct identification of the full CDS (i.e. correct identification of the start and stop codons). In this sense, CodAn is the best tool to be used in projects involving transcriptomic data. Availability CodAn is freely available at https://github.com/pedronachtigall/CodAn. Contact aland@usp.br Supplementary information Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Alan M Durham
- Corresponding author: Alan M. Durham, Department of Computer Science, Instituto de Matematica e Estatistica, Universidade de Sao Paulo (USP), Brazil. Tel.: +55 11 30919877; Fax: +55 11 30919877; E-mail:
| |
Collapse
|
12
|
de la Fuente L, Arzalluz-Luque Á, Tardáguila M, Del Risco H, Martí C, Tarazona S, Salguero P, Scott R, Lerma A, Alastrue-Agudo A, Bonilla P, Newman JRB, Kosugi S, McIntyre LM, Moreno-Manzano V, Conesa A. tappAS: a comprehensive computational framework for the analysis of the functional impact of differential splicing. Genome Biol 2020; 21:119. [PMID: 32423416 PMCID: PMC7236505 DOI: 10.1186/s13059-020-02028-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 04/23/2020] [Indexed: 12/26/2022] Open
Abstract
Recent advances in long-read sequencing solve inaccuracies in alternative transcript identification of full-length transcripts in short-read RNA-Seq data, which encourages the development of methods for isoform-centered functional analysis. Here, we present tappAS, the first framework to enable a comprehensive Functional Iso-Transcriptomics (FIT) analysis, which is effective at revealing the functional impact of context-specific post-transcriptional regulation. tappAS uses isoform-resolved annotation of coding and non-coding functional domains, motifs, and sites, in combination with novel analysis methods to interrogate different aspects of the functional readout of transcript variants and isoform regulation. tappAS software and documentation are available at https://app.tappas.org.
Collapse
Affiliation(s)
- Lorena de la Fuente
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
- Present Address: Bioinformatics Unit, IIS Fundación Jiménez Díaz, Madrid, Spain
| | - Ángeles Arzalluz-Luque
- Department of Statistics and Operational Research, Polytechnical University of Valencia, Valencia, Spain
| | - Manuel Tardáguila
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
- Present Address: Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Héctor Del Risco
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Cristina Martí
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
| | - Sonia Tarazona
- Department of Statistics and Operational Research, Polytechnical University of Valencia, Valencia, Spain
| | - Pedro Salguero
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
| | - Raymond Scott
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Alberto Lerma
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
| | - Ana Alastrue-Agudo
- Present Address: Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Pablo Bonilla
- Present Address: Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Jeremy R B Newman
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Department of Pathology, University of Florida, Gainesville, FL, USA
| | - Shunichi Kosugi
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Laboratory for Statistical and Translational Genetics, Center for Integrative Medical Sciences, RIKEN, Wako, Japan
| | - Lauren M McIntyre
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, USA
| | | | - Ana Conesa
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
13
|
Ye C, Lin J, Li QQ. Discovery of alternative polyadenylation dynamics from single cell types. Comput Struct Biotechnol J 2020; 18:1012-1019. [PMID: 32382395 PMCID: PMC7200215 DOI: 10.1016/j.csbj.2020.04.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 04/12/2020] [Accepted: 04/14/2020] [Indexed: 12/13/2022] Open
Abstract
Alternative polyadenylation (APA) occurs in the process of mRNA maturation by adding a poly(A) tail at different locations, resulting increased diversity of mRNA isoforms and contributing to the complexity of gene regulatory network. Benefit from the development of high-throughput sequencing technologies, we could now delineate APA profiles of transcriptomes at an unprecedented pace. Especially the single cell RNA sequencing (scRNA-seq) technologies provide us opportunities to interrogate biological details of diverse and rare cell types. Despite increasing evidence showing that APA is involved in the cell type-specific regulation and function, efficient and specific laboratory methods for capturing poly(A) sites at single cell resolution are underdeveloped to date. In this review, we summarize existing experimental and computational methods for the identification of APA dynamics from diverse single cell types. A future perspective is also provided.
Collapse
Affiliation(s)
- Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Juncheng Lin
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Qingshun Q. Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
- Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA 91766, USA
| |
Collapse
|
14
|
Venkat S, Tisdale AA, Schwarz JR, Alahmari AA, Maurer HC, Olive KP, Eng KH, Feigin ME. Alternative polyadenylation drives oncogenic gene expression in pancreatic ductal adenocarcinoma. Genome Res 2020; 30:347-360. [PMID: 32029502 PMCID: PMC7111527 DOI: 10.1101/gr.257550.119] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 02/04/2020] [Indexed: 01/08/2023]
Abstract
Alternative polyadenylation (APA) is a gene regulatory process that dictates mRNA 3'-UTR length, resulting in changes in mRNA stability and localization. APA is frequently disrupted in cancer and promotes tumorigenesis through altered expression of oncogenes and tumor suppressors. Pan-cancer analyses have revealed common APA events across the tumor landscape; however, little is known about tumor type-specific alterations that may uncover novel events and vulnerabilities. Here, we integrate RNA-sequencing data from the Genotype-Tissue Expression (GTEx) project and The Cancer Genome Atlas (TCGA) to comprehensively analyze APA events in 148 pancreatic ductal adenocarcinomas (PDACs). We report widespread, recurrent, and functionally relevant 3'-UTR alterations associated with gene expression changes of known and newly identified PDAC growth-promoting genes and experimentally validate the effects of these APA events on protein expression. We find enrichment for APA events in genes associated with known PDAC pathways, loss of tumor-suppressive miRNA binding sites, and increased heterogeneity in 3'-UTR forms of metabolic genes. Survival analyses reveal a subset of 3'-UTR alterations that independently characterize a poor prognostic cohort among PDAC patients. Finally, we identify and validate the casein kinase CSNK1A1 (also known as CK1alpha or CK1a) as an APA-regulated therapeutic target in PDAC. Knockdown or pharmacological inhibition of CSNK1A1 attenuates PDAC cell proliferation and clonogenic growth. Our single-cancer analysis reveals APA as an underappreciated driver of protumorigenic gene expression in PDAC via the loss of miRNA regulation.
Collapse
Affiliation(s)
- Swati Venkat
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Arwen A Tisdale
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Johann R Schwarz
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Abdulrahman A Alahmari
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - H Carlo Maurer
- Klinikum rechts der Isar, II. Medizinische Klinik, Technische Universität München, 81675 Munich, Germany
| | - Kenneth P Olive
- Herbert Irving Comprehensive Cancer Center, Department of Medicine, Division of Digestive and Liver Diseases, Department of Pathology and Cell Biology, Columbia University Medical Center, New York, New York 10032, USA
| | - Kevin H Eng
- Department of Cancer Genetics and Genomics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| | - Michael E Feigin
- Department of Pharmacology and Therapeutics, Roswell Park Comprehensive Cancer Center, Buffalo, New York 14263, USA
| |
Collapse
|
15
|
Arefeen A, Liu J, Xiao X, Jiang T. TAPAS: tool for alternative polyadenylation site analysis. Bioinformatics 2019; 34:2521-2529. [PMID: 30052912 DOI: 10.1093/bioinformatics/bty110] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Accepted: 02/22/2018] [Indexed: 01/08/2023] Open
Abstract
Motivation The length of the 3' untranslated region (3' UTR) of an mRNA is essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding and translation efficiency. Moreover, correlation between diseases and the shortening (or lengthening) of 3' UTRs has been reported in the literature. This length is largely determined by the polyadenylation cleavage site in the mRNA. As alternative polyadenylation (APA) sites are common in mammalian genes, several tools have been published recently for detecting APA sites from RNA-Seq data or performing shortening/lengthening analysis. These tools consider either up to only two APA sites in a gene or only APA sites that occur in the last exon of a gene, although a gene may generally have more than two APA sites and an APA site may sometimes occur before the last exon. Furthermore, the tools are unable to integrate the analysis of shortening/lengthening events with APA site detection. Results We propose a new tool, called TAPAS, for detecting novel APA sites from RNA-Seq data. It can deal with more than two APA sites in a gene as well as APA sites that occur before the last exon. The tool is based on an existing method for finding change points in time series data, but some filtration techniques are also adopted to remove change points that are likely false APA sites. It is then extended to identify APA sites that are expressed differently between two biological samples and genes that contain 3' UTRs with shortening/lengthening events. Our extensive experiments on simulated and real RNA-Seq data demonstrate that TAPAS outperforms the existing tools for APA site detection or shortening/lengthening analysis significantly. Availability and implementation https://github.com/arefeen/TAPAS. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ashraful Arefeen
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA
| | - Juntao Liu
- School of Mathematics, Shandong University, Jinan, Shandong, China
| | - Xinshu Xiao
- Department of Integrative Biology and Physiology, University of California, Los Angeles, CA, USA
| | - Tao Jiang
- Department of Computer Science and Engineering, University of California, Riverside, CA, USA.,Institute of Integrative Genome Biology, University of California, Riverside, CA, USA.,MOE Key Lab of Bioinformatics and Bioinformatics Division, TNLIST/Department of Computer Science and Technology, Tsinghua University, Beijing, China
| |
Collapse
|
16
|
Doulazmi M, Cros C, Dusart I, Trembleau A, Dubacq C. Alternative polyadenylation produces multiple 3' untranslated regions of odorant receptor mRNAs in mouse olfactory sensory neurons. BMC Genomics 2019; 20:577. [PMID: 31299892 PMCID: PMC6624953 DOI: 10.1186/s12864-019-5927-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 06/23/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Odorant receptor genes constitute the largest gene family in mammalian genomes and this family has been extensively studied in several species, but to date far less attention has been paid to the characterization of their mRNA 3' untranslated regions (3'UTRs). Given the increasing importance of UTRs in the understanding of RNA metabolism, and the growing interest in alternative polyadenylation especially in the nervous system, we aimed at identifying the alternative isoforms of odorant receptor mRNAs generated through 3'UTR variation. RESULTS We implemented a dedicated pipeline using IsoSCM instead of Cufflinks to analyze RNA-Seq data from whole olfactory mucosa of adult mice and obtained an extensive description of the 3'UTR isoforms of odorant receptor mRNAs. To validate our bioinformatics approach, we exhaustively analyzed the 3'UTR isoforms produced from 2 pilot genes, using molecular approaches including northern blot and RNA ligation mediated polyadenylation test. Comparison between datasets further validated the pipeline and confirmed the alternative polyadenylation patterns of odorant receptors. Qualitative and quantitative analyses of the annotated 3' regions demonstrate that 1) Odorant receptor 3'UTRs are longer than previously described in the literature; 2) More than 77% of odorant receptor mRNAs are subject to alternative polyadenylation, hence generating at least 2 detectable 3'UTR isoforms; 3) Splicing events in 3'UTRs are restricted to a limited subset of odorant receptor genes; and 4) Comparison between male and female data shows no sex-specific differences in odorant receptor 3'UTR isoforms. CONCLUSIONS We demonstrated for the first time that odorant receptor genes are extensively subject to alternative polyadenylation. This ground-breaking change to the landscape of 3'UTR isoforms of Olfr mRNAs opens new avenues for investigating their respective functions, especially during the differentiation of olfactory sensory neurons.
Collapse
Affiliation(s)
- Mohamed Doulazmi
- CNRS, Institut de Biologie Paris Seine, Biological adaptation and ageing, B2A, Sorbonne Université, F-75005 Paris, France
| | - Cyril Cros
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
- Present Address: Columbia University, New York, NY 10027 USA
| | - Isabelle Dusart
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
| | - Alain Trembleau
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
| | - Caroline Dubacq
- CNRS, INSERM, Institut de Biologie Paris Seine, Neuroscience Paris Seine, NPS, Sorbonne Université, F-75005 Paris, France
| |
Collapse
|
17
|
Chen M, Ji G, Fu H, Lin Q, Ye C, Ye W, Su Y, Wu X. A survey on identification and quantification of alternative polyadenylation sites from RNA-seq data. Brief Bioinform 2019; 21:1261-1276. [PMID: 31267126 DOI: 10.1093/bib/bbz068] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 05/03/2019] [Accepted: 05/14/2019] [Indexed: 12/13/2022] Open
Abstract
Alternative polyadenylation (APA) has been implicated to play an important role in post-transcriptional regulation by regulating mRNA abundance, stability, localization and translation, which contributes considerably to transcriptome diversity and gene expression regulation. RNA-seq has become a routine approach for transcriptome profiling, generating unprecedented data that could be used to identify and quantify APA site usage. A number of computational approaches for identifying APA sites and/or dynamic APA events from RNA-seq data have emerged in the literature, which provide valuable yet preliminary results that should be refined to yield credible guidelines for the scientific community. In this review, we provided a comprehensive overview of the status of currently available computational approaches. We also conducted objective benchmarking analysis using RNA-seq data sets from different species (human, mouse and Arabidopsis) and simulated data sets to present a systematic evaluation of 11 representative methods. Our benchmarking study showed that the overall performance of all tools investigated is moderate, reflecting that there is still lot of scope to improve the prediction of APA site or dynamic APA events from RNA-seq data. Particularly, prediction results from individual tools differ considerably, and only a limited number of predicted APA sites or genes are common among different tools. Accordingly, we attempted to give some advice on how to assess the reliability of the obtained results. We also proposed practical recommendations on the appropriate method applicable to diverse scenarios and discussed implications and future directions relevant to profiling APA from RNA-seq data.
Collapse
Affiliation(s)
- Moliang Chen
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Hongjuan Fu
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Qianmin Lin
- Xiang' an hospital of Xiamen university, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Wenbin Ye
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| | - Yaru Su
- College of Mathematics and Computer Science, Fuzhou University, Fuzhou 350116, China
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen 361005, China.,Xiamen Research Institute of National Center of Healthcare Big Data, Xiamen 361005, China
| |
Collapse
|
18
|
Ye C, Long Y, Ji G, Li QQ, Wu X. APAtrap: identification and quantification of alternative polyadenylation sites from RNA-seq data. Bioinformatics 2019; 34:1841-1849. [PMID: 29360928 DOI: 10.1093/bioinformatics/bty029] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Accepted: 01/17/2018] [Indexed: 12/28/2022] Open
Abstract
Motivation Alternative polyadenylation (APA) has been increasingly recognized as a crucial mechanism that contributes to transcriptome diversity and gene expression regulation. As RNA-seq has become a routine protocol for transcriptome analysis, it is of great interest to leverage such unprecedented collection of RNA-seq data by new computational methods to extract and quantify APA dynamics in these transcriptomes. However, research progress in this area has been relatively limited. Conventional methods rely on either transcript assembly to determine transcript 3' ends or annotated poly(A) sites. Moreover, they can neither identify more than two poly(A) sites in a gene nor detect dynamic APA site usage considering more than two poly(A) sites. Results We developed an approach called APAtrap based on the mean squared error model to identify and quantify APA sites from RNA-seq data. APAtrap is capable of identifying novel 3' UTRs and 3' UTR extensions, which contributes to locating potential poly(A) sites in previously overlooked regions and improving genome annotations. APAtrap also aims to tally all potential poly(A) sites and detect genes with differential APA site usages between conditions. Extensive comparisons of APAtrap with two other latest methods, ChangePoint and DaPars, using various RNA-seq datasets from simulation studies, human and Arabidopsis demonstrate the efficacy and flexibility of APAtrap for any organisms with an annotated genome. Availability and implementation Freely available for download at https://apatrap.sourceforge.io. Contact liqq@xmu.edu.cn or xhuister@xmu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Congting Ye
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
| | - Yuqi Long
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Qingshun Quinn Li
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China.,Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA 91766, USA
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
19
|
Kim N, Chung W, Eum HH, Lee HO, Park WY. Alternative polyadenylation of single cells delineates cell types and serves as a prognostic marker in early stage breast cancer. PLoS One 2019; 14:e0217196. [PMID: 31100099 PMCID: PMC6524824 DOI: 10.1371/journal.pone.0217196] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 05/08/2019] [Indexed: 12/15/2022] Open
Abstract
Alternative polyadenylation (APA) in 3’ untranslated regions (3’ UTR) plays an important role in regulating transcript abundance, localization, and interaction with microRNAs. Length-variation of 3’UTRs by APA contributes to efficient proliferation of cancer cells. In this study, we investigated APA in single cancer cells and tumor microenvironment cells to understand the physiological implication of APA in different cell types. We analyzed APA patterns and the expression level of genes from the 515 single-cell RNA sequencing (scRNA-seq) dataset from 11 breast cancer patients. Although the overall 3’UTR length of individual genes was distributed equally in tumor and non-tumor cells, we found a differential pattern of polyadenylation in gene sets between tumor and non-tumor cells. In addition, we found a differential pattern of APA across tumor types using scRNA-seq data from 3 glioblastoma patients and 1 renal cell carcinoma patients. In detail, 1,176 gene sets and 53 genes showed the distinct pattern of 3’UTR shortening and over-expression as signatures for five cell types including B lymphocytes, T lymphocytes, myeloid cells, stromal cells, and breast cancer cells. Functional categories of gene sets for cellular proliferation demonstrated concordant regulation of APA and gene expression specific to cell types. The expression of APA genes in breast cancer was significantly correlated with the clinical outcome of earlier stage breast cancer patients. We identified cell type-specific APA in single cells, which allows the identification of cell types based on 3’UTR length variation in combination with gene expression. Specifically, an immune-specific APA signature in breast cancer could be utilized as a prognostic marker of early stage breast cancer.
Collapse
Affiliation(s)
- Nayoung Kim
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
| | - Woosung Chung
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
| | - Hye Hyeon Eum
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
| | - Hae-Ock Lee
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences &Technology, Sungkyunkwan University, Seoul, South Korea
- * E-mail: (HOL); (WYP)
| | - Woong-Yang Park
- Samsung Genome Institute, Samsung Medical Center, Seoul, South Korea
- Department of Molecular Cell Biology, Sungkyunkwan University School of Medicine, Suwon, South Korea
- Department of Health Sciences and Technology, Samsung Advanced Institute for Health Sciences &Technology, Sungkyunkwan University, Seoul, South Korea
- GENINUS Inc., Seoul, South Korea
- * E-mail: (HOL); (WYP)
| |
Collapse
|
20
|
Harrison BJ, Park JW, Gomes C, Petruska JC, Sapio MR, Iadarola MJ, Chariker JH, Rouchka EC. Detection of Differentially Expressed Cleavage Site Intervals Within 3' Untranslated Regions Using CSI-UTR Reveals Regulated Interaction Motifs. Front Genet 2019; 10:182. [PMID: 30915105 PMCID: PMC6422928 DOI: 10.3389/fgene.2019.00182] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 02/19/2019] [Indexed: 01/08/2023] Open
Abstract
The length of untranslated regions at the 3' end of transcripts (3'UTRs) is regulated by alternate polyadenylation (APA). 3'UTRs contain regions that harbor binding motifs for regulatory molecules. However, the mechanisms that coordinate the 3'UTR length of specific groups of transcripts are not well-understood. We therefore developed a method, CSI-UTR, that models 3'UTR structure as tandem segments between functional alternative-polyadenylation sites (termed cleavage site intervals-CSIs). This approach facilitated (1) profiling of 3'UTR isoform expression changes and (2) statistical enrichment of putative regulatory motifs. CSI-UTR analysis is UTR-annotation independent and can interrogate legacy data generated from standard RNA-Seq libraries. CSI-UTR identified a set of CSIs in human and rodent transcriptomes. Analysis of RNA-Seq datasets from neural tissue identified differential expression events within 3'UTRs not detected by standard gene-based differential expression analyses. Further, in many instances 3'UTR and CDS from the same gene were regulated differently. This modulation of motifs for RNA-interacting molecules with potential condition-dependent and tissue-specific RNA binding partners near the polyA signal and CSI junction may play a mechanistic role in the specificity of alternative polyadenylation. Source code, CSI BED files and example datasets are available at: https://github.com/UofLBioinformatics/CSI-UTR.
Collapse
Affiliation(s)
- Benjamin J Harrison
- Department of Biomedical Sciences, Center for Excellence in the Neurosciences, College of Osteopathic Medicine, University of New England, Biddeford, ME, United States.,Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States.,Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States
| | - Juw Won Park
- Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States.,Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Louisville, KY, United States
| | - Cynthia Gomes
- Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States
| | - Jeffrey C Petruska
- Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States.,Kentucky Spinal Cord Injury Research Center, University of Louisville, Louisville, KY, United States.,Department of Neurological Surgery, University of Louisville, Louisville, KY, United States
| | - Matthew R Sapio
- Department of Perioperative Medicine, Clinical Center, National Institutes of Health, Bethesda, MD, United States
| | - Michael J Iadarola
- Department of Perioperative Medicine, Clinical Center, National Institutes of Health, Bethesda, MD, United States
| | - Julia H Chariker
- Department of Anatomical Sciences and Neurobiology, University of Louisville, Louisville, KY, United States.,Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States
| | - Eric C Rouchka
- Kentucky Biomedical Research Infrastructure Network Bioinformatics Core, Louisville, KY, United States.,Department of Computer Engineering and Computer Science, Speed School of Engineering, University of Louisville, Louisville, KY, United States
| |
Collapse
|
21
|
Xiang Y, Ye Y, Zhang Z, Han L. Maximizing the Utility of Cancer Transcriptomic Data. Trends Cancer 2018; 4:823-837. [PMID: 30470304 DOI: 10.1016/j.trecan.2018.09.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 09/23/2018] [Accepted: 09/24/2018] [Indexed: 12/13/2022]
Abstract
Transcriptomic profiling has been applied to large numbers of cancer samples, by large-scale consortia, including The Cancer Genome Atlas, International Cancer Genome Consortium, and Cancer Cell Line Encyclopedia. Advances in mining cancer transcriptomic data enable us to understand the endless complexity of the cancer transcriptome and thereby to discover new biomarkers and therapeutic targets. In this paper, we review computational resources for deep mining of transcriptomic data to identify, quantify, and determine the functional effects and clinical utility of transcriptomic events, including noncoding RNAs, post-transcriptional regulation, exogenous RNAs, and transcribed genetic variants. These approaches can be applied to other complex diseases, thereby greatly leveraging the impact of this work.
Collapse
Affiliation(s)
- Yu Xiang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; These authors contributed equally
| | - Youqiong Ye
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; These authors contributed equally
| | - Zhao Zhang
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Leng Han
- Department of Biochemistry and Molecular Biology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Center for Precision Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.
| |
Collapse
|
22
|
Chang JW, Zhang W, Yeh HS, Park M, Yao C, Shi Y, Kuang R, Yong J. An integrative model for alternative polyadenylation, IntMAP, delineates mTOR-modulated endoplasmic reticulum stress response. Nucleic Acids Res 2018; 46:5996-6008. [PMID: 29733382 PMCID: PMC6158760 DOI: 10.1093/nar/gky340] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 04/11/2018] [Accepted: 04/20/2018] [Indexed: 12/18/2022] Open
Abstract
3'-untranslated regions (UTRs) can vary through the use of alternative polyadenylation sites during pre-mRNA processing. Multiple publically available pipelines combining high profiling technologies and bioinformatics tools have been developed to catalog changes in 3'-UTR lengths. In our recent RNA-seq experiments using cells with hyper-activated mammalian target of rapamycin (mTOR), we found that cellular mTOR activation leads to transcriptome-wide alternative polyadenylation (APA), resulting in the activation of multiple cellular pathways. Here, we developed a novel bioinformatics algorithm, IntMAP, which integrates RNA-Seq and PolyA Site (PAS)-Seq data for a comprehensive characterization of APA events. By applying IntMAP to the datasets from cells with hyper-activated mTOR, we identified novel APA events that could otherwise not be identified by either profiling method alone. Several transcription factors including Cebpg (CCAAT/enhancer binding protein gamma) were among the newly discovered APA transcripts, indicating that diverse transcriptional networks may be regulated by mTOR-coordinated APA. The prevention of APA in Cebpg using the CRISPR/cas9-mediated genome editing tool showed that mTOR-driven 3'-UTR shortening in Cebpg is critical in protecting cells from endoplasmic reticulum (ER) stress. Taken together, we present IntMAP as a new bioinformatics algorithm for APA analysis by which we expand our understanding of the physiological role of mTOR-coordinated APA events to ER stress response. IntMAP toolbox is available at http://compbio.cs.umn.edu/IntMAP/.
Collapse
Affiliation(s)
- Jae-Woong Chang
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Wei Zhang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Hsin-Sung Yeh
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Meeyeon Park
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Chengguo Yao
- Department of Microbiology and Molecular Genetics, University of California School of Medicine, Irvine, CA 92697, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, University of California School of Medicine, Irvine, CA 92697, USA
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, USA
| |
Collapse
|
23
|
Yeh HS, Zhang W, Yong J. Analyses of alternative polyadenylation: from old school biochemistry to high-throughput technologies. BMB Rep 2018; 50:201-207. [PMID: 28148393 PMCID: PMC5437964 DOI: 10.5483/bmbrep.2017.50.4.019] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Indexed: 01/08/2023] Open
Abstract
Alternations in usage of polyadenylation sites during transcription termination yield transcript isoforms from a gene. Recent findings of transcriptome-wide alternative polyadenylation (APA) as a molecular response to changes in biology position APA not only as a molecular event of early transcriptional termination but also as a cellular regulatory step affecting various biological pathways. With the development of high-throughput profiling technologies at a single nucleotide level and their applications targeted to the 3'-end of mRNAs, dynamics in the landscape of mRNA 3'-end is measureable at a global scale. In this review, methods and technologies that have been adopted to study APA events are discussed. In addition, various bioinformatics algorithms for APA isoform analysis using publicly available RNA-seq datasets are introduced. [BMB Reports 2017; 50(4): 201-207].
Collapse
Affiliation(s)
- Hsin-Sung Yeh
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Wei Zhang
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minnesota 55455, USA
| |
Collapse
|
24
|
Huang Z, Teeling EC. ExUTR: a novel pipeline for large-scale prediction of 3'-UTR sequences from NGS data. BMC Genomics 2017; 18:847. [PMID: 29110697 PMCID: PMC5674806 DOI: 10.1186/s12864-017-4241-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 10/25/2017] [Indexed: 12/11/2022] Open
Abstract
Background The three prime untranslated region (3′-UTR) is known to play a pivotal role in modulating gene expression by determining the fate of mRNA. Many crucial developmental events, such as mammalian spermatogenesis, tissue patterning, sex determination and neurogenesis, rely heavily on post-transcriptional regulation by the 3′-UTR. However, 3′-UTR biology seems to be a relatively untapped field, with only limited tools and 3′-UTR resources available. To elucidate the regulatory mechanisms of the 3′-UTR on gene expression, firstly the 3′-UTR sequences must be identified. Current 3′-UTR mining tools, such as GETUTR, 3USS and UTRscan, all depend on a well-annotated reference genome or curated 3′-UTR sequences, which hinders their application on a myriad of non-model organisms where the genomes are not available. To address these issues, the establishment of an NGS-based, automated pipeline is urgently needed for genome-wide 3′-UTR prediction in the absence of reference genomes. Results Here, we propose ExUTR, a novel NGS-based pipeline to predict and retrieve 3′-UTR sequences from RNA-Seq experiments, particularly designed for non-model species lacking well-annotated genomes. This pipeline integrates cutting-edge bioinformatics tools, databases (Uniprot and UTRdb) and novel in-house Perl scripts, implementing a fully automated workflow. By taking transcriptome assemblies as inputs, this pipeline identifies 3′-UTR signals based primarily on the intrinsic features of transcripts, and outputs predicted 3′-UTR candidates together with associated annotations. In addition, ExUTR only requires minimal computational resources, which facilitates its implementation on a standard desktop computer with reasonable runtime, making it affordable to use for most laboratories. We also demonstrate the functionality and extensibility of this pipeline using publically available RNA-Seq data from both model and non-model species, and further validate the accuracy of predicted 3′-UTR using both well-characterized 3′-UTR resources and 3P–Seq data. Conclusions ExUTR is a practical and powerful workflow that enables rapid genome-wide 3′-UTR discovery from NGS data. The candidates predicted through this pipeline will further advance the study of miRNA target prediction, cis elements in 3′-UTR and the evolution and biology of 3′-UTRs. Being independent of a well-annotated reference genome will dramatically expand its application to much broader research area, encompassing all species for which RNA-Seq is available. Electronic supplementary material The online version of this article (10.1186/s12864-017-4241-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zixia Huang
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland
| | - Emma C Teeling
- UCD School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, Ireland.
| |
Collapse
|
25
|
Szkop KJ, Nobeli I. Untranslated Parts of Genes Interpreted: Making Heads or Tails of High-Throughput Transcriptomic Data via Computational Methods: Computational methods to discover and quantify isoforms with alternative untranslated regions. Bioessays 2017; 39. [PMID: 29052251 DOI: 10.1002/bies.201700090] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2017] [Revised: 09/12/2017] [Indexed: 01/07/2023]
Abstract
In this review we highlight the importance of defining the untranslated parts of transcripts, and present a number of computational approaches for the discovery and quantification of alternative transcription start and poly-adenylation events in high-throughput transcriptomic data. The fate of eukaryotic transcripts is closely linked to their untranslated regions, which are determined by the position at which transcription starts and ends at a genomic locus. Although the extent of alternative transcription starts and alternative poly-adenylation sites has been revealed by sequencing methods focused on the ends of transcripts, the application of these methods is not yet widely adopted by the community. We suggest that computational methods applied to standard high-throughput technologies are a useful, albeit less accurate, alternative to the expertise-demanding 5' and 3' sequencing and they are the only option for analysing legacy transcriptomic data. We review these methods here, focusing on technical challenges and arguing for the need to include better normalization of the data and more appropriate statistical models of the expected variation in the signal.
Collapse
Affiliation(s)
- Krzysztof J Szkop
- Institute of Structural and Molecular Biology, Department of Biological Sciences Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| | - Irene Nobeli
- Institute of Structural and Molecular Biology, Department of Biological Sciences Birkbeck, University of London, Malet Street, London WC1E 7HX, UK
| |
Collapse
|
26
|
Grassi E, Mariella E, Lembo A, Molineris I, Provero P. Roar: detecting alternative polyadenylation with standard mRNA sequencing libraries. BMC Bioinformatics 2016; 17:423. [PMID: 27756200 PMCID: PMC5069797 DOI: 10.1186/s12859-016-1254-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 09/08/2016] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Post-transcriptional regulation is a complex mechanism that plays a central role in defining multiple cellular identities starting from a common genome. Modifications in the length of 3'UTRs have been found to play an important role in this context, since alternative 3' UTRs could lead to differences for example in regulation by microRNAs and cellular localization of the transcripts thus altering their fate. RESULTS We propose a strategy to identify the genes undergoing regulation of 3' UTR length using RNA sequencing data obtained from standard libraries, thus widely applicable to data originally obtained to perform classical differential expression analyses. We decided to exploit previously annotated APA sites from public databases, in contrast with other approaches recently proposed in which the location of the APA site is inferred from the data together with the relative abundance of the isoforms. We demonstrate the reliability of our method by comparing it to the results of other microarray based or specific RNA-seq libraries methods and show that using APA sites databases results in higher sensitivity compared to de novo site prediction approach. CONCLUSIONS We implemented the algorithm in a Bioconductor package to facilitate its broad usage in the scientific community. The ability of this approach to detect shortening from libraries with a number of reads comparable to that needed for differential expression analyses makes it useful for investigating if alternative polyadenylation is relevant in a certain biological process without requiring specific experimental assays.
Collapse
Affiliation(s)
- Elena Grassi
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy.
| | - Elisa Mariella
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Antonio Lembo
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Ivan Molineris
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
| | - Paolo Provero
- Department of Molecular Biotechnology and Health Sciences, Molecular Biotechnology Center, Via Nizza 52, Torino, 10126, Italy
- Center for Translational Genomics and Bioinformatics, San Raffaele Scientific Institute, Via Olgettina 60, Milan, 20132, Italy
| |
Collapse
|
27
|
Erson-Bensan AE, Can T. Alternative Polyadenylation: Another Foe in Cancer. Mol Cancer Res 2016; 14:507-17. [PMID: 27075335 DOI: 10.1158/1541-7786.mcr-15-0489] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 03/30/2016] [Indexed: 11/16/2022]
Abstract
Advancements in sequencing and transcriptome analysis methods have led to seminal discoveries that have begun to unravel the complexity of cancer. These studies are paving the way toward the development of improved diagnostics, prognostic predictions, and targeted treatment options. However, it is clear that pieces of the cancer puzzle are still missing. In an effort to have a more comprehensive understanding of the development and progression of cancer, we have come to appreciate the value of the noncoding regions of our genomes, partly due to the discovery of miRNAs and their significance in gene regulation. Interestingly, the miRNA-mRNA interactions are not solely dependent on variations in miRNA levels. Instead, the majority of genes harbor multiple polyadenylation signals on their 3' UTRs (untranslated regions) that can be differentially selected on the basis of the physiologic state of cells, resulting in alternative 3' UTR isoforms. Deregulation of alternative polyadenylation (APA) has increasing interest in cancer research, because APA generates mRNA 3' UTR isoforms with potentially different stabilities, subcellular localizations, translation efficiencies, and functions. This review focuses on the link between APA and cancer and discusses the mechanisms as well as the tools available for investigating APA events in cancer. Overall, detection of deregulated APA-generated isoforms in cancer may implicate some proto-oncogene activation cases of unknown causes and may help the discovery of novel cases; thus, contributing to a better understanding of molecular mechanisms of cancer. Mol Cancer Res; 14(6); 507-17. ©2016 AACR.
Collapse
Affiliation(s)
- Ayse Elif Erson-Bensan
- Department of Biological Sciences, Middle East Technical University (METU) (ODTU), Ankara, Turkey.
| | - Tolga Can
- Department of Computer Engineering, Middle East Technical University (METU) (ODTU), Ankara, Turkey
| |
Collapse
|