1
|
Liu T, Gu J, Li C, Guo M, Yuan L, Lv Q, Qin C, Du M, Chu H, Liu H, Zhang Z. Alternative polyadenylation-related genetic variants contribute to bladder cancer risk. J Biomed Res 2023; 37:405-417. [PMID: 37936490 PMCID: PMC10687529 DOI: 10.7555/jbr.37.20230063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 05/26/2023] [Accepted: 05/28/2023] [Indexed: 11/09/2023] Open
Abstract
Aberrant alternative polyadenylation (APA) events play an important role in cancers, but little is known about whether APA-related genetic variants contribute to the susceptibility to bladder cancer. Previous genome-wide association study performed APA quantitative trait loci (apaQTL) analyses in bladder cancer, and identified 17 955 single nucleotide polymorphisms (SNPs). We found that gene symbols of APA affected by apaQTL-associated SNPs were closely correlated with cancer signaling pathways, high mutational burden, and immune infiltration. Association analysis showed that apaQTL-associated SNPs rs34402449 C>A, rs2683524 C>T, and rs11540872 C>G were significantly associated with susceptibility to bladder cancer (rs34402449: OR = 1.355, 95% confidence interval [CI]: 1.159-1.583, P = 1.33 × 10 -4; rs2683524: OR = 1.378, 95% CI: 1.164-1.632, P = 2.03 × 10 -4; rs11540872: OR = 1.472, 95% CI: 1.193-1.815, P = 3.06 × 10 -4). Cumulative effect analysis showed that the number of risk genotypes and smoking status were significantly associated with an increased risk of bladder cancer ( P trend = 2.87 × 10 -12). We found that PRR13, being demonstrated the most significant effect on cell proliferation in bladder cancer cell lines, was more highly expressed in bladder cancer tissues than in adjacent normal tissues. Moreover, the rs2683524 T allele was correlated with shorter 3' untranslated regions of PRR13 and increased PRR13 expression levels. Collectively, our findings have provided informative apaQTL resources and insights into the regulatory mechanisms linking apaQTL-associated variants to bladder cancer risk.
Collapse
Affiliation(s)
- Ting Liu
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Jingjing Gu
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Chuning Li
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Mengfan Guo
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Lin Yuan
- Department of Urology, Jiangsu Province Hospital of Traditional Chinese Medicine, Nanjing, Jiangsu 210029, China
| | - Qiang Lv
- Department of Urology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210029, China
| | - Chao Qin
- Department of Urology, the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu 210029, China
| | - Mulong Du
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Haiyan Chu
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Hanting Liu
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| | - Zhengdong Zhang
- Department of Environmental Genomics, Jiangsu Key Laboratory of Cancer Biomarkers, Prevention and Treatment, Collaborative Innovation Center for Cancer Personalized Medicine, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
- Department of Genetic Toxicology, the Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu 211166, China
| |
Collapse
|
2
|
Kang B, Yang Y, Hu K, Ruan X, Liu YL, Lee P, Lee J, Wang J, Zhang X. Infernape uncovers cell type-specific and spatially resolved alternative polyadenylation in the brain. Genome Res 2023; 33:1774-1787. [PMID: 37907328 PMCID: PMC10691540 DOI: 10.1101/gr.277864.123] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/12/2023] [Indexed: 11/02/2023]
Abstract
Differential polyadenylation sites (PAs) critically regulate gene expression, but their cell type-specific usage and spatial distribution in the brain have not been systematically characterized. Here, we present Infernape, which infers and quantifies PA usage from single-cell and spatial transcriptomic data and show its application in the mouse brain. Infernape uncovers alternative intronic PAs and 3'-UTR lengthening during cortical neurogenesis. Progenitor-neuron comparisons in the excitatory and inhibitory neuron lineages show overlapping PA changes in embryonic brains, suggesting that the neural proliferation-differentiation axis plays a prominent role. In the adult mouse brain, we uncover cell type-specific PAs and visualize such events using spatial transcriptomic data. Over two dozen neurodevelopmental disorder-associated genes such as Csnk2a1 and Mecp2 show differential PAs during brain development. This study presents Infernape to identify PAs from scRNA-seq and spatial data, and highlights the role of alternative PAs in neuronal gene regulation.
Collapse
Affiliation(s)
- Bowei Kang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yalan Yang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Kaining Hu
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Xiangbin Ruan
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yi-Lin Liu
- Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Pinky Lee
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jasper Lee
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jingshu Wang
- Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA;
| | - Xiaochang Zhang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA;
- The Neuroscience Institute, The University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
3
|
Moon Y, Burri D, Zavolan M. Identification of experimentally-supported poly(A) sites in single-cell RNA-seq data with SCINPAS. NAR Genom Bioinform 2023; 5:lqad079. [PMID: 37705828 PMCID: PMC10495540 DOI: 10.1093/nargab/lqad079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/15/2023] [Accepted: 08/23/2023] [Indexed: 09/15/2023] Open
Abstract
Alternative polyadenylation is a main driver of transcriptome diversity in mammals, generating transcript isoforms with different 3' ends via cleavage and polyadenylation at distinct polyadenylation (poly(A)) sites. The regulation of cell type-specific poly(A) site choice is not completely resolved, and requires quantitative poly(A) site usage data across cell types. 3' end-based single-cell RNA-seq can now be broadly used to obtain such data, enabling the identification and quantification of poly(A) sites with direct experimental support. We propose SCINPAS, a computational method to identify poly(A) sites from scRNA-seq datasets. SCINPAS modifies the read deduplication step to favor the selection of distal reads and extract those with non-templated poly(A) tails. This approach improves the resolution of poly(A) site recovery relative to standard software. SCINPAS identifies poly(A) sites in genic and non-genic regions, providing complementary information relative to other tools. The workflow is modular, and the key read deduplication step is general, enabling the use of SCINPAS in other typical analyses of single cell gene expression. Taken together, we show that SCINPAS is able to identify experimentally-supported, known and novel poly(A) sites from 3' end-based single-cell RNA sequencing data.
Collapse
Affiliation(s)
- Youngbin Moon
- Computational and Systems Biology, Biozentrum University of Basel, Spitalstrasse 41, CH-4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Dominik Burri
- Computational and Systems Biology, Biozentrum University of Basel, Spitalstrasse 41, CH-4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum University of Basel, Spitalstrasse 41, CH-4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
4
|
Zhang Y, Zhou R, Liu L, Chen L, Wang Y. Trackplot: A flexible toolkit for combinatorial analysis of genomic data. PLoS Comput Biol 2023; 19:e1011477. [PMID: 37669275 PMCID: PMC10503704 DOI: 10.1371/journal.pcbi.1011477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 09/15/2023] [Accepted: 08/29/2023] [Indexed: 09/07/2023] Open
Abstract
Here, we introduce Trackplot, a Python package for generating publication-quality visualization by a programmable and interactive web-based approach. Compared to the existing versions of programs generating sashimi plots, Trackplot offers a versatile platform for visually interpreting genomic data from a wide variety of sources, including gene annotation with functional domain mapping, isoform expression, isoform structures identified by scRNA-seq and long-read sequencing, as well as chromatin accessibility and architecture without any preprocessing, and also offers a broad degree of flexibility for formats of output files that satisfy the requirements of major journals. The Trackplot package is an open-source software which is freely available on Bioconda (https://anaconda.org/bioconda/trackplot), Docker (https://hub.docker.com/r/ygidtu/trackplot), PyPI (https://pypi.org/project/trackplot/) and GitHub (https://github.com/ygidtu/trackplot), and a built-in web server for local deployment is also provided.
Collapse
Affiliation(s)
- Yiming Zhang
- Department of Neurosurgery and State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, China
- Institute of Thoracic Oncology and Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Ran Zhou
- Department of Neurosurgery and State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Lunxu Liu
- Institute of Thoracic Oncology and Department of Thoracic Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Lu Chen
- State Key Laboratory of Biotherapy, West China Second University Hospital, Sichuan University, Chengdu, Sichuan, China
| | - Yuan Wang
- Department of Neurosurgery and State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
5
|
Oreper D, Klaeger S, Jhunjhunwala S, Delamarre L. The peptide woods are lovely, dark and deep: Hunting for novel cancer antigens. Semin Immunol 2023; 67:101758. [PMID: 37027981 DOI: 10.1016/j.smim.2023.101758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/22/2023] [Accepted: 03/22/2023] [Indexed: 04/08/2023]
Abstract
Harnessing the patient's immune system to control a tumor is a proven avenue for cancer therapy. T cell therapies as well as therapeutic vaccines, which target specific antigens of interest, are being explored as treatments in conjunction with immune checkpoint blockade. For these therapies, selecting the best suited antigens is crucial. Most of the focus has thus far been on neoantigens that arise from tumor-specific somatic mutations. Although there is clear evidence that T-cell responses against mutated neoantigens are protective, the large majority of these mutations are not immunogenic. In addition, most somatic mutations are unique to each individual patient and their targeting requires the development of individualized approaches. Therefore, novel antigen types are needed to broaden the scope of such treatments. We review high throughput approaches for discovering novel tumor antigens and some of the key challenges associated with their detection, and discuss considerations when selecting tumor antigens to target in the clinic.
Collapse
Affiliation(s)
- Daniel Oreper
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | - Susan Klaeger
- Genentech, 1 DNA way, South San Francisco, 94080 CA, USA.
| | | | | |
Collapse
|
6
|
Winstanley-Zarach P, Rot G, Kuba S, Smagul A, Peffers MJ, Tew SR. Analysis of RNA Polyadenylation in Healthy and Osteoarthritic Human Articular Cartilage. Int J Mol Sci 2023; 24:6611. [PMID: 37047586 PMCID: PMC10094766 DOI: 10.3390/ijms24076611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/17/2023] [Accepted: 03/27/2023] [Indexed: 04/05/2023] Open
Abstract
Polyadenylation (polyA) defines the 3' boundary of a transcript's genetic information. Its position can vary and alternative polyadenylation (APA) transcripts can exist for a gene. This causes variance in 3' regulatory domains and can affect coding sequence if intronic events occur. The distribution of polyA sites on articular chondrocyte transcripts has not been studied so we aimed to define their transcriptome-wide location in age-matched healthy and osteoarthritic knee articular cartilage. Total RNA was isolated from frozen tissue samples and analysed using the QuantSeq-Reverse 3' RNA sequencing approach, where each read runs 3' to 5' from within the polyA tail into the transcript and contains a distinct polyA site. Differential expression of transcripts was significant altered between healthy and osteoarthritic samples with enrichment for functionalities that were strongly associated with joint pathology. Subsequent examination of polyA site data allowed us to define the extent of site usage across all the samples. When comparing healthy and osteoarthritic samples, we found that differential use of polyadenylation sites was modest. However, in the genes affected, there was potential for the APA to have functional relevance. We have characterised the polyadenylation landscape of human knee articular chondrocytes and conclude that osteoarthritis does not elicit a widespread change in their polyadenylation site usage. This finding differentiates knee osteoarthritis from pathologies such as cancer where APA is more commonly observed.
Collapse
Affiliation(s)
- Phaedra Winstanley-Zarach
- Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L7 8TX, UK
| | - Gregor Rot
- Institute of Molecular Life Sciences, Winterthurerstrasse 190, 8057 Zurich, Switzerland
- Swiss Institute of Bioinformatics, Amphipôle, Quartier UNIL-Sorge, 1015 Lausanne, Switzerland
| | - Shweta Kuba
- Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L7 8TX, UK
- School of Health and Life Sciences, National Horizons Centre, Teesside University, Darlington DL1 1HG, UK
| | - Aibek Smagul
- Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L7 8TX, UK
| | - Mandy J. Peffers
- Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L7 8TX, UK
| | - Simon R. Tew
- Centre for Integrated Research into Musculoskeletal Ageing (CIMA), Department of Musculoskeletal and Ageing Science, Institute of Life Course and Medical Sciences, University of Liverpool, Liverpool L7 8TX, UK
| |
Collapse
|
7
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China.,Department of Immunology, Nanjing Medical University, Nanjing, 211166, China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children's Hospital of Chongqing Medical University, Chongqing, 401174, China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110, Guangdong, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081, Heilongjiang, China.
| | - Murray J Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW, 2308, Australia. .,Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW, 2305, Australia.
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166, China.
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199, Hainan, China.
| |
Collapse
|
8
|
Ye W, Lian Q, Ye C, Wu X. A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00121-8. [PMID: 36167284 PMCID: PMC10372920 DOI: 10.1016/j.gpb.2022.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/17/2022] [Accepted: 09/19/2022] [Indexed: 05/08/2023]
Abstract
Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new machine learning and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3' untranslated region, tissue-specific, cross-species, and single-cell pA prediction.
Collapse
Affiliation(s)
- Wenbin Ye
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Coastal and Wetland Ecosystems, Ministry of Education, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|