1
|
Dehghannasiri R, Kokot M, Starr AL, Maziarz J, Gordon T, Tan SY, Wang PL, Voskoboynik A, Musser JM, Deorowicz S, Salzman J. sc-SPLASH provides ultra-efficient reference-free discovery in barcoded single-cell sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.24.630263. [PMID: 39763839 PMCID: PMC11703226 DOI: 10.1101/2024.12.24.630263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/11/2025]
Abstract
Typical high-throughput single-cell RNA-sequencing (scRNA-seq) analyses are primarily conducted by (pseudo)alignment, through the lens of annotated gene models, and aimed at detecting differential gene expression. This misses diversity generated by other mechanisms that diversify the transcriptome such as splicing and V(D)J recombination, and is blind to sequences missing from imperfect reference genomes. Here, we present sc-SPLASH, a highly efficient pipeline that extends our SPLASH framework for statistics-first, reference-free discovery to barcoded scRNA-seq (10x Chromium) and spatial transcriptomics (10x Visium); we also provide its optimized module for preprocessing and k-mer counting in barcoded data, BKC, as a standalone tool. sc-SPLASH rediscovers known biology including V(D)J recombination and cell-type-specific alternative splicing in human and trans-splicing in tunicate (Ciona) and when applied to spatial datasets, detects sequence variation including tumor-specific somatic mutation. In sponge (Spongilla) and tunicate (Ciona), we uncover secreted repeat proteins expressed in immune-type cells and regulated during development; the sponge genes were absent from the reference assembly. sc-SPLASH provides a powerful alternative tool for exploring transcriptomes that is applicable to the breadth of life's diversity.
Collapse
Affiliation(s)
| | - Marek Kokot
- Department of Algorithmics and Software, v, Gliwice, Poland
| | | | - Jamie Maziarz
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, 06511, USA
| | - Tal Gordon
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, 94305 USA
| | - Serena Y. Tan
- Department of Pathology, Stanford University Medical Center, Stanford, 94305, USA
| | - Peter L. Wang
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, USA
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
| | - Ayelet Voskoboynik
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, 94305 USA
- Department of Biology, Hopkins Marine Station, Stanford University, Pacific Grove, 93950, USA
| | - Jacob M. Musser
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, 06511, USA
- Wu Tsai Institute, Yale University, New Haven, 06510, USA
| | | | - Julia Salzman
- Department of Biomedical Data Science, Stanford University, Stanford, 94305, USA
- Department of Biochemistry, Stanford University, Stanford, 94305, USA
- Department of Statistics (by courtesy), Stanford University, Stanford, 94305, USA
- Department of Biology (by Courtesy), Stanford University, Stanford, 94305, CA, USA
| |
Collapse
|
2
|
Prater KE, Lin KZ. All the single cells: Single-cell transcriptomics/epigenomics experimental design and analysis considerations for glial biologists. Glia 2024. [PMID: 39558887 DOI: 10.1002/glia.24633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Revised: 09/18/2024] [Accepted: 10/10/2024] [Indexed: 11/20/2024]
Abstract
Single-cell transcriptomics, epigenomics, and other 'omics applied at single-cell resolution can significantly advance hypotheses and understanding of glial biology. Omics technologies are revealing a large and growing number of new glial cell subtypes, defined by their gene expression profile. These subtypes have significant implications for understanding glial cell function, cell-cell communications, and glia-specific changes between homeostasis and conditions such as neurological disease. For many, the training in how to analyze, interpret, and understand these large datasets has been through reading and understanding literature from other fields like biostatistics. Here, we provide a primer for glial biologists on experimental design and analysis of single-cell RNA-seq datasets. Our goal is to further the understanding of why decisions are made about datasets and to enhance biologists' ability to interpret and critique their work and the work of others. We review the steps involved in single-cell analysis with a focus on decision points and particular notes for glia. The goal of this primer is to ensure that single-cell 'omics experiments continue to advance glial biology in a rigorous and replicable way.
Collapse
Affiliation(s)
- Katherine E Prater
- Department of Neurology, School of Medicine, University of Washington, Seattle, Washington, USA
| | - Kevin Z Lin
- Department of Biostatistics, University of Washington, Seattle, Washington, USA
| |
Collapse
|
3
|
Bi X, Zhu S, Liu F, Wu X. Dynamics of alternative polyadenylation in single root cells of Arabidopsis thaliana. FRONTIERS IN PLANT SCIENCE 2024; 15:1437118. [PMID: 39372861 PMCID: PMC11449893 DOI: 10.3389/fpls.2024.1437118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Accepted: 09/02/2024] [Indexed: 10/08/2024]
Abstract
Introduction Single-cell RNA-seq (scRNA-seq) technologies have been widely used to reveal the diversity and complexity of cells, and pioneering studies on scRNA-seq in plants began to emerge since 2019. However, existing studies on plants utilized scRNA-seq focused only on the gene expression regulation. As an essential post-transcriptional mechanism for regulating gene expression, alternative polyadenylation (APA) generates diverse mRNA isoforms with distinct 3' ends through the selective use of different polyadenylation sites in a gene. APA plays important roles in regulating multiple developmental processes in plants, such as flowering time and stress response. Methods In this study, we developed a pipeline to identify and integrate APA sites from different scRNA-seq data and analyze APA dynamics in single cells. First, high-confidence poly(A) sites in single root cells were identified and quantified. Second, three kinds of APA markers were identified for exploring APA dynamics in single cells, including differentially expressed poly(A) sites based on APA site expression, APA markers based on APA usages, and APA switching genes based on 3' UTR (untranslated region) length change. Moreover, cell type annotations of single root cells were refined by integrating both the APA information and the gene expression profile. Results We comprehensively compiled a single-cell APA atlas from five scRNA-seq studies, covering over 150,000 cells spanning four major tissue branches, twelve cell types, and three developmental stages. Moreover, we quantified the dynamic APA usages in single cells and identified APA markers across tissues and cell types. Further, we integrated complementary information of gene expression and APA profiles to annotate cell types and reveal subtle differences between cell types. Discussion This study reveals that APA provides an additional layer of information for determining cell identity and provides a landscape of APA dynamics during Arabidopsis root development.
Collapse
Affiliation(s)
- Xingyu Bi
- Cancer Institute, Suzhou Medical College, Soochow University, Suzhou, China
| | - Sheng Zhu
- Operational Technology Research and Evaluation Center, China Nuclear Power Operation Technology Corporation, Ltd, Wuhan, China
| | - Fei Liu
- Cancer Institute, Suzhou Medical College, Soochow University, Suzhou, China
| | - Xiaohui Wu
- Cancer Institute, Suzhou Medical College, Soochow University, Suzhou, China
- Jiangsu Key Laboratory of Infection and Immunity, Soochow University, Suzhou, China
| |
Collapse
|
4
|
Kowalski MH, Wessels HH, Linder J, Dalgarno C, Mascio I, Choudhary S, Hartman A, Hao Y, Kundaje A, Satija R. Multiplexed single-cell characterization of alternative polyadenylation regulators. Cell 2024; 187:4408-4425.e23. [PMID: 38925112 DOI: 10.1016/j.cell.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 03/12/2024] [Accepted: 06/05/2024] [Indexed: 06/28/2024]
Abstract
Most mammalian genes have multiple polyA sites, representing a substantial source of transcript diversity regulated by the cleavage and polyadenylation (CPA) machinery. To better understand how these proteins govern polyA site choice, we introduce CPA-Perturb-seq, a multiplexed perturbation screen dataset of 42 CPA regulators with a 3' scRNA-seq readout that enables transcriptome-wide inference of polyA site usage. We develop a framework to detect perturbation-dependent changes in polyadenylation and characterize modules of co-regulated polyA sites. We find groups of intronic polyA sites regulated by distinct components of the nuclear RNA life cycle, including elongation, splicing, termination, and surveillance. We train and validate a deep neural network (APARENT-Perturb) for tandem polyA site usage, delineating a cis-regulatory code that predicts perturbation response and reveals interactions between regulatory complexes. Our work highlights the potential for multiplexed single-cell perturbation screens to further our understanding of post-transcriptional regulation.
Collapse
Affiliation(s)
- Madeline H Kowalski
- New York Genome Center, New York, NY, USA; Center for Genomics and Systems Biology, New York University, New York, NY, USA; New York University Grossman School of Medicine, New York, NY, USA
| | - Hans-Hermann Wessels
- New York Genome Center, New York, NY, USA; Center for Genomics and Systems Biology, New York University, New York, NY, USA.
| | - Johannes Linder
- Department of Genetics, Stanford University, Stanford, CA, USA; Department of Computer Science, Stanford University, Stanford, CA, USA
| | | | - Isabella Mascio
- New York Genome Center, New York, NY, USA; Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - Saket Choudhary
- New York Genome Center, New York, NY, USA; Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | | | - Yuhan Hao
- New York Genome Center, New York, NY, USA; Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA; Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Rahul Satija
- New York Genome Center, New York, NY, USA; Center for Genomics and Systems Biology, New York University, New York, NY, USA; New York University Grossman School of Medicine, New York, NY, USA.
| |
Collapse
|
5
|
Zhao Z, Chen Y, Zou X, Lin L, Zhou X, Cheng X, Yang G, Xu Q, Gong L, Li L, Ni T. Pan-cancer transcriptome analysis reveals widespread regulation through alternative tandem transcription initiation. SCIENCE ADVANCES 2024; 10:eadl5606. [PMID: 38985880 PMCID: PMC11235174 DOI: 10.1126/sciadv.adl5606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 06/05/2024] [Indexed: 07/12/2024]
Abstract
Abnormal transcription initiation from alternative first exon has been reported to promote tumorigenesis. However, the prevalence and impact of gene expression regulation mediated by alternative tandem transcription initiation were mostly unknown in cancer. Here, we developed a robust computational method to analyze alternative tandem transcription start site (TSS) usage from standard RNA sequencing data. Applying this method to pan-cancer RNA sequencing datasets, we observed widespread dysregulation of tandem TSS usage in tumors, many of which were independent of changes in overall expression level or alternative first exon usage. We showed that the dynamics of tandem TSS usage was associated with epigenomic modulation. We found that significant 5' untranslated region shortening of gene TIMM13 contributed to increased protein production, and up-regulation of TIMM13 by CRISPR-mediated transcriptional activation promoted proliferation and migration of lung cancer cells. Our findings suggest that dysregulated tandem TSS usage represents an addtional layer of cancer-associated transcriptome alterations.
Collapse
Affiliation(s)
- Zhaozhao Zhao
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
- MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Yu Chen
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xudong Zou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Limin Lin
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xiaolan Zhou
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Xiaomeng Cheng
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Guangrui Yang
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Qiushi Xu
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Lihai Gong
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Lei Li
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, National Clinical Research Center for Aging and Medicine, Huashan Hospital, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Center for Evolutionary Biology, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences, Fudan University, Shanghai 200438, China
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| |
Collapse
|
6
|
Wu Q, Gu Z, Shang B, Wan D, Zhang Q, Zhang X, Xie P, Cheng S, Zhang W, Zhang K. Circulating tumor cell clustering modulates RNA splicing and polyadenylation to facilitate metastasis. Cancer Lett 2024; 588:216757. [PMID: 38417668 DOI: 10.1016/j.canlet.2024.216757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 02/07/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024]
Abstract
Circulating tumor cell (CTC) clusters exhibit significantly higher metastatic potential compared to single CTCs. However, the underlying mechanism behind this phenomenon remains unclear, and the role of posttranscriptional RNA regulation in CTC clusters has not been explored. Here, we conducted a comparative analysis of alternative splicing (AS) and alternative polyadenylation (APA) profiles between single CTCs and CTC clusters. We identified 994 and 836 AS events in single CTCs and CTC clusters, respectively, with ∼20% of AS events showing differential regulation between the two cell types. A key event in this differential splicing was observed in SRSF6, which disrupted AS profiles and contributed to the increased malignancy of CTC clusters. Regarding APA, we found a global lengthening of 3' UTRs in CTC clusters compared to single CTCs. This alteration was primarily governed by 14 core APA factors, particularly PPP1CA. The modified APA profiles facilitated the cell cycle progression of CTC clusters and indicated their reduced susceptibility to oxidative stress. Further investigation revealed that the proportion of H2AFY mRNA with long 3' UTR instead of short 3' UTR was higher in CTC clusters than single CTCs. The AU-rich elements (AREs) within the long 3' UTR of H2AFY mRNA enhance mRNA stability and translation activity, resulting in promoting cell proliferation and invasion, which potentially facilitate the establishment and rapid formation of metastatic tumors mediated by CTC clusters. These findings provide new insights into the mechanisms driving CTC cluster metastasis.
Collapse
Affiliation(s)
- Quanyou Wu
- Division of Abdominal Cancer, Department of Medical Oncology, Cancer Center and Laboratory of Molecular Targeted Therapy in Oncology, West China Hospital, Sichuan University, Chengdu, Sichuan Province, 610041, China; State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Zhaoru Gu
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Bingqing Shang
- Department of Urology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Duo Wan
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Qi Zhang
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Xiaoli Zhang
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Peipei Xie
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China
| | - Shujun Cheng
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| | - Wen Zhang
- Department of Immunology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| | - Kaitai Zhang
- State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
7
|
Kang B, Yang Y, Hu K, Ruan X, Liu YL, Lee P, Lee J, Wang J, Zhang X. Infernape uncovers cell type-specific and spatially resolved alternative polyadenylation in the brain. Genome Res 2023; 33:1774-1787. [PMID: 37907328 PMCID: PMC10691540 DOI: 10.1101/gr.277864.123] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/12/2023] [Indexed: 11/02/2023]
Abstract
Differential polyadenylation sites (PAs) critically regulate gene expression, but their cell type-specific usage and spatial distribution in the brain have not been systematically characterized. Here, we present Infernape, which infers and quantifies PA usage from single-cell and spatial transcriptomic data and show its application in the mouse brain. Infernape uncovers alternative intronic PAs and 3'-UTR lengthening during cortical neurogenesis. Progenitor-neuron comparisons in the excitatory and inhibitory neuron lineages show overlapping PA changes in embryonic brains, suggesting that the neural proliferation-differentiation axis plays a prominent role. In the adult mouse brain, we uncover cell type-specific PAs and visualize such events using spatial transcriptomic data. Over two dozen neurodevelopmental disorder-associated genes such as Csnk2a1 and Mecp2 show differential PAs during brain development. This study presents Infernape to identify PAs from scRNA-seq and spatial data, and highlights the role of alternative PAs in neuronal gene regulation.
Collapse
Affiliation(s)
- Bowei Kang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yalan Yang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Kaining Hu
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Xiangbin Ruan
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yi-Lin Liu
- Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Pinky Lee
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jasper Lee
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jingshu Wang
- Department of Statistics, The University of Chicago, Chicago, Illinois 60637, USA;
| | - Xiaochang Zhang
- Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA;
- The Neuroscience Institute, The University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
8
|
Moon Y, Burri D, Zavolan M. Identification of experimentally-supported poly(A) sites in single-cell RNA-seq data with SCINPAS. NAR Genom Bioinform 2023; 5:lqad079. [PMID: 37705828 PMCID: PMC10495540 DOI: 10.1093/nargab/lqad079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 08/15/2023] [Accepted: 08/23/2023] [Indexed: 09/15/2023] Open
Abstract
Alternative polyadenylation is a main driver of transcriptome diversity in mammals, generating transcript isoforms with different 3' ends via cleavage and polyadenylation at distinct polyadenylation (poly(A)) sites. The regulation of cell type-specific poly(A) site choice is not completely resolved, and requires quantitative poly(A) site usage data across cell types. 3' end-based single-cell RNA-seq can now be broadly used to obtain such data, enabling the identification and quantification of poly(A) sites with direct experimental support. We propose SCINPAS, a computational method to identify poly(A) sites from scRNA-seq datasets. SCINPAS modifies the read deduplication step to favor the selection of distal reads and extract those with non-templated poly(A) tails. This approach improves the resolution of poly(A) site recovery relative to standard software. SCINPAS identifies poly(A) sites in genic and non-genic regions, providing complementary information relative to other tools. The workflow is modular, and the key read deduplication step is general, enabling the use of SCINPAS in other typical analyses of single cell gene expression. Taken together, we show that SCINPAS is able to identify experimentally-supported, known and novel poly(A) sites from 3' end-based single-cell RNA sequencing data.
Collapse
Affiliation(s)
- Youngbin Moon
- Computational and Systems Biology, Biozentrum University of Basel, Spitalstrasse 41, CH-4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Dominik Burri
- Computational and Systems Biology, Biozentrum University of Basel, Spitalstrasse 41, CH-4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum University of Basel, Spitalstrasse 41, CH-4056 Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
9
|
Cao J, Kuyumcu-Martinez MN. Alternative polyadenylation regulation in cardiac development and cardiovascular disease. Cardiovasc Res 2023; 119:1324-1335. [PMID: 36657944 PMCID: PMC10262186 DOI: 10.1093/cvr/cvad014] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 11/01/2022] [Accepted: 11/28/2022] [Indexed: 01/21/2023] Open
Abstract
Cleavage and polyadenylation of pre-mRNAs is a necessary step for gene expression and function. Majority of human genes exhibit multiple polyadenylation sites, which can be alternatively used to generate different mRNA isoforms from a single gene. Alternative polyadenylation (APA) of pre-mRNAs is important for the proteome and transcriptome landscape. APA is tightly regulated during development and contributes to tissue-specific gene regulation. Mis-regulation of APA is linked to a wide range of pathological conditions. APA-mediated gene regulation in the heart is emerging as a new area of research. Here, we will discuss the impact of APA on gene regulation during heart development and in cardiovascular diseases. First, we will briefly review how APA impacts gene regulation and discuss molecular mechanisms that control APA. Then, we will address APA regulation during heart development and its dysregulation in cardiovascular diseases. Finally, we will discuss pre-mRNA targeting strategies to correct aberrant APA patterns of essential genes for the treatment or prevention of cardiovascular diseases. The RNA field is blooming due to advancements in RNA-based technologies. RNA-based vaccines and therapies are becoming the new line of effective and safe approaches for the treatment and prevention of human diseases. Overall, this review will be influential for understanding gene regulation at the RNA level via APA in the heart and will help design RNA-based tools for the treatment of cardiovascular diseases in the future.
Collapse
Affiliation(s)
- Jun Cao
- Faculty of Environment and Life, Beijing University of Technology, Xueyuan Road, Haidian District, Beijing 100124, PR China
| | - Muge N Kuyumcu-Martinez
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX 77573, USA
- Department of Neurobiology, University of Texas Medical Branch, Galveston, TX 77555, USA
- Institute for Translational Sciences, University of Texas Medical Branch, 301 University Blvd, Galveston, TX 77573, USA
| |
Collapse
|
10
|
Ji G, Tang Q, Zhu S, Zhu J, Ye P, Xia S, Wu X. stAPAminer: Mining Spatial Patterns of Alternative Polyadenylation for Spatially Resolved Transcriptomic Studies. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:601-618. [PMID: 36669641 PMCID: PMC10787175 DOI: 10.1016/j.gpb.2023.01.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 12/07/2022] [Accepted: 01/08/2023] [Indexed: 01/19/2023]
Abstract
Alternative polyadenylation (APA) contributes to transcriptome complexity and gene expression regulation and has been implicated in various cellular processes and diseases. Single-cell RNA sequencing (scRNA-seq) has enabled the profiling of APA at the single-cell level; however, the spatial information of cells is not preserved in scRNA-seq. Alternatively, spatial transcriptomics (ST) technologies provide opportunities to decipher the spatial context of the transcriptomic landscape. Pioneering studies have revealed potential spatially variable genes and/or splice isoforms; however, the pattern of APA usage in spatial contexts remains unappreciated. In this study, we developed a toolkit called stAPAminer for mining spatial patterns of APA from spatially barcoded ST data. APA sites were identified and quantified from the ST data. In particular, an imputation model based on the k-nearest neighbors algorithm was designed to recover APA signals, and then APA genes with spatial patterns of APA usage variation were identified. By analyzing well-established ST data of the mouse olfactory bulb (MOB), we presented a detailed view of spatial APA usage across morphological layers of the MOB. We compiled a comprehensive list of genes with spatial APA dynamics and obtained several major spatial expression patterns that represent spatial APA dynamics in different morphological layers. By extending this analysis to two additional replicates of the MOB ST data, we observed that the spatial APA patterns of several genes were reproducible among replicates. stAPAminer employs the power of ST to explore the transcriptional atlas of spatial APA patterns with spatial resolution. This toolkit is available at https://github.com/BMILAB/stAPAminer and https://ngdc.cncb.ac.cn/biocode/tools/BT007320.
Collapse
Affiliation(s)
- Guoli Ji
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Qi Tang
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Sheng Zhu
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Junyi Zhu
- Institute of Neuroscience, Soochow University, Suzhou 215000, China
| | - Pengchao Ye
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Shuting Xia
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Institute of Neuroscience, Soochow University, Suzhou 215000, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|
11
|
Kowalski MH, Wessels HH, Linder J, Choudhary S, Hartman A, Hao Y, Mascio I, Dalgarno C, Kundaje A, Satija R. CPA-Perturb-seq: Multiplexed single-cell characterization of alternative polyadenylation regulators. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527751. [PMID: 36798324 PMCID: PMC9934614 DOI: 10.1101/2023.02.09.527751] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Most mammalian genes have multiple polyA sites, representing a substantial source of transcript diversity that is governed by the cleavage and polyadenylation (CPA) regulatory machinery. To better understand how these proteins govern polyA site choice we introduce CPA-Perturb-seq, a multiplexed perturbation screen dataset of 42 known CPA regulators with a 3' scRNA-seq readout that enables transcriptome-wide inference of polyA site usage. We develop a statistical framework to specifically identify perturbation-dependent changes in intronic and tandem polyadenylation, and discover modules of co-regulated polyA sites exhibiting distinct functional properties. By training a multi-task deep neural network (APARENT-Perturb) on our dataset, we delineate a cis-regulatory code that predicts responsiveness to perturbation and reveals interactions between distinct regulatory complexes. Finally, we leverage our framework to re-analyze published scRNA-seq datasets, identifying new regulators that affect the relative abundance of alternatively polyadenylated transcripts, and characterizing extensive cellular heterogeneity in 3' UTR length amongst antibody-producing cells. Our work highlights the potential for multiplexed single-cell perturbation screens to further our understanding of post-transcriptional regulation in vitro and in vivo.
Collapse
Affiliation(s)
- Madeline H. Kowalski
- New York Genome Center, New York, NY, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York University Grossman School of Medicine, New York, NY, USA
| | - Hans-Hermann Wessels
- New York Genome Center, New York, NY, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - Johannes Linder
- Department of Genetics, Stanford University, Stanford USA
- Department of Computer Science, Stanford University, Stanford USA
| | - Saket Choudhary
- New York Genome Center, New York, NY, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | | | - Yuhan Hao
- New York Genome Center, New York, NY, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - Isabella Mascio
- New York Genome Center, New York, NY, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | | | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford USA
- Department of Computer Science, Stanford University, Stanford USA
| | - Rahul Satija
- New York Genome Center, New York, NY, USA
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York University Grossman School of Medicine, New York, NY, USA
| |
Collapse
|
12
|
Laczik M, Erdős E, Ozgyin L, Hevessy Z, Csősz É, Kalló G, Nagy T, Barta E, Póliska S, Szatmári I, Bálint BL. Extensive proteome and functional genomic profiling of variability between genetically identical human B-lymphoblastoid cells. Sci Data 2022; 9:763. [PMID: 36496436 PMCID: PMC9741606 DOI: 10.1038/s41597-022-01871-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 11/22/2022] [Indexed: 12/13/2022] Open
Abstract
In life-science research isogenic B-lymphoblastoid cell lines (LCLs) are widely known and preferred for their genetic stability - they are often used for studying mutations for example, where genetic stability is crucial. We have shown previously that phenotypic variability can be observed in isogenic B-lymphoblastoid cell lines. Isogenic LCLs present well-defined phenotypic differences on various levels, for example on the gene expression level or the chromatin level. Based on our investigations, the phenotypic variability of the isogenic LCLs is accompanied by certain genetic variation too. We have developed a compendium of LCL datasets that present the phenotypic and genetic variability of five isogenic LCLs from a multiomic perspective. In this paper, we present additional datasets generated with Next Generation Sequencing techniques to provide genomic and transcriptomic profiles (WGS, RNA-seq, single cell RNA-seq), protein-DNA interactions (ChIP-seq), together with mass spectrometry and flow cytometry datasets to monitor the changes in the proteome. We are sharing these datasets with the scientific community according to the FAIR principles for further investigations.
Collapse
Affiliation(s)
- Miklós Laczik
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Edina Erdős
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Lilla Ozgyin
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Zsuzsanna Hevessy
- grid.7122.60000 0001 1088 8582Department of Laboratory Medicine, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Éva Csősz
- grid.7122.60000 0001 1088 8582Proteomics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Gergő Kalló
- grid.7122.60000 0001 1088 8582Proteomics Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Tibor Nagy
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.129553.90000 0001 1015 7851Department of Genetics and Genomics, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert út 4, Gödöllő, H-2100 Hungary
| | - Endre Barta
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.129553.90000 0001 1015 7851Department of Genetics and Genomics, Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Szent-Györgyi Albert út 4, Gödöllő, H-2100 Hungary
| | - Szilárd Póliska
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - István Szatmári
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.7122.60000 0001 1088 8582Faculty of Pharmacy, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary
| | - Bálint László Bálint
- grid.7122.60000 0001 1088 8582Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Egyetem tér 1., H-4032 Hungary ,grid.11804.3c0000 0001 0942 9821Department of Bioinformatics, Semmelweis University, Budapest, Tűzoltó utca 7-9., H-1094 Hungary
| |
Collapse
|
13
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
- Department of Immunology, Nanjing Medical University, Nanjing, 211166 China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children’s Hospital of Chongqing Medical University, Chongqing, 401174 China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110 Guangdong China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Murray J. Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW 2308 Australia
- Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| |
Collapse
|
14
|
Meyer E, Chaung K, Dehghannasiri R, Salzman J. ReadZS detects cell type-specific and developmentally regulated RNA processing programs in single-cell RNA-seq. Genome Biol 2022; 23:226. [PMID: 36284317 PMCID: PMC9594907 DOI: 10.1186/s13059-022-02795-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 10/13/2022] [Indexed: 11/13/2022] Open
Abstract
RNA processing, including splicing and alternative polyadenylation, is crucial to gene function and regulation, but methods to detect RNA processing from single-cell RNA sequencing data are limited by reliance on pre-existing annotations, peak calling heuristics, and collapsing measurements by cell type. We introduce ReadZS, an annotation-free statistical approach to identify regulated RNA processing in single cells. ReadZS discovers cell type-specific RNA processing in human lung and conserved, developmentally regulated RNA processing in mammalian spermatogenesis-including global 3' UTR shortening in human spermatogenesis. ReadZS also discovers global 3' UTR lengthening in Arabidopsis development, highlighting the usefulness of this method in under-annotated transcriptomes.
Collapse
Affiliation(s)
- Elisabeth Meyer
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Kaitlin Chaung
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Roozbeh Dehghannasiri
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Julia Salzman
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA.
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA.
- Department of Statistics (by courtesy), Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
15
|
Junaid M, Lee A, Kim J, Park TJ, Lim SB. Transcriptional Heterogeneity of Cellular Senescence in Cancer. Mol Cells 2022; 45:610-619. [PMID: 35983702 PMCID: PMC9448649 DOI: 10.14348/molcells.2022.0036] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Revised: 06/02/2022] [Accepted: 06/11/2022] [Indexed: 11/27/2022] Open
Abstract
Cellular senescence plays a paradoxical role in tumorigenesis through the expression of diverse senescence-associated (SA) secretory phenotypes (SASPs). The heterogeneity of SA gene expression in cancer cells not only promotes cancer stemness but also protects these cells from chemotherapy. Despite the potential correlation between cancer and SA biomarkers, many transcriptional changes across distinct cell populations remain largely unknown. During the past decade, single-cell RNA sequencing (scRNA-seq) technologies have emerged as powerful experimental and analytical tools to dissect such diverse senescence-derived transcriptional changes. Here, we review the recent sequencing efforts that successfully characterized scRNA-seq data obtained from diverse cancer cells and elucidated the role of senescent cells in tumor malignancy. We further highlight the functional implications of SA genes expressed specifically in cancer and stromal cell populations in the tumor microenvironment. Translational research leveraging scRNA-seq profiling of SA genes will facilitate the identification of novel expression patterns underlying cancer susceptibility, providing new therapeutic opportunities in the era of precision medicine.
Collapse
Affiliation(s)
- Muhammad Junaid
- Department of Biochemistry and Molecular Biology, Ajou University School of Medicine, Suwon 16499, Korea
- Department of Biomedical Sciences, Ajou University Graduate School, Suwon 16499, Korea
| | - Aejin Lee
- Department of Biochemistry and Molecular Biology, Ajou University School of Medicine, Suwon 16499, Korea
| | - Jaehyung Kim
- Department of Biochemistry and Molecular Biology, Ajou University School of Medicine, Suwon 16499, Korea
| | - Tae Jun Park
- Department of Biochemistry and Molecular Biology, Ajou University School of Medicine, Suwon 16499, Korea
- Department of Biomedical Sciences, Ajou University Graduate School, Suwon 16499, Korea
| | - Su Bin Lim
- Department of Biochemistry and Molecular Biology, Ajou University School of Medicine, Suwon 16499, Korea
- Department of Biomedical Sciences, Ajou University Graduate School, Suwon 16499, Korea
| |
Collapse
|
16
|
Ye W, Lian Q, Ye C, Wu X. A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00121-8. [PMID: 36167284 PMCID: PMC10372920 DOI: 10.1016/j.gpb.2022.09.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 08/17/2022] [Accepted: 09/19/2022] [Indexed: 05/08/2023]
Abstract
Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. Identification of poly(A) sites (pAs) on a genome-wide scale is a critical step toward understanding the underlying mechanism of APA-mediated gene regulation. A number of established computational tools have been proposed to predict pAs from diverse genomic data. Here we provided an exhaustive overview of computational approaches for predicting pAs from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Particularly, we examined several representative tools using bulk RNA-seq and scRNA-seq data from peripheral blood mononuclear cells and put forward operable suggestions on how to assess the reliability of pAs predicted by different tools. We also proposed practical guidelines on choosing appropriate methods applicable to diverse scenarios. Moreover, we discussed in depth the challenges in improving the performance of pA prediction and benchmarking different methods. Additionally, we highlighted outstanding challenges and opportunities using new machine learning and integrative multi-omics techniques, and provided our perspective on how computational methodologies might evolve in the future for non-3' untranslated region, tissue-specific, cross-species, and single-cell pA prediction.
Collapse
Affiliation(s)
- Wenbin Ye
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China
| | - Qiwei Lian
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China; Department of Automation, Xiamen University, Xiamen 361005, China
| | - Congting Ye
- Key Laboratory of the Coastal and Wetland Ecosystems, Ministry of Education, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Xiaohui Wu
- Pasteurien College, Suzhou Medical College of Soochow University, Soochow University, Suzhou 215000, China.
| |
Collapse
|
17
|
Lee S, Chen YC, Gillen AE, Taliaferro JM, Deplancke B, Li H, Lai EC. Diverse cell-specific patterns of alternative polyadenylation in Drosophila. Nat Commun 2022; 13:5372. [PMID: 36100597 PMCID: PMC9470587 DOI: 10.1038/s41467-022-32305-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 07/24/2022] [Indexed: 11/17/2022] Open
Abstract
Most genes in higher eukaryotes express isoforms with distinct 3' untranslated regions (3' UTRs), generated by alternative polyadenylation (APA). Since 3' UTRs are predominant locations of post-transcriptional regulation, APA can render such programs conditional, and can also alter protein sequences via alternative last exon (ALE) isoforms. We previously used 3'-sequencing from diverse Drosophila samples to define multiple tissue-specific APA landscapes. Here, we exploit comprehensive single nucleus RNA-sequencing data (Fly Cell Atlas) to elucidate cell-type expression of 3' UTRs across >250 adult Drosophila cell types. We reveal the cellular bases of multiple tissue-specific APA/ALE programs, such as 3' UTR lengthening in differentiated neurons and 3' UTR shortening in spermatocytes and spermatids. We trace dynamic 3' UTR patterns across cell lineages, including in the male germline, and discover new APA patterns in the intestinal stem cell lineage. Finally, we correlate expression of RNA binding proteins (RBPs), miRNAs and global levels of cleavage and polyadenylation (CPA) factors in several cell types that exhibit characteristic APA landscapes, yielding candidate regulators of transcriptome complexity. These analyses provide a comprehensive foundation for future investigations of mechanisms and biological impacts of alternative 3' isoforms across the major cell types of this widely-studied model organism.
Collapse
Affiliation(s)
- Seungjae Lee
- Developmental Biology Program, Sloan Kettering Institute, 1275 York Ave, Box 252, New York, NY, 10065, USA
| | - Yen-Chung Chen
- Department of Biology, New York University, New York, NY, 10013, USA
| | | | - Austin E Gillen
- Division of Hematology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Rocky Mountain Regional VA Medical Center, Aurora, CO, USA.,RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - J Matthew Taliaferro
- RNA Bioscience Initiative, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bio-engineering & Global Health Institute, School of Life Sciences, EPFL, CH-1015, Lausanne, Switzerland
| | - Hongjie Li
- Huffington Center on Aging, Baylor College of Medicine, Houston, TX, 77030, USA.,Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Eric C Lai
- Developmental Biology Program, Sloan Kettering Institute, 1275 York Ave, Box 252, New York, NY, 10065, USA.
| |
Collapse
|
18
|
scAPAmod: Profiling Alternative Polyadenylation Modalities in Single Cells from Single-Cell RNA-Seq Data. Int J Mol Sci 2022; 23:ijms23158123. [PMID: 35897701 PMCID: PMC9329739 DOI: 10.3390/ijms23158123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 07/01/2022] [Accepted: 07/21/2022] [Indexed: 11/17/2022] Open
Abstract
Alternative polyadenylation (APA) is a key layer of gene expression regulation, and APA choice is finely modulated in cells. Advances in single-cell RNA-seq (scRNA-seq) have provided unprecedented opportunities to study APA in cell populations. However, existing studies that investigated APA in single cells were either confined to a few cells or focused on profiling APA dynamics between cell types or identifying APA sites. The diversity and pattern of APA usages on a genomic scale in single cells remains unappreciated. Here, we proposed an analysis framework based on a Gaussian mixture model, scAPAmod, to identify patterns of APA usage from homogeneous or heterogeneous cell populations at the single-cell level. We systematically evaluated the performance of scAPAmod using simulated data and scRNA-seq data. The results show that scAPAmod can accurately identify different patterns of APA usages at the single-cell level. We analyzed the dynamic changes in the pattern of APA usage using scAPAmod in different cell differentiation and developmental stages during mouse spermatogenesis and found that even the same gene has different patterns of APA usages in different differentiation stages. The preference of patterns of usages of APA sites in different genomic regions was also analyzed. We found that patterns of APA usages of the same gene in 3′ UTRs (3′ untranslated region) and non-3′ UTRs are different. Moreover, we analyzed cell-type-specific APA usage patterns and changes in patterns of APA usages across cell types. Different from the conventional analysis of single-cell heterogeneity based on gene expression profiling, this study profiled the heterogeneous pattern of APA isoforms, which contributes to revealing the heterogeneity of single-cell gene expression with higher resolution.
Collapse
|
19
|
Wei L, Lai EC. Regulation of the Alternative Neural Transcriptome by ELAV/Hu RNA Binding Proteins. Front Genet 2022; 13:848626. [PMID: 35281806 PMCID: PMC8904962 DOI: 10.3389/fgene.2022.848626] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 02/01/2022] [Indexed: 11/30/2022] Open
Abstract
The process of alternative polyadenylation (APA) generates multiple 3' UTR isoforms for a given locus, which can alter regulatory capacity and on occasion change coding potential. APA was initially characterized for a few genes, but in the past decade, has been found to be the rule for metazoan genes. While numerous differences in APA profiles have been catalogued across genetic conditions, perturbations, and diseases, our knowledge of APA mechanisms and biology is far from complete. In this review, we highlight recent findings regarding the role of the conserved ELAV/Hu family of RNA binding proteins (RBPs) in generating the broad landscape of lengthened 3' UTRs that is characteristic of neurons. We relate this to their established roles in alternative splicing, and summarize ongoing directions that will further elucidate the molecular strategies for neural APA, the in vivo functions of ELAV/Hu RBPs, and the phenotypic consequences of these regulatory paradigms in neurons.
Collapse
Affiliation(s)
- Lu Wei
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | - Eric C. Lai
- Developmental Biology Program, Sloan Kettering Institute, New York, NY, United States
| |
Collapse
|
20
|
Implications of Poly(A) Tail Processing in Repeat Expansion Diseases. Cells 2022; 11:cells11040677. [PMID: 35203324 PMCID: PMC8870147 DOI: 10.3390/cells11040677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/11/2022] [Accepted: 02/13/2022] [Indexed: 11/21/2022] Open
Abstract
Repeat expansion diseases are a group of more than 40 disorders that affect mainly the nervous and/or muscular system and include myotonic dystrophies, Huntington’s disease, and fragile X syndrome. The mutation-driven expanded repeat tract occurs in specific genes and is composed of tri- to dodeca-nucleotide-long units. Mutant mRNA is a pathogenic factor or important contributor to the disease and has great potential as a therapeutic target. Although repeat expansion diseases are quite well known, there are limited studies concerning polyadenylation events for implicated transcripts that could have profound effects on transcript stability, localization, and translation efficiency. In this review, we briefly present polyadenylation and alternative polyadenylation (APA) mechanisms and discuss their role in the pathogenesis of selected diseases. We also discuss several methods for poly(A) tail measurement (both transcript-specific and transcriptome-wide analyses) and APA site identification—the further development and use of which may contribute to a better understanding of the correlation between APA events and repeat expansion diseases. Finally, we point out some future perspectives on the research into repeat expansion diseases, as well as APA studies.
Collapse
|
21
|
Dynamic alternative polyadenylation during iPSC differentiation into cardiomyocytes. Comput Struct Biotechnol J 2022; 20:5859-5869. [DOI: 10.1016/j.csbj.2022.10.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 10/18/2022] [Accepted: 10/18/2022] [Indexed: 11/20/2022] Open
|
22
|
|
23
|
Zhu S, Lian Q, Ye W, Qin W, Wu Z, Ji G, Wu X. scAPAdb: a comprehensive database of alternative polyadenylation at single-cell resolution. Nucleic Acids Res 2021; 50:D365-D370. [PMID: 34508354 PMCID: PMC8728153 DOI: 10.1093/nar/gkab795] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 08/26/2021] [Accepted: 09/02/2021] [Indexed: 01/08/2023] Open
Abstract
Alternative polyadenylation (APA) is a widespread regulatory mechanism of transcript diversification in eukaryotes, which is increasingly recognized as an important layer for eukaryotic gene expression. Recent studies based on single-cell RNA-seq (scRNA-seq) have revealed cell-to-cell heterogeneity in APA usage and APA dynamics across different cell types in various tissues, biological processes and diseases. However, currently available APA databases were all collected from bulk 3′-seq and/or RNA-seq data, and no existing database has provided APA information at single-cell resolution. Here, we present a user-friendly database called scAPAdb (http://www.bmibig.cn/scAPAdb), which provides a comprehensive and manually curated atlas of poly(A) sites, APA events and poly(A) signals at the single-cell level. Currently, scAPAdb collects APA information from > 360 scRNA-seq experiments, covering six species including human, mouse and several other plant species. scAPAdb also provides batch download of data, and users can query the database through a variety of keywords such as gene identifier, gene function and accession number. scAPAdb would be a valuable and extendable resource for the study of cell-to-cell heterogeneity in APA isoform usages and APA-mediated gene regulation at the single-cell level under diverse cell types, tissues and species.
Collapse
Affiliation(s)
- Sheng Zhu
- Pasteurien College, Soochow University, Suzhou, Jiangsu 215000, China.,Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Qiwei Lian
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Wenbin Ye
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Wei Qin
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Zhe Wu
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, Fujian 361005, China
| | - Xiaohui Wu
- Pasteurien College, Soochow University, Suzhou, Jiangsu 215000, China
| |
Collapse
|
24
|
Dharmalingam P, Mahalingam R, Yalamanchili HK, Weng T, Karmouty-Quintana H, Guha A, A Thandavarayan R. Emerging roles of alternative cleavage and polyadenylation (APA) in human disease. J Cell Physiol 2021; 237:149-160. [PMID: 34378793 DOI: 10.1002/jcp.30549] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 07/13/2021] [Accepted: 07/20/2021] [Indexed: 12/11/2022]
Abstract
In the messenger RNA (mRNA) maturation process, the 3'-end of pre-mRNA is cleaved and a poly(A) sequence is added, this is an important determinant of mRNA stability and its cellular functions. More than 60%-70% of human genes have three or more polyadenylation (APA) sites and can be cleaved at different sites, generating mRNA transcripts of varying lengths. This phenomenon is termed as alternative cleavage and polyadenylation (APA) and it plays role in key biological processes like gene regulation, cell proliferation, senescence, and also in various human diseases. Loss of regulatory microRNA binding sites and interactions with RNA-binding proteins leading to APA are largely investigated in human diseases. However, the functions of the core APA machinery and related factors during disease conditions remain largely unknown. In this review, we discuss the roles of polyadenylation machinery in relation to brain disease, cardiac failure, pulmonary fibrosis, cancer, infectious conditions, and other human diseases. Collectively, we believe this review will be a useful avenue for understanding the emerging role of APA in the pathobiology of various human diseases.
Collapse
Affiliation(s)
- Prakash Dharmalingam
- Department of Biochemistry, Saveetha Dental College & Hospitals, Saveetha Institute of Medical & Technical Sciences, Saveetha University, Chennai, India
| | - Rajasekaran Mahalingam
- Laboratory of Neuroimmunology, Department of Symptom Research, The University of Texas MD Anderson Cancer Center, Houston, Texas, USA
| | - Hari Krishna Yalamanchili
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, USA.,Department of Pediatrics - Neurology, Jan and Dan Duncan Neurological Research Institute, Texas Children's Hospital, Houston, Texas, USA.,Department of Pediatrics, USDA/ARS Children's Nutrition Research Center, Baylor College of Medicine, Houston, Texas, USA
| | - Tingting Weng
- Department of Biochemistry and Molecular Biology & Divisions of Critical Care, Pulmonary and Sleep Medicine, Department of Internal Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Harry Karmouty-Quintana
- Department of Biochemistry and Molecular Biology & Divisions of Critical Care, Pulmonary and Sleep Medicine, Department of Internal Medicine, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Ashrith Guha
- Department of Cardiology, Houston Methodist DeBakey Heart & Vascular Center, Houston, Texas, USA
| | | |
Collapse
|
25
|
Li WV, Zheng D, Wang R, Tian B. MAAPER: model-based analysis of alternative polyadenylation using 3' end-linked reads. Genome Biol 2021; 22:222. [PMID: 34376236 PMCID: PMC8356463 DOI: 10.1186/s13059-021-02429-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 07/01/2021] [Indexed: 12/20/2022] Open
Abstract
Most eukaryotic genes express alternative polyadenylation (APA) isoforms. A growing number of RNA sequencing methods, especially those used for single-cell transcriptome analysis, generate reads close to the polyadenylation site (PAS), termed nearSite reads, hence inherently containing information about APA isoform abundance. Here, we present a probabilistic model-based method named MAAPER to utilize nearSite reads for APA analysis. MAAPER predicts PASs with high accuracy and sensitivity and examines different types of APA events with robust statistics. We show MAAPER's performance with both bulk and single-cell data and its applicability in unpaired or paired experimental designs.
Collapse
Affiliation(s)
- Wei Vivian Li
- Department of Biostatistics and Epidemiology, Rutgers School of Public Health, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
| | - Dinghai Zheng
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA
| | - Ruijia Wang
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, NJ, 07103, USA. .,Program in Gene Expression and Regulation, and Center for Systems and Computational Biology, The Wistar Institute, Philadelphia, PA, 19104, USA.
| |
Collapse
|