1
|
Chen Y, Shen J, Ling C, Liang Z, Huang S, Lin W, Qin Y, Meng L, Luo Y. Exploring the role of CD8 + T cells in clear renal cell carcinoma metastasis. FEBS Open Bio 2024; 14:1205-1217. [PMID: 38872260 PMCID: PMC11216920 DOI: 10.1002/2211-5463.13819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 04/04/2024] [Accepted: 05/03/2024] [Indexed: 06/15/2024] Open
Abstract
Clear cell renal cell carcinoma (ccRCC) accounts for approximately 75-80% of all patients with renal cell carcinoma. Despite its prevalence, little is known regarding the key components involved in ccRCC metastasis. In this study, scRNA-seq analysis was employed to classify CD8+ T cells into four sub-clusters based on their genetic profiles and immunofluorescence experiments were used to validate two key clusters. Through gene set enrichment analysis, these newly identified sub-clusters were found to exhibit distinct biological characteristics. Notably, TYMP, TOP2A, CHI3L2, CDKN3, CENPM, and RZH2 were highly expressed in these sub-clusters, indicating a correlation with poor prognosis. Among these sub-clusters, CD8+ T cells (MT-ND4) were identified as potentially playing a critical role in mediating ccRCC metastasis. These results contribute to our understanding of CD8+ T cell heterogeneity in ccRCC and shed light on the mechanisms underlying the loss of immune response against cancer.
Collapse
Affiliation(s)
- Yuanhong Chen
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
- Department of Pathogenic Biology and ImmunologyYoujiang Medical University for NationalitiesBaiseChina
| | - Jiajia Shen
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
| | - Caixia Ling
- Modern Industrial College of Biomedicine and Great HealthYoujiang Medical University for NationalitiesBaiseChina
| | - Zhengfang Liang
- Department of Urinary SurgeryThe Affiliated Hospital of Youjiang Medical University for NationalitiesBaiseChina
| | - Shaoang Huang
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
| | - Wenxian Lin
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
- Department of Interventional OncologyAffiliated Hospital of Youjiang Medical College for NationalitiesBaiseChina
| | - Yujuan Qin
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
| | - Lingzhang Meng
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
- Institute of Cardiovascular SciencesGuangxi Academy of Medical SciencesNanningChina
| | - Yanhong Luo
- Center for Systemic Inflammation Research (CSIR), School of Preclinical MedicineYoujiang Medical University for NationalitiesBaiseChina
- Modern Industrial College of Biomedicine and Great HealthYoujiang Medical University for NationalitiesBaiseChina
| |
Collapse
|
2
|
Bilous M, Hérault L, Gabriel AA, Teleman M, Gfeller D. Building and analyzing metacells in single-cell genomics data. Mol Syst Biol 2024; 20:744-766. [PMID: 38811801 PMCID: PMC11220014 DOI: 10.1038/s44320-024-00045-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 05/03/2024] [Accepted: 05/08/2024] [Indexed: 05/31/2024] Open
Abstract
The advent of high-throughput single-cell genomics technologies has fundamentally transformed biological sciences. Currently, millions of cells from complex biological tissues can be phenotypically profiled across multiple modalities. The scaling of computational methods to analyze and visualize such data is a constant challenge, and tools need to be regularly updated, if not redesigned, to cope with ever-growing numbers of cells. Over the last few years, metacells have been introduced to reduce the size and complexity of single-cell genomics data while preserving biologically relevant information and improving interpretability. Here, we review recent studies that capitalize on the concept of metacells-and the many variants in nomenclature that have been used. We further outline how and when metacells should (or should not) be used to analyze single-cell genomics data and what should be considered when analyzing such data at the metacell level. To facilitate the exploration of metacells, we provide a comprehensive tutorial on the construction and analysis of metacells from single-cell RNA-seq data ( https://github.com/GfellerLab/MetacellAnalysisTutorial ) as well as a fully integrated pipeline to rapidly build, visualize and evaluate metacells with different methods ( https://github.com/GfellerLab/MetacellAnalysisToolkit ).
Collapse
Affiliation(s)
- Mariia Bilous
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, 1011, Lausanne, Switzerland
- Agora Cancer Research Centre, 1011, Lausanne, Switzerland
- Swiss Cancer Center Leman (SCCL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), 1015, Lausanne, Switzerland
| | - Léonard Hérault
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, 1011, Lausanne, Switzerland
- Agora Cancer Research Centre, 1011, Lausanne, Switzerland
- Swiss Cancer Center Leman (SCCL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), 1015, Lausanne, Switzerland
| | - Aurélie Ag Gabriel
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, 1011, Lausanne, Switzerland
- Agora Cancer Research Centre, 1011, Lausanne, Switzerland
- Swiss Cancer Center Leman (SCCL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), 1015, Lausanne, Switzerland
| | - Matei Teleman
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, 1011, Lausanne, Switzerland
- Agora Cancer Research Centre, 1011, Lausanne, Switzerland
- Swiss Cancer Center Leman (SCCL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB), 1015, Lausanne, Switzerland
| | - David Gfeller
- Department of Oncology, Ludwig Institute for Cancer Research Lausanne, University of Lausanne, 1011, Lausanne, Switzerland.
- Agora Cancer Research Centre, 1011, Lausanne, Switzerland.
- Swiss Cancer Center Leman (SCCL), Lausanne, Switzerland.
- Swiss Institute of Bioinformatics (SIB), 1015, Lausanne, Switzerland.
| |
Collapse
|
3
|
Kim H, Chang W, Chae SJ, Park JE, Seo M, Kim JK. scLENS: data-driven signal detection for unbiased scRNA-seq data analysis. Nat Commun 2024; 15:3575. [PMID: 38678050 DOI: 10.1038/s41467-024-47884-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/14/2024] [Indexed: 04/29/2024] Open
Abstract
High dimensionality and noise have limited the new biological insights that can be discovered in scRNA-seq data. While dimensionality reduction tools have been developed to extract biological signals from the data, they often require manual determination of signal dimension, introducing user bias. Furthermore, a common data preprocessing method, log normalization, can unintentionally distort signals in the data. Here, we develop scLENS, a dimensionality reduction tool that circumvents the long-standing issues of signal distortion and manual input. Specifically, we identify the primary cause of signal distortion during log normalization and effectively address it by uniformizing cell vector lengths with L2 normalization. Furthermore, we utilize random matrix theory-based noise filtering and a signal robustness test to enable data-driven determination of the threshold for signal dimensions. Our method outperforms 11 widely used dimensionality reduction tools and performs particularly well for challenging scRNA-seq datasets with high sparsity and variability. To facilitate the use of scLENS, we provide a user-friendly package that automates accurate signal detection of scRNA-seq data without manual time-consuming tuning.
Collapse
Affiliation(s)
- Hyun Kim
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea
| | - Won Chang
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH, 45221, USA
| | - Seok Joo Chae
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea
- Department of Mathematical Sciences, KAIST, Daejeon, 34141, Republic of Korea
| | - Jong-Eun Park
- Graduate School of Medical Science and Engineering, KAIST, Daejeon, 34141, Republic of Korea
| | - Minseok Seo
- Department of Computer and Information Science, Korea University, Sejong, 30019, Republic of Korea
| | - Jae Kyoung Kim
- Biomedical Mathematics Group, Pioneer Research Center for Mathematical and Computational Sciences, Institute for Basic Science, Daejeon, 34126, Republic of Korea.
- Department of Mathematical Sciences, KAIST, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
4
|
Chen Y, Wu W, Jin C, Cui J, Diao Y, Wang R, Xu R, Yao Z, Li X. Integrating Single-Cell RNA-Seq and Bulk RNA-Seq Data to Explore the Key Role of Fatty Acid Metabolism in Breast Cancer. Int J Mol Sci 2023; 24:13209. [PMID: 37686016 PMCID: PMC10487665 DOI: 10.3390/ijms241713209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Revised: 08/18/2023] [Accepted: 08/22/2023] [Indexed: 09/10/2023] Open
Abstract
Cancer immune escape is associated with the metabolic reprogramming of the various infiltrating cells in the tumor microenvironment (TME), and combining metabolic targets with immunotherapy shows great promise for improving clinical outcomes. Among all metabolic processes, lipid metabolism, especially fatty acid metabolism (FAM), plays a major role in cancer cell survival, migration, and proliferation. However, the mechanisms and functions of FAM in the tumor immune microenvironment remain poorly understood. We screened 309 fatty acid metabolism-related genes (FMGs) for differential expression, identifying 121 differentially expressed genes. Univariate Cox regression models in The Cancer Genome Atlas (TCGA) database were then utilized to identify the 15 FMGs associated with overall survival. We systematically evaluated the correlation between FMGs' modification patterns and the TME, prognosis, and immunotherapy. The FMGsScore was constructed to quantify the FMG modification patterns using principal component analysis. Three clusters based on FMGs were demonstrated in breast cancer, with three patterns of distinct immune cell infiltration and biological behavior. An FMGsScore signature was constructed to reveal that patients with a low FMGsScore had higher immune checkpoint expression, higher immune checkpoint inhibitor (ICI) scores, increased immune microenvironment infiltration, better survival advantage, and were more sensitive to immunotherapy than those with a high FMGsScore. Finally, the expression and function of the signature key gene NDUFAB1 were examined by in vitro experiments. This study significantly demonstrates the substantial impact of FMGs on the immune microenvironment of breast cancer, and that FMGsScores can be used to guide the prediction of immunotherapy efficacy in breast cancer patients. In vitro experiments, knockdown of the NDUFAB1 gene resulted in reduced proliferation and migration of MCF-7 and MDA-MB-231 cell lines.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Xiaofeng Li
- Department of Epidemiology and Health Statistics, Dalian Medical University, Dalian 116044, China
| |
Collapse
|
5
|
Leary JR, Xu Y, Morrison AB, Jin C, Shen EC, Kuhlers PC, Su Y, Rashid NU, Yeh JJ, Peng XL. Sub-Cluster Identification through Semi-Supervised Optimization of Rare-Cell Silhouettes (SCISSORS) in single-cell RNA-sequencing. Bioinformatics 2023; 39:btad449. [PMID: 37498558 PMCID: PMC10412410 DOI: 10.1093/bioinformatics/btad449] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/30/2023] [Accepted: 07/25/2023] [Indexed: 07/28/2023] Open
Abstract
MOTIVATION Single-cell RNA-sequencing (scRNA-seq) has enabled the molecular profiling of thousands to millions of cells simultaneously in biologically heterogenous samples. Currently, the common practice in scRNA-seq is to determine cell type labels through unsupervised clustering and the examination of cluster-specific genes. However, even small differences in analysis and parameter choosing can greatly alter clustering results and thus impose great influence on which cell types are identified. Existing methods largely focus on determining the optimal number of robust clusters, which can be problematic for identifying cells of extremely low abundance due to their subtle contributions toward overall patterns of gene expression. RESULTS Here, we present a carefully designed framework, SCISSORS, which accurately profiles subclusters within broad cluster(s) for the identification of rare cell types in scRNA-seq data. SCISSORS employs silhouette scoring for the estimation of heterogeneity of clusters and reveals rare cells in heterogenous clusters by a multi-step semi-supervised reclustering process. Additionally, SCISSORS provides a method for the identification of marker genes of high specificity to the cell type. SCISSORS is wrapped around the popular Seurat R package and can be easily integrated into existing Seurat pipelines. AVAILABILITY AND IMPLEMENTATION SCISSORS, including source code and vignettes, are freely available at https://github.com/jr-leary7/SCISSORS.
Collapse
Affiliation(s)
- Jack R Leary
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| | - Yi Xu
- Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Ashley B Morrison
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Chong Jin
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Emily C Shen
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Peyton C Kuhlers
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Ye Su
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Naim U Rashid
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Jen Jen Yeh
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Surgery, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| | - Xianlu Laura Peng
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
- Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States
| |
Collapse
|
6
|
Pan Y, Landis JT, Moorad R, Wu D, Marron JS, Dittmer DP. The Poisson distribution model fits UMI-based single-cell RNA-sequencing data. BMC Bioinformatics 2023; 24:256. [PMID: 37330471 PMCID: PMC10276395 DOI: 10.1186/s12859-023-05349-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 05/24/2023] [Indexed: 06/19/2023] Open
Abstract
BACKGROUND Modeling of single cell RNA-sequencing (scRNA-seq) data remains challenging due to a high percentage of zeros and data heterogeneity, so improved modeling has strong potential to benefit many downstream data analyses. The existing zero-inflated or over-dispersed models are based on aggregations at either the gene or the cell level. However, they typically lose accuracy due to a too crude aggregation at those two levels. RESULTS We avoid the crude approximations entailed by such aggregation through proposing an independent Poisson distribution (IPD) particularly at each individual entry in the scRNA-seq data matrix. This approach naturally and intuitively models the large number of zeros as matrix entries with a very small Poisson parameter. The critical challenge of cell clustering is approached via a novel data representation as Departures from a simple homogeneous IPD (DIPD) to capture the per-gene-per-cell intrinsic heterogeneity generated by cell clusters. Our experiments using real data and crafted experiments show that using DIPD as a data representation for scRNA-seq data can uncover novel cell subtypes that are missed or can only be found by careful parameter tuning using conventional methods. CONCLUSIONS This new method has multiple advantages, including (1) no need for prior feature selection or manual optimization of hyperparameters; (2) flexibility to combine with and improve upon other methods, such as Seurat. Another novel contribution is the use of crafted experiments as part of the validation of our newly developed DIPD-based clustering pipeline. This new clustering pipeline is implemented in the R (CRAN) package scpoisson.
Collapse
Affiliation(s)
- Yue Pan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Justin T Landis
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, USA
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Razia Moorad
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, USA
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Di Wu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, USA
- Adam School of Dentistry, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - J S Marron
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, USA
- Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, USA
| | - Dirk P Dittmer
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, USA.
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, USA.
| |
Collapse
|
7
|
Miranda AMA, Janbandhu V, Maatz H, Kanemaru K, Cranley J, Teichmann SA, Hübner N, Schneider MD, Harvey RP, Noseda M. Single-cell transcriptomics for the assessment of cardiac disease. Nat Rev Cardiol 2023; 20:289-308. [PMID: 36539452 DOI: 10.1038/s41569-022-00805-7] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/03/2022] [Indexed: 12/24/2022]
Abstract
Cardiovascular disease is the leading cause of death globally. An advanced understanding of cardiovascular disease mechanisms is required to improve therapeutic strategies and patient risk stratification. State-of-the-art, large-scale, single-cell and single-nucleus transcriptomics facilitate the exploration of the cardiac cellular landscape at an unprecedented level, beyond its descriptive features, and can further our understanding of the mechanisms of disease and guide functional studies. In this Review, we provide an overview of the technical challenges in the experimental design of single-cell and single-nucleus transcriptomics studies, as well as a discussion of the type of inferences that can be made from the data derived from these studies. Furthermore, we describe novel findings derived from transcriptomics studies for each major cardiac cell type in both health and disease, and from development to adulthood. This Review also provides a guide to interpreting the exhaustive list of newly identified cardiac cell types and states, and highlights the consensus and discordances in annotation, indicating an urgent need for standardization. We describe advanced applications such as integration of single-cell data with spatial transcriptomics to map genes and cells on tissue and define cellular microenvironments that regulate homeostasis and disease progression. Finally, we discuss current and future translational and clinical implications of novel transcriptomics approaches, and provide an outlook of how these technologies will change the way we diagnose and treat heart disease.
Collapse
Affiliation(s)
| | - Vaibhao Janbandhu
- Victor Chang Cardiac Research Institute, Sydney, NSW, Australia
- School of Clinical Medicine, Faculty of Medicine, UNSW Sydney, Sydney, NSW, Australia
| | - Henrike Maatz
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Kazumasa Kanemaru
- Cellular Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - James Cranley
- Cellular Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Sarah A Teichmann
- Cellular Genetics Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- Deptartment of Physics, Cavendish Laboratory, University of Cambridge, Cambridge, UK
| | - Norbert Hübner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
- Charite-Universitätsmedizin Berlin, Berlin, Germany
- German Center for Cardiovascular Research (DZHK), Partner Site Berlin, Berlin, Germany
| | | | - Richard P Harvey
- Victor Chang Cardiac Research Institute, Sydney, NSW, Australia
- School of Clinical Medicine, Faculty of Medicine, UNSW Sydney, Sydney, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW, Australia
| | - Michela Noseda
- National Heart and Lung Institute, Imperial College London, London, UK.
| |
Collapse
|
8
|
Pu J, Wang B, Liu X, Chen L, Li SC. SMURF: embedding single-cell RNA-seq data with matrix factorization preserving self-consistency. Brief Bioinform 2023; 24:7008800. [PMID: 36715274 DOI: 10.1093/bib/bbad026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 12/17/2022] [Accepted: 01/09/2023] [Indexed: 01/31/2023] Open
Abstract
The advance in single-cell RNA-sequencing (scRNA-seq) sheds light on cell-specific transcriptomic studies of cell developments, complex diseases and cancers. Nevertheless, scRNA-seq techniques suffer from 'dropout' events, and imputation tools are proposed to address the sparsity. Here, rather than imputation, we propose a tool, SMURF, to extract the low-dimensional embeddings from cells and genes utilizing matrix factorization with a mixture of Poisson-Gamma divergent as objective while preserving self-consistency. SMURF exhibits feasible cell subpopulation discovery efficacy with obtained cell embeddings on replicated in silico and eight web lab scRNA datasets with ground truth cell types. Furthermore, SMURF can reduce the cell embedding to a 1D-oval space to recover the time course of cell cycle. SMURF can also serve as an imputation tool; the in silico data assessment shows that SMURF parades the most robust gene expression recovery power with low root mean square error and high Pearson correlation. Moreover, SMURF recovers the gene distribution for the WM989 Drop-seq data. SMURF is available at https://github.com/deepomicslab/SMURF.
Collapse
Affiliation(s)
- Juhua Pu
- State Key Laboratory of Software Development Environment, Beihang University, Beijing, China
- Beihang Hangzhou Innovation Institute Yuhang, Xixi Octagon City, Yuhang District, Hangzhou 310023, China
| | - Bingchen Wang
- State Key Laboratory of Software Development Environment, Beihang University, Beijing, China
- Beihang Hangzhou Innovation Institute Yuhang, Xixi Octagon City, Yuhang District, Hangzhou 310023, China
| | - Xingwu Liu
- School of Mathematical Sciences, Dalian University of Technology, Dalian, Liaoning, China
| | - Lingxi Chen
- Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong, 83 Tat Chee Ave, Kowloon Tong, Hong Kong, China
| |
Collapse
|
9
|
Liu Y, Li HD, Xu Y, Liu YW, Peng X, Wang J. IsoCell: An Approach to Enhance Single Cell Clustering by Integrating Isoform-Level Expression Through Orthogonal Projection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:465-475. [PMID: 35100120 DOI: 10.1109/tcbb.2022.3147193] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Single cell RNA sequencing (scRNA-seq) provides a powerful approach for profiling transcriptomes at single cell resolution. An essential application of scRNA-seq is the discovery of cell types with the aid of clustering analysis. Currently, existing single cell clustering methods are exclusively based on gene-level expression data, without considering alternative splicing information. It has been shown that alternative splicing has an important influence on biological processes such as cell differentiation and cell cycle. We therefore hypothesize that adding information about alternative splicing may help enhance single cell clustering. This motivates us to develop a way to integrate isoform-level expression and gene-level expression. We report an approach to enhance single cell clustering by integrating isoform-level expression through orthogonal projection. First, we construct an orthogonal projection matrix based on gene expression data. Second, isoforms are projected to the gene space to remove the redundant information between them. Third, isoform selection is performed based on the residual of the projected expression and the selected isoforms are combined with gene expression data for subsequent clustering. We applied our method to sixteen scRNA-seq datasets. We find that alternative splicing contains differential information among cell types and can be integrated to enhance single cell clustering. Compared with using only gene-level expression data, the integration of isoform-level expression leads to better clustering performances for most of the datasets. The integration of isoform-level expression also has potential in the detection of novel cell subgroups. Our study shows that integrating isoform and gene-level expression is a promising way to improve single cell clustering. The IsoCell R package is freely available at both Github (https://github.com/genemine/IsoCell) and Zenodo (https://zenodo.org/record/4395707).
Collapse
|
10
|
Umu SU, Rapp Vander-Elst K, Karlsen VT, Chouliara M, Bækkevold ES, Jahnsen FL, Domanska D. Cellsnake: a user-friendly tool for single-cell RNA sequencing analysis. Gigascience 2022; 12:giad091. [PMID: 37889009 PMCID: PMC10603768 DOI: 10.1093/gigascience/giad091] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 08/25/2023] [Accepted: 10/05/2023] [Indexed: 10/28/2023] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-seq) provides high-resolution transcriptome data to understand the heterogeneity of cell populations at the single-cell level. The analysis of scRNA-seq data requires the utilization of numerous computational tools. However, nonexpert users usually experience installation issues, a lack of critical functionality or batch analysis modes, and the steep learning curves of existing pipelines. RESULTS We have developed cellsnake, a comprehensive, reproducible, and accessible single-cell data analysis workflow, to overcome these problems. Cellsnake offers advanced features for standard users and facilitates downstream analyses in both R and Python environments. It is also designed for easy integration into existing workflows, allowing for rapid analyses of multiple samples. CONCLUSION As an open-source tool, cellsnake is accessible through Bioconda, PyPi, Docker, and GitHub, making it a cost-effective and user-friendly option for researchers. By using cellsnake, researchers can streamline the analysis of scRNA-seq data and gain insights into the complex biology of single cells.
Collapse
Affiliation(s)
- Sinan U Umu
- Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo 0372, Norway
| | | | - Victoria T Karlsen
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
| | - Manto Chouliara
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
| | - Espen Sønderaal Bækkevold
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
- Institute of Oral Biology, University of Oslo, Oslo 0372, Norway
| | - Frode Lars Jahnsen
- Department of Pathology, Institute of Clinical Medicine, University of Oslo, Oslo 0372, Norway
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
| | - Diana Domanska
- Department of Pathology, Oslo University Hospital-Rikshospitalet, Oslo 0372, Norway
- Department of Microbiology, University of Oslo, Rikshospitalet, Oslo 0372, Norway
| |
Collapse
|
11
|
Mikolajewicz N, Gacesa R, Aguilera-Uribe M, Brown KR, Moffat J, Han H. Multi-level cellular and functional annotation of single-cell transcriptomes using scPipeline. Commun Biol 2022; 5:1142. [PMID: 36307536 PMCID: PMC9616830 DOI: 10.1038/s42003-022-04093-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Accepted: 10/11/2022] [Indexed: 11/08/2022] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) offers functional insight into complex biology, allowing for the interrogation of cellular populations and gene expression programs at single-cell resolution. Here, we introduce scPipeline, a single-cell data analysis toolbox that builds on existing methods and offers modular workflows for multi-level cellular annotation and user-friendly analysis reports. Advances to scRNA-seq annotation include: (i) co-dependency index (CDI)-based differential expression, (ii) cluster resolution optimization using a marker-specificity criterion, (iii) marker-based cell-type annotation with Miko scoring, and (iv) gene program discovery using scale-free shared nearest neighbor network (SSN) analysis. Both unsupervised and supervised procedures were validated using a diverse collection of scRNA-seq datasets and illustrative examples of cellular transcriptomic annotation of developmental and immunological scRNA-seq atlases are provided herein. Overall, scPipeline offers a flexible computational framework for in-depth scRNA-seq analysis.
Collapse
Affiliation(s)
- Nicholas Mikolajewicz
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Rafael Gacesa
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
| | - Magali Aguilera-Uribe
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Kevin R Brown
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Jason Moffat
- Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Institute for Biomedical Engineering, University of Toronto, Toronto, ON, Canada.
| | - Hong Han
- Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada.
| |
Collapse
|
12
|
Zeng Y, Wei Z, Zhong F, Pan Z, Lu Y, Yang Y. A parameter-free deep embedded clustering method for single-cell RNA-seq data. Brief Bioinform 2022; 23:6582003. [PMID: 35524494 DOI: 10.1093/bib/bbac172] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 03/25/2022] [Accepted: 04/18/2022] [Indexed: 11/12/2022] Open
Abstract
Clustering analysis is widely used in single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) data to discover cell heterogeneity and cell states. While many clustering methods have been developed for scRNA-seq analysis, most of these methods require to provide the number of clusters. However, it is not easy to know the exact number of cell types in advance, and experienced determination is not always reliable. Here, we have developed ADClust, an automatic deep embedding clustering method for scRNA-seq data, which can accurately cluster cells without requiring a predefined number of clusters. Specifically, ADClust first obtains low-dimensional representation through pre-trained autoencoder and uses the representations to cluster cells into initial micro-clusters. The clusters are then compared in between by a statistical test, and similar micro-clusters are merged into larger clusters. According to the clustering, cell representations are updated so that each cell will be pulled toward centers of its assigned cluster and similar clusters, while cells are separated to keep distances between clusters. This is accomplished through jointly optimizing the carefully designed clustering and autoencoder loss functions. This merging process continues until convergence. ADClust was tested on 11 real scRNA-seq datasets and was shown to outperform existing methods in terms of both clustering performance and the accuracy on the number of the determined clusters. More importantly, our model provides high speed and scalability for large datasets.
Collapse
Affiliation(s)
- Yuansong Zeng
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Zhuoyi Wei
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Fengqi Zhong
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Zixiang Pan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Yutong Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China.,Key Laboratory of Machine Intelligence and Advanced Computing (MOE), Guangzhou 510000, China
| |
Collapse
|
13
|
Yao J, Zhang Y, Li M, Sun Z, Liu T, Zhao M, Li Z. Single-Cell RNA-Seq Reveals the Promoting Role of Ferroptosis Tendency During Lung Adenocarcinoma EMT Progression. Front Cell Dev Biol 2022; 9:822315. [PMID: 35127731 PMCID: PMC8810644 DOI: 10.3389/fcell.2021.822315] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 12/30/2021] [Indexed: 01/31/2023] Open
Abstract
Epithelial-mesenchymal transition (EMT) and ferroptosis are two important processes in biology. In tumor cells, they are intimately linked. We used single-cell RNA sequencing to investigate the regulatory connection between EMT and ferroptosis tendency in LUAD epithelial cells. We used Seurat to construct the expression matrix using the GEO dataset GSE131907 and extract epithelial cells. We found a positive correlation between the trends of EMT and ferroptosis tendency. Then we used SCENIC to analyze differentially activated transcription factors and constructed a molecular regulatory directed network by causal inference. Some ferroptosis markers (GPX4, SCP2, CAV1) were found to have strong regulatory effects on EMT. Cell communication networks were constructed by iTALK and implied that Ferro_High_EMT_High cells have a higher expression of SDC1, SDC4, and activation of LGALS9-HARVCR2 pathways. By deconvolution of bulk sequencing, the results of CIBERSORTx showed that the co-occurrence of ferroptosis tendency and EMT may lead to tumor metastasis and non-response to immunotherapy. Our findings showed there is a strong correlation between ferroptosis tendency and EMT. Ferroptosis may have a promotive effect on EMT. High propensities of ferroptosis and EMT may lead to poor prognosis and non-response to immunotherapy.
Collapse
Affiliation(s)
- Jiaxi Yao
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Yuchong Zhang
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
| | - Mengling Li
- Department of Clinical Epidemiology and Center of Evidence-Based Medicine, The First Hospital of China Medical University, Shenyang, China
| | - Zuyu Sun
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
| | - Tao Liu
- Department of Urology, The First Hospital of China Medical University, Shenyang, China
- *Correspondence: Tao Liu, ; Mingfang Zhao, ; Zhi Li,
| | - Mingfang Zhao
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
- *Correspondence: Tao Liu, ; Mingfang Zhao, ; Zhi Li,
| | - Zhi Li
- Department of Medical Oncology, The First Hospital of China Medical University, Shenyang, China
- *Correspondence: Tao Liu, ; Mingfang Zhao, ; Zhi Li,
| |
Collapse
|