1
|
Das Adhikari S, Yang J, Wang J, Cui Y. Recent advances in spatially variable gene detection in spatial transcriptomics. Comput Struct Biotechnol J 2024; 23:883-891. [PMID: 38370977 PMCID: PMC10869304 DOI: 10.1016/j.csbj.2024.01.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 01/22/2024] [Accepted: 01/22/2024] [Indexed: 02/20/2024] Open
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
2
|
Nie W, Yu Y, Wang X, Wang R, Li SC. Spatially Informed Graph Structure Learning Extracts Insights from Spatial Transcriptomics. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2403572. [PMID: 39382177 DOI: 10.1002/advs.202403572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 08/04/2024] [Indexed: 10/10/2024]
Abstract
Embeddings derived from cell graphs hold significant potential for exploring spatial transcriptomics (ST) datasets. Nevertheless, existing methodologies rely on a graph structure defined by spatial proximity, which inadequately represents the diversity inherent in cell-cell interactions (CCIs). This study introduces STAGUE, an innovative framework that concurrently learns a cell graph structure and a low-dimensional embedding from ST data. STAGUE employs graph structure learning to parameterize and refine a cell graph adjacency matrix, enabling the generation of learnable graph views for effective contrastive learning. The derived embeddings and cell graph improve spatial clustering accuracy and facilitate the discovery of novel CCIs. Experimental benchmarks across 86 real and simulated ST datasets show that STAGUE outperforms 15 comparison methods in clustering performance. Additionally, STAGUE delineates the heterogeneity in human breast cancer tissues, revealing the activation of epithelial-to-mesenchymal transition and PI3K/AKT signaling in specific sub-regions. Furthermore, STAGUE identifies CCIs with greater alignment to established biological knowledge than those ascertained by existing graph autoencoder-based methods. STAGUE also reveals the regulatory genes that participate in these CCIs, including those enriched in neuropeptide signaling and receptor tyrosine kinase signaling pathways, thereby providing insights into the underlying biological processes.
Collapse
Affiliation(s)
- Wan Nie
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Yingying Yu
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Xueying Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
- City University of Hong Kong (Dongguan), Dongguan, 523000, China
| | - Ruohan Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
3
|
Yan G, Hua SH, Li JJ. Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. ARXIV 2024:arXiv:2405.18779v4. [PMID: 38855546 PMCID: PMC11160866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 33 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.
Collapse
Affiliation(s)
- Guanao Yan
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
| | - Shuo Harper Hua
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
- Department of Human Genetics, University of California, Los Angeles, CA 90095-7088
- Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1772
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA 02138
| |
Collapse
|
4
|
Li Y, Zhou X, Chen R, Zhang X, Cao H. STAREG: Statistical replicability analysis of high throughput experiments with applications to spatial transcriptomic studies. PLoS Genet 2024; 20:e1011423. [PMID: 39361716 PMCID: PMC11478871 DOI: 10.1371/journal.pgen.1011423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 10/15/2024] [Accepted: 09/10/2024] [Indexed: 10/05/2024] Open
Abstract
Replicable signals from different yet conceptually related studies provide stronger scientific evidence and more powerful inference. We introduce STAREG, a statistical method for replicability analysis of high throughput experiments, and apply it to analyze spatial transcriptomic studies. STAREG uses summary statistics from multiple studies of high throughput experiments and models the the joint distribution of p-values accounting for the heterogeneity of different studies. It effectively controls the false discovery rate (FDR) and has higher power by information borrowing. Moreover, it provides different rankings of important genes. With the EM algorithm in combination with pool-adjacent-violator-algorithm (PAVA), STAREG is scalable to datasets with millions of genes without any tuning parameters. Analyzing two pairs of spatially resolved transcriptomic datasets, we are able to make biological discoveries that otherwise cannot be obtained by using existing methods.
Collapse
Affiliation(s)
- Yan Li
- School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, Jilin, China
- School of Mathematics, Jilin University, Changchun, Jilin, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Xianyang Zhang
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Hongyuan Cao
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| |
Collapse
|
5
|
Shiau C, Cao J, Gong D, Gregory MT, Caldwell NJ, Yin X, Cho JW, Wang PL, Su J, Wang S, Reeves JW, Kim TK, Kim Y, Guo JA, Lester NA, Bae JW, Zhao R, Schurman N, Barth JL, Ganci ML, Weissleder R, Jacks T, Qadan M, Hong TS, Wo JY, Roberts H, Beechem JM, Castillo CFD, Mino-Kenudson M, Ting DT, Hemberg M, Hwang WL. Spatially resolved analysis of pancreatic cancer identifies therapy-associated remodeling of the tumor microenvironment. Nat Genet 2024:10.1038/s41588-024-01890-9. [PMID: 39227743 DOI: 10.1038/s41588-024-01890-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 07/30/2024] [Indexed: 09/05/2024]
Abstract
In combination with cell-intrinsic properties, interactions in the tumor microenvironment modulate therapeutic response. We leveraged single-cell spatial transcriptomics to dissect the remodeling of multicellular neighborhoods and cell-cell interactions in human pancreatic cancer associated with neoadjuvant chemotherapy and radiotherapy. We developed spatially constrained optimal transport interaction analysis (SCOTIA), an optimal transport model with a cost function that includes both spatial distance and ligand-receptor gene expression. Our results uncovered a marked change in ligand-receptor interactions between cancer-associated fibroblasts and malignant cells in response to treatment, which was supported by orthogonal datasets, including an ex vivo tumoroid coculture system. We identified enrichment in interleukin-6 family signaling that functionally confers resistance to chemotherapy. Overall, this study demonstrates that characterization of the tumor microenvironment using single-cell spatial transcriptomics allows for the identification of molecular interactions that may play a role in the emergence of therapeutic resistance and offers a spatially based analysis framework that can be broadly applied to other contexts.
Collapse
Affiliation(s)
- Carina Shiau
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jingyi Cao
- Gene Lay Institute of Immunology and Inflammation, Brigham and Women's Hospital, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Dennis Gong
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Harvard-MIT Health Sciences and Technology Program, Cambridge, MA, USA
| | | | - Nicholas J Caldwell
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Xunqin Yin
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jae-Won Cho
- Gene Lay Institute of Immunology and Inflammation, Brigham and Women's Hospital, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Peter L Wang
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jennifer Su
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Steven Wang
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | | | | | - Jimmy A Guo
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biological and Biomedical Sciences Program, Harvard Medical School, Boston, MA, USA
| | - Nicole A Lester
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jung Woo Bae
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ryan Zhao
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Jamie L Barth
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Maria L Ganci
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Ralph Weissleder
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
| | - Tyler Jacks
- Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Motaz Qadan
- Department of Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Theodore S Hong
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Jennifer Y Wo
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Hannah Roberts
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | | | - Mari Mino-Kenudson
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - David T Ting
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Department of Medical Oncology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Martin Hemberg
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Gene Lay Institute of Immunology and Inflammation, Brigham and Women's Hospital, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
| | - William L Hwang
- Center for Systems Biology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Department of Radiation Oncology, Massachusetts General Hospital, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
- Center for Cancer Research, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
6
|
Wu D, Datta S. TWCOM: an R package for inference of cell-cell communication on spatially resolved transcriptomics data. BIOINFORMATICS ADVANCES 2024; 4:vbae101. [PMID: 39040219 PMCID: PMC11262461 DOI: 10.1093/bioadv/vbae101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 05/16/2024] [Accepted: 07/15/2024] [Indexed: 07/24/2024]
Abstract
Summary The inference of cell-cell communication is important, as it unveils the intricate cellular behaviors at the molecular level, providing crucial insights essential for understanding complex biological processes and informing targeted interventions in various pathological contexts. Here, we present TWCOM, an R package that implements a Tweedie distribution-based model for accurate cell-cell communication inference. Operating under a generalized additive model framework, TWCOM adeptly handles both single-cell resolution and spot-based spatially resolved transcriptomics data, providing a versatile tool for robust biological sample analysis. Availability and implementation The R package TWCOM is available at https://github.com/dongyuanwu/TWCOM. Comprehensive documentation is included with the package.
Collapse
Affiliation(s)
- Dongyuan Wu
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| | - Susmita Datta
- Department of Biostatistics, University of Florida, Gainesville, FL 32603, United States
| |
Collapse
|
7
|
Liao L, Martin PCN, Kim H, Panahandeh S, Won KJ. Data enhancement in the age of spatial biology. Adv Cancer Res 2024; 163:39-70. [PMID: 39271267 DOI: 10.1016/bs.acr.2024.06.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
Unveiling the intricate interplay of cells in their native environment lies at the heart of understanding fundamental biological processes and unraveling disease mechanisms, particularly in complex diseases like cancer. Spatial transcriptomics (ST) offers a revolutionary lens into the spatial organization of gene expression within tissues, empowering researchers to study both cell heterogeneity and microenvironments in health and disease. However, current ST technologies often face limitations in either resolution or the number of genes profiled simultaneously. Integrating ST data with complementary sources, such as single-cell transcriptomics and detailed tissue staining images, presents a powerful solution to overcome these limitations. This review delves into the computational approaches driving the integration of spatial transcriptomics with other data types. By illuminating the key challenges and outlining the current algorithmic solutions, we aim to highlight the immense potential of these methods to revolutionize our understanding of cancer biology.
Collapse
Affiliation(s)
- Linbu Liao
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Denmark; Samuel Oschin Cancer Center, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Patrick C N Martin
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Hyobin Kim
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Sanaz Panahandeh
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Kyoung Jae Won
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States.
| |
Collapse
|
8
|
Yuan X, Ma Y, Gao R, Cui S, Wang Y, Fa B, Ma S, Wei T, Ma S, Yu Z. HEARTSVG: a fast and accurate method for identifying spatially variable genes in large-scale spatial transcriptomics. Nat Commun 2024; 15:5700. [PMID: 38972896 PMCID: PMC11228050 DOI: 10.1038/s41467-024-49846-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 06/19/2024] [Indexed: 07/09/2024] Open
Abstract
Identifying spatially variable genes (SVGs) is crucial for understanding the spatiotemporal characteristics of diseases and tissue structures, posing a distinctive challenge in spatial transcriptomics research. We propose HEARTSVG, a distribution-free, test-based method for fast and accurately identifying spatially variable genes in large-scale spatial transcriptomic data. Extensive simulations demonstrate that HEARTSVG outperforms state-of-the-art methods with higherF 1 scores (averageF 1 Score=0.948), improved computational efficiency, scalability, and reduced false positives (FPs). Through analysis of twelve real datasets from various spatial transcriptomic technologies, HEARTSVG identifies a greater number of biologically significant SVGs (average AUC = 0.792) than other comparative methods without prespecifying spatial patterns. Furthermore, by clustering SVGs, we uncover two distinct tumor spatial domains characterized by unique spatial expression patterns, spatial-temporal locations, and biological functions in human colorectal cancer data, unraveling the complexity of tumors.
Collapse
Affiliation(s)
- Xin Yuan
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China
| | - Yanran Ma
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Ruitian Gao
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Shuya Cui
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China
| | - Yifan Wang
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Botao Fa
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Xi'an Jiaotong University, Xi'an, Shanxi, China
| | - Shiyang Ma
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Ting Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Shuangge Ma
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China.
- Department of Biostatistics, Yale University, New Haven, USA.
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
- SJTU-Yale Joint Center for Biostatistics and Data Science Organization, Shanghai Jiao Tong University, Shanghai, China.
- Clinical Research Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
- Center for Biomedical Data Science, Translational Science Institute, Shanghai Jiao Tong University School of Medicine, Shanghai, China.
| |
Collapse
|
9
|
Qian J, Bao H, Shao X, Fang Y, Liao J, Chen Z, Li C, Guo W, Hu Y, Li A, Yao Y, Fan X, Cheng Y. Simulating multiple variability in spatially resolved transcriptomics with scCube. Nat Commun 2024; 15:5021. [PMID: 38866768 PMCID: PMC11169532 DOI: 10.1038/s41467-024-49445-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open
Abstract
A pressing challenge in spatially resolved transcriptomics (SRT) is to benchmark the computational methods. A widely-used approach involves utilizing simulated data. However, biases exist in terms of the currently available simulated SRT data, which seriously affects the accuracy of method evaluation and validation. Herein, we present scCube ( https://github.com/ZJUFanLab/scCube ), a Python package for independent, reproducible, and technology-diverse simulation of SRT data. scCube not only enables the preservation of spatial expression patterns of genes in reference-based simulations, but also generates simulated data with different spatial variability (covering the spatial pattern type, the resolution, the spot arrangement, the targeted gene type, and the tissue slice dimension, etc.) in reference-free simulations. We comprehensively benchmark scCube with existing single-cell or SRT simulators, and demonstrate the utility of scCube in benchmarking spot deconvolution, gene imputation, and resolution enhancement methods in detail through three applications.
Collapse
Affiliation(s)
- Jingyang Qian
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Hudong Bao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Xin Shao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Yin Fang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310013, China
| | - Jie Liao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Zhuo Chen
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310013, China
| | - Chengyu Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Wenbo Guo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Yining Hu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Anyao Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Yue Yao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Xiaohui Fan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Yiyu Cheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China.
| |
Collapse
|
10
|
Duo H, Li Y, Lan Y, Tao J, Yang Q, Xiao Y, Sun J, Li L, Nie X, Zhang X, Liang G, Liu M, Hao Y, Li B. Systematic evaluation with practical guidelines for single-cell and spatially resolved transcriptomics data simulation under multiple scenarios. Genome Biol 2024; 25:145. [PMID: 38831386 PMCID: PMC11149245 DOI: 10.1186/s13059-024-03290-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 05/28/2024] [Indexed: 06/05/2024] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have led to groundbreaking advancements in life sciences. To develop bioinformatics tools for scRNA-seq and SRT data and perform unbiased benchmarks, data simulation has been widely adopted by providing explicit ground truth and generating customized datasets. However, the performance of simulation methods under multiple scenarios has not been comprehensively assessed, making it challenging to choose suitable methods without practical guidelines. RESULTS We systematically evaluated 49 simulation methods developed for scRNA-seq and/or SRT data in terms of accuracy, functionality, scalability, and usability using 152 reference datasets derived from 24 platforms. SRTsim, scDesign3, ZINB-WaVE, and scDesign2 have the best accuracy performance across various platforms. Unexpectedly, some methods tailored to scRNA-seq data have potential compatibility for simulating SRT data. Lun, SPARSim, and scDesign3-tree outperform other methods under corresponding simulation scenarios. Phenopath, Lun, Simple, and MFA yield high scalability scores but they cannot generate realistic simulated data. Users should consider the trade-offs between method accuracy and scalability (or functionality) when making decisions. Additionally, execution errors are mainly caused by failed parameter estimations and appearance of missing or infinite values in calculations. We provide practical guidelines for method selection, a standard pipeline Simpipe ( https://github.com/duohongrui/simpipe ; https://doi.org/10.5281/zenodo.11178409 ), and an online tool Simsite ( https://www.ciblab.net/software/simshiny/ ) for data simulation. CONCLUSIONS No method performs best on all criteria, thus a good-yet-not-the-best method is recommended if it solves problems effectively and reasonably. Our comprehensive work provides crucial insights for developers on modeling gene expression data and fosters the simulation process for users.
Collapse
Affiliation(s)
- Hongrui Duo
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
| | - Yinghong Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, People's Republic of China
| | - Yang Lan
- Institute of Pathology and Southwest Cancer Center, Southwest Hospital, Army Medical University, Chongqing, 400038, People's Republic of China
| | - Jingxin Tao
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, People's Republic of China
| | - Yingxue Xiao
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
| | - Jing Sun
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
| | - Lei Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
| | - Xiner Nie
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, People's Republic of China
| | - Xiaoxi Zhang
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, People's Republic of China
| | - Mingwei Liu
- Key Laboratory of Clinical Laboratory Diagnostics, College of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, People's Republic of China
| | - Youjin Hao
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing, 401331, People's Republic of China.
| |
Collapse
|
11
|
Sang-aram C, Browaeys R, Seurinck R, Saeys Y. Spotless, a reproducible pipeline for benchmarking cell type deconvolution in spatial transcriptomics. eLife 2024; 12:RP88431. [PMID: 38787371 PMCID: PMC11126312 DOI: 10.7554/elife.88431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024] Open
Abstract
Spatial transcriptomics (ST) technologies allow the profiling of the transcriptome of cells while keeping their spatial context. Since most commercial untargeted ST technologies do not yet operate at single-cell resolution, computational methods such as deconvolution are often used to infer the cell type composition of each sequenced spot. We benchmarked 11 deconvolution methods using 63 silver standards, 3 gold standards, and 2 case studies on liver and melanoma tissues. We developed a simulation engine called synthspot to generate silver standards from single-cell RNA-sequencing data, while gold standards are generated by pooling single cells from targeted ST data. We evaluated methods based on their performance, stability across different reference datasets, and scalability. We found that cell2location and RCTD are the top-performing methods, but surprisingly, a simple regression model outperforms almost half of the dedicated spatial deconvolution methods. Furthermore, we observe that the performance of all methods significantly decreased in datasets with highly abundant or rare cell types. Our results are reproducible in a Nextflow pipeline, which also allows users to generate synthetic data, run deconvolution methods and optionally benchmark them on their dataset (https://github.com/saeyslab/spotless-benchmark).
Collapse
Affiliation(s)
- Chananchida Sang-aram
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation ResearchGhentBelgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent UniversityGhentBelgium
| | - Robin Browaeys
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation ResearchGhentBelgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent UniversityGhentBelgium
| | - Ruth Seurinck
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation ResearchGhentBelgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent UniversityGhentBelgium
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine, VIB Center for Inflammation ResearchGhentBelgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent UniversityGhentBelgium
| |
Collapse
|
12
|
Yu S, Li WV. spVC for the detection and interpretation of spatial gene expression variation. Genome Biol 2024; 25:103. [PMID: 38641849 PMCID: PMC11027374 DOI: 10.1186/s13059-024-03245-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 04/10/2024] [Indexed: 04/21/2024] Open
Abstract
Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC's accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Shan Yu
- Department of Statistics, Unversity of Virginia, Charlottesville, 22903, VA, USA.
| | - Wei Vivian Li
- Department of Statistics, University of California, Riverside, 92521, CA, USA.
| |
Collapse
|
13
|
Li R, Chen X, Yang X. Navigating the landscapes of spatial transcriptomics: How computational methods guide the way. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1839. [PMID: 38527900 DOI: 10.1002/wrna.1839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/24/2024] [Accepted: 03/04/2024] [Indexed: 03/27/2024]
Abstract
Spatially resolved transcriptomics has been dramatically transforming biological and medical research in various fields. It enables transcriptome profiling at single-cell, multi-cellular, or sub-cellular resolution, while retaining the information of geometric localizations of cells in complex tissues. The coupling of cell spatial information and its molecular characteristics generates a novel multi-modal high-throughput data source, which poses new challenges for the development of analytical methods for data-mining. Spatial transcriptomic data are often highly complex, noisy, and biased, presenting a series of difficulties, many unresolved, for data analysis and generation of biological insights. In addition, to keep pace with the ever-evolving spatial transcriptomic experimental technologies, the existing analytical theories and tools need to be updated and reformed accordingly. In this review, we provide an overview and discussion of the current computational approaches for mining of spatial transcriptomics data. Future directions and perspectives of methodology design are proposed to stimulate further discussions and advances in new analytical models and algorithms. This article is categorized under: RNA Methods > RNA Analyses in Cells RNA Evolution and Genomics > Computational Analyses of RNA RNA Export and Localization > RNA Localization.
Collapse
Affiliation(s)
- Runze Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xu Chen
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
14
|
Sun ED, Ma R, Navarro Negredo P, Brunet A, Zou J. TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses. Nat Methods 2024; 21:444-454. [PMID: 38347138 DOI: 10.1038/s41592-024-02184-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 01/12/2024] [Indexed: 02/27/2024]
Abstract
Whole-transcriptome spatial profiling of genes at single-cell resolution remains a challenge. To address this limitation, spatial gene expression prediction methods have been developed to infer the spatial expression of unmeasured transcripts, but the quality of these predictions can vary greatly. Here we present Transcript Imputation with Spatial Single-cell Uncertainty Estimation (TISSUE) as a general framework for estimating uncertainty for spatial gene expression predictions and providing uncertainty-aware methods for downstream inference. Leveraging conformal inference, TISSUE provides well-calibrated prediction intervals for predicted expression values across 11 benchmark datasets. Moreover, it consistently reduces the false discovery rate for differential gene expression analysis, improves clustering and visualization of predicted spatial transcriptomics and improves the performance of supervised learning models trained on predicted gene expression profiles. Applying TISSUE to a MERFISH spatial transcriptomics dataset of the adult mouse subventricular zone, we identified subtypes within the neural stem cell lineage and developed subtype-specific regional classifiers.
Collapse
Affiliation(s)
- Eric D Sun
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Rong Ma
- Department of Statistics, Stanford University, Stanford, CA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Anne Brunet
- Department of Genetics, Stanford University, Stanford, CA, USA
- Wu Tsai Neurosciences Institute, Stanford University, Stanford, CA, USA
- Glenn Center for the Biology of Aging, Stanford University, Stanford, CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| |
Collapse
|
15
|
Yuan M, Wan H, Wang Z, Guo Q, Deng M. SPANN: annotating single-cell resolution spatial transcriptome data with scRNA-seq data. Brief Bioinform 2024; 25:bbad533. [PMID: 38279647 PMCID: PMC10818138 DOI: 10.1093/bib/bbad533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 11/13/2023] [Accepted: 12/19/2023] [Indexed: 01/28/2024] Open
Abstract
MOTIVATION The rapid development of spatial transcriptome technologies has enabled researchers to acquire single-cell-level spatial data at an affordable price. However, computational analysis tools, such as annotation tools, tailored for these data are still lacking. Recently, many computational frameworks have emerged to integrate single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics datasets. While some frameworks can utilize well-annotated scRNA-seq data to annotate spatial expression patterns, they overlook critical aspects. First, existing tools do not explicitly consider cell type mapping when aligning the two modalities. Second, current frameworks lack the capability to detect novel cells, which remains a key interest for biologists. RESULTS To address these problems, we propose an annotation method for spatial transcriptome data called SPANN. The main tasks of SPANN are to transfer cell-type labels from well-annotated scRNA-seq data to newly generated single-cell resolution spatial transcriptome data and discover novel cells from spatial data. The major innovations of SPANN come from two aspects: SPANN automatically detects novel cells from unseen cell types while maintaining high annotation accuracy over known cell types. SPANN finds a mapping between spatial transcriptome samples and RNA data prototypes and thus conducts cell-type-level alignment. Comprehensive experiments using datasets from various spatial platforms demonstrate SPANN's capabilities in annotating known cell types and discovering novel cell states within complex tissue contexts. AVAILABILITY The source code of SPANN can be accessed at https://github.com/ddb-qiwang/SPANN-torch. CONTACT dengmh@math.pku.edu.cn.
Collapse
Affiliation(s)
- Musu Yuan
- Center for Quantitative Biology, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Hui Wan
- School of Mathematical Sciences, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Zihao Wang
- Biomedical Interdisciplinary Research Center, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Qirui Guo
- Center for Quantitative Biology, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Minghua Deng
- Center for Quantitative Biology, Peking University, Yiheyuan Road, 100871, Beijing, China
- School of Mathematical Sciences, Peking University, Yiheyuan Road, 100871, Beijing, China
- Center for Statistical Science, Peking University, Yiheyuan Road, 100871, Beijing, China
- Biomedical Interdisciplinary Research Center, Peking University, Yiheyuan Road, 100871, Beijing, China
| |
Collapse
|
16
|
Kiessling P, Kuppe C. Spatial multi-omics: novel tools to study the complexity of cardiovascular diseases. Genome Med 2024; 16:14. [PMID: 38238823 PMCID: PMC10795303 DOI: 10.1186/s13073-024-01282-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 01/02/2024] [Indexed: 01/22/2024] Open
Abstract
Spatial multi-omic studies have emerged as a promising approach to comprehensively analyze cells in tissues, enabling the joint analysis of multiple data modalities like transcriptome, epigenome, proteome, and metabolome in parallel or even the same tissue section. This review focuses on the recent advancements in spatial multi-omics technologies, including novel data modalities and computational approaches. We discuss the advancements in low-resolution and high-resolution spatial multi-omics methods which can resolve up to 10,000 of individual molecules at subcellular level. By applying and integrating these techniques, researchers have recently gained valuable insights into the molecular circuits and mechanisms which govern cell biology along the cardiovascular disease spectrum. We provide an overview of current data analysis approaches, with a focus on data integration of multi-omic datasets, highlighting strengths and weaknesses of various computational pipelines. These tools play a crucial role in analyzing and interpreting spatial multi-omics datasets, facilitating the discovery of new findings, and enhancing translational cardiovascular research. Despite nontrivial challenges, such as the need for standardization of experimental setups, data analysis, and improved computational tools, the application of spatial multi-omics holds tremendous potential in revolutionizing our understanding of human disease processes and the identification of novel biomarkers and therapeutic targets. Exciting opportunities lie ahead for the spatial multi-omics field and will likely contribute to the advancement of personalized medicine for cardiovascular diseases.
Collapse
Affiliation(s)
- Paul Kiessling
- Department of Nephrology, Rheumatology, and Clinical Immunology, University Hospital RWTH Aachen, Aachen, Germany
| | - Christoph Kuppe
- Department of Nephrology, Rheumatology, and Clinical Immunology, University Hospital RWTH Aachen, Aachen, Germany.
| |
Collapse
|
17
|
Liang Q, Huang Y, He S, Chen K. Pathway centric analysis for single-cell RNA-seq and spatial transcriptomics data with GSDensity. Nat Commun 2023; 14:8416. [PMID: 38110427 PMCID: PMC10728201 DOI: 10.1038/s41467-023-44206-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 12/04/2023] [Indexed: 12/20/2023] Open
Abstract
Advances in single-cell technology have enabled molecular dissection of heterogeneous biospecimens at unprecedented scales and resolutions. Cluster-centric approaches are widely applied in analyzing single-cell data, however they have limited power in dissecting and interpreting highly heterogenous, dynamically evolving data. Here, we present GSDensity, a graph-modeling approach that allows users to obtain pathway-centric interpretation and dissection of single-cell and spatial transcriptomics (ST) data without performing clustering. Using pathway gene sets, we show that GSDensity can accurately detect biologically distinct cells and reveal novel cell-pathway associations ignored by existing methods. Moreover, GSDensity, combined with trajectory analysis can identify curated pathways that are active at various stages of mouse brain development. Finally, GSDensity can identify spatially relevant pathways in mouse brains and human tumors including those following high-order organizational patterns in the ST data. Particularly, we create a pan-cancer ST map revealing spatially relevant and recurrently active pathways across six different tumor types.
Collapse
Affiliation(s)
- Qingnan Liang
- Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, TX, USA
| | - Yuefan Huang
- Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, TX, USA
| | - Shan He
- Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, TX, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, UT MD Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
18
|
Li Z, Patel ZM, Song D, Yan G, Li JJ, Pinello L. Benchmarking computational methods to identify spatially variable genes and peaks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.02.569717. [PMID: 38076922 PMCID: PMC10705556 DOI: 10.1101/2023.12.02.569717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
Spatially resolved transcriptomics offers unprecedented insight by enabling the profiling of gene expression within the intact spatial context of cells, effectively adding a new and essential dimension to data interpretation. To efficiently detect spatial structure of interest, an essential step in analyzing such data involves identifying spatially variable genes. Despite researchers having developed several computational methods to accomplish this task, the lack of a comprehensive benchmark evaluating their performance remains a considerable gap in the field. Here, we present a systematic evaluation of 14 methods using 60 simulated datasets generated by four different simulation strategies, 12 real-world transcriptomics, and three spatial ATAC-seq datasets. We find that spatialDE2 consistently outperforms the other benchmarked methods, and Moran's I achieves competitive performance in different experimental settings. Moreover, our results reveal that more specialized algorithms are needed to identify spatially variable peaks.
Collapse
Affiliation(s)
- Zhijian Li
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Zain M. Patel
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Dongyuan Song
- Interdepartmental Program of Bioinformatics, University of California, Los Angeles, CA, USA
| | - Guanao Yan
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA, USA
| | - Luca Pinello
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
19
|
Adhikari SD, Yang J, Wang J, Cui Y. A SELECTIVE REVIEW OF RECENT DEVELOPMENTS IN SPATIALLY VARIABLE GENE DETECTION FOR SPATIAL TRANSCRIPTOMICS. ARXIV 2023:arXiv:2311.13801v1. [PMID: 38045476 PMCID: PMC10690303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]
Abstract
With the emergence of advanced spatial transcriptomic technologies, there has been a surge in research papers dedicated to analyzing spatial transcriptomics data, resulting in significant contributions to our understanding of biology. The initial stage of downstream analysis of spatial transcriptomic data has centered on identifying spatially variable genes (SVGs) or genes expressed with specific spatial patterns across the tissue. SVG detection is an important task since many downstream analyses depend on these selected SVGs. Over the past few years, a plethora of new methods have been proposed for the detection of SVGs, accompanied by numerous innovative concepts and discussions. This article provides a selective review of methods and their practical implementations, offering valuable insights into the current literature in this field.
Collapse
Affiliation(s)
- Sikta Das Adhikari
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
20
|
Luo X, Liu Z, Xu R. Adult tissue-specific stem cell interaction: novel technologies and research advances. Front Cell Dev Biol 2023; 11:1220694. [PMID: 37808078 PMCID: PMC10551553 DOI: 10.3389/fcell.2023.1220694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 09/11/2023] [Indexed: 10/10/2023] Open
Abstract
Adult tissue-specific stem cells play a dominant role in tissue homeostasis and regeneration. Various in vivo markers of adult tissue-specific stem cells have been increasingly reported by lineage tracing in genetic mouse models, indicating that marked cells differentiation is crucial during homeostasis and regeneration. How adult tissue-specific stem cells with indicated markers contact the adjacent lineage with indicated markers is of significance to be studied. Novel methods bring future findings. Recent advances in lineage tracing, synthetic receptor systems, proximity labeling, and transcriptomics have enabled easier and more accurate cell behavior visualization and qualitative and quantitative analysis of cell-cell interactions than ever before. These technological innovations have prompted researchers to re-evaluate previous experimental results, providing increasingly compelling experimental results for understanding the mechanisms of cell-cell interactions. This review aimed to describe the recent methodological advances of dual enzyme lineage tracing system, the synthetic receptor system, proximity labeling, single-cell RNA sequencing and spatial transcriptomics in the study of adult tissue-specific stem cells interactions. An enhanced understanding of the mechanisms of adult tissue-specific stem cells interaction is important for tissue regeneration and maintenance of homeostasis in organisms.
Collapse
Affiliation(s)
| | | | - Ruoshi Xu
- State Key Laboratory of Oral Diseases, National Center for Stomatology, National Clinical Research Center for Oral Diseases, Department of Cariology and Endodontics, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| |
Collapse
|
21
|
Shi X, Zhu J, Long Y, Liang C. Identifying spatial domains of spatially resolved transcriptomics via multi-view graph convolutional networks. Brief Bioinform 2023; 24:bbad278. [PMID: 37544658 DOI: 10.1093/bib/bbad278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 06/27/2023] [Accepted: 07/14/2023] [Indexed: 08/08/2023] Open
Abstract
MOTIVATION Recent advances in spatially resolved transcriptomics (ST) technologies enable the measurement of gene expression profiles while preserving cellular spatial context. Linking gene expression of cells with their spatial distribution is essential for better understanding of tissue microenvironment and biological progress. However, effectively combining gene expression data with spatial information to identify spatial domains remains challenging. RESULTS To deal with the above issue, in this paper, we propose a novel unsupervised learning framework named STMGCN for identifying spatial domains using multi-view graph convolution networks (MGCNs). Specifically, to fully exploit spatial information, we first construct multiple neighbor graphs (views) with different similarity measures based on the spatial coordinates. Then, STMGCN learns multiple view-specific embeddings by combining gene expressions with each neighbor graph through graph convolution networks. Finally, to capture the importance of different graphs, we further introduce an attention mechanism to adaptively fuse view-specific embeddings and thus derive the final spot embedding. STMGCN allows for the effective utilization of spatial context to enhance the expressive power of the latent embeddings with multiple graph convolutions. We apply STMGCN on two simulation datasets and five real spatial transcriptomics datasets with different resolutions across distinct platforms. The experimental results demonstrate that STMGCN obtains competitive results in spatial domain identification compared with five state-of-the-art methods, including spatial and non-spatial alternatives. Besides, STMGCN can detect spatially variable genes with enriched expression patterns in the identified domains. Overall, STMGCN is a powerful and efficient computational framework for identifying spatial domains in spatial transcriptomics data.
Collapse
Affiliation(s)
- Xuejing Shi
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Juntong Zhu
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| | - Yahui Long
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), 8A Biomedical Grove, 138648, Singapore
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China
| |
Collapse
|
22
|
Li H, Zhang Z, Squires M, Chen X, Zhang X. scMultiSim: simulation of single cell multi-omics and spatial data guided by gene regulatory networks and cell-cell interactions. RESEARCH SQUARE 2023:rs.3.rs-3301625. [PMID: 37790516 PMCID: PMC10543280 DOI: 10.21203/rs.3.rs-3301625/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Simulated single-cell data is essential for designing and evaluating computational methods in the absence of experimental ground truth. Existing simulators typically focus on modeling one or two specific biological factors or mechanisms that affect the output data, which limits their capacity to simulate the complexity and multi-modality in real data. Here, we present scMultiSim, an in silico simulator that generates multi-modal single-cell data, including gene expression, chromatin accessibility, RNA velocity, and spatial cell locations while accounting for the relationships between modalities. scMultiSim jointly models various biological factors that affect the output data, including cell identity, within-cell gene regulatory networks (GRNs), cell-cell interactions (CCIs), and chromatin accessibility, hile also incorporating technical noises. Moreover, it allows users to adjust each factor's effect easily. We validated scMultiSim's simulated biological effects and demonstrated its applications by benchmarking a wide range of computational tasks, including multi-modal and multi-batch data integration, RNA velocity estimation, GRN inference and CCI inference using spatially resolved gene expression data, many of them were not benchmarked before due to the lack of proper tools. Compared to existing simulators, scMultiSim can benchmark a much broader range of existing computational problems and even new potential tasks.
Collapse
Affiliation(s)
- Hechen Li
- Georgia Institute of Technology, Atlanta, USA
| | - Ziqi Zhang
- Georgia Institute of Technology, Atlanta, USA
| | | | - Xi Chen
- Southern University of Science and Technology, Shenzhen, China
| | | |
Collapse
|
23
|
Charitakis N, Salim A, Piers AT, Watt KI, Porrello ER, Elliott DA, Ramialison M. Disparities in spatially variable gene calling highlight the need for benchmarking spatial transcriptomics methods. Genome Biol 2023; 24:209. [PMID: 37723583 PMCID: PMC10506280 DOI: 10.1186/s13059-023-03045-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 08/21/2023] [Indexed: 09/20/2023] Open
Abstract
Identifying spatially variable genes (SVGs) is a key step in the analysis of spatially resolved transcriptomics data. SVGs provide biological insights by defining transcriptomic differences within tissues, which was previously unachievable using RNA-sequencing technologies. However, the increasing number of published tools designed to define SVG sets currently lack benchmarking methods to accurately assess performance. This study compares results of 6 purpose-built packages for SVG identification across 9 public and 5 simulated datasets and highlights discrepancies between results. Additional tools for generation of simulated data and development of benchmarking methods are required to improve methods for identifying SVGs.
Collapse
Affiliation(s)
- Natalie Charitakis
- Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia
- Department of Paediatrics, University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia
- Novo Nordisk Foundation Center for Stem Cell Medicine, Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia
| | - Agus Salim
- Melbourne School of Population and Global Health, University of Melbourne, Bouverie St, Carlton, VIC, 3053, Australia
- School of Mathematics and Statistics, University of Melbourne, Swanston Street, Parkville, VIC, 3010, Australia
| | - Adam T Piers
- Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia
- Novo Nordisk Foundation Center for Stem Cell Medicine, Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia
- Melbourne Centre for Cardiovascular Genomics and Regenerative Medicine, The Royal Children's Hospital, Melbourne, VIC, 3052, Australia
| | - Kevin I Watt
- Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia
- Novo Nordisk Foundation Center for Stem Cell Medicine, Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia
- Department of Anatomy and Physiology, University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia
- Department of Diabetes, Monash University, Alfred Centre, Commercial Road, Melbourne, VIC, 3004, Australia
| | - Enzo R Porrello
- Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
- Department of Paediatrics, University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia.
- Novo Nordisk Foundation Center for Stem Cell Medicine, Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
- Department of Anatomy and Physiology, University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia.
| | - David A Elliott
- Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
- Department of Paediatrics, University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia.
- Novo Nordisk Foundation Center for Stem Cell Medicine, Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
| | - Mirana Ramialison
- Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
- Department of Paediatrics, University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia.
- Novo Nordisk Foundation Center for Stem Cell Medicine, Murdoch Children's Research Institute, Royal Children's Hospital, Flemington Road, Parkville, VIC, 3052, Australia.
| |
Collapse
|