1
|
Li HS, Tan YT, Zhang XF. Enhancing spatial domain detection in spatial transcriptomics with EnSDD. Commun Biol 2024; 7:1358. [PMID: 39433947 PMCID: PMC11494180 DOI: 10.1038/s42003-024-07001-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 10/01/2024] [Indexed: 10/23/2024] Open
Abstract
Advancements in spatial transcriptomics have transformed our understanding of organ function and tissue microenvironment. However, accurately identifying spatial domains to depict genome heterogeneity and cellular interactions remains a challenge. In this study, we propose EnSDD (Ensemble-learning for Spatial Domain Detection), a method that ingeniously integrates eight state-of-the-art spatial domain detection methods to automatically identify spatial domains. A key innovation of EnSDD is its dynamic weighting mechanism within the ensemble learning process, which optimizes the contribution of each base model and provides a performance evaluation metric without the need for ground truth data. By leveraging the spatial domains identified through EnSDD, we incorporate the detection of domain-specific spatially variable genes and the spatial distribution of cell types, thereby providing deeper insights into tissue heterogeneity. We validate EnSDD across diverse spatial transcriptomics datasets from various tissue organizational structures. Our results demonstrate that EnSDD significantly enhances spatial domain identification accuracy, identifies genes with spatial expression patterns, and reveals domain-specific cell type enrichment patterns, offering invaluable insights into tissue spatial heterogeneity and regionalization.
Collapse
Affiliation(s)
- Hui-Sheng Li
- School of Mathematical Sciences, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Yu-Ting Tan
- School of Mathematics and Statistics, and Hubei Key Lab-Math. Sci., Central China Normal University, Wuhan, 430079, China
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics, and Hubei Key Lab-Math. Sci., Central China Normal University, Wuhan, 430079, China.
- Key Laboratory of Nonlinear Analysis & Applications (Ministry of Education), Central China Normal University, Wuhan, 430079, China.
| |
Collapse
|
2
|
Nie W, Yu Y, Wang X, Wang R, Li SC. Spatially Informed Graph Structure Learning Extracts Insights from Spatial Transcriptomics. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2403572. [PMID: 39382177 DOI: 10.1002/advs.202403572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 08/04/2024] [Indexed: 10/10/2024]
Abstract
Embeddings derived from cell graphs hold significant potential for exploring spatial transcriptomics (ST) datasets. Nevertheless, existing methodologies rely on a graph structure defined by spatial proximity, which inadequately represents the diversity inherent in cell-cell interactions (CCIs). This study introduces STAGUE, an innovative framework that concurrently learns a cell graph structure and a low-dimensional embedding from ST data. STAGUE employs graph structure learning to parameterize and refine a cell graph adjacency matrix, enabling the generation of learnable graph views for effective contrastive learning. The derived embeddings and cell graph improve spatial clustering accuracy and facilitate the discovery of novel CCIs. Experimental benchmarks across 86 real and simulated ST datasets show that STAGUE outperforms 15 comparison methods in clustering performance. Additionally, STAGUE delineates the heterogeneity in human breast cancer tissues, revealing the activation of epithelial-to-mesenchymal transition and PI3K/AKT signaling in specific sub-regions. Furthermore, STAGUE identifies CCIs with greater alignment to established biological knowledge than those ascertained by existing graph autoencoder-based methods. STAGUE also reveals the regulatory genes that participate in these CCIs, including those enriched in neuropeptide signaling and receptor tyrosine kinase signaling pathways, thereby providing insights into the underlying biological processes.
Collapse
Affiliation(s)
- Wan Nie
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Yingying Yu
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Xueying Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
- City University of Hong Kong (Dongguan), Dongguan, 523000, China
| | - Ruohan Wang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Shuai Cheng Li
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
3
|
Cui X, Chen X, Li Z, Gao Z, Chen S, Jiang R. Discrete latent embedding of single-cell chromatin accessibility sequencing data for uncovering cell heterogeneity. NATURE COMPUTATIONAL SCIENCE 2024; 4:346-359. [PMID: 38730185 DOI: 10.1038/s43588-024-00625-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Accepted: 04/05/2024] [Indexed: 05/12/2024]
Abstract
Single-cell epigenomic data has been growing continuously at an unprecedented pace, but their characteristics such as high dimensionality and sparsity pose substantial challenges to downstream analysis. Although deep learning models-especially variational autoencoders-have been widely used to capture low-dimensional feature embeddings, the prevalent Gaussian assumption somewhat disagrees with real data, and these models tend to struggle to incorporate reference information from abundant cell atlases. Here we propose CASTLE, a deep generative model based on the vector-quantized variational autoencoder framework to extract discrete latent embeddings that interpretably characterize single-cell chromatin accessibility sequencing data. We validate the performance and robustness of CASTLE for accurate cell-type identification and reasonable visualization compared with state-of-the-art methods. We demonstrate the advantages of CASTLE for effective incorporation of existing massive reference datasets in a weakly supervised or supervised manner. We further demonstrate CASTLE's capacity for intuitively distilling cell-type-specific feature spectra that unveil cell heterogeneity and biological implications quantitatively.
Collapse
Affiliation(s)
- Xuejian Cui
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Xiaoyang Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Zhen Li
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Zijing Gao
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
| | - Shengquan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, China.
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
4
|
Lei L, Han K, Wang Z, Shi C, Wang Z, Dai R, Zhang Z, Wang M, Guo Q. Attention-guided variational graph autoencoders reveal heterogeneity in spatial transcriptomics. Brief Bioinform 2024; 25:bbae173. [PMID: 38627939 PMCID: PMC11021349 DOI: 10.1093/bib/bbae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 03/03/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open
Abstract
The latest breakthroughs in spatially resolved transcriptomics technology offer comprehensive opportunities to delve into gene expression patterns within the tissue microenvironment. However, the precise identification of spatial domains within tissues remains challenging. In this study, we introduce AttentionVGAE (AVGN), which integrates slice images, spatial information and raw gene expression while calibrating low-quality gene expression. By combining the variational graph autoencoder with multi-head attention blocks (MHA blocks), AVGN captures spatial relationships in tissue gene expression, adaptively focusing on key features and alleviating the need for prior knowledge of cluster numbers, thereby achieving superior clustering performance. Particularly, AVGN attempts to balance the model's attention focus on local and global structures by utilizing MHA blocks, an aspect that current graph neural networks have not extensively addressed. Benchmark testing demonstrates its significant efficacy in elucidating tissue anatomy and interpreting tumor heterogeneity, indicating its potential in advancing spatial transcriptomics research and understanding complex biological phenomena.
Collapse
Affiliation(s)
- Lixin Lei
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Kaitai Han
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zijun Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Chaojing Shi
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zhenghui Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Ruoyan Dai
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zhiwei Zhang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Mengqiu Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Qianjin Guo
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| |
Collapse
|
5
|
Tang S, Cui X, Wang R, Li S, Li S, Huang X, Chen S. scCASE: accurate and interpretable enhancement for single-cell chromatin accessibility sequencing data. Nat Commun 2024; 15:1629. [PMID: 38388573 PMCID: PMC10884038 DOI: 10.1038/s41467-024-46045-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 02/12/2024] [Indexed: 02/24/2024] Open
Abstract
Single-cell chromatin accessibility sequencing (scCAS) has emerged as a valuable tool for interrogating and elucidating epigenomic heterogeneity and gene regulation. However, scCAS data inherently suffers from limitations such as high sparsity and dimensionality, which pose significant challenges for downstream analyses. Although several methods are proposed to enhance scCAS data, there are still challenges and limitations that hinder the effectiveness of these methods. Here, we propose scCASE, a scCAS data enhancement method based on non-negative matrix factorization which incorporates an iteratively updating cell-to-cell similarity matrix. Through comprehensive experiments on multiple datasets, we demonstrate the advantages of scCASE over existing methods for scCAS data enhancement. The interpretable cell type-specific peaks identified by scCASE can provide valuable biological insights into cell subpopulations. Moreover, to leverage the large compendia of available omics data as a reference, we further expand scCASE to scCASER, which enables the incorporation of external reference data to improve enhancement performance.
Collapse
Affiliation(s)
- Songming Tang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China
| | - Xuejian Cui
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division of BNRIST, Department of Automation, Tsinghua University, 100084, Beijing, China
| | - Rongxiang Wang
- Department of Computer Science, University of Virginia, Charlottesville, VA, 22903, USA
| | - Sijie Li
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China
| | - Siyu Li
- School of Statistics and Data Science, Nankai University, Tianjin, 300071, China
| | - Xin Huang
- Beijing Key Laboratory for Radiobiology, Department of Radiation Biology, Beijing Institute of Radiation Medicine, 100850, Beijing, China
| | - Shengquan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China.
| |
Collapse
|
6
|
Liang Y, Shi G, Cai R, Yuan Y, Xie Z, Yu L, Huang Y, Shi Q, Wang L, Li J, Tang Z. PROST: quantitative identification of spatially variable genes and domain detection in spatial transcriptomics. Nat Commun 2024; 15:600. [PMID: 38238417 PMCID: PMC10796707 DOI: 10.1038/s41467-024-44835-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 12/19/2023] [Indexed: 01/22/2024] Open
Abstract
Computational methods have been proposed to leverage spatially resolved transcriptomic data, pinpointing genes with spatial expression patterns and delineating tissue domains. However, existing approaches fall short in uniformly quantifying spatially variable genes (SVGs). Moreover, from a methodological viewpoint, while SVGs are naturally associated with depicting spatial domains, they are technically dissociated in most methods. Here, we present a framework (PROST) for the quantitative recognition of spatial transcriptomic patterns, consisting of (i) quantitatively characterizing spatial variations in gene expression patterns through the PROST Index; and (ii) unsupervised clustering of spatial domains via a self-attention mechanism. We demonstrate that PROST performs superior SVG identification and domain segmentation with various spatial resolutions, from multicellular to cellular levels. Importantly, PROST Index can be applied to prioritize spatial expression variations, facilitating the exploration of biological insights. Together, our study provides a flexible and robust framework for analyzing diverse spatial transcriptomic data.
Collapse
Affiliation(s)
- Yuchen Liang
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Guowei Shi
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Runlin Cai
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yuchen Yuan
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Ziying Xie
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Long Yu
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yingjian Huang
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Qian Shi
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Lizhe Wang
- School of Computer Science, China University of Geosciences, Wuhan, 430078, China
| | - Jun Li
- School of Computer Science, China University of Geosciences, Wuhan, 430078, China.
| | - Zhonghui Tang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|