201
|
Wang Y, Liu Z, Ma X. MNMST: topology of cell networks leverages identification of spatial domains from spatial transcriptomics data. Genome Biol 2024; 25:133. [PMID: 38783355 PMCID: PMC11112797 DOI: 10.1186/s13059-024-03272-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 05/09/2024] [Indexed: 05/25/2024] Open
Abstract
Advances in spatial transcriptomics provide an unprecedented opportunity to reveal the structure and function of biology systems. However, current algorithms fail to address the heterogeneity and interpretability of spatial transcriptomics data. Here, we present a multi-layer network model for identifying spatial domains in spatial transcriptomics data with joint learning. We demonstrate that spatial domains can be precisely characterized and discriminated by the topological structure of cell networks, facilitating identification and interpretability of spatial domains, which outperforms state-of-the-art baselines. Furthermore, we prove that network model offers an effective and efficient strategy for integrative analysis of spatial transcriptomics data from various platforms.
Collapse
Affiliation(s)
- Yu Wang
- School of Computer Science and Technology, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China
- Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China
| | - Zaiyi Liu
- Department of Radiology, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, 106 Zhongshan Er Road, Guangzhou, 510080, Guangdong, China
- Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, 106 Zhongshan Er Road, Guangzhou, 510080, Guangdong, China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
- Key Laboratory of Smart Human-Computer Interaction and Wearable Technology of Shaanxi Province, Xidian University, No.2 South Taibai Road, Xi'an, 710071, Shaanxi, China.
| |
Collapse
|
202
|
Zhang L, Liang S, Wan L. A multi-view graph contrastive learning framework for deciphering spatially resolved transcriptomics data. Brief Bioinform 2024; 25:bbae255. [PMID: 38801701 PMCID: PMC11129769 DOI: 10.1093/bib/bbae255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/27/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024] Open
Abstract
Spatially resolved transcriptomics data are being used in a revolutionary way to decipher the spatial pattern of gene expression and the spatial architecture of cell types. Much work has been done to exploit the genomic spatial architectures of cells. Such work is based on the common assumption that gene expression profiles of spatially adjacent spots are more similar than those of more distant spots. However, related work might not consider the nonlocal spatial co-expression dependency, which can better characterize the tissue architectures. Therefore, we propose MuCoST, a Multi-view graph Contrastive learning framework for deciphering complex Spatially resolved Transcriptomic architectures with dual scale structural dependency. To achieve this, we employ spot dependency augmentation by fusing gene expression correlation and spatial location proximity, thereby enabling MuCoST to model both nonlocal spatial co-expression dependency and spatially adjacent dependency. We benchmark MuCoST on four datasets, and we compare it with other state-of-the-art spatial domain identification methods. We demonstrate that MuCoST achieves the highest accuracy on spatial domain identification from various datasets. In particular, MuCoST accurately deciphers subtle biological textures and elaborates the variation of spatially functional patterns.
Collapse
Affiliation(s)
- Lei Zhang
- Department of Control Science and Engineering, Tongji University, No. 4800 Cao’an Road, 201804, Shanghai, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Lane 55, Chuanhe Road, 201210, Shanghai, China
| | - Shu Liang
- Department of Control Science and Engineering, Tongji University, No. 4800 Cao’an Road, 201804, Shanghai, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Lane 55, Chuanhe Road, 201210, Shanghai, China
| | - Lin Wan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, No. 55 Zhongguancun East Road, 100190, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, 19A Yuquan Road, 100049, Beijing, China
| |
Collapse
|
203
|
Si Z, Li H, Shang W, Zhao Y, Kong L, Long C, Zuo Y, Feng Z. SpaNCMG: improving spatial domains identification of spatial transcriptomics using neighborhood-complementary mixed-view graph convolutional network. Brief Bioinform 2024; 25:bbae259. [PMID: 38811360 PMCID: PMC11136618 DOI: 10.1093/bib/bbae259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 05/10/2024] [Accepted: 05/16/2024] [Indexed: 05/31/2024] Open
Abstract
The advancement of spatial transcriptomics (ST) technology contributes to a more profound comprehension of the spatial properties of gene expression within tissues. However, due to challenges of high dimensionality, pronounced noise and dynamic limitations in ST data, the integration of gene expression and spatial information to accurately identify spatial domains remains challenging. This paper proposes a SpaNCMG algorithm for the purpose of achieving precise spatial domain description and localization based on a neighborhood-complementary mixed-view graph convolutional network. The algorithm enables better adaptation to ST data at different resolutions by integrating the local information from KNN and the global structure from r-radius into a complementary neighborhood graph. It also introduces an attention mechanism to achieve adaptive fusion of different reconstructed expressions, and utilizes KPCA method for dimensionality reduction. The application of SpaNCMG on five datasets from four sequencing platforms demonstrates superior performance to eight existing advanced methods. Specifically, the algorithm achieved highest ARI accuracies of 0.63 and 0.52 on the datasets of the human dorsolateral prefrontal cortex and mouse somatosensory cortex, respectively. It accurately identified the spatial locations of marker genes in the mouse olfactory bulb tissue and inferred the biological functions of different regions. When handling larger datasets such as mouse embryos, the SpaNCMG not only identified the main tissue structures but also explored unlabeled domains. Overall, the good generalization ability and scalability of SpaNCMG make it an outstanding tool for understanding tissue structure and disease mechanisms. Our codes are available at https://github.com/ZhihaoSi/SpaNCMG.
Collapse
Affiliation(s)
- Zhihao Si
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hanshuang Li
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Wenjing Shang
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Yanan Zhao
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Lingjiao Kong
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Chunshen Long
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Zhenxing Feng
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| |
Collapse
|
204
|
Wang T, Shu H, Hu J, Wang Y, Chen J, Peng J, Shang X. Accurately deciphering spatial domains for spatially resolved transcriptomics with stCluster. Brief Bioinform 2024; 25:bbae329. [PMID: 38975895 PMCID: PMC11771244 DOI: 10.1093/bib/bbae329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 06/16/2024] [Accepted: 06/24/2024] [Indexed: 07/09/2024] Open
Abstract
Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.
Collapse
Affiliation(s)
- Tao Wang
- School of Computer Science, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
- Key Laboratory of Big Data Storage and Management, Ministry
of Industry and Information Technology, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
| | - Han Shu
- School of Computer Science, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
- Key Laboratory of Big Data Storage and Management, Ministry
of Industry and Information Technology, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
| | - Jialu Hu
- School of Computer Science, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
- Key Laboratory of Big Data Storage and Management, Ministry
of Industry and Information Technology, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
| | - Yongtian Wang
- School of Computer Science, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
- Key Laboratory of Big Data Storage and Management, Ministry
of Industry and Information Technology, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
| | - Jing Chen
- School of Computer Science and Engineering, Xi'an University of
Technology, No.5 South Jinhua rd., Xi'an 710048,
China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
- Key Laboratory of Big Data Storage and Management, Ministry
of Industry and Information Technology, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
- Key Laboratory of Big Data Storage and Management, Ministry
of Industry and Information Technology, Northwestern Polytechnical
University, 1 Dongxiang Rd., Xi'an 710072,
China
| |
Collapse
|
205
|
Wang L, Hu Y, Xiao K, Zhang C, Shi Q, Chen L. Multi-modal domain adaptation for revealing spatial functional landscape from spatially resolved transcriptomics. Brief Bioinform 2024; 25:bbae257. [PMID: 38819253 PMCID: PMC11141295 DOI: 10.1093/bib/bbae257] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 04/13/2024] [Accepted: 05/15/2024] [Indexed: 06/01/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) has emerged as a powerful tool for investigating gene expression in spatial contexts, providing insights into the molecular mechanisms underlying organ development and disease pathology. However, the expression sparsity poses a computational challenge to integrate other modalities (e.g. histological images and spatial locations) that are simultaneously captured in SRT datasets for spatial clustering and variation analyses. In this study, to meet such a challenge, we propose multi-modal domain adaption for spatial transcriptomics (stMDA), a novel multi-modal unsupervised domain adaptation method, which integrates gene expression and other modalities to reveal the spatial functional landscape. Specifically, stMDA first learns the modality-specific representations from spatial multi-modal data using multiple neural network architectures and then aligns the spatial distributions across modal representations to integrate these multi-modal representations, thus facilitating the integration of global and spatially local information and improving the consistency of clustering assignments. Our results demonstrate that stMDA outperforms existing methods in identifying spatial domains across diverse platforms and species. Furthermore, stMDA excels in identifying spatially variable genes with high prognostic potential in cancer tissues. In conclusion, stMDA as a new tool of multi-modal data integration provides a powerful and flexible framework for analyzing SRT datasets, thereby advancing our understanding of intricate biological systems.
Collapse
Affiliation(s)
- Lequn Wang
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
| | - Yaofeng Hu
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| | - Kai Xiao
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
| | - Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| | - Qianqian Shi
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, Hubei Province, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, Hubei Province, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| |
Collapse
|
206
|
Baul S, Tanvir Ahmed K, Jiang Q, Wang G, Li Q, Yong J, Zhang W. Integrating spatial transcriptomics and bulk RNA-seq: predicting gene expression with enhanced resolution through graph attention networks. Brief Bioinform 2024; 25:bbae316. [PMID: 38960406 PMCID: PMC11221891 DOI: 10.1093/bib/bbae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 06/04/2024] [Accepted: 06/17/2024] [Indexed: 07/05/2024] Open
Abstract
Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.
Collapse
Affiliation(s)
- Sudipto Baul
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Khandakar Tanvir Ahmed
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Qibing Jiang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Guangyu Wang
- Houston Methodist Research Institute, Weill Cornell Medical College, Houston, TX 77030, United States
| | - Qian Li
- Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, TN 38105, United States
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, United States
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
207
|
Li S, Gai K, Dong K, Zhang Y, Zhang S. High-density generation of spatial transcriptomics with STAGE. Nucleic Acids Res 2024; 52:4843-4856. [PMID: 38647109 PMCID: PMC11109953 DOI: 10.1093/nar/gkae294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 03/06/2024] [Accepted: 04/06/2024] [Indexed: 04/25/2024] Open
Abstract
Spatial transcriptome technologies have enabled the measurement of gene expression while maintaining spatial location information for deciphering the spatial heterogeneity of biological tissues. However, they were heavily limited by the sparse spatial resolution and low data quality. To this end, we develop a spatial location-supervised auto-encoder generator STAGE for generating high-density spatial transcriptomics (ST). STAGE takes advantage of the customized supervised auto-encoder to learn continuous patterns of gene expression in space and generate high-resolution expressions for given spatial coordinates. STAGE can improve the low quality of spatial transcriptome data and smooth the generated manifold of gene expression through the de-noising function on the latent codes of the auto-encoder. Applications to four ST datasets, STAGE has shown better recovery performance for down-sampled data than existing methods, revealed significant tissue structure specificity, and enabled robust identification of spatially informative genes and patterns. In addition, STAGE can be extended to three-dimensional (3D) stacked ST data for generating gene expression at any position between consecutive sections for shaping high-density 3D ST configuration.
Collapse
Affiliation(s)
- Shang Li
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kuo Gai
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kangning Dong
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiyang Zhang
- School of Software, Yunnan University, Kunming 650091, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| |
Collapse
|
208
|
Aminu M, Zhu B, Vokes N, Chen H, Hong L, Li J, Fujimoto J, Yang Y, Wang T, Wang B, Poteete A, Nilsson MB, Le X, Tina C, Jaffray D, Navin N, Byers LA, Gibbons D, Heymach J, Chen K, Cheng C, Zhang J, Wu J. CoCo-ST: Comparing and Contrasting Spatial Transcriptomics data sets using graph contrastive learning. RESEARCH SQUARE 2024:rs.3.rs-4359834. [PMID: 38826463 PMCID: PMC11142361 DOI: 10.21203/rs.3.rs-4359834/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Traditional feature dimension reduction methods have been widely used to uncover biological patterns or structures within individual spatial transcriptomics data. However, these methods are designed to yield feature representations that emphasize patterns or structures with dominant high variance, such as the normal tissue spatial pattern in a precancer setting. Consequently, they may inadvertently overlook patterns of interest that are potentially masked by these high-variance structures. Herein we present our graph contrastive feature representation method called CoCo-ST (Comparing and Contrasting Spatial Transcriptomics) to overcome this limitation. By incorporating a background data set representing normal tissue, this approach enhances the identification of interesting patterns in a target data set representing precancerous tissue. Simultaneously, it mitigates the influence of dominant common patterns shared by the background and target data sets. This enables discerning biologically relevant features crucial for capturing tissue-specific patterns, a capability we showcased through the analysis of serial mouse precancerous lung tissue samples.
Collapse
Affiliation(s)
- Muhammad Aminu
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- These authors contributed equally: Muhammad Aminu, Bo Zhu, Natalie Vokes
| | - Bo Zhu
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- These authors contributed equally: Muhammad Aminu, Bo Zhu, Natalie Vokes
| | - Natalie Vokes
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- These authors contributed equally: Muhammad Aminu, Bo Zhu, Natalie Vokes
| | - Hong Chen
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Lingzhi Hong
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Jianrong Li
- Department of Medicine, Institution of Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Junya Fujimoto
- Clinical Research Center, Hiroshima University, Hiroshima, Japan
| | - Yuqui Yang
- Department of Public Health, UT Southwestern Medical Center, Dallas, TX, USA
| | - Tao Wang
- Department of Public Health, UT Southwestern Medical Center, Dallas, TX, USA
| | - Bo Wang
- Department of Medical Biophysics, University of Toronto, Ontario, Canada
| | - Alissa Poteete
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Monique B. Nilsson
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Xiuning Le
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Cascone Tina
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - David Jaffray
- Office of the Chief Technology and Digital Officer, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Nick Navin
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Lauren A. Byers
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Don Gibbons
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - John Heymach
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Ken Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Chao Cheng
- Department of Medicine, Institution of Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Jianjun Zhang
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Co-senior authors: Jianjun Zhang, Jia Wu
| | - Jia Wu
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Department of Thoracic/Head and Neck Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Co-senior authors: Jianjun Zhang, Jia Wu
| |
Collapse
|
209
|
Ospina OE, Soupir AC, Manjarres-Betancur R, Gonzalez-Calderon G, Yu X, Fridley BL. Differential gene expression analysis of spatial transcriptomic experiments using spatial mixed models. Sci Rep 2024; 14:10967. [PMID: 38744956 PMCID: PMC11094014 DOI: 10.1038/s41598-024-61758-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 05/09/2024] [Indexed: 05/16/2024] Open
Abstract
Spatial transcriptomics (ST) assays represent a revolution in how the architecture of tissues is studied by allowing for the exploration of cells in their spatial context. A common element in the analysis is delineating tissue domains or "niches" followed by detecting differentially expressed genes to infer the biological identity of the tissue domains or cell types. However, many studies approach differential expression analysis by using statistical approaches often applied in the analysis of non-spatial scRNA data (e.g., two-sample t-tests, Wilcoxon's rank sum test), hence neglecting the spatial dependency observed in ST data. In this study, we show that applying linear mixed models with spatial correlation structures using spatial random effects effectively accounts for the spatial autocorrelation and reduces inflation of type-I error rate observed in non-spatial based differential expression testing. We also show that spatial linear models with an exponential correlation structure provide a better fit to the ST data as compared to non-spatial models, particularly for spatially resolved technologies that quantify expression at finer scales (i.e., single-cell resolution).
Collapse
Affiliation(s)
- Oscar E Ospina
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Alex C Soupir
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | | | | | - Xiaoqing Yu
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Brooke L Fridley
- Department of Biostatistics & Bioinformatics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
- Biostatistics and Epidemiology Core, Division of Health Services & Outcomes Research, Children's Mercy, Kansas City, MO, USA.
| |
Collapse
|
210
|
Schmidt M, Avagyan S, Reiche K, Binder H, Loeffler-Wirth H. A Spatial Transcriptomics Browser for Discovering Gene Expression Landscapes across Microscopic Tissue Sections. Curr Issues Mol Biol 2024; 46:4701-4720. [PMID: 38785552 PMCID: PMC11119626 DOI: 10.3390/cimb46050284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 04/30/2024] [Accepted: 05/03/2024] [Indexed: 05/25/2024] Open
Abstract
A crucial feature of life is its spatial organization and compartmentalization on the molecular, cellular, and tissue levels. Spatial transcriptomics (ST) technology has opened a new chapter of the sequencing revolution, emerging rapidly with transformative effects across biology. This technique produces extensive and complex sequencing data, raising the need for computational methods for their comprehensive analysis and interpretation. We developed the ST browser web tool for the interactive discovery of ST images, focusing on different functional aspects such as single gene expression, the expression of functional gene sets, as well as the inspection of the spatial patterns of cell-cell interactions. As a unique feature, our tool applies self-organizing map (SOM) machine learning to the ST data. Our SOM data portrayal method generates individual gene expression landscapes for each spot in the ST image, enabling its downstream analysis with high resolution. The performance of the spatial browser is demonstrated by disentangling the intra-tumoral heterogeneity of melanoma and the microarchitecture of the mouse brain. The integration of machine-learning-based SOM portrayal into an interactive ST analysis environment opens novel perspectives for the comprehensive knowledge mining of the organization and interactions of cellular ecosystems.
Collapse
Affiliation(s)
- Maria Schmidt
- Interdisciplinary Centre for Bioinformatics (IZBI), Leipzig University, Härtelstr. 16-18, 04107 Leipzig, Germany; (M.S.); (H.B.)
| | - Susanna Avagyan
- Armenian Bioinformatics Institute, 3/6 Nelson Stepanyan Str., Yerevan 0062, Armenia
| | - Kristin Reiche
- Department of Diagnostics, Fraunhofer Institute for Cell Therapy and Immunology (IZI), Perlickstrasse 1, 04103 Leipzig, Germany
- Institute for Clinical Immunology, University Hospital of Leipzig, 04103 Leipzig, Germany
| | - Hans Binder
- Interdisciplinary Centre for Bioinformatics (IZBI), Leipzig University, Härtelstr. 16-18, 04107 Leipzig, Germany; (M.S.); (H.B.)
- Armenian Bioinformatics Institute, 3/6 Nelson Stepanyan Str., Yerevan 0062, Armenia
| | - Henry Loeffler-Wirth
- Interdisciplinary Centre for Bioinformatics (IZBI), Leipzig University, Härtelstr. 16-18, 04107 Leipzig, Germany; (M.S.); (H.B.)
| |
Collapse
|
211
|
Li J, Wang Y, Raina MA, Xu C, Su L, Guo Q, Ma Q, Wang J, Xu D. scBSP: A fast and accurate tool for identifying spatially variable genes from spatial transcriptomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.06.592851. [PMID: 38765956 PMCID: PMC11100755 DOI: 10.1101/2024.05.06.592851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Spatially resolved transcriptomics have enabled the inference of gene expression patterns within two and three-dimensional space, while introducing computational challenges due to growing spatial resolutions and sparse expressions. Here, we introduce scBSP, an open-source, versatile, and user-friendly package designed for identifying spatially variable genes in large-scale spatial transcriptomics. scBSP implements sparse matrix operation to significantly increase the computational efficiency in both computational time and memory usage, processing the high-definition spatial transcriptomics data for 19,950 genes on 181,367 spots within 10 seconds. Applied to diverse sequencing data and simulations, scBSP efficiently identifies spatially variable genes, demonstrating fast computational speed and consistency across various sequencing techniques and spatial resolutions for both two and three-dimensional data with up to millions of cells. On a sample with hundreds of thousands of sports, scBSP identifies SVGs accurately in seconds to on a typical desktop computer.
Collapse
|
212
|
Ramirez Flores RO, Schäfer PSL, Küchenhoff L, Saez-Rodriguez J. Complementing Cell Taxonomies with a Multicellular Analysis of Tissues. Physiology (Bethesda) 2024; 39:0. [PMID: 38319138 DOI: 10.1152/physiol.00001.2024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 01/31/2024] [Indexed: 02/07/2024] Open
Abstract
The application of single-cell molecular profiling coupled with spatial technologies has enabled charting of cellular heterogeneity in reference tissues and in disease. This new wave of molecular data has highlighted the expected diversity of single-cell dynamics upon shared external queues and spatial organizations. However, little is known about the relationship between single-cell heterogeneity and the emergence and maintenance of robust multicellular processes in developed tissues and its role in (patho)physiology. Here, we present emerging computational modeling strategies that use increasingly available large-scale cross-condition single-cell and spatial datasets to study multicellular organization in tissues and complement cell taxonomies. This perspective should enable us to better understand how cells within tissues collectively process information and adapt synchronized responses in disease contexts and to bridge the gap between structural changes and functions in tissues.
Collapse
Affiliation(s)
- Ricardo Omar Ramirez Flores
- Faculty of Medicine, Heidelberg University and Institute for Computational Biomedicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Philipp Sven Lars Schäfer
- Faculty of Medicine, Heidelberg University and Institute for Computational Biomedicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Leonie Küchenhoff
- Faculty of Medicine, Heidelberg University and Institute for Computational Biomedicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Julio Saez-Rodriguez
- Faculty of Medicine, Heidelberg University and Institute for Computational Biomedicine, Heidelberg University Hospital, Heidelberg, Germany
| |
Collapse
|
213
|
Budhkar A, Tang Z, Liu X, Zhang X, Su J, Song Q. xSiGra: Explainable model for single-cell spatial data elucidation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.27.591458. [PMID: 38746321 PMCID: PMC11092461 DOI: 10.1101/2024.04.27.591458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Recent advancements in spatial imaging technologies have revolutionized the acquisition of high-resolution multi-channel images, gene expressions, and spatial locations at the single-cell level. Our study introduces xSiGra, an interpretable graph-based AI model, designed to elucidate interpretable features of identified spatial cell types, by harnessing multi-modal features from spatial imaging technologies. By constructing a spatial cellular graph with immunohistology images and gene expression as node attributes, xSiGra employs hybrid graph transformer models to delineate spatial cell types. Additionally, xSiGra integrates a novel variant of Grad-CAM component to uncover interpretable features, including pivotal genes and cells for various cell types, thereby facilitating deeper biological insights from spatial data. Through rigorous benchmarking against existing methods, xSiGra demonstrates superior performance across diverse spatial imaging datasets. Application of xSiGra on a lung tumor slice unveils the importance score of cells, illustrating that cellular activity is not solely determined by itself but also impacted by neighboring cells. Moreover, leveraging the identified interpretable genes, xSiGra reveals endothelial cell subset interacting with tumor cells, indicating its heterogeneous underlying mechanisms within the complex cellular communications.
Collapse
|
214
|
Wang W, Zheng S, Shin SC, Yuan GC. Characterizing Spatially Continuous Variations in Tissue Microenvironment through Niche Trajectory Analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.23.590827. [PMID: 38712255 PMCID: PMC11071437 DOI: 10.1101/2024.04.23.590827] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Recent technological developments have made it possible to map the spatial organization of a tissue at the single-cell resolution. However, computational methods for analyzing spatially continuous variations in tissue microenvironment are still lacking. Here we present ONTraC as a strategy that constructs niche trajectories using a graph neural network-based modeling framework. Our benchmark analysis shows that ONTraC performs more favorably than existing methods for reconstructing spatial trajectories. Applications of ONTraC to public spatial transcriptomics datasets successfully recapitulated the underlying anatomical structure, and further enabled detection of tissue microenvironment-dependent changes in gene regulatory networks and cell-cell interaction activities during embryonic development. Taken together, ONTraC provides a useful and generally applicable tool for the systematic characterization of the structural and functional organization of tissue microenvironments.
Collapse
Affiliation(s)
- Wen Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shiwei Zheng
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sujung Crystal Shin
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
215
|
Daly AC, Cambuli F, Äijö T, Lötstedt B, Marjanovic N, Kuksenko O, Smith-Erb M, Fernandez S, Domovic D, Van Wittenberghe N, Drokhlyansky E, Griffin GK, Phatnani H, Bonneau R, Regev A, Vickovic S. Tissue and cellular spatiotemporal dynamics in colon aging. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.22.590125. [PMID: 38712088 PMCID: PMC11071407 DOI: 10.1101/2024.04.22.590125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Tissue structure and molecular circuitry in the colon can be profoundly impacted by systemic age-related effects, but many of the underlying molecular cues remain unclear. Here, we built a cellular and spatial atlas of the colon across three anatomical regions and 11 age groups, encompassing ~1,500 mouse gut tissues profiled by spatial transcriptomics and ~400,000 single nucleus RNA-seq profiles. We developed a new computational framework, cSplotch, which learns a hierarchical Bayesian model of spatially resolved cellular expression associated with age, tissue region, and sex, by leveraging histological features to share information across tissue samples and data modalities. Using this model, we identified cellular and molecular gradients along the adult colonic tract and across the main crypt axis, and multicellular programs associated with aging in the large intestine. Our multi-modal framework for the investigation of cell and tissue organization can aid in the understanding of cellular roles in tissue-level pathology.
Collapse
Affiliation(s)
- Aidan C. Daly
- New York Genome Center, New York, NY, USA
- Center for Computational Biology, Flatiron Institute, New York, NY, USA
| | | | - Tarmo Äijö
- Center for Computational Biology, Flatiron Institute, New York, NY, USA
| | - Britta Lötstedt
- New York Genome Center, New York, NY, USA
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Nemanja Marjanovic
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Olena Kuksenko
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Columbia University Irving Medical Center, New York, NY, USA
| | | | | | | | | | - Eugene Drokhlyansky
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Gabriel K Griffin
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pathology, Brigham and Women’s Hospital, Boston, MA, USA
| | - Hemali Phatnani
- New York Genome Center, New York, NY, USA
- Department of Neurology, Columbia University Irving Medical Center, New York, NY, USA
| | - Richard Bonneau
- Center for Computational Biology, Flatiron Institute, New York, NY, USA
- Center for Data Science, New York University, New York, NY, USA
- Current address: Genentech, 1 DNA Way, South San Francisco, CA, USA
| | - Aviv Regev
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
- Current address: Genentech, 1 DNA Way, South San Francisco, CA, USA
| | - Sanja Vickovic
- New York Genome Center, New York, NY, USA
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Engineering and Herbert Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
- Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Beijer Laboratory for Gene and Neuro Research, Uppsala University, Uppsala, Sweden
| |
Collapse
|
216
|
Chakrabarti A, Ni Y, Mallick BK. Joint Bayesian estimation of cell dependence and gene associations in spatially resolved transcriptomic data. Sci Rep 2024; 14:9516. [PMID: 38664448 PMCID: PMC11045727 DOI: 10.1038/s41598-024-60002-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Recent technologies such as spatial transcriptomics, enable the measurement of gene expressions at the single-cell level along with the spatial locations of these cells in the tissue. Spatial clustering of the cells provides valuable insights into the understanding of the functional organization of the tissue. However, most such clustering methods involve some dimension reduction that leads to a loss of the inherent dependency structure among genes at any spatial location in the tissue. This destroys valuable insights of gene co-expression patterns apart from possibly impacting spatial clustering performance. In spatial transcriptomics, the matrix-variate gene expression data, along with spatial coordinates of the single cells, provides information on both gene expression dependencies and cell spatial dependencies through its row and column covariances. In this work, we propose a joint Bayesian approach to simultaneously estimate these gene and spatial cell correlations. These estimates provide data summaries for downstream analyses. We illustrate our method with simulations and analysis of several real spatial transcriptomic datasets. Our work elucidates gene co-expression networks as well as clear spatial clustering patterns of the cells. Furthermore, our analysis reveals that downstream spatial-differential analysis may aid in the discovery of unknown cell types from known marker genes.
Collapse
Affiliation(s)
- Arhit Chakrabarti
- Department of Statistics, Texas A &M University, College Station, TX, 77843, USA.
| | - Yang Ni
- Department of Statistics, Texas A &M University, College Station, TX, 77843, USA
| | - Bani K Mallick
- Department of Statistics, Texas A &M University, College Station, TX, 77843, USA
| |
Collapse
|
217
|
Yang J, Jiang X, Jin KW, Shin S, Li Q. Bayesian hidden mark interaction model for detecting spatially variable genes in imaging-based spatially resolved transcriptomics data. Front Genet 2024; 15:1356709. [PMID: 38725485 PMCID: PMC11079231 DOI: 10.3389/fgene.2024.1356709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Accepted: 04/08/2024] [Indexed: 05/12/2024] Open
Abstract
Recent technology breakthroughs in spatially resolved transcriptomics (SRT) have enabled the comprehensive molecular characterization of cells whilst preserving their spatial and gene expression contexts. One of the fundamental questions in analyzing SRT data is the identification of spatially variable genes whose expressions display spatially correlated patterns. Existing approaches are built upon either the Gaussian process-based model, which relies on ad hoc kernels, or the energy-based Ising model, which requires gene expression to be measured on a lattice grid. To overcome these potential limitations, we developed a generalized energy-based framework to model gene expression measured from imaging-based SRT platforms, accommodating the irregular spatial distribution of measured cells. Our Bayesian model applies a zero-inflated negative binomial mixture model to dichotomize the raw count data, reducing noise. Additionally, we incorporate a geostatistical mark interaction model with a generalized energy function, where the interaction parameter is used to identify the spatial pattern. Auxiliary variable MCMC algorithms were employed to sample from the posterior distribution with an intractable normalizing constant. We demonstrated the strength of our method on both simulated and real data. Our simulation study showed that our method captured various spatial patterns with high accuracy; moreover, analysis of a seqFISH dataset and a STARmap dataset established that our proposed method is able to identify genes with novel and strong spatial patterns.
Collapse
Affiliation(s)
- Jie Yang
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, United States
| | - Xi Jiang
- Department of Statistics and Data Science, Southern Methodist University, Dallas, TX, United States
| | - Kevin Wang Jin
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States
| | - Sunyoung Shin
- Department of Mathematics, Pohang University of Science and Technology, Pohang, Republic of Korea
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, United States
| |
Collapse
|
218
|
Wang C, Acosta D, McNutt M, Bian J, Ma A, Fu H, Ma Q. A Single-cell and Spatial RNA-seq Database for Alzheimer's Disease (ssREAD). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.08.556944. [PMID: 37745592 PMCID: PMC10515769 DOI: 10.1101/2023.09.08.556944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Alzheimer's Disease (AD) pathology has been increasingly explored through single-cell and single-nucleus RNA-sequencing (scRNA-seq & snRNA-seq) and spatial transcriptomics (ST). However, the surge in data demands a comprehensive, user-friendly repository. Addressing this, we introduce a single-cell and spatial RNA-seq database for Alzheimer's disease (ssREAD). It offers a broader spectrum of AD-related datasets, an optimized analytical pipeline, and improved usability. The database encompasses 1,053 samples (277 integrated datasets) from 67 AD-related scRNA-seq & snRNA-seq studies, totaling 7,332,202 cells. Additionally, it archives 381 ST datasets from 18 human and mouse brain studies. Each dataset is annotated with details such as species, gender, brain region, disease/control status, age, and AD Braak stages. ssREAD also provides an analysis suite for cell clustering, identification of differentially expressed and spatially variable genes, cell-type-specific marker genes and regulons, and spot deconvolution for integrative analysis. ssREAD is freely available at https://bmblx.bmi.osumc.edu/ssread/.
Collapse
Affiliation(s)
- Cankun Wang
- Department of Biomedical Informatics, The Ohio State University, OH 43210, USA
| | - Diana Acosta
- Department of Neuroscience, The Ohio State University, OH 43210, USA
| | - Megan McNutt
- Department of Biomedical Informatics, The Ohio State University, OH 43210, USA
| | - Jiang Bian
- Department of Health Outcomes & Biomedical Informatics, University of Florida, FL 32606, USA
| | - Anjun Ma
- Department of Biomedical Informatics, The Ohio State University, OH 43210, USA
| | - Hongjun Fu
- Department of Neuroscience, The Ohio State University, OH 43210, USA
- Chronic Brain Injury Program, The Ohio State University, OH 43210, USA
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, OH 43210, USA
| |
Collapse
|
219
|
Tian J, Bai X, Quek C. Single-Cell Informatics for Tumor Microenvironment and Immunotherapy. Int J Mol Sci 2024; 25:4485. [PMID: 38674070 PMCID: PMC11050520 DOI: 10.3390/ijms25084485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/12/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
Cancer comprises malignant cells surrounded by the tumor microenvironment (TME), a dynamic ecosystem composed of heterogeneous cell populations that exert unique influences on tumor development. The immune community within the TME plays a substantial role in tumorigenesis and tumor evolution. The innate and adaptive immune cells "talk" to the tumor through ligand-receptor interactions and signaling molecules, forming a complex communication network to influence the cellular and molecular basis of cancer. Such intricate intratumoral immune composition and interactions foster the application of immunotherapies, which empower the immune system against cancer to elicit durable long-term responses in cancer patients. Single-cell technologies have allowed for the dissection and characterization of the TME to an unprecedented level, while recent advancements in bioinformatics tools have expanded the horizon and depth of high-dimensional single-cell data analysis. This review will unravel the intertwined networks between malignancy and immunity, explore the utilization of computational tools for a deeper understanding of tumor-immune communications, and discuss the application of these approaches to aid in diagnosis or treatment decision making in the clinical setting, as well as the current challenges faced by the researchers with their potential future improvements.
Collapse
Affiliation(s)
| | | | - Camelia Quek
- Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia; (J.T.); (X.B.)
| |
Collapse
|
220
|
Yu S, Li WV. spVC for the detection and interpretation of spatial gene expression variation. Genome Biol 2024; 25:103. [PMID: 38641849 PMCID: PMC11027374 DOI: 10.1186/s13059-024-03245-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 04/10/2024] [Indexed: 04/21/2024] Open
Abstract
Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC's accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Shan Yu
- Department of Statistics, Unversity of Virginia, Charlottesville, 22903, VA, USA.
| | - Wei Vivian Li
- Department of Statistics, University of California, Riverside, 92521, CA, USA.
| |
Collapse
|
221
|
Bhuva DD, Tan CW, Salim A, Marceaux C, Pickering MA, Chen J, Kharbanda M, Jin X, Liu N, Feher K, Putri G, Tilley WD, Hickey TE, Asselin-Labat ML, Phipson B, Davis MJ. Library size confounds biology in spatial transcriptomics data. Genome Biol 2024; 25:99. [PMID: 38637899 PMCID: PMC11025268 DOI: 10.1186/s13059-024-03241-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 04/09/2024] [Indexed: 04/20/2024] Open
Abstract
Spatial molecular data has transformed the study of disease microenvironments, though, larger datasets pose an analytics challenge prompting the direct adoption of single-cell RNA-sequencing tools including normalization methods. Here, we demonstrate that library size is associated with tissue structure and that normalizing these effects out using commonly applied scRNA-seq normalization methods will negatively affect spatial domain identification. Spatial data should not be specifically corrected for library size prior to analysis, and algorithms designed for scRNA-seq data should be adopted with caution.
Collapse
Affiliation(s)
- Dharmesh D Bhuva
- South Australian immunoGENomics Cancer Institute (SAiGENCI), Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia.
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia.
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Chin Wee Tan
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
- The University of Queensland Fraser Institute, The University of Queensland, Woolloongabba, QLD, 4102, Australia
| | - Agus Salim
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Melbourne School of Population and Global Health, School of Mathematics and Statistics, The University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Claire Marceaux
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
- Personalised Oncology Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
| | - Marie A Pickering
- Dame Roma Mitchell Cancer Research Laboratories, Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| | - Jinjin Chen
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Malvika Kharbanda
- South Australian immunoGENomics Cancer Institute (SAiGENCI), Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Xinyi Jin
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Ning Liu
- South Australian immunoGENomics Cancer Institute (SAiGENCI), Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Kristen Feher
- South Australian immunoGENomics Cancer Institute (SAiGENCI), Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Givanna Putri
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Wayne D Tilley
- Dame Roma Mitchell Cancer Research Laboratories, Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| | - Theresa E Hickey
- Dame Roma Mitchell Cancer Research Laboratories, Adelaide Medical School, The University of Adelaide, Adelaide, SA, Australia
| | - Marie-Liesse Asselin-Labat
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
- Personalised Oncology Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
| | - Belinda Phipson
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Melissa J Davis
- South Australian immunoGENomics Cancer Institute (SAiGENCI), Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA, 5005, Australia
- Division of Bioinformatics, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, 3052, Australia
- Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
- The University of Queensland Fraser Institute, The University of Queensland, Woolloongabba, QLD, 4102, Australia
- Department of Clinical Pathology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
222
|
Lu Y, Chen QM, An L. SPADE: spatial deconvolution for domain specific cell-type estimation. Commun Biol 2024; 7:469. [PMID: 38632414 PMCID: PMC11024133 DOI: 10.1038/s42003-024-06172-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 04/10/2024] [Indexed: 04/19/2024] Open
Abstract
Understanding gene expression in different cell types within their spatial context is a key goal in genomics research. SPADE (SPAtial DEconvolution), our proposed method, addresses this by integrating spatial patterns into the analysis of cell type composition. This approach uses a combination of single-cell RNA sequencing, spatial transcriptomics, and histological data to accurately estimate the proportions of cell types in various locations. Our analyses of synthetic data have demonstrated SPADE's capability to discern cell type-specific spatial patterns effectively. When applied to real-life datasets, SPADE provides insights into cellular dynamics and the composition of tumor tissues. This enhances our comprehension of complex biological systems and aids in exploring cellular diversity. SPADE represents a significant advancement in deciphering spatial gene expression patterns, offering a powerful tool for the detailed investigation of cell types in spatial transcriptomics.
Collapse
Affiliation(s)
- Yingying Lu
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA
| | - Qin M Chen
- College of Pharmacy, University of Arizona, Tucson, AZ, 85721, USA
| | - Lingling An
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
223
|
Samadi Z, Askary A. Spatial motifs reveal patterns in cellular architecture of complex tissues. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.08.588586. [PMID: 38645046 PMCID: PMC11030378 DOI: 10.1101/2024.04.08.588586] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Spatial organization of cells is crucial to both proper physiological function of tissues and pathological conditions like cancer. Recent advances in spatial transcriptomics have enabled joint profiling of gene expression and spatial context of the cells. The outcome is an information rich map of the tissue where individual cells, or small regions, can be labeled based on their gene expression state. While spatial transcriptomics excels in its capacity to profile numerous genes within the same sample, most existing methods for analysis of spatial data only examine distribution of one or two labels at a time. These approaches overlook the potential for identifying higher-order associations between cell types - associations that can play a pivotal role in understanding development and function of complex tissues. In this context, we introduce a novel method for detecting motifs in spatial neighborhood graphs. Each motif represents a spatial arrangement of cell types that occurs in the tissue more frequently than expected by chance. To identify spatial motifs, we developed an algorithm for uniform sampling of paths from neighborhood graphs and combined it with a motif finding algorithm on graphs inspired by previous methods for finding motifs in DNA sequences. Using synthetic data with known ground truth, we show that our method can identify spatial motifs with high accuracy and sensitivity. Applied to spatial maps of mouse retinal bipolar cells and hypothalamic preoptic region, our method reveals previously unrecognized patterns in cell type arrangements. In some cases, cells within these spatial patterns differ in their gene expression from other cells of the same type, providing insights into the functional significance of the spatial motifs. These results suggest that our method can illuminate the substantial complexity of neural tissues, provide novel insight even in well studied models, and generate experimentally testable hypotheses.
Collapse
Affiliation(s)
- Zainalabedin Samadi
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, 90095, CA, USA
| | - Amjad Askary
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, 90095, CA, USA
| |
Collapse
|
224
|
Zhou Y, He W, Hou W, Zhu Y. Pianno: a probabilistic framework automating semantic annotation for spatial transcriptomics. Nat Commun 2024; 15:2848. [PMID: 38565531 PMCID: PMC11271244 DOI: 10.1038/s41467-024-47152-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open
Abstract
Spatial transcriptomics has revolutionized the study of gene expression within tissues, while preserving spatial context. However, annotating spatial spots' biological identity remains a challenge. To tackle this, we introduce Pianno, a Bayesian framework automating structural semantics annotation based on marker genes. Comprehensive evaluations underscore Pianno's remarkable prowess in precisely annotating a wide array of spatial semantics, ranging from diverse anatomical structures to intricate tumor microenvironments, as well as in estimating cell type distributions, across data generated from various spatial transcriptomics platforms. Furthermore, Pianno, in conjunction with clustering approaches, uncovers a region- and species-specific excitatory neuron subtype in the deep layer 3 of the human neocortex, shedding light on cellular evolution in the human neocortex. Overall, Pianno equips researchers with a robust and efficient tool for annotating diverse biological structures, offering new perspectives on spatial transcriptomics data.
Collapse
Affiliation(s)
- Yuqiu Zhou
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Wei He
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Weizhen Hou
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Ying Zhu
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China.
| |
Collapse
|
225
|
Hashemi Gheinani A, Kim J, You S, Adam RM. Bioinformatics in urology - molecular characterization of pathophysiology and response to treatment. Nat Rev Urol 2024; 21:214-242. [PMID: 37604982 DOI: 10.1038/s41585-023-00805-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/13/2023] [Indexed: 08/23/2023]
Abstract
The application of bioinformatics has revolutionized the practice of medicine in the past 20 years. From early studies that uncovered subtypes of cancer to broad efforts spearheaded by the Cancer Genome Atlas initiative, the use of bioinformatics strategies to analyse high-dimensional data has provided unprecedented insights into the molecular basis of disease. In addition to the identification of disease subtypes - which enables risk stratification - informatics analysis has facilitated the identification of novel risk factors and drivers of disease, biomarkers of progression and treatment response, as well as possibilities for drug repurposing or repositioning; moreover, bioinformatics has guided research towards precision and personalized medicine. Implementation of specific computational approaches such as artificial intelligence, machine learning and molecular subtyping has yet to become widespread in urology clinical practice for reasons of cost, disruption of clinical workflow and need for prospective validation of informatics approaches in independent patient cohorts. Solving these challenges might accelerate routine integration of bioinformatics into clinical settings.
Collapse
Affiliation(s)
- Ali Hashemi Gheinani
- Department of Urology, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Urology, Inselspital, Bern, Switzerland
- Department for BioMedical Research, University of Bern, Bern, Switzerland
| | - Jina Kim
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sungyong You
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Rosalyn M Adam
- Department of Urology, Boston Children's Hospital, Boston, MA, USA.
- Department of Surgery, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
226
|
Wu L, Jin W, Yu H, Liu B. Modulating autophagy to treat diseases: A revisited review on in silico methods. J Adv Res 2024; 58:175-191. [PMID: 37192730 PMCID: PMC10982871 DOI: 10.1016/j.jare.2023.05.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 05/05/2023] [Accepted: 05/09/2023] [Indexed: 05/18/2023] Open
Abstract
BACKGROUND Autophagy refers to the conserved cellular catabolic process relevant to lysosome activity and plays a vital role in maintaining the dynamic equilibrium of intracellular matter by degrading harmful and abnormally accumulated cellular components. Accumulating evidence has recently revealed that dysregulation of autophagy by genetic and exogenous interventions may disrupt cellular homeostasis in human diseases. In silico approaches as powerful aids to experiments have also been extensively reported to play their critical roles in the storage, prediction, and analysis of massive amounts of experimental data. Thus, modulating autophagy to treat diseases by in silico methods would be anticipated. AIM OF REVIEW Here, we focus on summarizing the updated in silico approaches including databases, systems biology network approaches, omics-based analyses, mathematical models, and artificial intelligence (AI) methods that sought to modulate autophagy for potential therapeutic purposes, which will provide a new insight into more promising therapeutic strategies. KEY SCIENTIFIC CONCEPTS OF REVIEW Autophagy-related databases are the data basis of the in silico method, storing a large amount of information about DNA, RNA, proteins, small molecules and diseases. The systems biology approach is a method to systematically study the interrelationships among biological processes including autophagy from a macroscopic perspective. Omics-based analyses are based on high-throughput data to analyze gene expression at different levels of biological processes involving autophagy. mathematical models are visualization methods to describe the dynamic process of autophagy, and its accuracy is related to the selection of parameters. AI methods use big data related to autophagy to predict autophagy targets, design targeted small molecules, and classify diverse human diseases for potential therapeutic applications.
Collapse
Affiliation(s)
- Lifeng Wu
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Wenke Jin
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Haiyang Yu
- State Key Laboratory of Component-based Chinese Medicine, Tianjin University of Traditional Chinese Medicine, Tianjin 301617, China; Haihe Laboratory of Modern Chinese Medicine, Tianjin 301617, China.
| | - Bo Liu
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu 610041, China.
| |
Collapse
|
227
|
Zeng Y, Luo M, Shangguan N, Shi P, Feng J, Xu J, Chen K, Lu Y, Yu W, Yang Y. Deciphering cell types by integrating scATAC-seq data with genome sequences. NATURE COMPUTATIONAL SCIENCE 2024; 4:285-298. [PMID: 38600256 DOI: 10.1038/s43588-024-00622-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Accepted: 03/18/2024] [Indexed: 04/12/2024]
Abstract
The single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) technology provides insight into gene regulation and epigenetic heterogeneity at single-cell resolution, but cell annotation from scATAC-seq remains challenging due to high dimensionality and extreme sparsity within the data. Existing cell annotation methods mostly focus on the cell peak matrix without fully utilizing the underlying genomic sequence. Here we propose a method, SANGO, for accurate single-cell annotation by integrating genome sequences around the accessibility peaks within scATAC data. The genome sequences of peaks are encoded into low-dimensional embeddings, and then iteratively used to reconstruct the peak statistics of cells through a fully connected network. The learned weights are considered as regulatory modes to represent cells, and utilized to align the query cells and the annotated cells in the reference data through a graph transformer network for cell annotations. SANGO was demonstrated to consistently outperform competing methods on 55 paired scATAC-seq datasets across samples, platforms and tissues. SANGO was also shown to be able to detect unknown tumor cells through attention edge weights learned by the graph transformer. Moreover, from the annotated cells, we found cell-type-specific peaks that provide functional insights/biological signals through expression enrichment analysis, cis-regulatory chromatin interaction analysis and motif enrichment analysis.
Collapse
Affiliation(s)
- Yuansong Zeng
- School of Big Data and Software Engineering, Chongqing University, Chongqing, China
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Mai Luo
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Ningyuan Shangguan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Peiyu Shi
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Junxi Feng
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Jin Xu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Ken Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yutong Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Weijiang Yu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
- Key Laboratory of Machine Intelligence and Advanced Computing (MOE), Guangzhou, China.
| |
Collapse
|
228
|
Yuan Z, Zhao F, Lin S, Zhao Y, Yao J, Cui Y, Zhang XY, Zhao Y. Benchmarking spatial clustering methods with spatially resolved transcriptomics data. Nat Methods 2024; 21:712-722. [PMID: 38491270 DOI: 10.1038/s41592-024-02215-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 02/16/2024] [Indexed: 03/18/2024]
Abstract
Spatial clustering, which shares an analogy with single-cell clustering, has expanded the scope of tissue physiology studies from cell-centroid to structure-centroid with spatially resolved transcriptomics (SRT) data. Computational methods have undergone remarkable development in recent years, but a comprehensive benchmark study is still lacking. Here we present a benchmark study of 13 computational methods on 34 SRT data (7 datasets). The performance was evaluated on the basis of accuracy, spatial continuity, marker genes detection, scalability, and robustness. We found existing methods were complementary in terms of their performance and functionality, and we provide guidance for selecting appropriate methods for given scenarios. On testing additional 22 challenging datasets, we identified challenges in identifying noncontinuous spatial domains and limitations of existing methods, highlighting their inadequacies in handling recent large-scale tasks. Furthermore, with 145 simulated data, we examined the robustness of these methods against four different factors, and assessed the impact of pre- and postprocessing approaches. Our study offers a comprehensive evaluation of existing spatial clustering methods with SRT data, paving the way for future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Zhiyuan Yuan
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
| | - Fangyuan Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Senlin Lin
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu Zhao
- Tencent AI Lab, Shenzhen, China
| | | | - Yan Cui
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Xiao-Yong Zhang
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
| | - Yi Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
229
|
Lyu Y, Wu C, Sun W, Li Z. Regional analysis to delineate intrasample heterogeneity with RegionalST. Bioinformatics 2024; 40:btae186. [PMID: 38579257 PMCID: PMC11026142 DOI: 10.1093/bioinformatics/btae186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 03/06/2024] [Accepted: 04/03/2024] [Indexed: 04/07/2024] Open
Abstract
MOTIVATION Spatial transcriptomics has greatly contributed to our understanding of spatial and intra-sample heterogeneity, which could be crucial for deciphering the molecular basis of human diseases. Intra-tumor heterogeneity, e.g. may be associated with cancer treatment responses. However, the lack of computational tools for exploiting cross-regional information and the limited spatial resolution of current technologies present major obstacles to elucidating tissue heterogeneity. RESULTS To address these challenges, we introduce RegionalST, an efficient computational method that enables users to quantify cell type mixture and interactions, identify sub-regions of interest, and perform cross-region cell type-specific differential analysis for the first time. Our simulations and real data applications demonstrate that RegionalST is an efficient tool for visualizing and analyzing diverse spatial transcriptomics data, thereby enabling accurate and flexible exploration of tissue heterogeneity. Overall, RegionalST provides a one-stop destination for researchers seeking to delve deeper into the intricacies of spatial transcriptomics data. AVAILABILITY AND IMPLEMENTATION The implementation of our method is available as an open-source R/Bioconductor package with a user-friendly manual available at https://bioconductor.org/packages/release/bioc/html/RegionalST.html.
Collapse
Affiliation(s)
- Yue Lyu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
- Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States
| | - Chong Wu
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
| | - Wei Sun
- Biostatistics Program, Public Health Science Division, Fred Hutchinson Cancer Center, Seattle, WA 98109, United States
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27516, United States
- Department of Biostatistics, University of Washington, Seattle, WA 98195, United States
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, United States
| |
Collapse
|
230
|
Lei L, Han K, Wang Z, Shi C, Wang Z, Dai R, Zhang Z, Wang M, Guo Q. Attention-guided variational graph autoencoders reveal heterogeneity in spatial transcriptomics. Brief Bioinform 2024; 25:bbae173. [PMID: 38627939 PMCID: PMC11021349 DOI: 10.1093/bib/bbae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 03/03/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open
Abstract
The latest breakthroughs in spatially resolved transcriptomics technology offer comprehensive opportunities to delve into gene expression patterns within the tissue microenvironment. However, the precise identification of spatial domains within tissues remains challenging. In this study, we introduce AttentionVGAE (AVGN), which integrates slice images, spatial information and raw gene expression while calibrating low-quality gene expression. By combining the variational graph autoencoder with multi-head attention blocks (MHA blocks), AVGN captures spatial relationships in tissue gene expression, adaptively focusing on key features and alleviating the need for prior knowledge of cluster numbers, thereby achieving superior clustering performance. Particularly, AVGN attempts to balance the model's attention focus on local and global structures by utilizing MHA blocks, an aspect that current graph neural networks have not extensively addressed. Benchmark testing demonstrates its significant efficacy in elucidating tissue anatomy and interpreting tumor heterogeneity, indicating its potential in advancing spatial transcriptomics research and understanding complex biological phenomena.
Collapse
Affiliation(s)
- Lixin Lei
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Kaitai Han
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zijun Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Chaojing Shi
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zhenghui Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Ruoyan Dai
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zhiwei Zhang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Mengqiu Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Qianjin Guo
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| |
Collapse
|
231
|
Zhai Y, Chen L, Deng M. scBOL: a universal cell type identification framework for single-cell and spatial transcriptomics data. Brief Bioinform 2024; 25:bbae188. [PMID: 38678389 PMCID: PMC11056022 DOI: 10.1093/bib/bbae188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 03/11/2024] [Accepted: 04/14/2024] [Indexed: 04/30/2024] Open
Abstract
MOTIVATION Over the past decade, single-cell transcriptomic technologies have experienced remarkable advancements, enabling the simultaneous profiling of gene expressions across thousands of individual cells. Cell type identification plays an essential role in exploring tissue heterogeneity and characterizing cell state differences. With more and more well-annotated reference data becoming available, massive automatic identification methods have sprung up to simplify the annotation process on unlabeled target data by transferring the cell type knowledge. However, in practice, the target data often include some novel cell types that are not in the reference data. Most existing works usually classify these private cells as one generic 'unassigned' group and learn the features of known and novel cell types in a coupled way. They are susceptible to the potential batch effects and fail to explore the fine-grained semantic knowledge of novel cell types, thus hurting the model's discrimination ability. Additionally, emerging spatial transcriptomic technologies, such as in situ hybridization, sequencing and multiplexed imaging, present a novel challenge to current cell type identification strategies that predominantly neglect spatial organization. Consequently, it is imperative to develop a versatile method that can proficiently annotate single-cell transcriptomics data, encompassing both spatial and non-spatial dimensions. RESULTS To address these issues, we propose a new, challenging yet realistic task called universal cell type identification for single-cell and spatial transcriptomics data. In this task, we aim to give semantic labels to target cells from known cell types and cluster labels to those from novel ones. To tackle this problem, instead of designing a suboptimal two-stage approach, we propose an end-to-end algorithm called scBOL from the perspective of Bipartite prototype alignment. Firstly, we identify the mutual nearest clusters in reference and target data as their potential common cell types. On this basis, we mine the cycle-consistent semantic anchor cells to build the intrinsic structure association between two data. Secondly, we design a neighbor-aware prototypical learning paradigm to strengthen the inter-cluster separability and intra-cluster compactness within each data, thereby inspiring the discriminative feature representations. Thirdly, driven by the semantic-aware prototypical learning framework, we can align the known cell types and separate the private cell types from them among reference and target data. Such an algorithm can be seamlessly applied to various data types modeled by different foundation models that can generate the embedding features for cells. Specifically, for non-spatial single-cell transcriptomics data, we use the autoencoder neural network to learn latent low-dimensional cell representations, and for spatial single-cell transcriptomics data, we apply the graph convolution network to capture molecular and spatial similarities of cells jointly. Extensive results on our carefully designed evaluation benchmarks demonstrate the superiority of scBOL over various state-of-the-art cell type identification methods. To our knowledge, we are the pioneers in presenting this pragmatic annotation task, as well as in devising a comprehensive algorithmic framework aimed at resolving this challenge across varied types of single-cell data. Finally, scBOL is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/aimeeyaoyao/scBOL.
Collapse
Affiliation(s)
- Yuyao Zhai
- School of Mathematical Sciences, Peking University, Beijing, China
| | - Liang Chen
- Huawei Technologies Co., Ltd., Beijing, China
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, Beijing, China
- Center for Statistical Science, Peking University, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing, China
| |
Collapse
|
232
|
Guo X, Ning J, Chen Y, Liu G, Zhao L, Fan Y, Sun S. Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies. Brief Funct Genomics 2024; 23:95-109. [PMID: 37022699 DOI: 10.1093/bfgp/elad011] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 12/09/2022] [Accepted: 03/10/2023] [Indexed: 04/07/2023] Open
Abstract
Differential expression (DE) analysis is a necessary step in the analysis of single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data. Unlike traditional bulk RNA-seq, DE analysis for scRNA-seq or SRT data has unique characteristics that may contribute to the difficulty of detecting DE genes. However, the plethora of DE tools that work with various assumptions makes it difficult to choose an appropriate one. Furthermore, a comprehensive review on detecting DE genes for scRNA-seq data or SRT data from multi-condition, multi-sample experimental designs is lacking. To bridge such a gap, here, we first focus on the challenges of DE detection, then highlight potential opportunities that facilitate further progress in scRNA-seq or SRT analysis, and finally provide insights and guidance in selecting appropriate DE tools or developing new computational DE methods.
Collapse
Affiliation(s)
- Xiya Guo
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Jin Ning
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yuanze Chen
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Guoliang Liu
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Liyan Zhao
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yue Fan
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Shiquan Sun
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| |
Collapse
|
233
|
Duan B, Chen S, Cheng X, Liu Q. Multi-slice spatial transcriptome domain analysis with SpaDo. Genome Biol 2024; 25:73. [PMID: 38504325 PMCID: PMC10949687 DOI: 10.1186/s13059-024-03213-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/08/2024] [Indexed: 03/21/2024] Open
Abstract
With the rapid advancements in spatial transcriptome sequencing, multiple tissue slices are now available, enabling the integration and interpretation of spatial cellular landscapes. Herein, we introduce SpaDo, a tool for multi-slice spatial domain analysis, including modules for multi-slice spatial domain detection, reference-based annotation, and multiple slice clustering at both single-cell and spot resolutions. We demonstrate SpaDo's effectiveness with over 40 multi-slice spatial transcriptome datasets from 7 sequencing platforms. Our findings highlight SpaDo's potential to reveal novel biological insights in multi-slice spatial transcriptomes.
Collapse
Affiliation(s)
- Bin Duan
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
| | - Shaoqi Chen
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Xiaojie Cheng
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Qi Liu
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
| |
Collapse
|
234
|
Ding J, Liu R, Wen H, Tang W, Li Z, Venegas J, Su R, Molho D, Jin W, Wang Y, Lu Q, Li L, Zuo W, Chang Y, Xie Y, Tang J. DANCE: a deep learning library and benchmark platform for single-cell analysis. Genome Biol 2024; 25:72. [PMID: 38504331 PMCID: PMC10949782 DOI: 10.1186/s13059-024-03211-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 03/05/2024] [Indexed: 03/21/2024] Open
Abstract
DANCE is the first standard, generic, and extensible benchmark platform for accessing and evaluating computational methods across the spectrum of benchmark datasets for numerous single-cell analysis tasks. Currently, DANCE supports 3 modules and 8 popular tasks with 32 state-of-art methods on 21 benchmark datasets. People can easily reproduce the results of supported algorithms across major benchmark datasets via minimal efforts, such as using only one command line. In addition, DANCE provides an ecosystem of deep learning architectures and tools for researchers to facilitate their own model development. DANCE is an open-source Python package that welcomes all kinds of contributions.
Collapse
Affiliation(s)
- Jiayuan Ding
- Department of Computer Science and Engineering, Michigan State University, East Lansing, USA.
| | - Renming Liu
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, USA
| | - Hongzhi Wen
- Department of Computer Science and Engineering, Michigan State University, East Lansing, USA
| | - Wenzhuo Tang
- Department of Statistics and Probability, Michigan State University, East Lansing, USA
| | - Zhaoheng Li
- Department of Biostatistics, University of Washington, Seattle, USA
| | - Julian Venegas
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, USA
| | - Runze Su
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, USA
- Department of Statistics and Probability, Michigan State University, East Lansing, USA
| | - Dylan Molho
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, USA
| | - Wei Jin
- Department of Computer Science and Engineering, Michigan State University, East Lansing, USA
| | - Yixin Wang
- Department of Bioengineering, Stanford University, Palo Alto, USA
| | - Qiaolin Lu
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Lingxiao Li
- Department of Computer Science, Boston University, Boston, USA
| | - Wangyang Zuo
- Department of Computer Science, Zhejiang University of Technology, Zhejiang, China
| | - Yi Chang
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Yuying Xie
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, USA.
- Department of Statistics and Probability, Michigan State University, East Lansing, USA.
| | - Jiliang Tang
- Department of Computer Science and Engineering, Michigan State University, East Lansing, USA.
| |
Collapse
|
235
|
Jing Z, Zhu Q, Li L, Xie Y, Wu X, Fang Q, Yang B, Dai B, Xu X, Pan H, Bai Y. Spaco: A comprehensive tool for coloring spatial data at single-cell resolution. PATTERNS (NEW YORK, N.Y.) 2024; 5:100915. [PMID: 38487801 PMCID: PMC10935509 DOI: 10.1016/j.patter.2023.100915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 12/11/2023] [Accepted: 12/18/2023] [Indexed: 03/17/2024]
Abstract
Understanding tissue architecture and niche-specific microenvironments in spatially resolved transcriptomics (SRT) requires in situ annotation and labeling of cells. Effective spatial visualization of these data demands appropriate colorization of numerous cell types. However, current colorization frameworks often inadequately account for the spatial relationships between cell types. This results in perceptual ambiguity in neighboring cells of biological distinct types, particularly in complex environments such as brain or tumor. To address this, we introduce Spaco, a potent tool for spatially aware colorization. Spaco utilizes the Degree of Interlacement metric to construct a weighted graph that evaluates the spatial relationships among different cell types, refining color assignments. Furthermore, Spaco incorporates an adaptive palette selection approach to amplify chromatic distinctions. When benchmarked on four diverse datasets, Spaco outperforms existing solutions, capturing complex spatial relationships and boosting visual clarity. Spaco ensures broad accessibility by accommodating color vision deficiency and offering open-accessible code in both Python and R.
Collapse
Affiliation(s)
- Zehua Jing
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310012, China
| | | | - Linxuan Li
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Shenzhen 518083, China
| | - Yue Xie
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Shenzhen 518083, China
| | - Xinchao Wu
- BGI Research, Hangzhou 310012, China
- School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Qi Fang
- BGI Research, Shenzhen 518083, China
| | - Bolin Yang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- BGI Research, Hangzhou 310012, China
| | - Baojun Dai
- BGI Research, Hangzhou 310012, China
- School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Xun Xu
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- Guangdong Provincial Key Laboratory of Genome Read and Write, BGI Research, Shenzhen 518083, China
- BGI Research, Shenzhen 518083, China
| | - Hailin Pan
- BGI Research, Hangzhou 310012, China
- BGI Research, Shenzhen 518083, China
| | - Yinqi Bai
- BGI Research, Hangzhou 310012, China
- BGI Research, Shenzhen 518083, China
| |
Collapse
|
236
|
Li R, Chen X, Yang X. Navigating the landscapes of spatial transcriptomics: How computational methods guide the way. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1839. [PMID: 38527900 DOI: 10.1002/wrna.1839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 02/24/2024] [Accepted: 03/04/2024] [Indexed: 03/27/2024]
Abstract
Spatially resolved transcriptomics has been dramatically transforming biological and medical research in various fields. It enables transcriptome profiling at single-cell, multi-cellular, or sub-cellular resolution, while retaining the information of geometric localizations of cells in complex tissues. The coupling of cell spatial information and its molecular characteristics generates a novel multi-modal high-throughput data source, which poses new challenges for the development of analytical methods for data-mining. Spatial transcriptomic data are often highly complex, noisy, and biased, presenting a series of difficulties, many unresolved, for data analysis and generation of biological insights. In addition, to keep pace with the ever-evolving spatial transcriptomic experimental technologies, the existing analytical theories and tools need to be updated and reformed accordingly. In this review, we provide an overview and discussion of the current computational approaches for mining of spatial transcriptomics data. Future directions and perspectives of methodology design are proposed to stimulate further discussions and advances in new analytical models and algorithms. This article is categorized under: RNA Methods > RNA Analyses in Cells RNA Evolution and Genomics > Computational Analyses of RNA RNA Export and Localization > RNA Localization.
Collapse
Affiliation(s)
- Runze Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xu Chen
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| | - Xuerui Yang
- MOE Key Laboratory of Bioinformatics, Center for Synthetic & Systems Biology, School of Life Sciences, Tsinghua University, Beijing, China
| |
Collapse
|
237
|
Danishuddin, Khan S, Kim JJ. Spatial transcriptomics data and analytical methods: An updated perspective. Drug Discov Today 2024; 29:103889. [PMID: 38244672 DOI: 10.1016/j.drudis.2024.103889] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/01/2024] [Accepted: 01/15/2024] [Indexed: 01/22/2024]
Abstract
Spatial transcriptomics (ST) is a newly emerging field that integrates high-resolution imaging and transcriptomic data to enable the high-throughput analysis of the spatial localization of transcripts in diverse biological systems. The rapid progress in this field necessitates the development of innovative computational methods to effectively tackle the distinct challenges posed by the analysis of ST data. These platforms, integrating AI techniques, offer a promising avenue for understanding disease mechanisms and expediting drug discovery. Despite significant advances in the development of ST data analysis techniques, there is an ongoing need to enhance these models for increased biological relevance. In this review, we briefly discuss the ST-related databases and current deep-learning-based models for spatial transcriptome data analyses and highlight their roles and future perspectives in biomedical applications.
Collapse
Affiliation(s)
- Danishuddin
- Department of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 38541, Korea.
| | - Shawez Khan
- National Center for Cancer Immune Therapy (CCIT-DK), Department of Oncology, Copenhagen University Hospital, Herlev, Denmark
| | - Jong Joo Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 38541, Korea.
| |
Collapse
|
238
|
Ruitenberg MJ, Nguyen QH. Cellular neighborhood analysis in spatial omics reveals new tissue domains and cell subtypes. Nat Genet 2024; 56:362-364. [PMID: 38413724 DOI: 10.1038/s41588-023-01646-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Affiliation(s)
- Marc J Ruitenberg
- School of Biomedical Science, Faculty of Medicine, The University of Queensland, Brisbane, Australia
| | - Quan H Nguyen
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia.
- QIMR Berghofter Medical Research Institute, Brisbane, Australia.
| |
Collapse
|
239
|
Lin S, Zhao F, Wu Z, Yao J, Zhao Y, Yuan Z. Streamlining spatial omics data analysis with Pysodb. Nat Protoc 2024; 19:831-895. [PMID: 38135744 DOI: 10.1038/s41596-023-00925-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 10/02/2023] [Indexed: 12/24/2023]
Abstract
Advances in spatial omics technologies have improved the understanding of cellular organization in tissues, leading to the generation of complex and heterogeneous data and prompting the development of specialized tools for managing, loading and visualizing spatial omics data. The Spatial Omics Database (SODB) was established to offer a unified format for data storage and interactive visualization modules. Here we detail the use of Pysodb, a Python-based tool designed to enable the efficient exploration and loading of spatial datasets from SODB within a Python environment. We present seven case studies using Pysodb, detailing the interaction with various computational methods, ensuring reproducibility of experimental data and facilitating the integration of new data and alternative applications in SODB. The approach offers a reference for method developers by outlining label and metadata availability in representative spatial data that can be loaded by Pysodb. The tool is supplemented by a website ( https://protocols-pysodb.readthedocs.io/ ) with detailed information for benchmarking analysis, and allows method developers to focus on computational models by facilitating data processing. This protocol is designed for researchers with limited experience in computational biology. Depending on the dataset complexity, the protocol typically requires ~12 h to complete.
Collapse
Affiliation(s)
- Senlin Lin
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Fangyuan Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | | | | | - Yi Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
| |
Collapse
|
240
|
Singhal V, Chou N, Lee J, Yue Y, Liu J, Chock WK, Lin L, Chang YC, Teo EML, Aow J, Lee HK, Chen KH, Prabhakar S. BANKSY unifies cell typing and tissue domain segmentation for scalable spatial omics data analysis. Nat Genet 2024; 56:431-441. [PMID: 38413725 PMCID: PMC10937399 DOI: 10.1038/s41588-024-01664-3] [Citation(s) in RCA: 49] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 01/16/2024] [Indexed: 02/29/2024]
Abstract
Spatial omics data are clustered to define both cell types and tissue domains. We present Building Aggregates with a Neighborhood Kernel and Spatial Yardstick (BANKSY), an algorithm that unifies these two spatial clustering problems by embedding cells in a product space of their own and the local neighborhood transcriptome, representing cell state and microenvironment, respectively. BANKSY's spatial feature augmentation strategy improved performance on both tasks when tested on diverse RNA (imaging, sequencing) and protein (imaging) datasets. BANKSY revealed unexpected niche-dependent cell states in the mouse brain and outperformed competing methods on domain segmentation and cell typing benchmarks. BANKSY can also be used for quality control of spatial transcriptomics data and for spatially aware batch effect correction. Importantly, it is substantially faster and more scalable than existing methods, enabling the processing of millions of cell datasets. In summary, BANKSY provides an accurate, biologically motivated, scalable and versatile framework for analyzing spatially resolved omics data.
Collapse
Affiliation(s)
- Vipul Singhal
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Nigel Chou
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Joseph Lee
- Faculty of Science, National University of Singapore, Singapore, Republic of Singapore
| | - Yifei Yue
- Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore, Republic of Singapore
| | - Jinyue Liu
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Wan Kee Chock
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Li Lin
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | | | | | - Jonathan Aow
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Hwee Kuan Lee
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
- School of Computing, National University of Singapore, Singapore, Republic of Singapore
- Singapore Eye Research Institute, Singapore, Republic of Singapore
- International Research Laboratory on Artificial Intelligence, Singapore, Republic of Singapore
- School of Biological Sciences, Nanyang Technological University, Singapore, Republic of Singapore
- Singapore Institute for Clinical Sciences, Agency for Science, Technology and Research, Singapore, Republic of Singapore
| | - Kok Hao Chen
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore.
| | - Shyam Prabhakar
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore.
- Population and Global Health, Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Republic of Singapore.
- Cancer Science Institute of Singapore, National University of Singapore, Singapore, Republic of Singapore.
| |
Collapse
|
241
|
Liao K, Xiang Y, Huang F, Huang M, Xu W, Lin Y, Liao P, Wang Z, Yang L, Tian X, Chen D, Wang Z, Liu S, Zhuang Z. Spatial and single-nucleus transcriptomics decoding the molecular landscape and cellular organization of avian optic tectum. iScience 2024; 27:109009. [PMID: 38333704 PMCID: PMC10850779 DOI: 10.1016/j.isci.2024.109009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2023] [Revised: 12/19/2023] [Accepted: 01/22/2024] [Indexed: 02/10/2024] Open
Abstract
The avian optic tectum (OT) has been studied for its diverse functions, yet a comprehensive molecular landscape at the cellular level has been lacking. In this study, we applied spatial transcriptome sequencing and single-nucleus RNA sequencing (snRNA-seq) to explore the cellular organization and molecular characteristics of the avian OT from two species: Columba livia and Taeniopygia guttata. We identified precise layer structures and provided comprehensive layer-specific signatures of avian OT. Furthermore, we elucidated diverse functions in different layers, with the stratum griseum periventriculare (SGP) potentially playing a key role in advanced functions of OT, like fear response and associative learning. We characterized detailed neuronal subtypes and identified a population of FOXG1+ excitatory neurons, resembling those found in the mouse neocortex, potentially involved in neocortex-related functions and expansion of avian OT. These findings could contribute to our understanding of the architecture of OT, shedding light on visual perception and multifunctional association.
Collapse
Affiliation(s)
- Kuo Liao
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
- BGI Research, Hangzhou 310030, China
| | - Ya Xiang
- BGI Research, Hangzhou 310030, China
- College of Life Sciences, Northwest University, Xi’an 710069, China
| | - Fubaoqian Huang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
- BGI Research, Hangzhou 310030, China
| | - Maolin Huang
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Wenbo Xu
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Youning Lin
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
| | - Pingfang Liao
- BGI Research, Hangzhou 310030, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zishi Wang
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Lin Yang
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Xinmao Tian
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Duoyuan Chen
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
| | - Zhenlong Wang
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Shiping Liu
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
| | - Zhenkun Zhuang
- BGI Research, Hangzhou 310030, China
- BGI Research, Shenzhen 518083, China
| |
Collapse
|
242
|
Yao J, Yu J, Caffo B, Page SC, Martinowich K, Hicks SC. Spatial domain detection using contrastive self-supervised learning for spatial multi-omics technologies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.02.578662. [PMID: 38352580 PMCID: PMC10862910 DOI: 10.1101/2024.02.02.578662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Recent advances in spatially-resolved single-omics and multi-omics technologies have led to the emergence of computational tools to detect or predict spatial domains. Additionally, histological images and immunofluorescence (IF) staining of proteins and cell types provide multiple perspectives and a more complete understanding of tissue architecture. Here, we introduce Proust, a scalable tool to predict discrete domains using spatial multi-omics data by combining the low-dimensional representation of biological profiles based on graph-based contrastive self-supervised learning. Our scalable method integrates multiple data modalities, such as RNA, protein, and H&E images, and predicts spatial domains within tissue samples. Through the integration of multiple modalities, Proust consistently demonstrates enhanced accuracy in detecting spatial domains, as evidenced across various benchmark datasets and technological platforms.
Collapse
Affiliation(s)
- Jianing Yao
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, MD, USA
| | - Jinglun Yu
- Department of Electrical and Computer Engineering, Johns Hopkins University, MD, USA
| | - Brian Caffo
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, MD, USA
| | - Stephanie C. Page
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
| | - Keri Martinowich
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, MD, USA
- The Solomon H. Snyder Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Stephanie C. Hicks
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
- Malone Center for Engineering in Healthcare, Johns Hopkins University, MD, USA
| |
Collapse
|
243
|
Hu Y, Rong J, Xu Y, Xie R, Peng J, Gao L, Tan K. Unsupervised and supervised discovery of tissue cellular neighborhoods from cell phenotypes. Nat Methods 2024; 21:267-278. [PMID: 38191930 PMCID: PMC10864185 DOI: 10.1038/s41592-023-02124-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 11/08/2023] [Indexed: 01/10/2024]
Abstract
It is poorly understood how different cells in a tissue organize themselves to support tissue functions. We describe the CytoCommunity algorithm for the identification of tissue cellular neighborhoods (TCNs) based on cell phenotypes and their spatial distributions. CytoCommunity learns a mapping directly from the cell phenotype space to the TCN space using a graph neural network model without intermediate clustering of cell embeddings. By leveraging graph pooling, CytoCommunity enables de novo identification of condition-specific and predictive TCNs under the supervision of sample labels. Using several types of spatial omics data, we demonstrate that CytoCommunity can identify TCNs of variable sizes with substantial improvement over existing methods. By analyzing risk-stratified colorectal and breast cancer data, CytoCommunity revealed new granulocyte-enriched and cancer-associated fibroblast-enriched TCNs specific to high-risk tumors and altered interactions between neoplastic and immune or stromal cells within and between TCNs. CytoCommunity can perform unsupervised and supervised analyses of spatial omics maps and enable the discovery of condition-specific cell-cell communication patterns across spatial scales.
Collapse
Affiliation(s)
- Yuxuan Hu
- School of Computer Science and Technology, Xidian University, Xi'an, China.
| | - Jiazhen Rong
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Yafei Xu
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Runzhi Xie
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Jacqueline Peng
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Kai Tan
- Division of Oncology and Center for Childhood Cancer Research, Children's Hospital of Philadelphia, Philadelphia, PA, USA.
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
244
|
Cai P, Robinson MD, Tiberi S. DESpace: spatially variable gene detection via differential expression testing of spatial clusters. Bioinformatics 2024; 40:btae027. [PMID: 38243704 PMCID: PMC10868334 DOI: 10.1093/bioinformatics/btae027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 12/23/2023] [Accepted: 01/15/2024] [Indexed: 01/21/2024] Open
Abstract
MOTIVATION Spatially resolved transcriptomics (SRT) enables scientists to investigate spatial context of mRNA abundance, including identifying spatially variable genes (SVGs), i.e. genes whose expression varies across the tissue. Although several methods have been proposed for this task, native SVG tools cannot jointly model biological replicates, or identify the key areas of the tissue affected by spatial variability. RESULTS Here, we introduce DESpace, a framework, based on an original application of existing methods, to discover SVGs. In particular, our approach inputs all types of SRT data, summarizes spatial information via spatial clusters, and identifies spatially variable genes by performing differential gene expression testing between clusters. Furthermore, our framework can identify (and test) the main cluster of the tissue affected by spatial variability; this allows scientists to investigate spatial expression changes in specific areas of interest. Additionally, DESpace enables joint modeling of multiple samples (i.e. biological replicates); compared to inference based on individual samples, this approach increases statistical power, and targets SVGs with consistent spatial patterns across replicates. Overall, in our benchmarks, DESpace displays good true positive rates, controls for false positive and false discovery rates, and is computationally efficient. AVAILABILITY AND IMPLEMENTATION DESpace is freely distributed as a Bioconductor R package at https://bioconductor.org/packages/DESpace.
Collapse
Affiliation(s)
- Peiying Cai
- Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich 8057, Switzerland
| | - Mark D Robinson
- Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich 8057, Switzerland
| | - Simone Tiberi
- Department of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Zurich 8057, Switzerland
- Department of Statistical Sciences, University of Bologna, Bologna 40126, Italy
| |
Collapse
|
245
|
Yan S, Guo Y, Lin L, Zhang W. Breaks for Precision Medicine in Cancer: Development and Prospects of Spatiotemporal Transcriptomics. Cancer Biother Radiopharm 2024; 39:35-45. [PMID: 38181185 DOI: 10.1089/cbr.2023.0116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2024] Open
Abstract
With the development of the social economy and the deepening understanding of cancer, cancer has become a significant cause of death, threatening human health. Although researchers have made rapid progress in cancer treatment strategies in recent years, the overall survival of cancer patients is still not optimistic. Therefore, it is essential to reveal the spatial pattern of gene expression, spatial heterogeneity of cell populations, microenvironment interactions, and other aspects of cancer. Spatiotemporal transcriptomics can help analyze the mechanism of cancer occurrence and development, greatly help precise cancer treatment, and improve clinical prognosis. Here, we review the integration strategies of single-cell RNA sequencing and spatial transcriptomics data, summarize the recent advances in spatiotemporal transcriptomics in cancer studies, and discuss the combined application of spatial multiomics, which provides new directions and strategies for the precise treatment and clinical prognosis of cancer.
Collapse
Affiliation(s)
- Shiqi Yan
- Department of Laboratory Medicine, The Third Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China
- Department of Laboratory Medicine, Xiangya School of Medicine, Central South University, Changsha, Hunan, People's Republic of China
| | - Yilin Guo
- Department of Laboratory Medicine, The Third Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China
- Department of Laboratory Medicine, Xiangya School of Medicine, Central South University, Changsha, Hunan, People's Republic of China
| | - Lizhong Lin
- Department of Clinical Laboratory, The First People's Hospital of Changde City, Changde, Hunan, People's Republic of China
| | - Wenling Zhang
- Department of Laboratory Medicine, The Third Xiangya Hospital, Central South University, Changsha, Hunan, People's Republic of China
- Department of Laboratory Medicine, Xiangya School of Medicine, Central South University, Changsha, Hunan, People's Republic of China
| |
Collapse
|
246
|
Zhang C, Gao J, Chen HY, Kong L, Cao G, Guo X, Liu W, Ren B, Wei DQ. STGIC: A graph and image convolution-based method for spatial transcriptomic clustering. PLoS Comput Biol 2024; 20:e1011935. [PMID: 38416785 PMCID: PMC10927115 DOI: 10.1371/journal.pcbi.1011935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/11/2024] [Accepted: 02/20/2024] [Indexed: 03/01/2024] Open
Abstract
Spatial transcriptomic (ST) clustering employs spatial and transcription information to group spots spatially coherent and transcriptionally similar together into the same spatial domain. Graph convolution network (GCN) and graph attention network (GAT), fed with spatial coordinates derived adjacency and transcription profile derived feature matrix are often used to solve the problem. Our proposed method STGIC (spatial transcriptomic clustering with graph and image convolution) is designed for techniques with regular lattices on chips. It utilizes an adaptive graph convolution (AGC) to get high quality pseudo-labels and then resorts to dilated convolution framework (DCF) for virtual image converted from gene expression information and spatial coordinates of spots. The dilation rates and kernel sizes are set appropriately and updating of weight values in the kernels is made to be subject to the spatial distance from the position of corresponding elements to kernel centers so that feature extraction of each spot is better guided by spatial distance to neighbor spots. Self-supervision realized by Kullback-Leibler (KL) divergence, spatial continuity loss and cross entropy calculated among spots with high confidence pseudo-labels make up the training objective of DCF. STGIC attains state-of-the-art (SOTA) clustering performance on the benchmark dataset of 10x Visium human dorsolateral prefrontal cortex (DLPFC). Besides, it's capable of depicting fine structures of other tissues from other species as well as guiding the identification of marker genes. Also, STGIC is expandable to Stereo-seq data with high spatial resolution.
Collapse
Affiliation(s)
- Chen Zhang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Junhui Gao
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hong-Yu Chen
- College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Lingxin Kong
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Guangshuo Cao
- State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang
| | - Xiangyu Guo
- Smart-Health Initiative, King Abdullah University of Science and Technology, Jeddah, Saudi Arabia
| | - Wei Liu
- Marine Science and Technology College, Zhejiang Ocean University, Zhoushan, China
| | - Bin Ren
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Dong-Qing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
247
|
Yuan M, Wan H, Wang Z, Guo Q, Deng M. SPANN: annotating single-cell resolution spatial transcriptome data with scRNA-seq data. Brief Bioinform 2024; 25:bbad533. [PMID: 38279647 PMCID: PMC10818138 DOI: 10.1093/bib/bbad533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 11/13/2023] [Accepted: 12/19/2023] [Indexed: 01/28/2024] Open
Abstract
MOTIVATION The rapid development of spatial transcriptome technologies has enabled researchers to acquire single-cell-level spatial data at an affordable price. However, computational analysis tools, such as annotation tools, tailored for these data are still lacking. Recently, many computational frameworks have emerged to integrate single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics datasets. While some frameworks can utilize well-annotated scRNA-seq data to annotate spatial expression patterns, they overlook critical aspects. First, existing tools do not explicitly consider cell type mapping when aligning the two modalities. Second, current frameworks lack the capability to detect novel cells, which remains a key interest for biologists. RESULTS To address these problems, we propose an annotation method for spatial transcriptome data called SPANN. The main tasks of SPANN are to transfer cell-type labels from well-annotated scRNA-seq data to newly generated single-cell resolution spatial transcriptome data and discover novel cells from spatial data. The major innovations of SPANN come from two aspects: SPANN automatically detects novel cells from unseen cell types while maintaining high annotation accuracy over known cell types. SPANN finds a mapping between spatial transcriptome samples and RNA data prototypes and thus conducts cell-type-level alignment. Comprehensive experiments using datasets from various spatial platforms demonstrate SPANN's capabilities in annotating known cell types and discovering novel cell states within complex tissue contexts. AVAILABILITY The source code of SPANN can be accessed at https://github.com/ddb-qiwang/SPANN-torch. CONTACT dengmh@math.pku.edu.cn.
Collapse
Affiliation(s)
- Musu Yuan
- Center for Quantitative Biology, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Hui Wan
- School of Mathematical Sciences, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Zihao Wang
- Biomedical Interdisciplinary Research Center, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Qirui Guo
- Center for Quantitative Biology, Peking University, Yiheyuan Road, 100871, Beijing, China
| | - Minghua Deng
- Center for Quantitative Biology, Peking University, Yiheyuan Road, 100871, Beijing, China
- School of Mathematical Sciences, Peking University, Yiheyuan Road, 100871, Beijing, China
- Center for Statistical Science, Peking University, Yiheyuan Road, 100871, Beijing, China
- Biomedical Interdisciplinary Research Center, Peking University, Yiheyuan Road, 100871, Beijing, China
| |
Collapse
|
248
|
Zahedi R, Ghamsari R, Argha A, Macphillamy C, Beheshti A, Alizadehsani R, Lovell NH, Lotfollahi M, Alinejad-Rokny H. Deep learning in spatially resolved transcriptfomics: a comprehensive technical view. Brief Bioinform 2024; 25:bbae082. [PMID: 38483255 PMCID: PMC10939360 DOI: 10.1093/bib/bbae082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 12/22/2024] [Accepted: 02/13/2024] [Indexed: 03/17/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) is a pioneering method for simultaneously studying morphological contexts and gene expression at single-cell precision. Data emerging from SRT are multifaceted, presenting researchers with intricate gene expression matrices, precise spatial details and comprehensive histology visuals. Such rich and intricate datasets, unfortunately, render many conventional methods like traditional machine learning and statistical models ineffective. The unique challenges posed by the specialized nature of SRT data have led the scientific community to explore more sophisticated analytical avenues. Recent trends indicate an increasing reliance on deep learning algorithms, especially in areas such as spatial clustering, identification of spatially variable genes and data alignment tasks. In this manuscript, we provide a rigorous critique of these advanced deep learning methodologies, probing into their merits, limitations and avenues for further refinement. Our in-depth analysis underscores that while the recent innovations in deep learning tailored for SRT have been promising, there remains a substantial potential for enhancement. A crucial area that demands attention is the development of models that can incorporate intricate biological nuances, such as phylogeny-aware processing or in-depth analysis of minuscule histology image segments. Furthermore, addressing challenges like the elimination of batch effects, perfecting data normalization techniques and countering the overdispersion and zero inflation patterns seen in gene expression is pivotal. To support the broader scientific community in their SRT endeavors, we have meticulously assembled a comprehensive directory of readily accessible SRT databases, hoping to serve as a foundation for future research initiatives.
Collapse
Affiliation(s)
- Roxana Zahedi
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
| | - Reza Ghamsari
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
| | - Ahmadreza Argha
- The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| | - Callum Macphillamy
- School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, 5371, Australia
| | - Amin Beheshti
- School of Computing, Macquarie University, Sydney, 2109, Australia
| | - Roohallah Alizadehsani
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Waurn Ponds, Melbourne, VIC, 3216, Australia
| | - Nigel H Lovell
- The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| | - Mohammad Lotfollahi
- Computational Health Center, Helmholtz Munich, Germany
- Wellcome Sanger Institute, Cambridge, UK
| | - Hamid Alinejad-Rokny
- UNSW BioMedical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, 2052, NSW, Australia
- Tyree Institute of Health Engineering (IHealthE), UNSW Sydney, 2052, NSW, Australia
| |
Collapse
|
249
|
Tao Y, Sun X, Wang F. BiGATAE: a bipartite graph attention auto-encoder enhancing spatial domain identification from single-slice to multi-slices. Brief Bioinform 2024; 25:bbae045. [PMID: 38385877 PMCID: PMC10883416 DOI: 10.1093/bib/bbae045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 01/10/2024] [Accepted: 01/23/2024] [Indexed: 02/23/2024] Open
Abstract
Recent advancements in spatial transcriptomics technology have revolutionized our ability to comprehensively characterize gene expression patterns within the tissue microenvironment, enabling us to grasp their functional significance in a spatial context. One key field of research in spatial transcriptomics is the identification of spatial domains, which refers to distinct regions within the tissue where specific gene expression patterns are observed. Diverse methodologies have been proposed, each with its unique characteristics. As the availability of spatial transcriptomics data continues to expand, there is a growing need for methods that can integrate information from multiple slices to discover spatial domains. To extend the applicability of existing single-slice analysis methods to multi-slice clustering, we introduce BiGATAE (Bipartite Graph Attention Auto Encoder) that leverages gene expression information from adjacent tissue slices to enhance spatial transcriptomics data. BiGATAE comprises two steps: aligning slices to generate an adjacency matrix for different spots in consecutive slices and constructing a bipartite graph. Subsequently, it utilizes a graph attention network to integrate information across different slices. Then it can seamlessly integrate with pre-existing techniques. To evaluate the performance of BiGATAE, we conducted benchmarking analyses on three different datasets. The experimental results demonstrate that for existing single-slice clustering methods, the integration of BiGATAE significantly enhances their performance. Moreover, single-slice clustering methods integrated with BiGATAE outperform methods specifically designed for multi-slice integration. These results underscore the proficiency of BiGATAE in facilitating information transfer across multiple slices and its capacity to broaden the applicability and sustainability of pre-existing methods.
Collapse
Affiliation(s)
- Yuhao Tao
- Shanghai Key Lab of Intelligent Information Processing, Handan Street, 200433 Shanghai, China
- School of Computer Science and Technology, Fudan University Handan Street, 200433 Shanghai, China
| | - Xiaoang Sun
- Shanghai Key Lab of Intelligent Information Processing, Handan Street, 200433 Shanghai, China
- School of Computer Science and Technology, Fudan University Handan Street, 200433 Shanghai, China
| | - Fei Wang
- Shanghai Key Lab of Intelligent Information Processing, Handan Street, 200433 Shanghai, China
- School of Computer Science and Technology, Fudan University Handan Street, 200433 Shanghai, China
| |
Collapse
|
250
|
Zhao C, Xu Z, Wang X, Tao S, MacDonald WA, He K, Poholek AC, Chen K, Huang H, Chen W. Innovative super-resolution in spatial transcriptomics: a transformer model exploiting histology images and spatial gene expression. Brief Bioinform 2024; 25:bbae052. [PMID: 38436557 PMCID: PMC10939304 DOI: 10.1093/bib/bbae052] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 01/26/2024] [Accepted: 01/27/2024] [Indexed: 03/05/2024] Open
Abstract
Spatial transcriptomics technologies have shed light on the complexities of tissue structures by accurately mapping spatial microenvironments. Nonetheless, a myriad of methods, especially those utilized in platforms like Visium, often relinquish spatial details owing to intrinsic resolution limitations. In response, we introduce TransformerST, an innovative, unsupervised model anchored in the Transformer architecture, which operates independently of references, thereby ensuring cost-efficiency by circumventing the need for single-cell RNA sequencing. TransformerST not only elevates Visium data from a multicellular level to a single-cell granularity but also showcases adaptability across diverse spatial transcriptomics platforms. By employing a vision transformer-based encoder, it discerns latent image-gene expression co-representations and is further enhanced by spatial correlations, derived from an adaptive graph Transformer module. The sophisticated cross-scale graph network, utilized in super-resolution, significantly boosts the model's accuracy, unveiling complex structure-functional relationships within histology images. Empirical evaluations validate its adeptness in revealing tissue subtleties at the single-cell scale. Crucially, TransformerST adeptly navigates through image-gene co-representation, maximizing the synergistic utility of gene expression and histology images, thereby emerging as a pioneering tool in spatial transcriptomics. It not only enhances resolution to a single-cell level but also introduces a novel approach that optimally utilizes histology images alongside gene expression, providing a refined lens for investigating spatial transcriptomics.
Collapse
Affiliation(s)
- Chongyue Zhao
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
| | - Zhongli Xu
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
- School of Medicine, Tsinghua University, Beijing, 100084, Beijing, China
| | - Xinjun Wang
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, 10065, New York, USA
| | - Shiyue Tao
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, 15261, Pennsylvania, USA
| | - William A MacDonald
- Health Sciences Sequencing Core at UPMC Children’s Hospital of Pittsburgh, Department of Pediatrics , University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
| | - Kun He
- Division of Pediatric Rheumatology, Department of Pediatrics , University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
| | - Amanda C Poholek
- Division of Pediatric Rheumatology, Department of Pediatrics , University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
- Department of Immunology , University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
- Health Sciences Sequencing Core at UPMC Children’s Hospital of Pittsburgh, Department of Pediatrics , University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
| | - Kong Chen
- Department of Medicine, University of Pittsburgh, Pittsburgh, 15213, Pennsylvania, USA
| | - Heng Huang
- Department of Computer Science, University of Maryland, College Park, 20742, Maryland, USA
| | - Wei Chen
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, 15224, Pennsylvania, USA
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, 15261, Pennsylvania, USA
| |
Collapse
|