1
|
Zhang C, Wang L, Shi Q. Computational modeling for deciphering tissue microenvironment heterogeneity from spatially resolved transcriptomics. Comput Struct Biotechnol J 2024; 23:2109-2115. [PMID: 38800634 PMCID: PMC11126885 DOI: 10.1016/j.csbj.2024.05.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 05/29/2024] Open
Abstract
Spatial transcriptomics techniques, while measuring gene expression, retain spatial location information, aiding in situ studies of organismal tissue architecture and the progression of pathological processes. These techniques generate vast amounts of omics data, necessitating the development of computational methods to reveal the underlying tissue microenvironment heterogeneity. The main directions in spatial transcriptomics data analysis are spatial domain detection and spatial deconvolution, which can identify spatial functional regions and parse the distribution of cell types in spatial transcriptomics data by integrating single-cell transcriptomics data. In these two research directions, many computational methods have been successively proposed. This article will categorize them into three types: machine learning-based methods, probabilistic models-based methods, and deep learning-based methods. It will list and discuss the representative algorithms of each type along with their advantages and disadvantages and describe the datasets and evaluation metrics used to assess these computational methods, facilitating researchers in selecting suitable computational methods according to their research needs. Finally, combining the latest technological developments and the advantages and disadvantages of current algorithms, this article will look forward to the future directions of computational method development.
Collapse
Affiliation(s)
- Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, Hangzhou 310024; University of Chinese Academy of Sciences, China
| | - Lequn Wang
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qianqian Shi
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| |
Collapse
|
2
|
Benjamin K, Bhandari A, Kepple JD, Qi R, Shang Z, Xing Y, An Y, Zhang N, Hou Y, Crockford TL, McCallion O, Issa F, Hester J, Tillmann U, Harrington HA, Bull KR. Multiscale topology classifies cells in subcellular spatial transcriptomics. Nature 2024:10.1038/s41586-024-07563-1. [PMID: 38898271 DOI: 10.1038/s41586-024-07563-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 05/14/2024] [Indexed: 06/21/2024]
Abstract
Spatial transcriptomics measures in situ gene expression at millions of locations within a tissue1, hitherto with some trade-off between transcriptome depth, spatial resolution and sample size2. Although integration of image-based segmentation has enabled impactful work in this context, it is limited by imaging quality and tissue heterogeneity. By contrast, recent array-based technologies offer the ability to measure the entire transcriptome at subcellular resolution across large samples3-6. Presently, there exist no approaches for cell type identification that directly leverage this information to annotate individual cells. Here we propose a multiscale approach to automatically classify cell types at this subcellular level, using both transcriptomic information and spatial context. We showcase this on both targeted and whole-transcriptome spatial platforms, improving cell classification and morphology for human kidney tissue and pinpointing individual sparsely distributed renal mouse immune cells without reliance on image data. By integrating these predictions into a topological pipeline based on multiparameter persistent homology7-9, we identify cell spatial relationships characteristic of a mouse model of lupus nephritis, which we validate experimentally by immunofluorescence. The proposed framework readily generalizes to new platforms, providing a comprehensive pipeline bridging different levels of biological organization from genes through to tissues.
Collapse
Affiliation(s)
| | - Aneesha Bhandari
- Centre for Human Genetics, University of Oxford, Oxford, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Jessica D Kepple
- Centre for Human Genetics, University of Oxford, Oxford, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Rui Qi
- Centre for Human Genetics, University of Oxford, Oxford, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Chinese Academy of Medical Sciences Oxford Institute, University of Oxford, Oxford, UK
| | - Zhouchun Shang
- BGI Research, Riga, Latvia
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yanan Xing
- BGI Research, Riga, Latvia
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | | | | | | | - Tanya L Crockford
- Centre for Human Genetics, University of Oxford, Oxford, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Oliver McCallion
- Translational Research Immunology Group, Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
| | - Fadi Issa
- Translational Research Immunology Group, Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
| | - Joanna Hester
- Translational Research Immunology Group, Nuffield Department of Surgical Sciences, University of Oxford, Oxford, UK
| | - Ulrike Tillmann
- Mathematical Institute, University of Oxford, Oxford, UK
- Isaac Newton Institute for Mathematical Sciences, University of Cambridge, Cambridge, UK
| | - Heather A Harrington
- Mathematical Institute, University of Oxford, Oxford, UK.
- Centre for Human Genetics, University of Oxford, Oxford, UK.
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.
- Centre for Systems Biology, Dresden, Dresden, Germany.
- Faculty of Mathematics, Technische Universität Dresden, Dresden, Germany.
| | - Katherine R Bull
- Centre for Human Genetics, University of Oxford, Oxford, UK.
- Nuffield Department of Medicine, University of Oxford, Oxford, UK.
- Chinese Academy of Medical Sciences Oxford Institute, University of Oxford, Oxford, UK.
| |
Collapse
|
3
|
Swain AK, Pandit V, Sharma J, Yadav P. SpatialPrompt: spatially aware scalable and accurate tool for spot deconvolution and domain identification in spatial transcriptomics. Commun Biol 2024; 7:639. [PMID: 38796505 PMCID: PMC11127982 DOI: 10.1038/s42003-024-06349-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 05/17/2024] [Indexed: 05/28/2024] Open
Abstract
Efficiently mapping of cell types in situ remains a major challenge in spatial transcriptomics. Most spot deconvolution tools ignore spatial coordinate information and perform extremely slow on large datasets. Here, we introduce SpatialPrompt, a spatially aware and scalable tool for spot deconvolution and domain identification. SpatialPrompt integrates gene expression, spatial location, and single-cell RNA sequencing (scRNA-seq) dataset as reference to accurately infer cell-type proportions of spatial spots. SpatialPrompt uses non-negative ridge regression and graph neural network to efficiently capture local microenvironment information. Our extensive benchmarking analysis on Visium, Slide-seq, and MERFISH datasets demonstrated superior performance of SpatialPrompt over 15 existing tools. On mouse hippocampus dataset, SpatialPrompt achieves spot deconvolution and domain identification within 2 minutes for 50,000 spots. Overall, domain identification using SpatialPrompt was 44 to 150 times faster than existing methods. We build a database housing 40 plus curated scRNA-seq datasets for seamless integration with SpatialPrompt for spot deconvolution.
Collapse
Affiliation(s)
- Asish Kumar Swain
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Vrushali Pandit
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Jyoti Sharma
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Pankaj Yadav
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India.
- School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India.
| |
Collapse
|
4
|
Khatri R, Machart P, Bonn S. DISSECT: deep semi-supervised consistency regularization for accurate cell type fraction and gene expression estimation. Genome Biol 2024; 25:112. [PMID: 38689377 PMCID: PMC11061925 DOI: 10.1186/s13059-024-03251-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 04/17/2024] [Indexed: 05/02/2024] Open
Abstract
Cell deconvolution is the estimation of cell type fractions and cell type-specific gene expression from mixed data. An unmet challenge in cell deconvolution is the scarcity of realistic training data and the domain shift often observed in synthetic training data. Here, we show that two novel deep neural networks with simultaneous consistency regularization of the target and training domains significantly improve deconvolution performance. Our algorithm, DISSECT, outperforms competing algorithms in cell fraction and gene expression estimation by up to 14 percentage points. DISSECT can be easily adapted to other biomedical data types, as exemplified by our proteomic deconvolution experiments.
Collapse
Affiliation(s)
- Robin Khatri
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Pierre Machart
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Stefan Bonn
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, Center for Biomedical AI, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
| |
Collapse
|
5
|
Danishuddin, Khan S, Kim JJ. Spatial transcriptomics data and analytical methods: An updated perspective. Drug Discov Today 2024; 29:103889. [PMID: 38244672 DOI: 10.1016/j.drudis.2024.103889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 01/01/2024] [Accepted: 01/15/2024] [Indexed: 01/22/2024]
Abstract
Spatial transcriptomics (ST) is a newly emerging field that integrates high-resolution imaging and transcriptomic data to enable the high-throughput analysis of the spatial localization of transcripts in diverse biological systems. The rapid progress in this field necessitates the development of innovative computational methods to effectively tackle the distinct challenges posed by the analysis of ST data. These platforms, integrating AI techniques, offer a promising avenue for understanding disease mechanisms and expediting drug discovery. Despite significant advances in the development of ST data analysis techniques, there is an ongoing need to enhance these models for increased biological relevance. In this review, we briefly discuss the ST-related databases and current deep-learning-based models for spatial transcriptome data analyses and highlight their roles and future perspectives in biomedical applications.
Collapse
Affiliation(s)
- Danishuddin
- Department of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 38541, Korea.
| | - Shawez Khan
- National Center for Cancer Immune Therapy (CCIT-DK), Department of Oncology, Copenhagen University Hospital, Herlev, Denmark
| | - Jong Joo Kim
- Department of Biotechnology, Yeungnam University, Gyeongsan, Gyeongbuk 38541, Korea.
| |
Collapse
|
6
|
Shimonov S, Cunningham JM, Talmon R, Aizenbud L, Desai SJ, Rimm D, Schalper K, Kluger H, Kluger Y. SORBET: Automated cell-neighborhood analysis of spatial transcriptomics or proteomics for interpretable sample classification via GNN. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.12.30.573739. [PMID: 38260586 PMCID: PMC10802254 DOI: 10.1101/2023.12.30.573739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Spatially resolved transcriptomics or proteomics data have the potential to contribute fundamental insights into the mechanisms underlying physiologic and pathological processes. However, analysis of these data capable of relating spatial information, multiplexed markers, and their observed phenotypes remains technically challenging. To analyze these relationships, we developed SORBET, a deep learning framework that leverages recent advances in graph neural networks (GNN). We apply SORBET to predict tissue phenotypes, such as response to immunotherapy, across different disease processes and different technologies including both spatial proteomics and transcriptomics methods. Our results show that SORBET accurately learns biologically meaningful relationships across distinct tissue structures and data acquisition methods. Furthermore, we demonstrate that SORBET facilitates understanding of the spatially-resolved biological mechanisms underlying the inferred phenotypes. In sum, our method facilitates mapping between the rich spatial and marker information acquired from spatial 'omics technologies to emergent biological phenotypes. Moreover, we provide novel techniques for identifying the biological processes that comprise the predicted phenotypes.
Collapse
|