1
|
Liu Y, Yang C. Computational methods for alignment and integration of spatially resolved transcriptomics data. Comput Struct Biotechnol J 2024; 23:1094-1105. [PMID: 38495555 PMCID: PMC10940867 DOI: 10.1016/j.csbj.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 03/02/2024] [Accepted: 03/04/2024] [Indexed: 03/19/2024] Open
Abstract
Most of the complex biological regulatory activities occur in three dimensions (3D). To better analyze biological processes, it is essential not only to decipher the molecular information of numerous cells but also to understand how their spatial contexts influence their behavior. With the development of spatially resolved transcriptomics (SRT) technologies, SRT datasets are being generated to simultaneously characterize gene expression and spatial arrangement information within tissues, organs or organisms. To fully leverage spatial information, the focus extends beyond individual two-dimensional (2D) slices. Two tasks known as slices alignment and data integration have been introduced to establish correlations between multiple slices, enhancing the effectiveness of downstream tasks. Currently, numerous related methods have been developed. In this review, we first elucidate the details and principles behind several representative methods. Then we report the testing results of these methods on various SRT datasets, and assess their performance in representative downstream tasks. Insights into the strengths and weaknesses of each method and the reasons behind their performance are discussed. Finally, we provide an outlook on future developments. The codes and details of experiments are now publicly available at https://github.com/YangLabHKUST/SRT_alignment_and_integration.
Collapse
Affiliation(s)
- Yuyao Liu
- Department of Automation, School of Information Science and Technology, Tsinghua University, Beijing, China
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong, China
| |
Collapse
|
2
|
Wang N, Hong W, Wu Y, Chen Z, Bai M, Wang W, Zhu J. Next-generation spatial transcriptomics: unleashing the power to gear up translational oncology. MedComm (Beijing) 2024; 5:e765. [PMID: 39376738 PMCID: PMC11456678 DOI: 10.1002/mco2.765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 08/30/2024] [Accepted: 09/03/2024] [Indexed: 10/09/2024] Open
Abstract
The growing advances in spatial transcriptomics (ST) stand as the new frontier bringing unprecedented influences in the realm of translational oncology. This has triggered systemic experimental design, analytical scope, and depth alongside with thorough bioinformatics approaches being constantly developed in the last few years. However, harnessing the power of spatial biology and streamlining an array of ST tools to achieve designated research goals are fundamental and require real-world experiences. We present a systemic review by updating the technical scope of ST across different principal basis in a timeline manner hinting on the generally adopted ST techniques used within the community. We also review the current progress of bioinformatic tools and propose in a pipelined workflow with a toolbox available for ST data exploration. With particular interests in tumor microenvironment where ST is being broadly utilized, we summarize the up-to-date progress made via ST-based technologies by narrating studies categorized into either mechanistic elucidation or biomarker profiling (translational oncology) across multiple cancer types and their ways of deploying the research through ST. This updated review offers as a guidance with forward-looking viewpoints endorsed by many high-resolution ST tools being utilized to disentangle biological questions that may lead to clinical significance in the future.
Collapse
Affiliation(s)
- Nan Wang
- Cosmos Wisdom Biotech Co. LtdHangzhouChina
| | - Weifeng Hong
- Department of Radiation OncologyZhejiang Cancer HospitalHangzhouChina
- Hangzhou Institute of Medicine (HIM)Chinese Academy of SciencesHangzhouChina
- Zhejiang Key Laboratory of Radiation OncologyHangzhouChina
| | - Yixing Wu
- Department of Pulmonary and Critical Care MedicineZhongshan HospitalFudan UniversityShanghaiChina
| | - Zhe‐Sheng Chen
- Department of Pharmaceutical SciencesCollege of Pharmacy and Health SciencesInstitute for BiotechnologySt. John's UniversityQueensNew YorkUSA
| | - Minghua Bai
- Department of Radiation OncologyZhejiang Cancer HospitalHangzhouChina
- Hangzhou Institute of Medicine (HIM)Chinese Academy of SciencesHangzhouChina
- Zhejiang Key Laboratory of Radiation OncologyHangzhouChina
| | | | - Ji Zhu
- Department of Radiation OncologyZhejiang Cancer HospitalHangzhouChina
- Hangzhou Institute of Medicine (HIM)Chinese Academy of SciencesHangzhouChina
- Zhejiang Key Laboratory of Radiation OncologyHangzhouChina
| |
Collapse
|
3
|
Sun X, Zhang W, Li W, Yu N, Zhang D, Zou Q, Dong Q, Zhang X, Liu Z, Yuan Z, Gao R. SpaGRA: graph augmentation facilitates domain identification for spatially resolved transcriptomics. J Genet Genomics 2024:S1673-8527(24)00253-4. [PMID: 39362628 DOI: 10.1016/j.jgg.2024.09.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 09/07/2024] [Accepted: 09/22/2024] [Indexed: 10/05/2024]
Abstract
Recent advances in spatially resolved transcriptomics (SRT) have provided new opportunities for characterizing spatial structures of various tissues. Graph-based geometric deep learning have gained widespread adoption for spatial domain identification tasks. Currently, most methods define adjacency relation between cells or spots by their spatial distance in SRT data, which overlooks key biological interactions like gene expression similarities, and leads to inaccuracies in spatial domain identification. To tackle this challenge, we propose a novel method, SpaGRA (https://github.com/sunxue-yy/SpaGRA), for automatic multi-relationship construction based on graph augmentation. SpaGRA uses spatial distance as prior knowledge and dynamically adjusts edge weights with multi-head graph attention networks (GATs). This helps SpaGRA to uncover diverse node relationships and enhance message passing in geometric contrastive learning. Additionally, SpaGRA uses these multi-view relationships to construct negative samples, addressing sampling bias posed by random selection. Experimental results show that SpaGRA demonstrates superior domain identification performance on multiple datasets generated from different protocols. Using SpaGRA, we analyzed the functional regions in the mouse hypothalamus, identified key genes related to heart development in mouse embryos, and observed cancer-associated fibroblasts enveloping cancer cells in the latest Visium HD data. Overall, SpaGRA can effectively characterize spatial structures across diverse SRT datasets.
Collapse
Affiliation(s)
- Xue Sun
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Wei Zhang
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Wenrui Li
- MOE Key Lab of Bioinformatics and Bioinformatics Division of BNRIST, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Na Yu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Daoliang Zhang
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Qi Zou
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Qiongye Dong
- Institute of Precision Medicine, Peking University Shenzhen Hospital, Shenzhen, Guangdong 518036, China
| | - Xianglin Zhang
- Department of Clinical Laboratory, the Second Hospital, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250033, China
| | - Zhiping Liu
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai 200433, China.
| | - Rui Gao
- Center of Intelligent Medicine, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China.
| |
Collapse
|
4
|
Hua Y, Zhang Y, Guo Z, Bian S, Zhang Y. ImSpiRE: image feature-aided spatial resolution enhancement method. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-023-2636-9. [PMID: 39327391 DOI: 10.1007/s11427-023-2636-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 05/31/2024] [Indexed: 09/28/2024]
Abstract
The resolution of most spatially resolved transcriptomic technologies usually cannot attain the single-cell level, limiting their applications in biological discoveries. Here, we introduce ImSpiRE, an image feature-aided spatial resolution enhancement method for in situ capturing spatial transcriptome. Taking the information stored in histological images, ImSpiRE solves an optimal transport problem to redistribute the expression profiles of spots to construct new transcriptional profiles with enhanced resolution, together with extending the gene expression profiles into unmeasured regions. Applications to multiple datasets confirm that ImSpiRE can enhance spatial resolution to the subspot level while contributing to the discovery of tissue domains, signaling communication patterns, and spatiotemporal characterization.
Collapse
Affiliation(s)
- Yuwei Hua
- State Key Laboratory of Cardiovascular Diseases and Medical Innovation Center, Institute for Regenerative Medicine, Department of Neurosurgery, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Yizhi Zhang
- State Key Laboratory of Cardiovascular Diseases and Medical Innovation Center, Institute for Regenerative Medicine, Department of Neurosurgery, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Zhenming Guo
- State Key Laboratory of Cardiovascular Diseases and Medical Innovation Center, Institute for Regenerative Medicine, Department of Neurosurgery, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shan Bian
- State Key Laboratory of Cardiovascular Diseases and Medical Innovation Center, Institute for Regenerative Medicine, Department of Neurosurgery, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Yong Zhang
- State Key Laboratory of Cardiovascular Diseases and Medical Innovation Center, Institute for Regenerative Medicine, Department of Neurosurgery, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, Frontier Science Center for Stem Cell Research, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| |
Collapse
|
5
|
Liu T, Li K, Wang Y, Li H, Zhao H. Evaluating the Utilities of Foundation Models in Single-cell Data Analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.08.555192. [PMID: 38464157 PMCID: PMC10925156 DOI: 10.1101/2023.09.08.555192] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Foundation Models (FMs) have made significant strides in both industrial and scientific domains. In this paper, we evaluate the performance of FMs for single-cell sequencing data analysis through comprehensive experiments across eight downstream tasks pertinent to single-cell data. Overall, the top FMs include scGPT, Geneformer, and CellPLM by considering model performances and user accessibility among ten single-cell FMs. However, by comparing these FMs with task-specific methods, we found that single-cell FMs may not consistently excel than task-specific methods in all tasks, which challenges the necessity of developing foundation models for single-cell analysis. In addition, we evaluated the effects of hyper-parameters, initial settings, and stability for training single-cell FMs based on a proposed scEval framework, and provide guidelines for pre-training and fine-tuning, to enhance the performances of single-cell FMs. Our work summarizes the current state of single-cell FMs, points to their constraints and avenues for future development, and offers a freely available evaluation pipeline to benchmark new models and improve method development.
Collapse
|
6
|
Liu L, Chen A, Li Y, Mulder J, Heyn H, Xu X. Spatiotemporal omics for biology and medicine. Cell 2024; 187:4488-4519. [PMID: 39178830 DOI: 10.1016/j.cell.2024.07.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 07/05/2024] [Accepted: 07/23/2024] [Indexed: 08/26/2024]
Abstract
The completion of the Human Genome Project has provided a foundational blueprint for understanding human life. Nonetheless, understanding the intricate mechanisms through which our genetic blueprint is involved in disease or orchestrates development across temporal and spatial dimensions remains a profound scientific challenge. Recent breakthroughs in cellular omics technologies have paved new pathways for understanding the regulation of genomic elements and the relationship between gene expression, cellular functions, and cell fate determination. The advent of spatial omics technologies, encompassing both imaging and sequencing-based methodologies, has enabled a comprehensive understanding of biological processes from a cellular ecosystem perspective. This review offers an updated overview of how spatial omics has advanced our understanding of the translation of genetic information into cellular heterogeneity and tissue structural organization and their dynamic changes over time. It emphasizes the discovery of various biological phenomena, related to organ functionality, embryogenesis, species evolution, and the pathogenesis of diseases.
Collapse
Affiliation(s)
| | - Ao Chen
- BGI Research, Shenzhen 518083, China
| | | | - Jan Mulder
- Department of Neuroscience, Karolinska Institute, Stockholm, Sweden
| | - Holger Heyn
- Centro Nacional de Análisis Genómico (CNAG), Barcelona, Spain
| | - Xun Xu
- BGI Research, Hangzhou 310030, China; BGI Research, Shenzhen 518083, China.
| |
Collapse
|
7
|
Ruan Z, Zhou W, Liu H, Wei J, Pan Y, Yan C, Wei X, Xiang W, Yan C, Chen S, Liu J. Precise detection of cell-type-specific domains in spatial transcriptomics. CELL REPORTS METHODS 2024; 4:100841. [PMID: 39127046 PMCID: PMC11384096 DOI: 10.1016/j.crmeth.2024.100841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 06/17/2024] [Accepted: 07/17/2024] [Indexed: 08/12/2024]
Abstract
Cell-type-specific domains are the anatomical domains in spatially resolved transcriptome (SRT) tissues where particular cell types are enriched coincidentally. It is challenging to use existing computational methods to detect specific domains with low-proportion cell types, which are partly overlapped with or even inside other cell-type-specific domains. Here, we propose De-spot, which synthesizes segmentation and deconvolution as an ensemble to generate cell-type patterns, detect low-proportion cell-type-specific domains, and display these domains intuitively. Experimental evaluation showed that De-spot enabled us to discover the co-localizations between cancer-associated fibroblasts and immune-related cells that indicate potential tumor microenvironment (TME) domains in given slices, which were obscured by previous computational methods. We further elucidated the identified domains and found that Srgn may be a critical TME marker in SRT slices. By deciphering T cell-specific domains in breast cancer tissues, De-spot also revealed that the proportions of exhausted T cells were significantly increased in invasive vs. ductal carcinoma.
Collapse
Affiliation(s)
- Zhihan Ruan
- Centre for Bioinformatics and Intelligent Medicine, College of Computer Science, Nankai University, Tianjin 300350, China
| | - Weijun Zhou
- Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Hong Liu
- The Second Surgical Department of Breast Cancer, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute & Hospital, Tianjin 300060, China
| | - Jinmao Wei
- Centre for Bioinformatics and Intelligent Medicine, College of Computer Science, Nankai University, Tianjin 300350, China
| | - Yichen Pan
- Centre for Bioinformatics and Intelligent Medicine, College of Computer Science, Nankai University, Tianjin 300350, China
| | - Chaoyang Yan
- Centre for Bioinformatics and Intelligent Medicine, College of Computer Science, Nankai University, Tianjin 300350, China
| | - Xiaoyi Wei
- Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai 519000, China
| | - Wenting Xiang
- Centre for Bioinformatics and Intelligent Medicine, College of Computer Science, Nankai University, Tianjin 300350, China
| | - Chengwei Yan
- Centre for Bioinformatics and Intelligent Medicine, College of Computer Science, Nankai University, Tianjin 300350, China
| | - Shengquan Chen
- School of Mathematical Sciences, Nankai University, Tianjin 300350, China
| | - Jian Liu
- State Key Laboratory of Medicinal Chemical Biology, College of Computer Science, Nankai University, Tianjin 300350, China.
| |
Collapse
|
8
|
Hu Y, Xie M, Li Y, Rao M, Shen W, Luo C, Qin H, Baek J, Zhou XM. Benchmarking clustering, alignment, and integration methods for spatial transcriptomics. Genome Biol 2024; 25:212. [PMID: 39123269 PMCID: PMC11312151 DOI: 10.1186/s13059-024-03361-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 07/30/2024] [Indexed: 08/12/2024] Open
Abstract
BACKGROUND Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remains challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of comprehensive benchmark studies complicates the selection of methods and future method development. RESULTS In this study, we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics and analyses, including eight metrics for spatial clustering accuracy and contiguity, uniform manifold approximation and projection visualization, layer-wise and spot-to-spot alignment accuracy, and 3D reconstruction, which are designed to assess method performance as well as data quality. The code used for evaluation is available on our GitHub. Additionally, we provide online notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets. CONCLUSIONS Our analyses lead to comprehensive recommendations that cover multiple aspects, helping users to select optimal tools for their specific needs and guide future method development.
Collapse
Affiliation(s)
- Yunfei Hu
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Manfei Xie
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Yikang Li
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Mingxing Rao
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Wenjun Shen
- Department of Bioinformatics, Shantou University Medical College, 515041, Shantou, China
| | - Can Luo
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Haoran Qin
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Jihoon Baek
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Xin Maizie Zhou
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA.
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA.
| |
Collapse
|
9
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-023-2561-0. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
10
|
Zhang Y, Yu Z, Wong KC, Li X. Unraveling Spatial Domain Characterization in Spatially Resolved Transcriptomics with Robust Graph Contrastive Clustering. Bioinformatics 2024; 40:btae451. [PMID: 39012523 PMCID: PMC11272174 DOI: 10.1093/bioinformatics/btae451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 06/12/2024] [Accepted: 07/12/2024] [Indexed: 07/17/2024] Open
Abstract
MOTIVATION Spatial transcriptomics can quantify gene expression and its spatial distribution in tissues, thus revealing molecular mechanisms of cellular interactions underlying tissue heterogeneity, tissue regeneration, and spatially localized disease mechanisms. However, existing spatial clustering methods often fail to exploit the full potential of spatial information, resulting in inaccurate identification of spatial domains. RESULTS In this paper, we develop a deep graph contrastive clustering framework, stDGCC, that accurately uncovers underlying spatial domains via explicitly modeling spatial information and gene expression profiles from spatial transcriptomics data. The stDGCC framework proposes a spatially informed graph node embedding model to preserve the topological information of spots and to learn the informative and discriminative characterization of spatial transcriptomics data through self-supervised contrastive learning. By simultaneously optimizing the contrastive learning loss, reconstruction loss, and Kullback-Leibler (KL) divergence loss, stDGCC achieves joint optimization of feature learning and topology structure preservation in an end-to-end manner. We validate the effectiveness of stDGCC on various spatial transcriptomics datasets acquired from different platforms, each with varying spatial resolutions. Our extensive experiments demonstrate the superiority of stDGCC over various state-of-the-art clustering methods in accurately identifying cellular-level biological structures. AVAILABILITY Code and data are available from https://github.com/TimE9527/stDGCC and https://figshare.com/projects/stDGCC/186525. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yingxi Zhang
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Zhuohan Yu
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong 999077, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| |
Collapse
|
11
|
Ma Y, Zhou X. Accurate and efficient integrative reference-informed spatial domain detection for spatial transcriptomics. Nat Methods 2024; 21:1231-1244. [PMID: 38844627 DOI: 10.1038/s41592-024-02284-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 04/18/2024] [Indexed: 06/23/2024]
Abstract
Spatially resolved transcriptomics (SRT) studies are becoming increasingly common and large, offering unprecedented opportunities in mapping complex tissue structures and functions. Here we present integrative and reference-informed tissue segmentation (IRIS), a computational method designed to characterize tissue spatial organization in SRT studies through accurately and efficiently detecting spatial domains. IRIS uniquely leverages single-cell RNA sequencing data for reference-informed detection of biologically interpretable spatial domains, integrating multiple SRT slices while explicitly considering correlations both within and across slices. We demonstrate the advantages of IRIS through in-depth analysis of six SRT datasets encompassing diverse technologies, tissues, species and resolutions. In these applications, IRIS achieves substantial accuracy gains (39-1,083%) and speed improvements (4.6-666.0) in moderate-sized datasets, while representing the only method applicable for large datasets including Stereo-seq and 10x Xenium. As a result, IRIS reveals intricate brain structures, uncovers tumor microenvironment heterogeneity and detects structural changes in diabetes-affected testis, all with exceptional speed and accuracy.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, Brown University, Providence, RI, USA
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
12
|
Lin S, Cui Y, Zhao F, Yang Z, Song J, Yao J, Zhao Y, Qian BZ, Zhao Y, Yuan Z. Complete spatially resolved gene expression is not necessary for identifying spatial domains. CELL GENOMICS 2024; 4:100565. [PMID: 38781966 PMCID: PMC11228956 DOI: 10.1016/j.xgen.2024.100565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 02/29/2024] [Accepted: 04/30/2024] [Indexed: 05/25/2024]
Abstract
Spatially resolved transcriptomics (SRT) technologies have revolutionized the study of tissue organization. We introduce a graph convolutional network with an attention and positive emphasis mechanism, termed BINARY, relying exclusively on binarized SRT data to accurately delineate spatial domains. BINARY outperforms existing methods across various SRT data types while using significantly less input information. Our study suggests that precise gene expression quantification may not always be essential, inspiring further exploration of the broader applications of spatially resolved binarized gene expression data.
Collapse
Affiliation(s)
- Senlin Lin
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Yan Cui
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China; Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
| | - Fangyuan Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Zhidong Yang
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
| | | | - Yu Zhao
- AI Lab, Tencent, Shenzhen, China
| | - Bin-Zhi Qian
- Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, The Human Phenome Institute, Zhangjiang-Fudan International Innovation Center, Fudan University, Shanghai, China
| | - Yi Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China.
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China; Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
| |
Collapse
|
13
|
Yu Y, He Y, Xie Z. Accurate Identification of Spatial Domain by Incorporating Global Spatial Proximity and Local Expression Proximity. Biomolecules 2024; 14:674. [PMID: 38927077 PMCID: PMC11201407 DOI: 10.3390/biom14060674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/01/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024] Open
Abstract
Accurate identification of spatial domains is essential in the analysis of spatial transcriptomics data in order to elucidate tissue microenvironments and biological functions. However, existing methods only perform domain segmentation based on local or global spatial relationships between spots, resulting in an underutilization of spatial information. To this end, we propose SECE, a deep learning-based method that captures both local and global relationships among spots and aggregates their information using expression similarity and spatial similarity. We benchmarked SECE against eight state-of-the-art methods on six real spatial transcriptomics datasets spanning four different platforms. SECE consistently outperformed other methods in spatial domain identification accuracy. Moreover, SECE produced spatial embeddings that exhibited clearer patterns in low-dimensional visualizations and facilitated a more accurate trajectory inference.
Collapse
Affiliation(s)
- Yuanyuan Yu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;
| | - Yao He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou 510080, China
| |
Collapse
|
14
|
Jiang X, Wang S, Guo L, Zhu B, Wen Z, Jia L, Xu L, Xiao G, Li Q. iIMPACT: integrating image and molecular profiles for spatial transcriptomics analysis. Genome Biol 2024; 25:147. [PMID: 38844966 DOI: 10.1186/s13059-024-03289-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 05/23/2024] [Indexed: 07/04/2024] Open
Abstract
Current clustering analysis of spatial transcriptomics data primarily relies on molecular information and fails to fully exploit the morphological features present in histology images, leading to compromised accuracy and interpretability. To overcome these limitations, we have developed a multi-stage statistical method called iIMPACT. It identifies and defines histology-based spatial domains based on AI-reconstructed histology images and spatial context of gene expression measurements, and detects domain-specific differentially expressed genes. Through multiple case studies, we demonstrate iIMPACT outperforms existing methods in accuracy and interpretability and provides insights into the cellular spatial organization and landscape of functional genes within spatial transcriptomics data.
Collapse
Affiliation(s)
- Xi Jiang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Statistics and Data Science, Southern Methodist University, Dallas, TX, USA
| | - Shidan Wang
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Lei Guo
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Bencong Zhu
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong SAR, China
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, USA
| | - Zhuoyu Wen
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Liwei Jia
- Department of Pathology, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Lin Xu
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, TX, USA.
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Peter O'Donnell Jr. School of Public Health, The University of Texas Southwestern Medical Center, Dallas, TX, USA.
| | - Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, Richardson, TX, USA.
| |
Collapse
|
15
|
Swain AK, Pandit V, Sharma J, Yadav P. SpatialPrompt: spatially aware scalable and accurate tool for spot deconvolution and domain identification in spatial transcriptomics. Commun Biol 2024; 7:639. [PMID: 38796505 PMCID: PMC11127982 DOI: 10.1038/s42003-024-06349-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 05/17/2024] [Indexed: 05/28/2024] Open
Abstract
Efficiently mapping of cell types in situ remains a major challenge in spatial transcriptomics. Most spot deconvolution tools ignore spatial coordinate information and perform extremely slow on large datasets. Here, we introduce SpatialPrompt, a spatially aware and scalable tool for spot deconvolution and domain identification. SpatialPrompt integrates gene expression, spatial location, and single-cell RNA sequencing (scRNA-seq) dataset as reference to accurately infer cell-type proportions of spatial spots. SpatialPrompt uses non-negative ridge regression and graph neural network to efficiently capture local microenvironment information. Our extensive benchmarking analysis on Visium, Slide-seq, and MERFISH datasets demonstrated superior performance of SpatialPrompt over 15 existing tools. On mouse hippocampus dataset, SpatialPrompt achieves spot deconvolution and domain identification within 2 minutes for 50,000 spots. Overall, domain identification using SpatialPrompt was 44 to 150 times faster than existing methods. We build a database housing 40 plus curated scRNA-seq datasets for seamless integration with SpatialPrompt for spot deconvolution.
Collapse
Affiliation(s)
- Asish Kumar Swain
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Vrushali Pandit
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Jyoti Sharma
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Pankaj Yadav
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India.
- School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India.
| |
Collapse
|
16
|
Lu Y, Chen QM, An L. SPADE: spatial deconvolution for domain specific cell-type estimation. Commun Biol 2024; 7:469. [PMID: 38632414 PMCID: PMC11024133 DOI: 10.1038/s42003-024-06172-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 04/10/2024] [Indexed: 04/19/2024] Open
Abstract
Understanding gene expression in different cell types within their spatial context is a key goal in genomics research. SPADE (SPAtial DEconvolution), our proposed method, addresses this by integrating spatial patterns into the analysis of cell type composition. This approach uses a combination of single-cell RNA sequencing, spatial transcriptomics, and histological data to accurately estimate the proportions of cell types in various locations. Our analyses of synthetic data have demonstrated SPADE's capability to discern cell type-specific spatial patterns effectively. When applied to real-life datasets, SPADE provides insights into cellular dynamics and the composition of tumor tissues. This enhances our comprehension of complex biological systems and aids in exploring cellular diversity. SPADE represents a significant advancement in deciphering spatial gene expression patterns, offering a powerful tool for the detailed investigation of cell types in spatial transcriptomics.
Collapse
Affiliation(s)
- Yingying Lu
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA
| | - Qin M Chen
- College of Pharmacy, University of Arizona, Tucson, AZ, 85721, USA
| | - Lingling An
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
17
|
Yuan Z, Zhao F, Lin S, Zhao Y, Yao J, Cui Y, Zhang XY, Zhao Y. Benchmarking spatial clustering methods with spatially resolved transcriptomics data. Nat Methods 2024; 21:712-722. [PMID: 38491270 DOI: 10.1038/s41592-024-02215-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 02/16/2024] [Indexed: 03/18/2024]
Abstract
Spatial clustering, which shares an analogy with single-cell clustering, has expanded the scope of tissue physiology studies from cell-centroid to structure-centroid with spatially resolved transcriptomics (SRT) data. Computational methods have undergone remarkable development in recent years, but a comprehensive benchmark study is still lacking. Here we present a benchmark study of 13 computational methods on 34 SRT data (7 datasets). The performance was evaluated on the basis of accuracy, spatial continuity, marker genes detection, scalability, and robustness. We found existing methods were complementary in terms of their performance and functionality, and we provide guidance for selecting appropriate methods for given scenarios. On testing additional 22 challenging datasets, we identified challenges in identifying noncontinuous spatial domains and limitations of existing methods, highlighting their inadequacies in handling recent large-scale tasks. Furthermore, with 145 simulated data, we examined the robustness of these methods against four different factors, and assessed the impact of pre- and postprocessing approaches. Our study offers a comprehensive evaluation of existing spatial clustering methods with SRT data, paving the way for future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Zhiyuan Yuan
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
| | - Fangyuan Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Senlin Lin
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu Zhao
- Tencent AI Lab, Shenzhen, China
| | | | - Yan Cui
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Xiao-Yong Zhang
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
| | - Yi Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
18
|
Lin S, Zhao F, Wu Z, Yao J, Zhao Y, Yuan Z. Streamlining spatial omics data analysis with Pysodb. Nat Protoc 2024; 19:831-895. [PMID: 38135744 DOI: 10.1038/s41596-023-00925-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 10/02/2023] [Indexed: 12/24/2023]
Abstract
Advances in spatial omics technologies have improved the understanding of cellular organization in tissues, leading to the generation of complex and heterogeneous data and prompting the development of specialized tools for managing, loading and visualizing spatial omics data. The Spatial Omics Database (SODB) was established to offer a unified format for data storage and interactive visualization modules. Here we detail the use of Pysodb, a Python-based tool designed to enable the efficient exploration and loading of spatial datasets from SODB within a Python environment. We present seven case studies using Pysodb, detailing the interaction with various computational methods, ensuring reproducibility of experimental data and facilitating the integration of new data and alternative applications in SODB. The approach offers a reference for method developers by outlining label and metadata availability in representative spatial data that can be loaded by Pysodb. The tool is supplemented by a website ( https://protocols-pysodb.readthedocs.io/ ) with detailed information for benchmarking analysis, and allows method developers to focus on computational models by facilitating data processing. This protocol is designed for researchers with limited experience in computational biology. Depending on the dataset complexity, the protocol typically requires ~12 h to complete.
Collapse
Affiliation(s)
- Senlin Lin
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Fangyuan Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | | | | | - Yi Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
| |
Collapse
|
19
|
Yang S, Zhou X. SRT-Server: powering the analysis of spatial transcriptomic data. Genome Med 2024; 16:18. [PMID: 38279156 PMCID: PMC10811909 DOI: 10.1186/s13073-024-01288-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 01/15/2024] [Indexed: 01/28/2024] Open
Abstract
BACKGROUND Spatial resolved transcriptomics (SRT) encompasses a rapidly developing set of technologies that enable the measurement of gene expression in tissue while retaining spatial localization information. SRT technologies and the enabled SRT studies have provided unprecedent insights into the structural and functional underpinnings of complex tissues. As SRT technologies have advanced and an increasing number of SRT studies have emerged, numerous sophisticated statistical and computational methods have been developed to facilitate the analysis and interpretation of SRT data. However, despite the growing popularity of SRT studies and the widespread availability of SRT analysis methods, analysis of large-scale and complex SRT datasets remains challenging and not easily accessible to researchers with limited statistical and computational backgrounds. RESULTS Here, we present SRT-Server, the first webserver designed to carry out comprehensive SRT analyses for a wide variety of SRT technologies while requiring minimal prior computational knowledge. Implemented with cutting-edge web development technologies, SRT-Server is user-friendly and features multiple analytic modules that can perform a range of SRT analyses. With a flowchart-style interface, these different analytic modules on the SRT-Server can be dragged into the main panel and connected to each other to create custom analytic pipelines. SRT-Server then automatically executes the desired analyses, generates corresponding figures, and outputs results-all without requiring prior programming knowledge. We demonstrate the advantages of SRT-Server through three case studies utilizing SRT data collected from two common platforms, highlighting its versatility and values to researchers with varying analytic expertise. CONCLUSIONS Overall, SRT-Server presents a user-friendly, efficient, effective, secure, and expandable solution for SRT data analysis, opening new doors for researchers in the field. SRT-Server is freely available at https://spatialtranscriptomicsanalysis.com/ .
Collapse
Affiliation(s)
- Sheng Yang
- Department of Biostatistics, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing, Jiangsu, 211166, China.
| | - Xiang Zhou
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
20
|
Liang Y, Shi G, Cai R, Yuan Y, Xie Z, Yu L, Huang Y, Shi Q, Wang L, Li J, Tang Z. PROST: quantitative identification of spatially variable genes and domain detection in spatial transcriptomics. Nat Commun 2024; 15:600. [PMID: 38238417 PMCID: PMC10796707 DOI: 10.1038/s41467-024-44835-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 12/19/2023] [Indexed: 01/22/2024] Open
Abstract
Computational methods have been proposed to leverage spatially resolved transcriptomic data, pinpointing genes with spatial expression patterns and delineating tissue domains. However, existing approaches fall short in uniformly quantifying spatially variable genes (SVGs). Moreover, from a methodological viewpoint, while SVGs are naturally associated with depicting spatial domains, they are technically dissociated in most methods. Here, we present a framework (PROST) for the quantitative recognition of spatial transcriptomic patterns, consisting of (i) quantitatively characterizing spatial variations in gene expression patterns through the PROST Index; and (ii) unsupervised clustering of spatial domains via a self-attention mechanism. We demonstrate that PROST performs superior SVG identification and domain segmentation with various spatial resolutions, from multicellular to cellular levels. Importantly, PROST Index can be applied to prioritize spatial expression variations, facilitating the exploration of biological insights. Together, our study provides a flexible and robust framework for analyzing diverse spatial transcriptomic data.
Collapse
Affiliation(s)
- Yuchen Liang
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Guowei Shi
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Runlin Cai
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yuchen Yuan
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Ziying Xie
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
| | - Long Yu
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yingjian Huang
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Qian Shi
- School of Geography and Planning, Sun Yat-sen University, Guangzhou, 510275, China
| | - Lizhe Wang
- School of Computer Science, China University of Geosciences, Wuhan, 430078, China
| | - Jun Li
- School of Computer Science, China University of Geosciences, Wuhan, 430078, China.
| | - Zhonghui Tang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China.
| |
Collapse
|
21
|
Yuan Z. MENDER: fast and scalable tissue structure identification in spatial omics data. Nat Commun 2024; 15:207. [PMID: 38182575 PMCID: PMC10770058 DOI: 10.1038/s41467-023-44367-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 12/11/2023] [Indexed: 01/07/2024] Open
Abstract
Tissue structure identification is a crucial task in spatial omics data analysis, for which increasingly complex models, such as Graph Neural Networks and Bayesian networks, are employed. However, whether increased model complexity can effectively lead to improved performance is a notable question in the field. Inspired by the consistent observation of cellular neighborhood structures across various spatial technologies, we propose Multi-range cEll coNtext DEciphereR (MENDER), for tissue structure identification. Applied on datasets of 3 brain regions and a whole-brain atlas, MENDER, with biology-driven design, offers substantial improvements over modern complex models while automatically aligning labels across slices, despite using much less running time than the second-fastest. MENDER's identification power allows the uncovering of previously overlooked spatial domains that exhibit strong associations with brain aging. MENDER's scalability makes it freely appliable on a million-level brain spatial atlas. MENDER's discriminative power enables the differentiation of breast cancer patient subtypes obscured by single-cell analysis.
Collapse
Affiliation(s)
- Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, 200433, China.
| |
Collapse
|
22
|
Li J, Wang J, Lin Z. SGCAST: symmetric graph convolutional auto-encoder for scalable and accurate study of spatial transcriptomics. Brief Bioinform 2023; 25:bbad490. [PMID: 38171928 PMCID: PMC10782917 DOI: 10.1093/bib/bbad490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/02/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024] Open
Abstract
Recent advances in spatial transcriptomics (ST) have enabled comprehensive profiling of gene expression with spatial information in the context of the tissue microenvironment. However, with the improvements in the resolution and scale of ST data, deciphering spatial domains precisely while ensuring efficiency and scalability is still challenging. Here, we develop SGCAST, an efficient auto-encoder framework to identify spatial domains. SGCAST adopts a symmetric graph convolutional auto-encoder to learn aggregated latent embeddings via integrating the gene expression similarity and the proximity of the spatial spots. This framework in SGCAST enables a mini-batch training strategy, which makes SGCAST memory-efficient and scalable to high-resolution spatial transcriptomic data with a large number of spots. SGCAST improves the overall accuracy of spatial domain identification on benchmarking data. We also validated the performance of SGCAST on ST datasets at various scales across multiple platforms. Our study illustrates the superior capacity of SGCAST on analyzing spatial transcriptomic data.
Collapse
Affiliation(s)
- Jinzhao Li
- Department of Statistics, The Chinese University of Hong Kong, Sha Tin, Hong Kong, China
| | - Jiong Wang
- School of Science and Engineering, The Chinese University of Hong Kong (Shenzhen), Shenzhen, 518172, China
| | - Zhixiang Lin
- Department of Statistics, The Chinese University of Hong Kong, Sha Tin, Hong Kong, China
| |
Collapse
|
23
|
Guo T, Yuan Z, Pan Y, Wang J, Chen F, Zhang MQ, Li X. SPIRAL: integrating and aligning spatially resolved transcriptomics data across different experiments, conditions, and technologies. Genome Biol 2023; 24:241. [PMID: 37864231 PMCID: PMC10590036 DOI: 10.1186/s13059-023-03078-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 09/29/2023] [Indexed: 10/22/2023] Open
Abstract
Properly integrating spatially resolved transcriptomics (SRT) generated from different batches into a unified gene-spatial coordinate system could enable the construction of a comprehensive spatial transcriptome atlas. Here, we propose SPIRAL, consisting of two consecutive modules: SPIRAL-integration, with graph domain adaptation-based data integration, and SPIRAL-alignment, with cluster-aware optimal transport-based coordination alignment. We verify SPIRAL with both synthetic and real SRT datasets. By encoding spatial correlations to gene expressions, SPIRAL-integration surpasses state-of-the-art methods in both batch effect removal and joint spatial domain identification. By aligning spots cluster-wise, SPIRAL-alignment achieves more accurate coordinate alignments than existing methods.
Collapse
Affiliation(s)
- Tiantian Guo
- School of Software Engineering, Beijing Jiaotong University, Beijing, 100044, China
- MOE Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic & Systems Biology, BNRist, Department of Automation, Tsinghua University, Beijing, 100084, China
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Yan Pan
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Jiakang Wang
- School of Software Engineering, Beijing Jiaotong University, Beijing, 100044, China
| | - Fengling Chen
- Center for Stem Cell Biology and Regenerative Medicine, MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua-Peking Center for Life Sciences, Tsinghua University, Beijing, 100084, China
| | - Michael Q Zhang
- Department of Biological Sciences, Center for Systems Biology, The University of Texas, Richardson, TX, 75080-3021, USA.
| | - Xiangyu Li
- School of Software Engineering, Beijing Jiaotong University, Beijing, 100044, China.
| |
Collapse
|
24
|
Lu Y, Chen Q, An L. SPADE: Spatial Deconvolution for Domain Specific Cell-type Estimation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.14.536924. [PMID: 37131788 PMCID: PMC10153127 DOI: 10.1101/2023.04.14.536924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
The advent of spatial transcriptomics technology has allowed for the acquisition of gene expression profiles with multi-cellular resolution in a spatially resolved manner, presenting a new milestone in the field of genomics. However, the aggregate gene expression from heterogeneous cell types obtained by these technologies poses a significant challenge for a comprehensive delineation of cell type-specific spatial patterns. Here, we propose SPADE (SPAtial DEconvolution), an in-silico method designed to address this challenge by incorporating spatial patterns during cell type decomposition. SPADE utilizes a combination of single-cell RNA sequencing data, spatial location information, and histological information to computationally estimate the proportion of cell types present at each spatial location. In our study, we showcased the effectiveness of SPADE by conducting analyses on synthetic data. Our results indicated that SPADE was able to successfully identify cell type-specific spatial patterns that were not previously identified by existing deconvolution methods. Furthermore, we applied SPADE to a real-world dataset analyzing the developmental chicken heart, where we observed that SPADE was able to accurately capture the intricate processes of cellular differentiation and morphogenesis within the heart. Specifically, we were able to reliably estimate changes in cell type compositions over time, which is a critical aspect of understanding the underlying mechanisms of complex biological systems. These findings underscore the potential of SPADE as a valuable tool for analyzing complex biological systems and shedding light on their underlying mechanisms. Taken together, our results suggest that SPADE represents a significant advancement in the field of spatial transcriptomics, providing a powerful tool for characterizing complex spatial gene expression patterns in heterogeneous tissues.
Collapse
|
25
|
Zhu J, Shang L, Zhou X. SRTsim: spatial pattern preserving simulations for spatially resolved transcriptomics. Genome Biol 2023; 24:39. [PMID: 36869394 PMCID: PMC9983268 DOI: 10.1186/s13059-023-02879-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Accepted: 02/16/2023] [Indexed: 03/05/2023] Open
Abstract
Spatially resolved transcriptomics (SRT)-specific computational methods are often developed, tested, validated, and evaluated in silico using simulated data. Unfortunately, existing simulated SRT data are often poorly documented, hard to reproduce, or unrealistic. Single-cell simulators are not directly applicable for SRT simulation as they cannot incorporate spatial information. We present SRTsim, an SRT-specific simulator for scalable, reproducible, and realistic SRT simulations. SRTsim not only maintains various expression characteristics of SRT data but also preserves spatial patterns. We illustrate the benefits of SRTsim in benchmarking methods for spatial clustering, spatial expression pattern detection, and cell-cell communication identification.
Collapse
Affiliation(s)
- Jiaqiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
26
|
Zhang X, Liu W, Song F, Liu J. iSC.MEB: an R package for multi-sample spatial clustering analysis of spatial transcriptomics data. BIOINFORMATICS ADVANCES 2023; 3:vbad019. [PMID: 36845201 PMCID: PMC9945056 DOI: 10.1093/bioadv/vbad019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/24/2022] [Accepted: 02/16/2023] [Indexed: 02/19/2023]
Abstract
Summary Emerging spatially resolved transcriptomics (SRT) technologies are powerful in measuring gene expression profiles while retaining tissue spatial localization information and typically provide data from multiple tissue sections. We have previously developed the tool SC.MEB-an empirical Bayes approach for SRT data analysis using a hidden Markov random field. Here, we introduce an extension to SC.MEB, denoted as integrated spatial clustering with hidden Markov random field using empirical Bayes (iSC.MEB) that permits the users to simultaneously estimate the batch effect and perform spatial clustering for low-dimensional representations of multiple SRT datasets. We demonstrate that iSC.MEB can provide accurate cell/domain detection results using two SRT datasets. Availability and implementation iSC.MEB is implemented in an open-source R package, and source code is freely available at https://github.com/XiaoZhangryy/iSC.MEB. Documentation and vignettes are provided on our package website (https://xiaozhangryy.github.io/iSC.MEB/index.html). Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Xiao Zhang
- Centre for Quantitative Medicine Health Services & Systems Research, Duke-NUS Medical School, 169857 Singapore, Singapore
| | - Wei Liu
- Centre for Quantitative Medicine Health Services & Systems Research, Duke-NUS Medical School, 169857 Singapore, Singapore
| | - Fangda Song
- School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen 518172, Guangdong, China
| | - Jin Liu
- To whom correspondence should be addressed.
| |
Collapse
|
27
|
Jeon H, Xie J, Jeon Y, Jung KJ, Gupta A, Chang W, Chung D. Statistical Power Analysis for Designing Bulk, Single-Cell, and Spatial Transcriptomics Experiments: Review, Tutorial, and Perspectives. Biomolecules 2023; 13:221. [PMID: 36830591 PMCID: PMC9952882 DOI: 10.3390/biom13020221] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/20/2023] [Accepted: 01/21/2023] [Indexed: 01/26/2023] Open
Abstract
Gene expression profiling technologies have been used in various applications such as cancer biology. The development of gene expression profiling has expanded the scope of target discovery in transcriptomic studies, and each technology produces data with distinct characteristics. In order to guarantee biologically meaningful findings using transcriptomic experiments, it is important to consider various experimental factors in a systematic way through statistical power analysis. In this paper, we review and discuss the power analysis for three types of gene expression profiling technologies from a practical standpoint, including bulk RNA-seq, single-cell RNA-seq, and high-throughput spatial transcriptomics. Specifically, we describe the existing power analysis tools for each research objective for each of the bulk RNA-seq and scRNA-seq experiments, along with recommendations. On the other hand, since there are no power analysis tools for high-throughput spatial transcriptomics at this point, we instead investigate the factors that can influence power analysis.
Collapse
Affiliation(s)
- Hyeongseon Jeon
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
| | - Juan Xie
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH 43210, USA
| | - Yeseul Jeon
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Department of Statistics and Data Science, Yonsei University, Seoul 03722, Republic of Korea
- Department of Applied Statistics, Yonsei University, Seoul 03722, Republic of Korea
| | - Kyeong Joo Jung
- Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, USA
| | - Arkobrato Gupta
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH 43210, USA
| | - Won Chang
- Division of Statistics and Data Science, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Dongjun Chung
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Ph.D. Program in Biostatistics, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
28
|
Liu W, Liao X, Luo Z, Yang Y, Lau MC, Jiao Y, Shi X, Zhai W, Ji H, Yeong J, Liu J. Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST. Nat Commun 2023; 14:296. [PMID: 36653349 PMCID: PMC9849443 DOI: 10.1038/s41467-023-35947-w] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 01/09/2023] [Indexed: 01/19/2023] Open
Abstract
Spatially resolved transcriptomics involves a set of emerging technologies that enable the transcriptomic profiling of tissues with the physical location of expressions. Although a variety of methods have been developed for data integration, most of them are for single-cell RNA-seq datasets without consideration of spatial information. Thus, methods that can integrate spatial transcriptomics data from multiple tissue slides, possibly from multiple individuals, are needed. Here, we present PRECAST, a data integration method for multiple spatial transcriptomics datasets with complex batch effects and/or biological effects between slides. PRECAST unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, while requiring only partially shared cell/domain clusters across datasets. Using both simulated and four real datasets, we show improved cell/domain detection with outstanding visualization, and the estimated aligned embeddings and cell/domain labels facilitate many downstream analyses. We demonstrate that PRECAST is computationally scalable and applicable to spatial transcriptomics datasets from different platforms.
Collapse
Affiliation(s)
- Wei Liu
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Xu Liao
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Ziye Luo
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore, Singapore
- School of Statistics, Renmin University, Beijing, China
| | - Yi Yang
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore, Singapore
| | - Mai Chan Lau
- Institute of Molecular and Cell Biology (IMCB), Agency of Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Yuling Jiao
- School of Mathematics and Statistics, Wuhan University, Wuhan, China
| | - Xingjie Shi
- Academy of Statistics and Interdisciplinary Sciences, East China Normal University, Shanghai, China
| | - Weiwei Zhai
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Joe Yeong
- Institute of Molecular and Cell Biology (IMCB), Agency of Science, Technology and Research (A*STAR), Singapore, Singapore
- Department of Anatomical Pathology, Singapore General Hospital, Singapore, Singapore
| | - Jin Liu
- Centre for Quantitative Medicine, Health Services & Systems Research, Duke-NUS Medical School, Singapore, Singapore.
- School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen, China.
| |
Collapse
|
29
|
Shakola F, Palejev D, Ivanov I. A Framework for Comparison and Assessment of Synthetic RNA-Seq Data. Genes (Basel) 2022; 13:2362. [PMID: 36553629 PMCID: PMC9778097 DOI: 10.3390/genes13122362] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 12/05/2022] [Accepted: 12/06/2022] [Indexed: 12/16/2022] Open
Abstract
The ever-growing number of methods for the generation of synthetic bulk and single cell RNA-seq data have multiple and diverse applications. They are often aimed at benchmarking bioinformatics algorithms for purposes such as sample classification, differential expression analysis, correlation and network studies and the optimization of data integration and normalization techniques. Here, we propose a general framework to compare synthetically generated RNA-seq data and select a data-generating tool that is suitable for a set of specific study goals. As there are multiple methods for synthetic RNA-seq data generation, researchers can use the proposed framework to make an informed choice of an RNA-seq data simulation algorithm and software that are best suited for their specific scientific questions of interest.
Collapse
Affiliation(s)
- Felitsiya Shakola
- GATE Institute, Sofia University, 125 Tsarigradsko Shosse, Bl. 2, 1113 Sofia, Bulgaria
| | - Dean Palejev
- Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, Acad. G. Bonchev St., Bl. 8, 1113 Sofia, Bulgaria
| | - Ivan Ivanov
- Department of Veterinary Physiology and Pharmacology, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
30
|
Shang L, Zhou X. Spatially aware dimension reduction for spatial transcriptomics. Nat Commun 2022; 13:7203. [PMID: 36418351 PMCID: PMC9684472 DOI: 10.1038/s41467-022-34879-1] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 11/10/2022] [Indexed: 11/27/2022] Open
Abstract
Spatial transcriptomics are a collection of genomic technologies that have enabled transcriptomic profiling on tissues with spatial localization information. Analyzing spatial transcriptomic data is computationally challenging, as the data collected from various spatial transcriptomic technologies are often noisy and display substantial spatial correlation across tissue locations. Here, we develop a spatially-aware dimension reduction method, SpatialPCA, that can extract a low dimensional representation of the spatial transcriptomics data with biological signal and preserved spatial correlation structure, thus unlocking many existing computational tools previously developed in single-cell RNAseq studies for tailored analysis of spatial transcriptomics. We illustrate the benefits of SpatialPCA for spatial domain detection and explores its utility for trajectory inference on the tissue and for high-resolution spatial map construction. In the real data applications, SpatialPCA identifies key molecular and immunological signatures in a detected tumor surrounding microenvironment, including a tertiary lymphoid structure that shapes the gradual transcriptomic transition during tumorigenesis and metastasis. In addition, SpatialPCA detects the past neuronal developmental history that underlies the current transcriptomic landscape across tissue locations in the cortex.
Collapse
Affiliation(s)
- Lulu Shang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
31
|
Yu Q, Jiang M, Wu L. Spatial transcriptomics technology in cancer research. Front Oncol 2022; 12:1019111. [PMID: 36313703 PMCID: PMC9606570 DOI: 10.3389/fonc.2022.1019111] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/21/2022] [Indexed: 08/25/2023] Open
Abstract
In recent years, spatial transcriptomics (ST) technologies have developed rapidly and have been widely used in constructing spatial tissue atlases and characterizing spatiotemporal heterogeneity of cancers. Currently, ST has been used to profile spatial heterogeneity in multiple cancer types. Besides, ST is a benefit for identifying and comprehensively understanding special spatial areas such as tumor interface and tertiary lymphoid structures (TLSs), which exhibit unique tumor microenvironments (TMEs). Therefore, ST has also shown great potential to improve pathological diagnosis and identify novel prognostic factors in cancer. This review presents recent advances and prospects of applications on cancer research based on ST technologies as well as the challenges.
Collapse
Affiliation(s)
- Qichao Yu
- Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Miaomiao Jiang
- Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China
| | - Liang Wu
- Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China
| |
Collapse
|