1
|
Zhang C, Wang L, Shi Q. Computational modeling for deciphering tissue microenvironment heterogeneity from spatially resolved transcriptomics. Comput Struct Biotechnol J 2024; 23:2109-2115. [PMID: 38800634 PMCID: PMC11126885 DOI: 10.1016/j.csbj.2024.05.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 05/29/2024] Open
Abstract
Spatial transcriptomics techniques, while measuring gene expression, retain spatial location information, aiding in situ studies of organismal tissue architecture and the progression of pathological processes. These techniques generate vast amounts of omics data, necessitating the development of computational methods to reveal the underlying tissue microenvironment heterogeneity. The main directions in spatial transcriptomics data analysis are spatial domain detection and spatial deconvolution, which can identify spatial functional regions and parse the distribution of cell types in spatial transcriptomics data by integrating single-cell transcriptomics data. In these two research directions, many computational methods have been successively proposed. This article will categorize them into three types: machine learning-based methods, probabilistic models-based methods, and deep learning-based methods. It will list and discuss the representative algorithms of each type along with their advantages and disadvantages and describe the datasets and evaluation metrics used to assess these computational methods, facilitating researchers in selecting suitable computational methods according to their research needs. Finally, combining the latest technological developments and the advantages and disadvantages of current algorithms, this article will look forward to the future directions of computational method development.
Collapse
Affiliation(s)
- Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, Hangzhou 310024; University of Chinese Academy of Sciences, China
| | - Lequn Wang
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qianqian Shi
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| |
Collapse
|
2
|
Liu Y, Yang C. Computational methods for alignment and integration of spatially resolved transcriptomics data. Comput Struct Biotechnol J 2024; 23:1094-1105. [PMID: 38495555 PMCID: PMC10940867 DOI: 10.1016/j.csbj.2024.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 03/02/2024] [Accepted: 03/04/2024] [Indexed: 03/19/2024] Open
Abstract
Most of the complex biological regulatory activities occur in three dimensions (3D). To better analyze biological processes, it is essential not only to decipher the molecular information of numerous cells but also to understand how their spatial contexts influence their behavior. With the development of spatially resolved transcriptomics (SRT) technologies, SRT datasets are being generated to simultaneously characterize gene expression and spatial arrangement information within tissues, organs or organisms. To fully leverage spatial information, the focus extends beyond individual two-dimensional (2D) slices. Two tasks known as slices alignment and data integration have been introduced to establish correlations between multiple slices, enhancing the effectiveness of downstream tasks. Currently, numerous related methods have been developed. In this review, we first elucidate the details and principles behind several representative methods. Then we report the testing results of these methods on various SRT datasets, and assess their performance in representative downstream tasks. Insights into the strengths and weaknesses of each method and the reasons behind their performance are discussed. Finally, we provide an outlook on future developments. The codes and details of experiments are now publicly available at https://github.com/YangLabHKUST/SRT_alignment_and_integration.
Collapse
Affiliation(s)
- Yuyao Liu
- Department of Automation, School of Information Science and Technology, Tsinghua University, Beijing, China
| | - Can Yang
- Department of Mathematics, The Hong Kong University of Science and Technology, Hong Kong, China
| |
Collapse
|
3
|
Tian T, Zhang J, Lin X, Wei Z, Hakonarson H. Dependency-aware deep generative models for multitasking analysis of spatial omics data. Nat Methods 2024; 21:1501-1513. [PMID: 38783067 DOI: 10.1038/s41592-024-02257-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 03/25/2024] [Indexed: 05/25/2024]
Abstract
Spatially resolved transcriptomics (SRT) technologies have significantly advanced biomedical research, but their data analysis remains challenging due to the discrete nature of the data and the high levels of noise, compounded by complex spatial dependencies. Here, we propose spaVAE, a dependency-aware, deep generative spatial variational autoencoder model that probabilistically characterizes count data while capturing spatial correlations. spaVAE introduces a hybrid embedding combining a Gaussian process prior with a Gaussian prior to explicitly capture spatial correlations among spots. It then optimizes the parameters of deep neural networks to approximate the distributions underlying the SRT data. With the approximated distributions, spaVAE can contribute to several analytical tasks that are essential for SRT data analysis, including dimensionality reduction, visualization, clustering, batch integration, denoising, differential expression, spatial interpolation, resolution enhancement and identification of spatially variable genes. Moreover, we have extended spaVAE to spaPeakVAE and spaMultiVAE to characterize spatial ATAC-seq (assay for transposase-accessible chromatin using sequencing) data and spatial multi-omics data, respectively.
Collapse
Affiliation(s)
- Tian Tian
- School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence, and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, Hubei, China
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Jie Zhang
- National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, Jiangsu, China
| | - Xiang Lin
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA.
| | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
4
|
Affiliation(s)
- Vipul Singhal
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore, Singapore, Republic of Singapore.
| | - Nigel Chou
- Spatial and Single Cell Systems Domain, Genome Institute of Singapore, Singapore, Republic of Singapore
| |
Collapse
|
5
|
Zhang YZ, Imoto S. Genome analysis through image processing with deep learning models. J Hum Genet 2024:10.1038/s10038-024-01275-0. [PMID: 39085457 DOI: 10.1038/s10038-024-01275-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 07/08/2024] [Accepted: 07/08/2024] [Indexed: 08/02/2024]
Abstract
Genomic sequences are traditionally represented as strings of characters: A (adenine), C (cytosine), G (guanine), and T (thymine). However, an alternative approach involves depicting sequence-related information through image representations, such as Chaos Game Representation (CGR) and read pileup images. With rapid advancements in deep learning (DL) methods within computer vision and natural language processing, there is growing interest in applying image-based DL methods to genomic sequence analysis. These methods involve encoding genomic information as images or integrating spatial information from images into the analytical process. In this review, we summarize three typical applications that use image processing with DL models for genome analysis. We examine the utilization and advantages of these image-based approaches.
Collapse
Affiliation(s)
- Yao-Zhong Zhang
- Division of Health Medical Intelligence, Human Genome Center, the Institute of Medical Science, the University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan.
| | - Seiya Imoto
- Division of Health Medical Intelligence, Human Genome Center, the Institute of Medical Science, the University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Japan.
| |
Collapse
|
6
|
Zhang B, Zhang S, Zhang S. Whole brain alignment of spatial transcriptomics between humans and mice with BrainAlign. Nat Commun 2024; 15:6302. [PMID: 39080277 PMCID: PMC11289418 DOI: 10.1038/s41467-024-50608-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2024] [Accepted: 07/10/2024] [Indexed: 08/02/2024] Open
Abstract
The increasing utilization of mouse models in human neuroscience research places higher demands on computational methods to translate findings from the mouse brain to the human one. In this study, we develop BrainAlign, a self-supervised learning approach, for the whole brain alignment of spatial transcriptomics (ST) between humans and mice. BrainAlign encodes spots and genes simultaneously in two separated shared embedding spaces by a heterogeneous graph neural network. We demonstrate that BrainAlign could integrate cross-species spots into the embedding space and reveal the conserved brain regions supported by ST information, which facilitates the detection of homologous regions between humans and mice. Genomic analysis further presents gene expression connections between humans and mice and reveals similar expression patterns for marker genes. Moreover, BrainAlign can accurately map spatially similar homologous regions or clusters onto a unified spatial structural domain while preserving their relative positions.
Collapse
Affiliation(s)
- Biao Zhang
- School of Mathematical Sciences, Fudan University, Shanghai, China
| | - Shuqin Zhang
- School of Mathematical Sciences, Fudan University, Shanghai, China.
- Key Laboratory of Mathematics for Nonlinear Science, Fudan University, Ministry of Education, Shanghai, China.
- Shanghai Key Laboratory for Contemporary Applied Mathematics, Fudan University, Shanghai, China.
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China.
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, China.
| |
Collapse
|
7
|
Budhkar A, Tang Z, Liu X, Zhang X, Su J, Song Q. xSiGra: explainable model for single-cell spatial data elucidation. Brief Bioinform 2024; 25:bbae388. [PMID: 39120644 PMCID: PMC11312371 DOI: 10.1093/bib/bbae388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 06/22/2024] [Accepted: 07/23/2024] [Indexed: 08/10/2024] Open
Abstract
Recent advancements in spatial imaging technologies have revolutionized the acquisition of high-resolution multichannel images, gene expressions, and spatial locations at the single-cell level. Our study introduces xSiGra, an interpretable graph-based AI model, designed to elucidate interpretable features of identified spatial cell types, by harnessing multimodal features from spatial imaging technologies. By constructing a spatial cellular graph with immunohistology images and gene expression as node attributes, xSiGra employs hybrid graph transformer models to delineate spatial cell types. Additionally, xSiGra integrates a novel variant of gradient-weighted class activation mapping component to uncover interpretable features, including pivotal genes and cells for various cell types, thereby facilitating deeper biological insights from spatial data. Through rigorous benchmarking against existing methods, xSiGra demonstrates superior performance across diverse spatial imaging datasets. Application of xSiGra on a lung tumor slice unveils the importance score of cells, illustrating that cellular activity is not solely determined by itself but also impacted by neighboring cells. Moreover, leveraging the identified interpretable genes, xSiGra reveals endothelial cell subset interacting with tumor cells, indicating its heterogeneous underlying mechanisms within complex cellular interactions.
Collapse
Affiliation(s)
- Aishwarya Budhkar
- Luddy School of Informatics, Computing, and Engineering, Indiana University Bloomington, 107 S Indiana Ave, Bloomington, IN 47405, United States
| | - Ziyang Tang
- Department of Computer and Information Technology, Purdue University, 610 Purdue Mall, West Lafayette, IN 47907, United States
| | - Xiang Liu
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, 340 W 10th St, Indianapolis, IN 46202, United States
| | - Xuhong Zhang
- Luddy School of Informatics, Computing, and Engineering, Indiana University Bloomington, 107 S Indiana Ave, Bloomington, IN 47405, United States
| | - Jing Su
- Department of Biostatistics and Health Data Science, Indiana University School of Medicine, 340 W 10th St, Indianapolis, IN 46202, United States
- Gerontology and Geriatric Medicine, Wake Forest School of Medicine, 475 Vine St, Winston-Salem, NC 27101, United States
| | - Qianqian Song
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32611, United States
| |
Collapse
|
8
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-023-2561-0. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
9
|
Chang Z, Xu Y, Dong X, Gao Y, Wang C. Single-cell and spatial multiomic inference of gene regulatory networks using SCRIPro. Bioinformatics 2024; 40:btae466. [PMID: 39024032 PMCID: PMC11288411 DOI: 10.1093/bioinformatics/btae466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 06/05/2024] [Accepted: 07/17/2024] [Indexed: 07/20/2024] Open
Abstract
MOTIVATION The burgeoning generation of single-cell or spatial multiomic data allows for the characterization of gene regulation networks (GRNs) at an unprecedented resolution. However, the accurate reconstruction of GRNs from sparse and noisy single-cell or spatial multiomic data remains challenging. RESULTS Here, we present SCRIPro, a comprehensive computational framework that robustly infers GRNs for both single-cell and spatial multi-omics data. SCRIPro first improves sample coverage through a density clustering approach based on multiomic and spatial similarities. Additionally, SCRIPro scans transcriptional regulator (TR) importance by performing chromatin reconstruction and in silico deletion analyses using a comprehensive reference covering 1,292 human and 994 mouse TRs. Finally, SCRIPro combines TR-target importance scores derived from multiomic data with TR-target expression levels to ensure precise GRN reconstruction. We benchmarked SCRIPro on various datasets, including single-cell multiomic data from human B-cell lymphoma, mouse hair follicle development, Stereo-seq of mouse embryos, and Spatial-ATAC-RNA from mouse brain. SCRIPro outperforms existing motif-based methods and accurately reconstructs cell type-specific, stage-specific, and region-specific GRNs. Overall, SCRIPro emerges as a streamlined and fast method capable of reconstructing TR activities and GRNs for both single-cell and spatial multi-omic data. AVAILABILITY SCRIPro is available at https://github.com/wanglabtongji/SCRIPro. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhanhe Chang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Yunfan Xu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
| | - Yawei Gao
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
- National Key Laboratory of Autonomous Intelligent Unmanned Systems, Tongji University, Shanghai 200120, China
- Frontier Science Center for Intelligent Autonomous Systems, Tongji University, Shanghai 200120, China
| |
Collapse
|
10
|
Jackson KC, Booeshaghi AS, Gálvez-Merchán Á, Moses L, Chari T, Kim A, Pachter L. Identification of spatial homogeneous regions in tissues with concordex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.06.28.546949. [PMID: 39071320 PMCID: PMC11275758 DOI: 10.1101/2023.06.28.546949] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Spatial homogeneous regions (SHRs) in tissues are domains that are homogeneous with respect to cell type composition. We present a method for identifying SHRs using spatial transcriptomics data, and demonstrate that it is efficient and effective at finding SHRs for a wide variety of tissue types. The method is implemented in a tool called concordex, which relies on analysis of k-nearest-neighbor (kNN) graphs. The concordex tool is also useful for analysis of non-spatial transcriptomics data, and can elucidate the extent of concordance between partitions of cells derived from clustering algorithms, and transcriptomic similarity as represented in kNN graphs.
Collapse
Affiliation(s)
- Kayla C Jackson
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - A Sina Booeshaghi
- Department of Bioengineering, University of California, Berkeley, CA, USA
| | | | - Lambda Moses
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Tara Chari
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | | | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
11
|
Zhang Y, Yu Z, Wong KC, Li X. Unraveling Spatial Domain Characterization in Spatially Resolved Transcriptomics with Robust Graph Contrastive Clustering. Bioinformatics 2024; 40:btae451. [PMID: 39012523 PMCID: PMC11272174 DOI: 10.1093/bioinformatics/btae451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Revised: 06/12/2024] [Accepted: 07/12/2024] [Indexed: 07/17/2024] Open
Abstract
MOTIVATION Spatial transcriptomics can quantify gene expression and its spatial distribution in tissues, thus revealing molecular mechanisms of cellular interactions underlying tissue heterogeneity, tissue regeneration, and spatially localized disease mechanisms. However, existing spatial clustering methods often fail to exploit the full potential of spatial information, resulting in inaccurate identification of spatial domains. RESULTS In this paper, we develop a deep graph contrastive clustering framework, stDGCC, that accurately uncovers underlying spatial domains via explicitly modeling spatial information and gene expression profiles from spatial transcriptomics data. The stDGCC framework proposes a spatially informed graph node embedding model to preserve the topological information of spots and to learn the informative and discriminative characterization of spatial transcriptomics data through self-supervised contrastive learning. By simultaneously optimizing the contrastive learning loss, reconstruction loss, and Kullback-Leibler (KL) divergence loss, stDGCC achieves joint optimization of feature learning and topology structure preservation in an end-to-end manner. We validate the effectiveness of stDGCC on various spatial transcriptomics datasets acquired from different platforms, each with varying spatial resolutions. Our extensive experiments demonstrate the superiority of stDGCC over various state-of-the-art clustering methods in accurately identifying cellular-level biological structures. AVAILABILITY Code and data are available from https://github.com/TimE9527/stDGCC and https://figshare.com/projects/stDGCC/186525. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yingxi Zhang
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Zhuohan Yu
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong 999077, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| |
Collapse
|
12
|
Liu Y, Chen J, Lin C, Ke R. Multiplexed in situ RNA imaging by combFISH. Anal Bioanal Chem 2024; 416:3765-3774. [PMID: 38775954 DOI: 10.1007/s00216-024-05327-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Revised: 04/28/2024] [Accepted: 04/30/2024] [Indexed: 06/18/2024]
Abstract
Multiplexed in situ RNA imaging offers new opportunities for gene expression profiling by providing high-throughput spatial information. In this work, we present a cyclic combinatorial fluorescent in situ hybridization (combFISH) assay to achieve multiplexed detection of RNA in cell cultures and tissues. Specifically, multiplexing is achieved through cyclic interrogation of barcode sequences on the rolling circle amplicons generated from the padlock probe assay by using sets of combinatorial detection probes. Theoretically, combFISH can detect 64 genes in three hybridization cycles by combinatorial barcoding using 12 fluorescently labeled detection probes. Our method eliminates sequencing-by-ligation (SBL) chemistry in the in situ sequencing protocol and directly uses RNA as targets for ligation, making it more straightforward. We showed that our method works in fresh-frozen and formalin-fixed paraffin-embedded tissue sections. With its straightforward protocols, we expect our method to be adopted by the scientific community and extended to clinical settings.
Collapse
Affiliation(s)
- Yanxiu Liu
- School of Medicine, Huaqiao University, Xiamen, 361021, Fujian, China
- School of Biomedical Sciences, Huaqiao University, Xiamen, 361021, Fujian, China
| | - Jiayu Chen
- School of Medicine, Huaqiao University, Xiamen, 361021, Fujian, China
- School of Biomedical Sciences, Huaqiao University, Xiamen, 361021, Fujian, China
| | - Chen Lin
- School of Medicine, Huaqiao University, Xiamen, 361021, Fujian, China.
- School of Biomedical Sciences, Huaqiao University, Xiamen, 361021, Fujian, China.
| | - Rongqin Ke
- School of Medicine, Huaqiao University, Xiamen, 361021, Fujian, China.
- School of Biomedical Sciences, Huaqiao University, Xiamen, 361021, Fujian, China.
| |
Collapse
|
13
|
Ma Y, Zhou X. Accurate and efficient integrative reference-informed spatial domain detection for spatial transcriptomics. Nat Methods 2024; 21:1231-1244. [PMID: 38844627 DOI: 10.1038/s41592-024-02284-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 04/18/2024] [Indexed: 06/23/2024]
Abstract
Spatially resolved transcriptomics (SRT) studies are becoming increasingly common and large, offering unprecedented opportunities in mapping complex tissue structures and functions. Here we present integrative and reference-informed tissue segmentation (IRIS), a computational method designed to characterize tissue spatial organization in SRT studies through accurately and efficiently detecting spatial domains. IRIS uniquely leverages single-cell RNA sequencing data for reference-informed detection of biologically interpretable spatial domains, integrating multiple SRT slices while explicitly considering correlations both within and across slices. We demonstrate the advantages of IRIS through in-depth analysis of six SRT datasets encompassing diverse technologies, tissues, species and resolutions. In these applications, IRIS achieves substantial accuracy gains (39-1,083%) and speed improvements (4.6-666.0) in moderate-sized datasets, while representing the only method applicable for large datasets including Stereo-seq and 10x Xenium. As a result, IRIS reveals intricate brain structures, uncovers tumor microenvironment heterogeneity and detects structural changes in diabetes-affected testis, all with exceptional speed and accuracy.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, Brown University, Providence, RI, USA
- Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
14
|
Chen R, Nie P, Wang J, Wang GZ. Deciphering brain cellular and behavioral mechanisms: Insights from single-cell and spatial RNA sequencing. WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1865. [PMID: 38972934 DOI: 10.1002/wrna.1865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 05/05/2024] [Accepted: 05/14/2024] [Indexed: 07/09/2024]
Abstract
The brain is a complex computing system composed of a multitude of interacting neurons. The computational outputs of this system determine the behavior and perception of every individual. Each brain cell expresses thousands of genes that dictate the cell's function and physiological properties. Therefore, deciphering the molecular expression of each cell is of great significance for understanding its characteristics and role in brain function. Additionally, the positional information of each cell can provide crucial insights into their involvement in local brain circuits. In this review, we briefly overview the principles of single-cell RNA sequencing and spatial transcriptomics, the potential issues and challenges in their data processing, and their applications in brain research. We further outline several promising directions in neuroscience that could be integrated with single-cell RNA sequencing, including neurodevelopment, the identification of novel brain microstructures, cognition and behavior, neuronal cell positioning, molecules and cells related to advanced brain functions, sleep-wake cycles/circadian rhythms, and computational modeling of brain function. We believe that the deep integration of these directions with single-cell and spatial RNA sequencing can contribute significantly to understanding the roles of individual cells or cell types in these specific functions, thereby making important contributions to addressing critical questions in those fields. This article is categorized under: RNA Evolution and Genomics > Computational Analyses of RNA RNA in Disease and Development > RNA in Development RNA in Disease and Development > RNA in Disease.
Collapse
Affiliation(s)
- Renrui Chen
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Pengxing Nie
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Jing Wang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Guang-Zhong Wang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
15
|
Si Y, Zou J, Gao Y, Chuai G, Liu Q, Chen L. Foundation models in molecular biology. BIOPHYSICS REPORTS 2024; 10:135-151. [PMID: 39027316 PMCID: PMC11252241 DOI: 10.52601/bpr.2024.240006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/04/2024] [Indexed: 07/20/2024] Open
Abstract
Determining correlations between molecules at various levels is an important topic in molecular biology. Large language models have demonstrated a remarkable ability to capture correlations from large amounts of data in the field of natural language processing as well as image generation, and correlations captured from data using large language models can also be applicable to solving a wide range of specific tasks, hence large language models are also referred to as foundation models. The massive amount of data that exists in the field of molecular biology provides an excellent basis for the development of foundation models, and the recent emergence of foundation models in the field of molecular biology has really pushed the entire field forward. We summarize the foundation models developed based on RNA sequence data, DNA sequence data, protein sequence data, single-cell transcriptome data, and spatial transcriptome data respectively, and further discuss the research directions for the development of foundation models in molecular biology.
Collapse
Affiliation(s)
- Yunda Si
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| | - Jiawei Zou
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China
| | - Yicheng Gao
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Guohui Chuai
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Qi Liu
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai 201804, China
| | - Luonan Chen
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
- Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
16
|
Broadbent C, Song T, Kuang R. Deciphering high-order structures in spatial transcriptomes with graph-guided Tucker decomposition. Bioinformatics 2024; 40:i529-i538. [PMID: 38940176 PMCID: PMC11256919 DOI: 10.1093/bioinformatics/btae245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Spatial transcripome (ST) profiling can reveal cells' structural organizations and functional roles in tissues. However, deciphering the spatial context of gene expressions in ST data is a challenge-the high-order structure hiding in whole transcriptome space over 2D/3D spatial coordinates requires modeling and detection of interpretable high-order elements and components for further functional analysis and interpretation. This paper presents a new method GraphTucker-graph-regularized Tucker tensor decomposition for learning high-order factorization in ST data. GraphTucker is based on a nonnegative Tucker decomposition algorithm regularized by a high-order graph that captures spatial relation among spots and functional relation among genes. In the experiments on several Visium and Stereo-seq datasets, the novelty and advantage of modeling multiway multilinear relationships among the components in Tucker decomposition are demonstrated as opposed to the Canonical Polyadic Decomposition and conventional matrix factorization models by evaluation of detecting spatial components of gene modules, clustering spatial coefficients for tissue segmentation and imputing complete spatial transcriptomes. The results of visualization show strong evidence that GraphTucker detect more interpretable spatial components in the context of the spatial domains in the tissues. AVAILABILITY AND IMPLEMENTATION https://github.com/kuanglab/GraphTucker.
Collapse
Affiliation(s)
- Charles Broadbent
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, 55455, United States
| | - Tianci Song
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, 55455, United States
| | - Rui Kuang
- Department of Computer Science and Engineering, University of Minnesota Twin Cities, Minneapolis, MN, 55455, United States
| |
Collapse
|
17
|
Long Y, Ang KS, Sethi R, Liao S, Heng Y, van Olst L, Ye S, Zhong C, Xu H, Zhang D, Kwok I, Husna N, Jian M, Ng LG, Chen A, Gascoigne NRJ, Gate D, Fan R, Xu X, Chen J. Deciphering spatial domains from spatial multi-omics with SpatialGlue. Nat Methods 2024:10.1038/s41592-024-02316-4. [PMID: 38907114 DOI: 10.1038/s41592-024-02316-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 05/20/2024] [Indexed: 06/23/2024]
Abstract
Advances in spatial omics technologies now allow multiple types of data to be acquired from the same tissue slice. To realize the full potential of such data, we need spatially informed methods for data integration. Here, we introduce SpatialGlue, a graph neural network model with a dual-attention mechanism that deciphers spatial domains by intra-omics integration of spatial location and omics measurement followed by cross-omics integration. We demonstrated SpatialGlue on data acquired from different tissue types using different technologies, including spatial epigenome-transcriptome and transcriptome-proteome modalities. Compared to other methods, SpatialGlue captured more anatomical details and more accurately resolved spatial domains such as the cortex layers of the brain. Our method also identified cell types like spleen macrophage subsets located at three different zones that were not available in the original data annotations. SpatialGlue scales well with data size and can be used to integrate three modalities. Our spatial multi-omics analysis tool combines the information from complementary omics modalities to obtain a holistic view of cellular and tissue properties.
Collapse
Affiliation(s)
- Yahui Long
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Kok Siong Ang
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Raman Sethi
- Binformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Sha Liao
- BGI-Shenzhen, Shenzhen, China
- BGI Research-Southwest, BGI, Chongqing, China
| | - Yang Heng
- BGI-Shenzhen, Shenzhen, China
- BGI Research-Southwest, BGI, Chongqing, China
| | - Lynn van Olst
- The Ken & Ruth Davee Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Shuchen Ye
- Binformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Chengwei Zhong
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Hang Xu
- Binformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Di Zhang
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Immanuel Kwok
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Nazihah Husna
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
- Immunology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Min Jian
- BGI-Shenzhen, Shenzhen, China
- BGI Research Asia-Pacific, BGI, Singapore, Singapore
| | - Lai Guan Ng
- Shanghai Immune Therapy Institute, Shanghai Jiao Tong University School of Medicine Affiliated Renji Hospital, Shanghai, China
| | - Ao Chen
- BGI-Shenzhen, Shenzhen, China
- BGI Research-Southwest, BGI, Chongqing, China
- JFL-BGI STOmics Center, Jinfeng Laboratory, Chongqing, China
| | - Nicholas R J Gascoigne
- Immunology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
- Cancer Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - David Gate
- The Ken & Ruth Davee Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Rong Fan
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, China
| | - Jinmiao Chen
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
- Binformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Singapore, Singapore.
- Immunology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
- Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
- Center for Computational Biology and Program in Cancer and Stem Cell Biology, Duke-NUS Medical School, Singapore, Singapore.
| |
Collapse
|
18
|
Jin Y, Zuo Y, Li G, Liu W, Pan Y, Fan T, Fu X, Yao X, Peng Y. Advances in spatial transcriptomics and its applications in cancer research. Mol Cancer 2024; 23:129. [PMID: 38902727 PMCID: PMC11188176 DOI: 10.1186/s12943-024-02040-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Accepted: 06/10/2024] [Indexed: 06/22/2024] Open
Abstract
Malignant tumors have increasing morbidity and high mortality, and their occurrence and development is a complicate process. The development of sequencing technologies enabled us to gain a better understanding of the underlying genetic and molecular mechanisms in tumors. In recent years, the spatial transcriptomics sequencing technologies have been developed rapidly and allow the quantification and illustration of gene expression in the spatial context of tissues. Compared with the traditional transcriptomics technologies, spatial transcriptomics technologies not only detect gene expression levels in cells, but also inform the spatial location of genes within tissues, cell composition of biological tissues, and interaction between cells. Here we summarize the development of spatial transcriptomics technologies, spatial transcriptomics tools and its application in cancer research. We also discuss the limitations and challenges of current spatial transcriptomics approaches, as well as future development and prospects.
Collapse
Affiliation(s)
- Yang Jin
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Yuanli Zuo
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Gang Li
- Department of Thoracic Surgery, The Public Health Clinical Center of Chengdu, Chengdu, 610061, China
| | - Wenrong Liu
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Yitong Pan
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Ting Fan
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Xin Fu
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China
| | - Xiaojun Yao
- Department of Thoracic Surgery, The Public Health Clinical Center of Chengdu, Chengdu, 610061, China.
| | - Yong Peng
- Laboratory of Molecular Oncology, Frontiers Science Center for Disease-related Molecular Network, State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, 610041, China.
- Frontier Medical Center, Tianfu Jincheng Laboratory, Chengdu, 610212, China.
| |
Collapse
|
19
|
Li Y, Zhang J, Gao X, Zhang QC. Tissue module discovery in single-cell-resolution spatial transcriptomics data via cell-cell interaction-aware cell embedding. Cell Syst 2024; 15:578-592.e7. [PMID: 38823396 DOI: 10.1016/j.cels.2024.05.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 01/08/2024] [Accepted: 05/07/2024] [Indexed: 06/03/2024]
Abstract
Computational methods are desired for single-cell-resolution spatial transcriptomics (ST) data analysis to uncover spatial organization principles for how individual cells exert tissue-specific functions. Here, we present ST data analysis via interaction-aware cell embedding (SPACE), a deep-learning method for cell-type identification and tissue module discovery from single-cell-resolution ST data by learning a cell representation that captures its gene expression profile and interactions with its spatial neighbors. SPACE identified spatially informed cell subtypes defined by their special spatial distribution patterns and distinct proximal-interacting cell types. SPACE also automatically discovered "cell communities"-tissue modules with discernible boundaries and a uniform spatial distribution of constituent cell types. For each cell community, SPACE outputs a characteristic proximal cell-cell interaction network associated with physiological processes, which can be used to refine ligand-receptor-based intercellular signaling analyses. We envision that SPACE can be used in large-scale ST projects to understand how proximal cell-cell interactions contribute to emergent biological functions within cell communities. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Yuzhe Li
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China; Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Jinsong Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China; Tsinghua-Peking Center for Life Sciences, Beijing 100084, China; Shanghai Qi Zhi Institute, Shanghai 200030, China
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia; KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia; BioMap, Beijing 100086, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology & Frontier Research Center for Biological Structure, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China; Tsinghua-Peking Center for Life Sciences, Beijing 100084, China.
| |
Collapse
|
20
|
Lee AJ, Yao S, Lusk N, Ng L, Kunst M, Zeng H, Tasic B, Abbasi-Asl R. Data-driven fine-grained region discovery in the mouse brain with transformers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.05.592608. [PMID: 38766132 PMCID: PMC11100623 DOI: 10.1101/2024.05.05.592608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Technologies such as spatial transcriptomics offer unique opportunities to define the spatial organization of the mouse brain. We developed an unsupervised training scheme and novel transformer-based deep learning architecture to detect spatial domains across the whole mouse brain using spatial transcriptomics data. Our model learns local representations of molecular and cellular statistical patterns which can be clustered to identify spatial domains within the brain from coarse to fine-grained. Discovered domains are spatially regular, even with several hundreds of spatial clusters. They are also consistent with existing anatomical ontologies such as the Allen Mouse Brain Common Coordinate Framework version 3 (CCFv3) and can be visually interpreted at the cell type or transcript level. We demonstrate our method can be used to identify previously uncatalogued subregions, such as in the midbrain, where we uncover gradients of inhibitory neuron complexity and abundance. Notably, these subregions cannot be discovered using other methods. We apply our method to a separate multi-animal whole-brain spatial transcriptomic dataset and show that our method can also robustly integrate spatial domains across animals.
Collapse
Affiliation(s)
- Alex J. Lee
- University of California, San Francisco
- Weill Institute for Neurosciences
| | | | | | - Lydia Ng
- Allen Institute for Brain Science
| | | | | | | | - Reza Abbasi-Asl
- University of California, San Francisco
- Weill Institute for Neurosciences
| |
Collapse
|
21
|
Zuo C, Xia J, Chen L. Dissecting tumor microenvironment from spatially resolved transcriptomics data by heterogeneous graph learning. Nat Commun 2024; 15:5057. [PMID: 38871687 DOI: 10.1038/s41467-024-49171-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Accepted: 05/22/2024] [Indexed: 06/15/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) has enabled precise dissection of tumor-microenvironment (TME) by analyzing its intracellular molecular networks and intercellular cell-cell communication (CCC). However, lacking computational exploration of complicated relations between cells, genes, and histological regions, severely limits the ability to interpret the complex structure of TME. Here, we introduce stKeep, a heterogeneous graph (HG) learning method that integrates multimodality and gene-gene interactions, in unraveling TME from SRT data. stKeep leverages HG to learn both cell-modules and gene-modules by incorporating features of diverse nodes including genes, cells, and histological regions, allows for identifying finer cell-states within TME and cell-state-specific gene-gene relations, respectively. Furthermore, stKeep employs HG to infer CCC for each cell, while ensuring that learned CCC patterns are comparable across different cell-states through contrastive learning. In various cancer samples, stKeep outperforms other tools in dissecting TME such as detecting bi-potent basal populations, neoplastic myoepithelial cells, and metastatic cells distributed within the tumor or leading-edge regions. Notably, stKeep identifies key transcription factors, ligands, and receptors relevant to disease progression, which are further validated by the functional and survival analysis of independent clinical data, thereby highlighting its clinical prognostic and immunotherapy applications.
Collapse
Affiliation(s)
- Chunman Zuo
- Institute of Artificial Intelligence, Shanghai Engineering Research Center of Industrial Big Data and Intelligent System, Donghua University, Shanghai, 201620, China.
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130022, China.
| | - Junjie Xia
- Institute of Artificial Intelligence, Shanghai Engineering Research Center of Industrial Big Data and Intelligent System, Donghua University, Shanghai, 201620, China
- Department of Applied Mathematics, Donghua University, Shanghai, 201620, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai, 200031, China.
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, 310024, China.
- West China Biomedical Big Data Center, Med-X center for informatics, West China Hospital, Sichuan University, Chengdu, 610041, China.
| |
Collapse
|
22
|
Qian J, Bao H, Shao X, Fang Y, Liao J, Chen Z, Li C, Guo W, Hu Y, Li A, Yao Y, Fan X, Cheng Y. Simulating multiple variability in spatially resolved transcriptomics with scCube. Nat Commun 2024; 15:5021. [PMID: 38866768 PMCID: PMC11169532 DOI: 10.1038/s41467-024-49445-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 06/03/2024] [Indexed: 06/14/2024] Open
Abstract
A pressing challenge in spatially resolved transcriptomics (SRT) is to benchmark the computational methods. A widely-used approach involves utilizing simulated data. However, biases exist in terms of the currently available simulated SRT data, which seriously affects the accuracy of method evaluation and validation. Herein, we present scCube ( https://github.com/ZJUFanLab/scCube ), a Python package for independent, reproducible, and technology-diverse simulation of SRT data. scCube not only enables the preservation of spatial expression patterns of genes in reference-based simulations, but also generates simulated data with different spatial variability (covering the spatial pattern type, the resolution, the spot arrangement, the targeted gene type, and the tissue slice dimension, etc.) in reference-free simulations. We comprehensively benchmark scCube with existing single-cell or SRT simulators, and demonstrate the utility of scCube in benchmarking spot deconvolution, gene imputation, and resolution enhancement methods in detail through three applications.
Collapse
Affiliation(s)
- Jingyang Qian
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Hudong Bao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Xin Shao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Yin Fang
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310013, China
| | - Jie Liao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Zhuo Chen
- College of Computer Science and Technology, Zhejiang University, Hangzhou, 310013, China
| | - Chengyu Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Wenbo Guo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Yining Hu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Anyao Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Yue Yao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China
| | - Xiaohui Fan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Yiyu Cheng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, 314100, Jiaxing, China.
| |
Collapse
|
23
|
Lin S, Cui Y, Zhao F, Yang Z, Song J, Yao J, Zhao Y, Qian BZ, Zhao Y, Yuan Z. Complete spatially resolved gene expression is not necessary for identifying spatial domains. CELL GENOMICS 2024; 4:100565. [PMID: 38781966 PMCID: PMC11228956 DOI: 10.1016/j.xgen.2024.100565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 02/29/2024] [Accepted: 04/30/2024] [Indexed: 05/25/2024]
Abstract
Spatially resolved transcriptomics (SRT) technologies have revolutionized the study of tissue organization. We introduce a graph convolutional network with an attention and positive emphasis mechanism, termed BINARY, relying exclusively on binarized SRT data to accurately delineate spatial domains. BINARY outperforms existing methods across various SRT data types while using significantly less input information. Our study suggests that precise gene expression quantification may not always be essential, inspiring further exploration of the broader applications of spatially resolved binarized gene expression data.
Collapse
Affiliation(s)
- Senlin Lin
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Yan Cui
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China; Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
| | - Fangyuan Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Zhidong Yang
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia
| | | | - Yu Zhao
- AI Lab, Tencent, Shenzhen, China
| | - Bin-Zhi Qian
- Fudan University Shanghai Cancer Center, Department of Oncology, Shanghai Medical College, The Human Phenome Institute, Zhangjiang-Fudan International Innovation Center, Fudan University, Shanghai, China
| | - Yi Zhao
- Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China; University of Chinese Academy of Sciences, Beijing, China.
| | - Zhiyuan Yuan
- Institute of Science and Technology for Brain-Inspired Intelligence, MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China; Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
| |
Collapse
|
24
|
Sun Y, Kong L, Huang J, Deng H, Bian X, Li X, Cui F, Dou L, Cao C, Zou Q, Zhang Z. A comprehensive survey of dimensionality reduction and clustering methods for single-cell and spatial transcriptomics data. Brief Funct Genomics 2024:elae023. [PMID: 38860675 DOI: 10.1093/bfgp/elae023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/29/2024] [Accepted: 05/27/2024] [Indexed: 06/12/2024] Open
Abstract
In recent years, the application of single-cell transcriptomics and spatial transcriptomics analysis techniques has become increasingly widespread. Whether dealing with single-cell transcriptomic or spatial transcriptomic data, dimensionality reduction and clustering are indispensable. Both single-cell and spatial transcriptomic data are often high-dimensional, making the analysis and visualization of such data challenging. Through dimensionality reduction, it becomes possible to visualize the data in a lower-dimensional space, allowing for the observation of relationships and differences between cell subpopulations. Clustering enables the grouping of similar cells into the same cluster, aiding in the identification of distinct cell subpopulations and revealing cellular diversity, providing guidance for downstream analyses. In this review, we systematically summarized the most widely recognized algorithms employed for the dimensionality reduction and clustering analysis of single-cell transcriptomic and spatial transcriptomic data. This endeavor provides valuable insights and ideas that can contribute to the development of novel tools in this rapidly evolving field.
Collapse
Affiliation(s)
- Yidi Sun
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lingling Kong
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Jiayi Huang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Hongyan Deng
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Xinling Bian
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Xingfeng Li
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Feifei Cui
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| | - Lijun Dou
- Genomic Medicine Institute, Lerner Research Institute, Cleveland, OH 44106, United States
| | - Chen Cao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 210029, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou 570228, China
| |
Collapse
|
25
|
Blampey Q, Mulder K, Gardet M, Christodoulidis S, Dutertre CA, André F, Ginhoux F, Cournède PH. Sopa: a technology-invariant pipeline for analyses of image-based spatial omics. Nat Commun 2024; 15:4981. [PMID: 38862483 PMCID: PMC11167053 DOI: 10.1038/s41467-024-48981-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 05/21/2024] [Indexed: 06/13/2024] Open
Abstract
Spatial omics data allow in-depth analysis of tissue architectures, opening new opportunities for biological discovery. In particular, imaging techniques offer single-cell resolutions, providing essential insights into cellular organizations and dynamics. Yet, the complexity of such data presents analytical challenges and demands substantial computing resources. Moreover, the proliferation of diverse spatial omics technologies, such as Xenium, MERSCOPE, CosMX in spatial-transcriptomics, and MACSima and PhenoCycler in multiplex imaging, hinders the generality of existing tools. We introduce Sopa ( https://github.com/gustaveroussy/sopa ), a technology-invariant, memory-efficient pipeline with a unified visualizer for all image-based spatial omics. Built upon the universal SpatialData framework, Sopa optimizes tasks like segmentation, transcript/channel aggregation, annotation, and geometric/spatial analysis. Its output includes user-friendly web reports and visualizer files, as well as comprehensive data files for in-depth analysis. Overall, Sopa represents a significant step toward unifying spatial data analysis, enabling a more comprehensive understanding of cellular interactions and tissue organization in biological systems.
Collapse
Affiliation(s)
- Quentin Blampey
- Paris-Saclay University, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), Gif-sur-Yvette, France.
- Paris-Saclay University, Gustave Roussy, Villejuif, France.
| | - Kevin Mulder
- Paris-Saclay University, Gustave Roussy, Villejuif, France
| | - Margaux Gardet
- Paris-Saclay University, Gustave Roussy, Villejuif, France
| | - Stergios Christodoulidis
- Paris-Saclay University, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), Gif-sur-Yvette, France
| | | | - Fabrice André
- Paris-Saclay University, Gustave Roussy, Villejuif, France
- Gustave Roussy, Department of Medical Oncology, Villejuif, France
| | | | - Paul-Henry Cournède
- Paris-Saclay University, CentraleSupélec, Laboratory of Mathematics and Computer Science (MICS), Gif-sur-Yvette, France.
| |
Collapse
|
26
|
Yu Y, He Y, Xie Z. Accurate Identification of Spatial Domain by Incorporating Global Spatial Proximity and Local Expression Proximity. Biomolecules 2024; 14:674. [PMID: 38927077 PMCID: PMC11201407 DOI: 10.3390/biom14060674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 06/01/2024] [Accepted: 06/07/2024] [Indexed: 06/28/2024] Open
Abstract
Accurate identification of spatial domains is essential in the analysis of spatial transcriptomics data in order to elucidate tissue microenvironments and biological functions. However, existing methods only perform domain segmentation based on local or global spatial relationships between spots, resulting in an underutilization of spatial information. To this end, we propose SECE, a deep learning-based method that captures both local and global relationships among spots and aggregates their information using expression similarity and spatial similarity. We benchmarked SECE against eight state-of-the-art methods on six real spatial transcriptomics datasets spanning four different platforms. SECE consistently outperformed other methods in spatial domain identification accuracy. Moreover, SECE produced spatial embeddings that exhibited clearer patterns in low-dimensional visualizations and facilitated a more accurate trajectory inference.
Collapse
Affiliation(s)
- Yuanyuan Yu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;
| | - Yao He
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;
- Center for Precision Medicine, Sun Yat-sen University, Guangzhou 510080, China
| |
Collapse
|
27
|
Ma Y, Liu L, Zhao Y, Hang B, Zhang Y. HyperGCN: an effective deep representation learning framework for the integrative analysis of spatial transcriptomics data. BMC Genomics 2024; 25:566. [PMID: 38840049 PMCID: PMC11155133 DOI: 10.1186/s12864-024-10469-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 05/29/2024] [Indexed: 06/07/2024] Open
Abstract
BACKGROUND Advances of spatial transcriptomics technologies enabled simultaneously profiling gene expression and spatial locations of cells from the same tissue. Computational tools and approaches for integration of transcriptomics data and spatial context information are urgently needed to comprehensively explore the underlying structure patterns. In this manuscript, we propose HyperGCN for the integrative analysis of gene expression and spatial information profiled from the same tissue. HyperGCN enables data visualization and clustering, and facilitates downstream analysis, including domain segmentation, the characterization of marker genes for the specific domain structure and GO enrichment analysis. RESULTS Extensive experiments are implemented on four real datasets from different tissues (including human dorsolateral prefrontal cortex, human positive breast tumors, mouse brain, mouse olfactory bulb tissue and Zabrafish melanoma) and technologies (including 10X visium, osmFISH, seqFISH+, 10X Xenium and Stereo-seq) with different spatial resolutions. The results show that HyperGCN achieves superior clustering performance and produces good domain segmentation effects while identifies biologically meaningful spatial expression patterns. This study provides a flexible framework to analyze spatial transcriptomics data with high geometric complexity. CONCLUSIONS HyperGCN is an unsupervised method based on hypergraph induced graph convolutional network, where it assumes that there existed disjoint tissues with high geometric complexity, and models the semantic relationship of cells through hypergraph, which better tackles the high-order interactions of cells and levels of noise in spatial transcriptomics data.
Collapse
Affiliation(s)
- Yuanyuan Ma
- School of Computer Engineering, Hubei University of Arts and Science, Xiangyang, China.
- Hubei Key Laboratory of Power System Design and Test for Electrical Vehicle, Hubei University of Arts and Science, Xiangyang, China.
| | - Lifang Liu
- School of Physics and Electronic Engineering, Hubei University of Arts and Science, Xiangyang, China
| | - Yongbiao Zhao
- School of Computer Engineering, Hubei University of Arts and Science, Xiangyang, China
- School of Computer, Central China Normal University, Wuhan, China
| | - Bo Hang
- School of Computer Engineering, Hubei University of Arts and Science, Xiangyang, China
| | - Yanduo Zhang
- School of Computer Engineering, Hubei University of Arts and Science, Xiangyang, China
| |
Collapse
|
28
|
Duan Z, Riffle D, Li R, Liu J, Min MR, Zhang J. Impeller: a path-based heterogeneous graph learning method for spatial transcriptomic data imputation. Bioinformatics 2024; 40:btae339. [PMID: 38806165 PMCID: PMC11256934 DOI: 10.1093/bioinformatics/btae339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 05/18/2024] [Accepted: 05/26/2024] [Indexed: 05/30/2024] Open
Abstract
MOTIVATION Recent advances in spatial transcriptomics allow spatially resolved gene expression measurements with cellular or even sub-cellular resolution, directly characterizing the complex spatiotemporal gene expression landscape and cell-to-cell interactions in their native microenvironments. Due to technology limitations, most spatial transcriptomic technologies still yield incomplete expression measurements with excessive missing values. Therefore, gene imputation is critical to filling in missing data, enhancing resolution, and improving overall interpretability. However, existing methods either require additional matched single-cell RNA-seq data, which is rarely available, or ignore spatial proximity or expression similarity information. RESULTS To address these issues, we introduce Impeller, a path-based heterogeneous graph learning method for spatial transcriptomic data imputation. Impeller has two unique characteristics distinct from existing approaches. First, it builds a heterogeneous graph with two types of edges representing spatial proximity and expression similarity. Therefore, Impeller can simultaneously model smooth gene expression changes across spatial dimensions and capture similar gene expression signatures of faraway cells from the same type. Moreover, Impeller incorporates both short- and long-range cell-to-cell interactions (e.g. via paracrine and endocrine) by stacking multiple GNN layers. We use a learnable path operator in Impeller to avoid the over-smoothing issue of the traditional Laplacian matrices. Extensive experiments on diverse datasets from three popular platforms and two species demonstrate the superiority of Impeller over various state-of-the-art imputation methods. AVAILABILITY AND IMPLEMENTATION The code and preprocessed data used in this study are available at https://github.com/aicb-ZhangLabs/Impeller and https://zenodo.org/records/11212604.
Collapse
Affiliation(s)
- Ziheng Duan
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, United States
| | - Dylan Riffle
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, United States
| | - Ren Li
- Mathematical, Computational, and Systems Biology, University of California, Irvine, Irvine, CA 92697, United States
| | - Junhao Liu
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, United States
| | - Martin Renqiang Min
- Department of Machine Learning, NEC Labs America, Princeton, NJ 08540, United States
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, United States
| |
Collapse
|
29
|
Li Y, Lac L, Liu Q, Hu P. ST-CellSeg: Cell segmentation for imaging-based spatial transcriptomics using multi-scale manifold learning. PLoS Comput Biol 2024; 20:e1012254. [PMID: 38935799 PMCID: PMC11236102 DOI: 10.1371/journal.pcbi.1012254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 07/10/2024] [Accepted: 06/16/2024] [Indexed: 06/29/2024] Open
Abstract
Spatial transcriptomics has gained popularity over the past decade due to its ability to evaluate transcriptome data while preserving spatial information. Cell segmentation is a crucial step in spatial transcriptomic analysis, as it enables the avoidance of unpredictable tissue disentanglement steps. Although high-quality cell segmentation algorithms can aid in the extraction of valuable data, traditional methods are frequently non-spatial, do not account for spatial information efficiently, and perform poorly when confronted with the problem of spatial transcriptome cell segmentation with varying shapes. In this study, we propose ST-CellSeg, an image-based machine learning method for spatial transcriptomics that uses manifold for cell segmentation and is novel in its consideration of multi-scale information. We first construct a fully connected graph which acts as a spatial transcriptomic manifold. Using multi-scale data, we then determine the low-dimensional spatial probability distribution representation for cell segmentation. Using the adjusted Rand index (ARI), normalized mutual information (NMI), and Silhouette coefficient (SC) as model performance measures, the proposed algorithm significantly outperforms baseline models in selected datasets and is efficient in computational complexity.
Collapse
Affiliation(s)
- Youcheng Li
- Department of Biochemistry, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada
- Department of Computer Science, Western University, London, Ontario, Canada
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Leann Lac
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
- Department of Statistics, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Qian Liu
- Department of Applied Computer Science, University of Winnipeg, Winnipeg, Manitoba, Canada
| | - Pingzhao Hu
- Department of Biochemistry, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada
- Department of Computer Science, Western University, London, Ontario, Canada
- Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
- Department of Epidemiology and Biostatistics, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada
- Department of Oncology, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada
- The Children's Health Research Institute, Lawson Health Research Institute, London, Ontario, Canada
| |
Collapse
|
30
|
Wang H, Zhao J, Nie Q, Zheng C, Sun X. Dissecting Spatiotemporal Structures in Spatial Transcriptomics via Diffusion-Based Adversarial Learning. RESEARCH (WASHINGTON, D.C.) 2024; 7:0390. [PMID: 38812530 PMCID: PMC11134684 DOI: 10.34133/research.0390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 04/23/2024] [Indexed: 05/31/2024]
Abstract
Recent advancements in spatial transcriptomics (ST) technologies offer unprecedented opportunities to unveil the spatial heterogeneity of gene expression and cell states within tissues. Despite these capabilities of the ST data, accurately dissecting spatiotemporal structures (e.g., spatial domains, temporal trajectories, and functional interactions) remains challenging. Here, we introduce a computational framework, PearlST (partial differential equation [PDE]-enhanced adversarial graph autoencoder of ST), for accurate inference of spatiotemporal structures from the ST data using PDE-enhanced adversarial graph autoencoder. PearlST employs contrastive learning to extract histological image features, integrates a PDE-based diffusion model to enhance characterization of spatial features at domain boundaries, and learns the latent low-dimensional embeddings via Wasserstein adversarial regularized graph autoencoders. Comparative analyses across multiple ST datasets with varying resolutions demonstrate that PearlST outperforms existing methods in spatial clustering, trajectory inference, and pseudotime analysis. Furthermore, PearlST elucidates functional regulations of the latent features by linking intercellular ligand-receptor interactions to most contributing genes of the low-dimensional embeddings, as illustrated in a human breast cancer dataset. Overall, PearlST proves to be a powerful tool for extracting interpretable latent features and dissecting intricate spatiotemporal structures in ST data across various biological contexts.
Collapse
Affiliation(s)
- Haiyun Wang
- College of Mathematics and System Sciences,
Xinjiang University, Urumqi, China
| | - Jianping Zhao
- College of Mathematics and System Sciences,
Xinjiang University, Urumqi, China
| | - Qing Nie
- Department of Mathematics and Department of Developmental and Cell Biology, NSF-Simons Center for Multiscale Cell Fate Research,
University of California Irvine, Irvine, CA, USA
| | - Chunhou Zheng
- School of Artificial Intelligence,
Anhui University, Hefei, China
| | - Xiaoqiang Sun
- School of Mathematics,
Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
31
|
Swain AK, Pandit V, Sharma J, Yadav P. SpatialPrompt: spatially aware scalable and accurate tool for spot deconvolution and domain identification in spatial transcriptomics. Commun Biol 2024; 7:639. [PMID: 38796505 PMCID: PMC11127982 DOI: 10.1038/s42003-024-06349-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 05/17/2024] [Indexed: 05/28/2024] Open
Abstract
Efficiently mapping of cell types in situ remains a major challenge in spatial transcriptomics. Most spot deconvolution tools ignore spatial coordinate information and perform extremely slow on large datasets. Here, we introduce SpatialPrompt, a spatially aware and scalable tool for spot deconvolution and domain identification. SpatialPrompt integrates gene expression, spatial location, and single-cell RNA sequencing (scRNA-seq) dataset as reference to accurately infer cell-type proportions of spatial spots. SpatialPrompt uses non-negative ridge regression and graph neural network to efficiently capture local microenvironment information. Our extensive benchmarking analysis on Visium, Slide-seq, and MERFISH datasets demonstrated superior performance of SpatialPrompt over 15 existing tools. On mouse hippocampus dataset, SpatialPrompt achieves spot deconvolution and domain identification within 2 minutes for 50,000 spots. Overall, domain identification using SpatialPrompt was 44 to 150 times faster than existing methods. We build a database housing 40 plus curated scRNA-seq datasets for seamless integration with SpatialPrompt for spot deconvolution.
Collapse
Affiliation(s)
- Asish Kumar Swain
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Vrushali Pandit
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Jyoti Sharma
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India
| | - Pankaj Yadav
- Department of Bioscience & Bioengineering, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India.
- School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, Rajasthan, 342030, India.
| |
Collapse
|
32
|
Zhang L, Liang S, Wan L. A multi-view graph contrastive learning framework for deciphering spatially resolved transcriptomics data. Brief Bioinform 2024; 25:bbae255. [PMID: 38801701 PMCID: PMC11129769 DOI: 10.1093/bib/bbae255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/27/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024] Open
Abstract
Spatially resolved transcriptomics data are being used in a revolutionary way to decipher the spatial pattern of gene expression and the spatial architecture of cell types. Much work has been done to exploit the genomic spatial architectures of cells. Such work is based on the common assumption that gene expression profiles of spatially adjacent spots are more similar than those of more distant spots. However, related work might not consider the nonlocal spatial co-expression dependency, which can better characterize the tissue architectures. Therefore, we propose MuCoST, a Multi-view graph Contrastive learning framework for deciphering complex Spatially resolved Transcriptomic architectures with dual scale structural dependency. To achieve this, we employ spot dependency augmentation by fusing gene expression correlation and spatial location proximity, thereby enabling MuCoST to model both nonlocal spatial co-expression dependency and spatially adjacent dependency. We benchmark MuCoST on four datasets, and we compare it with other state-of-the-art spatial domain identification methods. We demonstrate that MuCoST achieves the highest accuracy on spatial domain identification from various datasets. In particular, MuCoST accurately deciphers subtle biological textures and elaborates the variation of spatially functional patterns.
Collapse
Affiliation(s)
- Lei Zhang
- Department of Control Science and Engineering, Tongji University, No. 4800 Cao’an Road, 201804, Shanghai, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Lane 55, Chuanhe Road, 201210, Shanghai, China
| | - Shu Liang
- Department of Control Science and Engineering, Tongji University, No. 4800 Cao’an Road, 201804, Shanghai, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University, Lane 55, Chuanhe Road, 201210, Shanghai, China
| | - Lin Wan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, No. 55 Zhongguancun East Road, 100190, Beijing, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, 19A Yuquan Road, 100049, Beijing, China
| |
Collapse
|
33
|
Si Z, Li H, Shang W, Zhao Y, Kong L, Long C, Zuo Y, Feng Z. SpaNCMG: improving spatial domains identification of spatial transcriptomics using neighborhood-complementary mixed-view graph convolutional network. Brief Bioinform 2024; 25:bbae259. [PMID: 38811360 PMCID: PMC11136618 DOI: 10.1093/bib/bbae259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 05/10/2024] [Accepted: 05/16/2024] [Indexed: 05/31/2024] Open
Abstract
The advancement of spatial transcriptomics (ST) technology contributes to a more profound comprehension of the spatial properties of gene expression within tissues. However, due to challenges of high dimensionality, pronounced noise and dynamic limitations in ST data, the integration of gene expression and spatial information to accurately identify spatial domains remains challenging. This paper proposes a SpaNCMG algorithm for the purpose of achieving precise spatial domain description and localization based on a neighborhood-complementary mixed-view graph convolutional network. The algorithm enables better adaptation to ST data at different resolutions by integrating the local information from KNN and the global structure from r-radius into a complementary neighborhood graph. It also introduces an attention mechanism to achieve adaptive fusion of different reconstructed expressions, and utilizes KPCA method for dimensionality reduction. The application of SpaNCMG on five datasets from four sequencing platforms demonstrates superior performance to eight existing advanced methods. Specifically, the algorithm achieved highest ARI accuracies of 0.63 and 0.52 on the datasets of the human dorsolateral prefrontal cortex and mouse somatosensory cortex, respectively. It accurately identified the spatial locations of marker genes in the mouse olfactory bulb tissue and inferred the biological functions of different regions. When handling larger datasets such as mouse embryos, the SpaNCMG not only identified the main tissue structures but also explored unlabeled domains. Overall, the good generalization ability and scalability of SpaNCMG make it an outstanding tool for understanding tissue structure and disease mechanisms. Our codes are available at https://github.com/ZhihaoSi/SpaNCMG.
Collapse
Affiliation(s)
- Zhihao Si
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Hanshuang Li
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Wenjing Shang
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Yanan Zhao
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Lingjiao Kong
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| | - Chunshen Long
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, College of Life Sciences, Inner Mongolia University, Hohhot 010070, China
| | - Zhenxing Feng
- College of Sciences, Inner Mongolia University of Technology, Hohhot 010051, China
| |
Collapse
|
34
|
Wang T, Shu H, Hu J, Wang Y, Chen J, Peng J, Shang X. Accurately deciphering spatial domains for spatially resolved transcriptomics with stCluster. Brief Bioinform 2024; 25:bbae329. [PMID: 38975895 DOI: 10.1093/bib/bbae329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 06/16/2024] [Accepted: 06/24/2024] [Indexed: 07/09/2024] Open
Abstract
Spatial transcriptomics provides valuable insights into gene expression within the native tissue context, effectively merging molecular data with spatial information to uncover intricate cellular relationships and tissue organizations. In this context, deciphering cellular spatial domains becomes essential for revealing complex cellular dynamics and tissue structures. However, current methods encounter challenges in seamlessly integrating gene expression data with spatial information, resulting in less informative representations of spots and suboptimal accuracy in spatial domain identification. We introduce stCluster, a novel method that integrates graph contrastive learning with multi-task learning to refine informative representations for spatial transcriptomic data, consequently improving spatial domain identification. stCluster first leverages graph contrastive learning technology to obtain discriminative representations capable of recognizing spatially coherent patterns. Through jointly optimizing multiple tasks, stCluster further fine-tunes the representations to be able to capture complex relationships between gene expression and spatial organization. Benchmarked against six state-of-the-art methods, the experimental results reveal its proficiency in accurately identifying complex spatial domains across various datasets and platforms, spanning tissue, organ, and embryo levels. Moreover, stCluster can effectively denoise the spatial gene expression patterns and enhance the spatial trajectory inference. The source code of stCluster is freely available at https://github.com/hannshu/stCluster.
Collapse
Affiliation(s)
- Tao Wang
- School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
- Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
| | - Han Shu
- School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
- Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
| | - Jialu Hu
- School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
- Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
| | - Yongtian Wang
- School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
- Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
| | - Jing Chen
- School of Computer Science and Engineering, Xi'an University of Technology, No.5 South Jinhua rd., Xi'an 710048, China
| | - Jiajie Peng
- School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
- Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
- Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 1 Dongxiang Rd., Xi'an 710072, China
| |
Collapse
|
35
|
Wang L, Hu Y, Xiao K, Zhang C, Shi Q, Chen L. Multi-modal domain adaptation for revealing spatial functional landscape from spatially resolved transcriptomics. Brief Bioinform 2024; 25:bbae257. [PMID: 38819253 PMCID: PMC11141295 DOI: 10.1093/bib/bbae257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 04/13/2024] [Accepted: 05/15/2024] [Indexed: 06/01/2024] Open
Abstract
Spatially resolved transcriptomics (SRT) has emerged as a powerful tool for investigating gene expression in spatial contexts, providing insights into the molecular mechanisms underlying organ development and disease pathology. However, the expression sparsity poses a computational challenge to integrate other modalities (e.g. histological images and spatial locations) that are simultaneously captured in SRT datasets for spatial clustering and variation analyses. In this study, to meet such a challenge, we propose multi-modal domain adaption for spatial transcriptomics (stMDA), a novel multi-modal unsupervised domain adaptation method, which integrates gene expression and other modalities to reveal the spatial functional landscape. Specifically, stMDA first learns the modality-specific representations from spatial multi-modal data using multiple neural network architectures and then aligns the spatial distributions across modal representations to integrate these multi-modal representations, thus facilitating the integration of global and spatially local information and improving the consistency of clustering assignments. Our results demonstrate that stMDA outperforms existing methods in identifying spatial domains across diverse platforms and species. Furthermore, stMDA excels in identifying spatially variable genes with high prognostic potential in cancer tissues. In conclusion, stMDA as a new tool of multi-modal data integration provides a powerful and flexible framework for analyzing SRT datasets, thereby advancing our understanding of intricate biological systems.
Collapse
Affiliation(s)
- Lequn Wang
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
| | - Yaofeng Hu
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| | - Kai Xiao
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
| | - Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| | - Qianqian Shi
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, Hubei Province, China
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, Hubei Province, China
| | - Luonan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, No. 320 Yue Yang Road, Xuhui District, Shanghai 200031, China
- University of Chinese Academy of Sciences, No. 80 Zhongguancun East Road, Haidian District, Beijing 100049, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Xiangshan Lane, Hangzhou 310024, China
| |
Collapse
|
36
|
Baul S, Tanvir Ahmed K, Jiang Q, Wang G, Li Q, Yong J, Zhang W. Integrating spatial transcriptomics and bulk RNA-seq: predicting gene expression with enhanced resolution through graph attention networks. Brief Bioinform 2024; 25:bbae316. [PMID: 38960406 PMCID: PMC11221891 DOI: 10.1093/bib/bbae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 06/04/2024] [Accepted: 06/17/2024] [Indexed: 07/05/2024] Open
Abstract
Spatial transcriptomics data play a crucial role in cancer research, providing a nuanced understanding of the spatial organization of gene expression within tumor tissues. Unraveling the spatial dynamics of gene expression can unveil key insights into tumor heterogeneity and aid in identifying potential therapeutic targets. However, in many large-scale cancer studies, spatial transcriptomics data are limited, with bulk RNA-seq and corresponding Whole Slide Image (WSI) data being more common (e.g. TCGA project). To address this gap, there is a critical need to develop methodologies that can estimate gene expression at near-cell (spot) level resolution from existing WSI and bulk RNA-seq data. This approach is essential for reanalyzing expansive cohort studies and uncovering novel biomarkers that have been overlooked in the initial assessments. In this study, we present STGAT (Spatial Transcriptomics Graph Attention Network), a novel approach leveraging Graph Attention Networks (GAT) to discern spatial dependencies among spots. Trained on spatial transcriptomics data, STGAT is designed to estimate gene expression profiles at spot-level resolution and predict whether each spot represents tumor or non-tumor tissue, especially in patient samples where only WSI and bulk RNA-seq data are available. Comprehensive tests on two breast cancer spatial transcriptomics datasets demonstrated that STGAT outperformed existing methods in accurately predicting gene expression. Further analyses using the TCGA breast cancer dataset revealed that gene expression estimated from tumor-only spots (predicted by STGAT) provides more accurate molecular signatures for breast cancer sub-type and tumor stage prediction, and also leading to improved patient survival and disease-free analysis. Availability: Code is available at https://github.com/compbiolabucf/STGAT.
Collapse
Affiliation(s)
- Sudipto Baul
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Khandakar Tanvir Ahmed
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Qibing Jiang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| | - Guangyu Wang
- Houston Methodist Research Institute, Weill Cornell Medical College, Houston, TX 77030, United States
| | - Qian Li
- Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, TN 38105, United States
| | - Jeongsik Yong
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota Twin Cities, Minneapolis, MN 55455, United States
| | - Wei Zhang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, United States
| |
Collapse
|
37
|
Li S, Gai K, Dong K, Zhang Y, Zhang S. High-density generation of spatial transcriptomics with STAGE. Nucleic Acids Res 2024; 52:4843-4856. [PMID: 38647109 PMCID: PMC11109953 DOI: 10.1093/nar/gkae294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 03/06/2024] [Accepted: 04/06/2024] [Indexed: 04/25/2024] Open
Abstract
Spatial transcriptome technologies have enabled the measurement of gene expression while maintaining spatial location information for deciphering the spatial heterogeneity of biological tissues. However, they were heavily limited by the sparse spatial resolution and low data quality. To this end, we develop a spatial location-supervised auto-encoder generator STAGE for generating high-density spatial transcriptomics (ST). STAGE takes advantage of the customized supervised auto-encoder to learn continuous patterns of gene expression in space and generate high-resolution expressions for given spatial coordinates. STAGE can improve the low quality of spatial transcriptome data and smooth the generated manifold of gene expression through the de-noising function on the latent codes of the auto-encoder. Applications to four ST datasets, STAGE has shown better recovery performance for down-sampled data than existing methods, revealed significant tissue structure specificity, and enabled robust identification of spatially informative genes and patterns. In addition, STAGE can be extended to three-dimensional (3D) stacked ST data for generating gene expression at any position between consecutive sections for shaping high-density 3D ST configuration.
Collapse
Affiliation(s)
- Shang Li
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kuo Gai
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kangning Dong
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiyang Zhang
- School of Software, Yunnan University, Kunming 650091, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou 310024, China
| |
Collapse
|
38
|
Cao J, Li C, Cui Z, Deng S, Lei T, Liu W, Yang H, Chen P. Spatial Transcriptomics: A Powerful Tool in Disease Understanding and Drug Discovery. Theranostics 2024; 14:2946-2968. [PMID: 38773973 PMCID: PMC11103497 DOI: 10.7150/thno.95908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 04/25/2024] [Indexed: 05/24/2024] Open
Abstract
Recent advancements in modern science have provided robust tools for drug discovery. The rapid development of transcriptome sequencing technologies has given rise to single-cell transcriptomics and single-nucleus transcriptomics, increasing the accuracy of sequencing and accelerating the drug discovery process. With the evolution of single-cell transcriptomics, spatial transcriptomics (ST) technology has emerged as a derivative approach. Spatial transcriptomics has emerged as a hot topic in the field of omics research in recent years; it not only provides information on gene expression levels but also offers spatial information on gene expression. This technology has shown tremendous potential in research on disease understanding and drug discovery. In this article, we introduce the analytical strategies of spatial transcriptomics and review its applications in novel target discovery and drug mechanism unravelling. Moreover, we discuss the current challenges and issues in this research field that need to be addressed. In conclusion, spatial transcriptomics offers a new perspective for drug discovery.
Collapse
Affiliation(s)
- Junxian Cao
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
- Analysis of Complex Effects of Proprietary Chinese Medicine, Hunan Provincial Key Laboratory, Yongzhou City, Hunan Province, China
| | - Caifeng Li
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Zhao Cui
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Shiwen Deng
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
- Analysis of Complex Effects of Proprietary Chinese Medicine, Hunan Provincial Key Laboratory, Yongzhou City, Hunan Province, China
| | - Tong Lei
- Institute of Basic Theory for Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Wei Liu
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
| | - Hongjun Yang
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
- Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing 100700, China
- Analysis of Complex Effects of Proprietary Chinese Medicine, Hunan Provincial Key Laboratory, Yongzhou City, Hunan Province, China
| | - Peng Chen
- Beijing Key Laboratory of Traditional Chinese Medicine Basic Research on Prevention and Treatment for Major Diseases, Experimental Research Center, China Academy of Chinese Medical Sciences, Beijing 100700, China
- Analysis of Complex Effects of Proprietary Chinese Medicine, Hunan Provincial Key Laboratory, Yongzhou City, Hunan Province, China
| |
Collapse
|
39
|
Walker CR, Angelo M. Insights and Opportunity Costs in Applying Spatial Biology to Study the Tumor Microenvironment. Cancer Discov 2024; 14:707-710. [PMID: 38587535 DOI: 10.1158/2159-8290.cd-24-0348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
SUMMARY The recent development of high-dimensional spatial omics tools has revealed the functional importance of the tumor microenvironment in driving tumor progression. Here, we discuss practical factors to consider when designing a spatial biology cohort and offer perspectives on the future of spatial biology research.
Collapse
Affiliation(s)
- Cameron R Walker
- Department of Pathology, Stanford University School of Medicine, Palo Alto, California
| | - Michael Angelo
- Department of Pathology, Stanford University School of Medicine, Palo Alto, California
| |
Collapse
|
40
|
Budhkar A, Tang Z, Liu X, Zhang X, Su J, Song Q. xSiGra: Explainable model for single-cell spatial data elucidation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.27.591458. [PMID: 38746321 PMCID: PMC11092461 DOI: 10.1101/2024.04.27.591458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Recent advancements in spatial imaging technologies have revolutionized the acquisition of high-resolution multi-channel images, gene expressions, and spatial locations at the single-cell level. Our study introduces xSiGra, an interpretable graph-based AI model, designed to elucidate interpretable features of identified spatial cell types, by harnessing multi-modal features from spatial imaging technologies. By constructing a spatial cellular graph with immunohistology images and gene expression as node attributes, xSiGra employs hybrid graph transformer models to delineate spatial cell types. Additionally, xSiGra integrates a novel variant of Grad-CAM component to uncover interpretable features, including pivotal genes and cells for various cell types, thereby facilitating deeper biological insights from spatial data. Through rigorous benchmarking against existing methods, xSiGra demonstrates superior performance across diverse spatial imaging datasets. Application of xSiGra on a lung tumor slice unveils the importance score of cells, illustrating that cellular activity is not solely determined by itself but also impacted by neighboring cells. Moreover, leveraging the identified interpretable genes, xSiGra reveals endothelial cell subset interacting with tumor cells, indicating its heterogeneous underlying mechanisms within the complex cellular communications.
Collapse
|
41
|
Wang W, Zheng S, Shin SC, Yuan GC. Characterizing Spatially Continuous Variations in Tissue Microenvironment through Niche Trajectory Analysis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.23.590827. [PMID: 38712255 PMCID: PMC11071437 DOI: 10.1101/2024.04.23.590827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Recent technological developments have made it possible to map the spatial organization of a tissue at the single-cell resolution. However, computational methods for analyzing spatially continuous variations in tissue microenvironment are still lacking. Here we present ONTraC as a strategy that constructs niche trajectories using a graph neural network-based modeling framework. Our benchmark analysis shows that ONTraC performs more favorably than existing methods for reconstructing spatial trajectories. Applications of ONTraC to public spatial transcriptomics datasets successfully recapitulated the underlying anatomical structure, and further enabled detection of tissue microenvironment-dependent changes in gene regulatory networks and cell-cell interaction activities during embryonic development. Taken together, ONTraC provides a useful and generally applicable tool for the systematic characterization of the structural and functional organization of tissue microenvironments.
Collapse
Affiliation(s)
- Wen Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shiwei Zheng
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sujung Crystal Shin
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
42
|
吴 瀚, 高 洁. [Identifying spatial domains from spatial transcriptome by graph attention network]. SHENG WU YI XUE GONG CHENG XUE ZA ZHI = JOURNAL OF BIOMEDICAL ENGINEERING = SHENGWU YIXUE GONGCHENGXUE ZAZHI 2024; 41:246-252. [PMID: 38686404 PMCID: PMC11058491 DOI: 10.7507/1001-5515.202304030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 12/01/2023] [Indexed: 05/02/2024]
Abstract
Due to the high dimensionality and complexity of the data, the analysis of spatial transcriptome data has been a challenging problem. Meanwhile, cluster analysis is the core issue of the analysis of spatial transcriptome data. In this article, a deep learning approach is proposed based on graph attention networks for clustering analysis of spatial transcriptome data. Our method first enhances the spatial transcriptome data, then uses graph attention networks to extract features from nodes, and finally uses the Leiden algorithm for clustering analysis. Compared with the traditional non-spatial and spatial clustering methods, our method has better performance in data analysis through the clustering evaluation index. The experimental results show that the proposed method can effectively cluster spatial transcriptome data and identify different spatial domains, which provides a new tool for studying spatial transcriptome data.
Collapse
Affiliation(s)
- 瀚文 吴
- 江南大学(江苏无锡 214122)Jiangnan University, Wuxi, Jiangsu 214122, P. R. China
| | - 洁 高
- 江南大学(江苏无锡 214122)Jiangnan University, Wuxi, Jiangsu 214122, P. R. China
| |
Collapse
|
43
|
Lu Y, Chen QM, An L. SPADE: spatial deconvolution for domain specific cell-type estimation. Commun Biol 2024; 7:469. [PMID: 38632414 PMCID: PMC11024133 DOI: 10.1038/s42003-024-06172-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 04/10/2024] [Indexed: 04/19/2024] Open
Abstract
Understanding gene expression in different cell types within their spatial context is a key goal in genomics research. SPADE (SPAtial DEconvolution), our proposed method, addresses this by integrating spatial patterns into the analysis of cell type composition. This approach uses a combination of single-cell RNA sequencing, spatial transcriptomics, and histological data to accurately estimate the proportions of cell types in various locations. Our analyses of synthetic data have demonstrated SPADE's capability to discern cell type-specific spatial patterns effectively. When applied to real-life datasets, SPADE provides insights into cellular dynamics and the composition of tumor tissues. This enhances our comprehension of complex biological systems and aids in exploring cellular diversity. SPADE represents a significant advancement in deciphering spatial gene expression patterns, offering a powerful tool for the detailed investigation of cell types in spatial transcriptomics.
Collapse
Affiliation(s)
- Yingying Lu
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA
| | - Qin M Chen
- College of Pharmacy, University of Arizona, Tucson, AZ, 85721, USA
| | - Lingling An
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, 85721, USA.
- Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, 85721, USA.
| |
Collapse
|
44
|
Zhou Y, He W, Hou W, Zhu Y. Pianno: a probabilistic framework automating semantic annotation for spatial transcriptomics. Nat Commun 2024; 15:2848. [PMID: 38565531 PMCID: PMC11271244 DOI: 10.1038/s41467-024-47152-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open
Abstract
Spatial transcriptomics has revolutionized the study of gene expression within tissues, while preserving spatial context. However, annotating spatial spots' biological identity remains a challenge. To tackle this, we introduce Pianno, a Bayesian framework automating structural semantics annotation based on marker genes. Comprehensive evaluations underscore Pianno's remarkable prowess in precisely annotating a wide array of spatial semantics, ranging from diverse anatomical structures to intricate tumor microenvironments, as well as in estimating cell type distributions, across data generated from various spatial transcriptomics platforms. Furthermore, Pianno, in conjunction with clustering approaches, uncovers a region- and species-specific excitatory neuron subtype in the deep layer 3 of the human neocortex, shedding light on cellular evolution in the human neocortex. Overall, Pianno equips researchers with a robust and efficient tool for annotating diverse biological structures, offering new perspectives on spatial transcriptomics data.
Collapse
Affiliation(s)
- Yuqiu Zhou
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Wei He
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Weizhen Hou
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Ying Zhu
- State Key Laboratory of Medical Neurobiology, MOE Frontiers Center for Brain Science, Institutes of Brain Science and Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China.
| |
Collapse
|
45
|
Yuan Z, Zhao F, Lin S, Zhao Y, Yao J, Cui Y, Zhang XY, Zhao Y. Benchmarking spatial clustering methods with spatially resolved transcriptomics data. Nat Methods 2024; 21:712-722. [PMID: 38491270 DOI: 10.1038/s41592-024-02215-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 02/16/2024] [Indexed: 03/18/2024]
Abstract
Spatial clustering, which shares an analogy with single-cell clustering, has expanded the scope of tissue physiology studies from cell-centroid to structure-centroid with spatially resolved transcriptomics (SRT) data. Computational methods have undergone remarkable development in recent years, but a comprehensive benchmark study is still lacking. Here we present a benchmark study of 13 computational methods on 34 SRT data (7 datasets). The performance was evaluated on the basis of accuracy, spatial continuity, marker genes detection, scalability, and robustness. We found existing methods were complementary in terms of their performance and functionality, and we provide guidance for selecting appropriate methods for given scenarios. On testing additional 22 challenging datasets, we identified challenges in identifying noncontinuous spatial domains and limitations of existing methods, highlighting their inadequacies in handling recent large-scale tasks. Furthermore, with 145 simulated data, we examined the robustness of these methods against four different factors, and assessed the impact of pre- and postprocessing approaches. Our study offers a comprehensive evaluation of existing spatial clustering methods with SRT data, paving the way for future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Zhiyuan Yuan
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China.
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China.
| | - Fangyuan Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Senlin Lin
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu Zhao
- Tencent AI Lab, Shenzhen, China
| | | | - Yan Cui
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
- Institute of Science and Technology for Brain-Inspired Intelligence; MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence; MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto, Japan
| | - Xiao-Yong Zhang
- Center for Medical Research and Innovation, Shanghai Pudong Hospital, Fudan University Pudong Medical Center, Fudan University, Shanghai, China
| | - Yi Zhao
- Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
46
|
Lei L, Han K, Wang Z, Shi C, Wang Z, Dai R, Zhang Z, Wang M, Guo Q. Attention-guided variational graph autoencoders reveal heterogeneity in spatial transcriptomics. Brief Bioinform 2024; 25:bbae173. [PMID: 38627939 PMCID: PMC11021349 DOI: 10.1093/bib/bbae173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 03/03/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open
Abstract
The latest breakthroughs in spatially resolved transcriptomics technology offer comprehensive opportunities to delve into gene expression patterns within the tissue microenvironment. However, the precise identification of spatial domains within tissues remains challenging. In this study, we introduce AttentionVGAE (AVGN), which integrates slice images, spatial information and raw gene expression while calibrating low-quality gene expression. By combining the variational graph autoencoder with multi-head attention blocks (MHA blocks), AVGN captures spatial relationships in tissue gene expression, adaptively focusing on key features and alleviating the need for prior knowledge of cluster numbers, thereby achieving superior clustering performance. Particularly, AVGN attempts to balance the model's attention focus on local and global structures by utilizing MHA blocks, an aspect that current graph neural networks have not extensively addressed. Benchmark testing demonstrates its significant efficacy in elucidating tissue anatomy and interpreting tumor heterogeneity, indicating its potential in advancing spatial transcriptomics research and understanding complex biological phenomena.
Collapse
Affiliation(s)
- Lixin Lei
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Kaitai Han
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zijun Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Chaojing Shi
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zhenghui Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Ruoyan Dai
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Zhiwei Zhang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Mengqiu Wang
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| | - Qianjin Guo
- Academy of Artificial Intelligence, Beijing Institute of Petrochemical Technology, Beijing 102617, China
| |
Collapse
|
47
|
Zhai Y, Chen L, Deng M. scBOL: a universal cell type identification framework for single-cell and spatial transcriptomics data. Brief Bioinform 2024; 25:bbae188. [PMID: 38678389 PMCID: PMC11056022 DOI: 10.1093/bib/bbae188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 03/11/2024] [Accepted: 04/14/2024] [Indexed: 04/30/2024] Open
Abstract
MOTIVATION Over the past decade, single-cell transcriptomic technologies have experienced remarkable advancements, enabling the simultaneous profiling of gene expressions across thousands of individual cells. Cell type identification plays an essential role in exploring tissue heterogeneity and characterizing cell state differences. With more and more well-annotated reference data becoming available, massive automatic identification methods have sprung up to simplify the annotation process on unlabeled target data by transferring the cell type knowledge. However, in practice, the target data often include some novel cell types that are not in the reference data. Most existing works usually classify these private cells as one generic 'unassigned' group and learn the features of known and novel cell types in a coupled way. They are susceptible to the potential batch effects and fail to explore the fine-grained semantic knowledge of novel cell types, thus hurting the model's discrimination ability. Additionally, emerging spatial transcriptomic technologies, such as in situ hybridization, sequencing and multiplexed imaging, present a novel challenge to current cell type identification strategies that predominantly neglect spatial organization. Consequently, it is imperative to develop a versatile method that can proficiently annotate single-cell transcriptomics data, encompassing both spatial and non-spatial dimensions. RESULTS To address these issues, we propose a new, challenging yet realistic task called universal cell type identification for single-cell and spatial transcriptomics data. In this task, we aim to give semantic labels to target cells from known cell types and cluster labels to those from novel ones. To tackle this problem, instead of designing a suboptimal two-stage approach, we propose an end-to-end algorithm called scBOL from the perspective of Bipartite prototype alignment. Firstly, we identify the mutual nearest clusters in reference and target data as their potential common cell types. On this basis, we mine the cycle-consistent semantic anchor cells to build the intrinsic structure association between two data. Secondly, we design a neighbor-aware prototypical learning paradigm to strengthen the inter-cluster separability and intra-cluster compactness within each data, thereby inspiring the discriminative feature representations. Thirdly, driven by the semantic-aware prototypical learning framework, we can align the known cell types and separate the private cell types from them among reference and target data. Such an algorithm can be seamlessly applied to various data types modeled by different foundation models that can generate the embedding features for cells. Specifically, for non-spatial single-cell transcriptomics data, we use the autoencoder neural network to learn latent low-dimensional cell representations, and for spatial single-cell transcriptomics data, we apply the graph convolution network to capture molecular and spatial similarities of cells jointly. Extensive results on our carefully designed evaluation benchmarks demonstrate the superiority of scBOL over various state-of-the-art cell type identification methods. To our knowledge, we are the pioneers in presenting this pragmatic annotation task, as well as in devising a comprehensive algorithmic framework aimed at resolving this challenge across varied types of single-cell data. Finally, scBOL is implemented in Python using the Pytorch machine-learning library, and it is freely available at https://github.com/aimeeyaoyao/scBOL.
Collapse
Affiliation(s)
- Yuyao Zhai
- School of Mathematical Sciences, Peking University, Beijing, China
| | - Liang Chen
- Huawei Technologies Co., Ltd., Beijing, China
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, Beijing, China
- Center for Statistical Science, Peking University, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing, China
| |
Collapse
|
48
|
Kilfeather P, Khoo JH, Wagner K, Liang H, Caiazza MC, An Y, Zhang X, Chen X, Connor-Robson N, Shang Z, Wade-Martins R. Single-cell spatial transcriptomic and translatomic profiling of dopaminergic neurons in health, aging, and disease. Cell Rep 2024; 43:113784. [PMID: 38386560 DOI: 10.1016/j.celrep.2024.113784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 11/14/2023] [Accepted: 01/27/2024] [Indexed: 02/24/2024] Open
Abstract
The brain is spatially organized and contains unique cell types, each performing diverse functions and exhibiting differential susceptibility to neurodegeneration. This is exemplified in Parkinson's disease with the preferential loss of dopaminergic neurons of the substantia nigra pars compacta. Using a Parkinson's transgenic model, we conducted a single-cell spatial transcriptomic and dopaminergic neuron translatomic analysis of young and old mouse brains. Through the high resolving capacity of single-cell spatial transcriptomics, we provide a deep characterization of the expression features of dopaminergic neurons and 27 other cell types within their spatial context, identifying markers of healthy and aging cells, spanning Parkinson's relevant pathways. We integrate gene enrichment and genome-wide association study data to prioritize putative causative genes for disease investigation, identifying CASR as a regulator of dopaminergic calcium handling. These datasets represent the largest public resource for the investigation of spatial gene expression in brain cells in health, aging, and disease.
Collapse
Affiliation(s)
- Peter Kilfeather
- Oxford Parkinson's Disease Centre and Department of Physiology, Anatomy and Genetics, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK; Kavli Institute for Nanoscience Discovery, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK; Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | | | - Katherina Wagner
- Oxford Parkinson's Disease Centre and Department of Physiology, Anatomy and Genetics, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK
| | | | - Maria Claudia Caiazza
- Oxford Parkinson's Disease Centre and Department of Physiology, Anatomy and Genetics, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK; Kavli Institute for Nanoscience Discovery, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK; Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA
| | - Yanru An
- BGI Research, 49276 Riga, Latvia
| | | | | | - Natalie Connor-Robson
- Oxford Parkinson's Disease Centre and Department of Physiology, Anatomy and Genetics, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK
| | | | - Richard Wade-Martins
- Oxford Parkinson's Disease Centre and Department of Physiology, Anatomy and Genetics, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK; Kavli Institute for Nanoscience Discovery, University of Oxford, Dorothy Crowfoot Hodgkin Building, South Parks Road, Oxford OX1 3QU, UK; Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, MD, USA.
| |
Collapse
|
49
|
Guo X, Ning J, Chen Y, Liu G, Zhao L, Fan Y, Sun S. Recent advances in differential expression analysis for single-cell RNA-seq and spatially resolved transcriptomic studies. Brief Funct Genomics 2024; 23:95-109. [PMID: 37022699 DOI: 10.1093/bfgp/elad011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 12/09/2022] [Accepted: 03/10/2023] [Indexed: 04/07/2023] Open
Abstract
Differential expression (DE) analysis is a necessary step in the analysis of single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) data. Unlike traditional bulk RNA-seq, DE analysis for scRNA-seq or SRT data has unique characteristics that may contribute to the difficulty of detecting DE genes. However, the plethora of DE tools that work with various assumptions makes it difficult to choose an appropriate one. Furthermore, a comprehensive review on detecting DE genes for scRNA-seq data or SRT data from multi-condition, multi-sample experimental designs is lacking. To bridge such a gap, here, we first focus on the challenges of DE detection, then highlight potential opportunities that facilitate further progress in scRNA-seq or SRT analysis, and finally provide insights and guidance in selecting appropriate DE tools or developing new computational DE methods.
Collapse
Affiliation(s)
- Xiya Guo
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Jin Ning
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yuanze Chen
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Guoliang Liu
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Liyan Zhao
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Yue Fan
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| | - Shiquan Sun
- School of Public Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
- Key Laboratory of Trace Elements and Endemic Diseases, Center for Single Cell Omics and Health, Xi'an Jiaotong University, Xi'an, Shaanxi 710061, P.R. China
| |
Collapse
|
50
|
Duan B, Chen S, Cheng X, Liu Q. Multi-slice spatial transcriptome domain analysis with SpaDo. Genome Biol 2024; 25:73. [PMID: 38504325 PMCID: PMC10949687 DOI: 10.1186/s13059-024-03213-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/08/2024] [Indexed: 03/21/2024] Open
Abstract
With the rapid advancements in spatial transcriptome sequencing, multiple tissue slices are now available, enabling the integration and interpretation of spatial cellular landscapes. Herein, we introduce SpaDo, a tool for multi-slice spatial domain analysis, including modules for multi-slice spatial domain detection, reference-based annotation, and multiple slice clustering at both single-cell and spot resolutions. We demonstrate SpaDo's effectiveness with over 40 multi-slice spatial transcriptome datasets from 7 sequencing platforms. Our findings highlight SpaDo's potential to reveal novel biological insights in multi-slice spatial transcriptomes.
Collapse
Affiliation(s)
- Bin Duan
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
| | - Shaoqi Chen
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Xiaojie Cheng
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
| | - Qi Liu
- State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department of Tongji Hospital, Frontier Science Center for Stem Cell Research, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201804, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
| |
Collapse
|