1
|
Chitra U, Arnold BJ, Sarkar H, Sanno K, Ma C, Lopez-Darwin S, Raphael BJ. Mapping the topography of spatial gene expression with interpretable deep learning. Nat Methods 2025; 22:298-309. [PMID: 39849132 DOI: 10.1038/s41592-024-02503-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Accepted: 10/14/2024] [Indexed: 01/25/2025]
Abstract
Spatially resolved transcriptomics technologies provide high-throughput measurements of gene expression in a tissue slice, but the sparsity of these data complicates analysis of spatial gene expression patterns. We address this issue by deriving a topographic map of a tissue slice-analogous to a map of elevation in a landscape-using a quantity called the isodepth. Contours of constant isodepths enclose domains with distinct cell type composition, while gradients of the isodepth indicate spatial directions of maximum change in expression. We develop GASTON (gradient analysis of spatial transcriptomics organization with neural networks), an unsupervised and interpretable deep learning algorithm that simultaneously learns the isodepth, spatial gradients and piecewise linear expression functions that model both continuous gradients and discontinuous variation in gene expression. We show that GASTON accurately identifies spatial domains and marker genes across several tissues, gradients of neuronal differentiation and firing in the brain, and gradients of metabolism and immune activity in the tumor microenvironment.
Collapse
Affiliation(s)
- Uthsav Chitra
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Brian J Arnold
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
| | - Hirak Sarkar
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Ludwig Cancer Institute, Princeton Branch, Princeton University, Princeton, NJ, USA
| | - Kohei Sanno
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Cong Ma
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Sereno Lopez-Darwin
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
2
|
Yan G, Hua SH, Li JJ. Categorization of 34 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. Nat Commun 2025; 16:1141. [PMID: 39880807 PMCID: PMC11779979 DOI: 10.1038/s41467-025-56080-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Accepted: 01/06/2025] [Indexed: 01/31/2025] Open
Abstract
In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 34 state-of-the-art methods, classifying SVGs into three categories: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.
Collapse
Affiliation(s)
- Guanao Yan
- Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095-1554, USA
| | - Shuo Harper Hua
- Department of Biomedical Data Science, Stanford University, Stanford, CA, 94305, USA
| | - Jingyi Jessica Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095-1554, USA.
- Department of Human Genetics, University of California, Los Angeles, CA, 90095-7088, USA.
- Department of Computational Medicine, University of California, Los Angeles, CA, 90095-1766, USA.
- Department of Biostatistics, University of California, Los Angeles, CA, 90095-1772, USA.
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA, 02138, USA.
| |
Collapse
|
3
|
Qiu X, Zhu DY, Lu Y, Yao J, Jing Z, Min KH, Cheng M, Pan H, Zuo L, King S, Fang Q, Zheng H, Wang M, Wang S, Zhang Q, Yu S, Liao S, Liu C, Wu X, Lai Y, Hao S, Zhang Z, Wu L, Zhang Y, Li M, Tu Z, Lin J, Yang Z, Li Y, Gu Y, Ellison D, Chen A, Liu L, Weissman JS, Ma J, Xu X, Liu S, Bai Y. Spatiotemporal modeling of molecular holograms. Cell 2024; 187:7351-7373.e61. [PMID: 39532097 DOI: 10.1016/j.cell.2024.10.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 05/29/2024] [Accepted: 10/08/2024] [Indexed: 11/16/2024]
Abstract
Quantifying spatiotemporal dynamics during embryogenesis is crucial for understanding congenital diseases. We developed Spateo (https://github.com/aristoteleo/spateo-release), a 3D spatiotemporal modeling framework, and applied it to a 3D mouse embryogenesis atlas at E9.5 and E11.5, capturing eight million cells. Spateo enables scalable, partial, non-rigid alignment, multi-slice refinement, and mesh correction to create molecular holograms of whole embryos. It introduces digitization methods to uncover multi-level biology from subcellular to whole organ, identifying expression gradients along orthogonal axes of emergent 3D structures, e.g., secondary organizers such as midbrain-hindbrain boundary (MHB). Spateo further jointly models intercellular and intracellular interaction to dissect signaling landscapes in 3D structures, including the zona limitans intrathalamica (ZLI). Lastly, Spateo introduces "morphometric vector fields" of cell migration and integrates spatial differential geometry to unveil molecular programs underlying asymmetrical murine heart organogenesis and others, bridging macroscopic changes with molecular dynamics. Thus, Spateo enables the study of organ ecology at a molecular level in 3D space over time.
Collapse
Affiliation(s)
- Xiaojie Qiu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Basic Sciences and Engineering Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA; Stanford Cardiovascular Institute, Stanford University, Stanford, CA, USA.
| | - Daniel Y Zhu
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yifan Lu
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA; Basic Sciences and Engineering Initiative, Betty Irene Moore Children's Heart Center, Lucile Packard Children's Hospital, Stanford, CA, USA; Department of Computer Science, Stanford University, Stanford, CA 94305, USA; Electronic Information School, Wuhan University, Wuhan 430072, China
| | - Jiajun Yao
- BGI Research, Hangzhou 310030, China; BGI Research, Sanya 572025, China; College of Life Sciences, Northwest University, Xi'an 710069, China
| | - Zehua Jing
- BGI Research, Hangzhou 310030, China; BGI Research, Sanya 572025, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Kyung Hoi Min
- Ginkgo Bioworks, The Innovation and Design Building, Boston, MA 02210, USA
| | - Mengnan Cheng
- BGI Research, Hangzhou 310030, China; BGI Research, Shenzhen 518083, China
| | | | - Lulu Zuo
- BGI Research, Shenzhen 518083, China
| | - Samuel King
- Department of Bioengineering, Stanford University School of Medicine, Stanford, CA, USA
| | - Qi Fang
- BGI Research, Hangzhou 310030, China; BGI Research, Shenzhen 518083, China
| | - Huiwen Zheng
- BGI Research, Hangzhou 310030, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingyue Wang
- BGI Research, Hangzhou 310030, China; Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Shuai Wang
- BGI Research, Hangzhou 310030, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qingquan Zhang
- Department of Medicine, Division of Cardiology, University of California, San Diego, La Jolla, CA, USA
| | - Sichao Yu
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | - Sha Liao
- BGI Research, Shenzhen 518083, China; STOmics Tech Co., Ltd, Shenzhen 518083, China; BGI Research, Chongqing 401329, China
| | - Chao Liu
- BGI Research, Wuhan 430074, China
| | - Xinchao Wu
- BGI Research, Hangzhou 310030, China; BGI Research, Sanya 572025, China; School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Yiwei Lai
- BGI Research, Shenzhen 518083, China
| | | | - Zhewei Zhang
- BGI Research, Hangzhou 310030, China; BGI Research, Sanya 572025, China; School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | - Liang Wu
- BGI Research, Chongqing 401329, China
| | | | - Mei Li
- STOmics Tech Co., Ltd, Shenzhen 518083, China
| | - Zhencheng Tu
- BGI Research, Hangzhou 310030, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jinpei Lin
- BGI Research, Hangzhou 310030, China; BGI Research, Sanya 572025, China
| | - Zhuoxuan Yang
- BGI Research, Hangzhou 310030, China; School of Life Sciences, Southern University of Science and Technology, Shenzhen 518055, China
| | | | - Ying Gu
- BGI Research, Hangzhou 310030, China; BGI Research, Shenzhen 518083, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | | | - Ao Chen
- BGI Research, Shenzhen 518083, China; STOmics Tech Co., Ltd, Shenzhen 518083, China; BGI Research, Chongqing 401329, China
| | - Longqi Liu
- BGI Research, Hangzhou 310030, China; Shenzhen Bay Laboratory, Shenzhen 518132, China; Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen 518120, China
| | - Jonathan S Weissman
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA; Department of Biology and Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, USA; Koch Institute for Integrative Cancer Research at MIT, MIT, Cambridge, MA, USA
| | - Jiayi Ma
- Electronic Information School, Wuhan University, Wuhan 430072, China.
| | - Xun Xu
- BGI Research, Hangzhou 310030, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China; Guangdong Provincial Key Laboratory of Genome Read and Write, BGI-Shenzhen, Shenzhen 518120, China.
| | - Shiping Liu
- BGI Research, Hangzhou 310030, China; Shenzhen Bay Laboratory, Shenzhen 518132, China; Shenzhen Key Laboratory of Single-Cell Omics, BGI-Shenzhen, Shenzhen 518120, China; The Guangdong-Hong Kong Joint Laboratory on Immunological and Genetic Kidney Diseases, Guangzhou, Guangdong, China.
| | - Yinqi Bai
- BGI Research, Sanya 572025, China; Hainan Technology Innovation Center for Marine Biological Resources Utilization (Preparatory Period), BGI Research, Sanya 572025, China.
| |
Collapse
|
4
|
Zhang C, Wang L, Shi Q. Computational modeling for deciphering tissue microenvironment heterogeneity from spatially resolved transcriptomics. Comput Struct Biotechnol J 2024; 23:2109-2115. [PMID: 38800634 PMCID: PMC11126885 DOI: 10.1016/j.csbj.2024.05.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 05/15/2024] [Accepted: 05/16/2024] [Indexed: 05/29/2024] Open
Abstract
Spatial transcriptomics techniques, while measuring gene expression, retain spatial location information, aiding in situ studies of organismal tissue architecture and the progression of pathological processes. These techniques generate vast amounts of omics data, necessitating the development of computational methods to reveal the underlying tissue microenvironment heterogeneity. The main directions in spatial transcriptomics data analysis are spatial domain detection and spatial deconvolution, which can identify spatial functional regions and parse the distribution of cell types in spatial transcriptomics data by integrating single-cell transcriptomics data. In these two research directions, many computational methods have been successively proposed. This article will categorize them into three types: machine learning-based methods, probabilistic models-based methods, and deep learning-based methods. It will list and discuss the representative algorithms of each type along with their advantages and disadvantages and describe the datasets and evaluation metrics used to assess these computational methods, facilitating researchers in selecting suitable computational methods according to their research needs. Finally, combining the latest technological developments and the advantages and disadvantages of current algorithms, this article will look forward to the future directions of computational method development.
Collapse
Affiliation(s)
- Chuanchao Zhang
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, Hangzhou 310024; University of Chinese Academy of Sciences, China
| | - Lequn Wang
- State Key Laboratory of Cell Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qianqian Shi
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Engineering Technology Research Center of Agricultural Big Data, Huazhong Agricultural University, Wuhan 430070, Hubei, China
| |
Collapse
|
5
|
Yan G, Hua SH, Li JJ. Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. ARXIV 2024:arXiv:2405.18779v4. [PMID: 38855546 PMCID: PMC11160866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
In the analysis of spatially resolved transcriptomics data, detecting spatially variable genes (SVGs) is crucial. Numerous computational methods exist, but varying SVG definitions and methodologies lead to incomparable results. We review 33 state-of-the-art methods, categorizing SVGs into three types: overall, cell-type-specific, and spatial-domain-marker SVGs. Our review explains the intuitions underlying these methods, summarizes their applications, and categorizes the hypothesis tests they use in the trade-off between generality and specificity for SVG detection. We discuss challenges in SVG detection and propose future directions for improvement. Our review offers insights for method developers and users, advocating for category-specific benchmarking.
Collapse
Affiliation(s)
- Guanao Yan
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
| | - Shuo Harper Hua
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554
- Department of Human Genetics, University of California, Los Angeles, CA 90095-7088
- Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766
- Department of Biostatistics, University of California, Los Angeles, CA 90095-1772
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA 02138
| |
Collapse
|
6
|
Hu Y, Xie M, Li Y, Rao M, Shen W, Luo C, Qin H, Baek J, Zhou XM. Benchmarking clustering, alignment, and integration methods for spatial transcriptomics. Genome Biol 2024; 25:212. [PMID: 39123269 PMCID: PMC11312151 DOI: 10.1186/s13059-024-03361-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 07/30/2024] [Indexed: 08/12/2024] Open
Abstract
BACKGROUND Spatial transcriptomics (ST) is advancing our understanding of complex tissues and organisms. However, building a robust clustering algorithm to define spatially coherent regions in a single tissue slice and aligning or integrating multiple tissue slices originating from diverse sources for essential downstream analyses remains challenging. Numerous clustering, alignment, and integration methods have been specifically designed for ST data by leveraging its spatial information. The absence of comprehensive benchmark studies complicates the selection of methods and future method development. RESULTS In this study, we systematically benchmark a variety of state-of-the-art algorithms with a wide range of real and simulated datasets of varying sizes, technologies, species, and complexity. We analyze the strengths and weaknesses of each method using diverse quantitative and qualitative metrics and analyses, including eight metrics for spatial clustering accuracy and contiguity, uniform manifold approximation and projection visualization, layer-wise and spot-to-spot alignment accuracy, and 3D reconstruction, which are designed to assess method performance as well as data quality. The code used for evaluation is available on our GitHub. Additionally, we provide online notebook tutorials and documentation to facilitate the reproduction of all benchmarking results and to support the study of new methods and new datasets. CONCLUSIONS Our analyses lead to comprehensive recommendations that cover multiple aspects, helping users to select optimal tools for their specific needs and guide future method development.
Collapse
Affiliation(s)
- Yunfei Hu
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Manfei Xie
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Yikang Li
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Mingxing Rao
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Wenjun Shen
- Department of Bioinformatics, Shantou University Medical College, 515041, Shantou, China
| | - Can Luo
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA
| | - Haoran Qin
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Jihoon Baek
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA
| | - Xin Maizie Zhou
- Department of Computer Science, Vanderbilt University, 37235, Nashville, USA.
- Department of Biomedical Engineering, Vanderbilt University, 37235, Nashville, USA.
| |
Collapse
|
7
|
Sarkar H, Chitra U, Gold J, Raphael BJ. A count-based model for delineating cell-cell interactions in spatial transcriptomics data. Bioinformatics 2024; 40:i481-i489. [PMID: 38940134 PMCID: PMC11211854 DOI: 10.1093/bioinformatics/btae219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Cell-cell interactions (CCIs) consist of cells exchanging signals with themselves and neighboring cells by expressing ligand and receptor molecules and play a key role in cellular development, tissue homeostasis, and other critical biological functions. Since direct measurement of CCIs is challenging, multiple methods have been developed to infer CCIs by quantifying correlations between the gene expression of the ligands and receptors that mediate CCIs, originally from bulk RNA-sequencing data and more recently from single-cell or spatially resolved transcriptomics (SRT) data. SRT has a particular advantage over single-cell approaches, since ligand-receptor correlations can be computed between cells or spots that are physically close in the tissue. However, the transcript counts of individual ligands and receptors in SRT data are generally low, complicating the inference of CCIs from expression correlations. RESULTS We introduce Copulacci, a count-based model for inferring CCIs from SRT data. Copulacci uses a Gaussian copula to model dependencies between the expression of ligands and receptors from nearby spatial locations even when the transcript counts are low. On simulated data, Copulacci outperforms existing CCI inference methods based on the standard Spearman and Pearson correlation coefficients. Using several real SRT datasets, we show that Copulacci discovers biologically meaningful ligand-receptor interactions that are lowly expressed and undiscoverable by existing CCI inference methods. AVAILABILITY AND IMPLEMENTATION Copulacci is implemented in Python and available at https://github.com/raphael-group/copulacci.
Collapse
Affiliation(s)
- Hirak Sarkar
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, United States
- Ludwig Cancer Institute, Princeton Branch, Princeton University, Princeton, NJ, 08540, United States
| | - Uthsav Chitra
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, United States
| | - Julian Gold
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, United States
- Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, 08540, United States
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, 08540, United States
| |
Collapse
|
8
|
Yu S, Li WV. spVC for the detection and interpretation of spatial gene expression variation. Genome Biol 2024; 25:103. [PMID: 38641849 PMCID: PMC11027374 DOI: 10.1186/s13059-024-03245-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Accepted: 04/10/2024] [Indexed: 04/21/2024] Open
Abstract
Spatially resolved transcriptomics technologies have opened new avenues for understanding gene expression heterogeneity in spatial contexts. However, existing methods for identifying spatially variable genes often focus solely on statistical significance, limiting their ability to capture continuous expression patterns and integrate spot-level covariates. To address these challenges, we introduce spVC, a statistical method based on a generalized Poisson model. spVC seamlessly integrates constant and spatially varying effects of covariates, facilitating comprehensive exploration of gene expression variability and enhancing interpretability. Simulation and real data applications confirm spVC's accuracy in these tasks, highlighting its versatility in spatial transcriptomics analysis.
Collapse
Affiliation(s)
- Shan Yu
- Department of Statistics, Unversity of Virginia, Charlottesville, 22903, VA, USA.
| | - Wei Vivian Li
- Department of Statistics, University of California, Riverside, 92521, CA, USA.
| |
Collapse
|
9
|
Song S, Mohsin E, Zhang R, Kuznetsov A, Shen L, Grossman RL, Weber CR, Khan AA. ATAT: Automated Tissue Alignment and Traversal in Spatial Transcriptomics with Self-Supervised Learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.08.570839. [PMID: 38106010 PMCID: PMC10723486 DOI: 10.1101/2023.12.08.570839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Spatial transcriptomics (ST) has enhanced RNA analysis in tissue biopsies, but interpreting these data is challenging without expert input. We present Automated Tissue Alignment and Traversal (ATAT), a novel computational framework designed to enhance ST analysis in the context of multiple and complex tissue architectures and morphologies, such as those found in biopsies of the gastrointestinal tract. ATAT utilizes self-supervised contrastive learning on hematoxylin and eosin (H&E) stained images to automate the alignment and traversal of ST data. This approach addresses a critical gap in current ST analysis methodologies, which rely heavily on manual annotation and pathologist expertise to delineate regions of interest for accurate gene expression modeling. Our framework not only streamlines the alignment of multiple ST samples, but also demonstrates robustness in modeling gene expression transitions across specific regions. Additionally, we highlight the ability of ATAT to traverse complex tissue topologies in real-world cases from various individuals and conditions. Our method successfully elucidates differences in immune infiltration patterns across the intestinal wall, enabling the modeling of transcriptional changes across histological layers. We show that ATAT achieves comparable performance to the state-of-the-art method, while alleviating the burden of manual annotation and enabling alignment of tissue samples with complex morphologies.
Collapse
Affiliation(s)
- Steven Song
- Department of Computer Science, University of Chicago, IL 60637, USA
- Interdisciplinary Scientist Training Program, University of Chicago, Chicago, IL 60637, USA
| | - Emaan Mohsin
- Department of Pathology, University of Chicago, Chicago, IL 60637, USA
| | - Renyu Zhang
- Department of Computer Science, University of Chicago, IL 60637, USA
| | - Andrey Kuznetsov
- Department of Pathology, University of Chicago, Chicago, IL 60637, USA
| | - Le Shen
- Department of Pathology, University of Chicago, Chicago, IL 60637, USA
| | - Robert L. Grossman
- Department of Computer Science, University of Chicago, IL 60637, USA
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | | | - Aly A. Khan
- Department of Pathology, University of Chicago, Chicago, IL 60637, USA
- Committee on Immunology, University of Chicago, Chicago, IL 60637, USA
- Institute for Population and Precision Health, University of Chicago, Chicago, IL 60637, USA
- Department of Family Medicine, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
10
|
Chitra U, Arnold BJ, Sarkar H, Ma C, Lopez-Darwin S, Sanno K, Raphael BJ. Mapping the topography of spatial gene expression with interpretable deep learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.10.561757. [PMID: 37873258 PMCID: PMC10592770 DOI: 10.1101/2023.10.10.561757] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Spatially resolved transcriptomics technologies provide high-throughput measurements of gene expression in a tissue slice, but the sparsity of this data complicates the analysis of spatial gene expression patterns such as gene expression gradients. We address these issues by deriving a topographic map of a tissue slice-analogous to a map of elevation in a landscape-using a novel quantity called the isodepth. Contours of constant isodepth enclose spatial domains with distinct cell type composition, while gradients of the isodepth indicate spatial directions of maximum change in gene expression. We develop GASTON, an unsupervised and interpretable deep learning algorithm that simultaneously learns the isodepth, spatial gene expression gradients, and piecewise linear functions of the isodepth that model both continuous gradients and discontinuous spatial variation in the expression of individual genes. We validate GASTON by showing that it accurately identifies spatial domains and marker genes across several biological systems. In SRT data from the brain, GASTON reveals gradients of neuronal differentiation and firing, and in SRT data from a tumor sample, GASTON infers gradients of metabolic activity and epithelial-mesenchymal transition (EMT)-related gene expression in the tumor microenvironment.
Collapse
Affiliation(s)
- Uthsav Chitra
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Brian J. Arnold
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
| | - Hirak Sarkar
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Ludwig Cancer Institute, Princeton Branch, Princeton University, Princeton, NJ, USA
| | - Cong Ma
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | | | - Kohei Sanno
- Department of Computer Science, Princeton University, Princeton, NJ, USA
- Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
| | | |
Collapse
|
11
|
Velten B, Stegle O. Principles and challenges of modeling temporal and spatial omics data. Nat Methods 2023; 20:1462-1474. [PMID: 37710019 DOI: 10.1038/s41592-023-01992-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 07/31/2023] [Indexed: 09/16/2023]
Abstract
Studies with temporal or spatial resolution are crucial to understand the molecular dynamics and spatial dependencies underlying a biological process or system. With advances in high-throughput omic technologies, time- and space-resolved molecular measurements at scale are increasingly accessible, providing new opportunities to study the role of timing or structure in a wide range of biological questions. At the same time, analyses of the data being generated in the context of spatiotemporal studies entail new challenges that need to be considered, including the need to account for temporal and spatial dependencies and compare them across different scales, biological samples or conditions. In this Review, we provide an overview of common principles and challenges in the analysis of temporal and spatial omics data. We discuss statistical concepts to model temporal and spatial dependencies and highlight opportunities for adapting existing analysis methods to data with temporal and spatial dimensions.
Collapse
Affiliation(s)
- Britta Velten
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Cellular Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge, UK.
- Centre for Organismal Studies (COS) and Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany.
| | - Oliver Stegle
- Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), Heidelberg, Germany.
- Cellular Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge, UK.
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
| |
Collapse
|
12
|
Zhang D, Deng Y, Kukanja P, Agirre E, Bartosovic M, Dong M, Ma C, Ma S, Su G, Bao S, Liu Y, Xiao Y, Rosoklija GB, Dwork AJ, Mann JJ, Leong KW, Boldrini M, Wang L, Haeussler M, Raphael BJ, Kluger Y, Castelo-Branco G, Fan R. Spatial epigenome-transcriptome co-profiling of mammalian tissues. Nature 2023; 616:113-122. [PMID: 36922587 PMCID: PMC10076218 DOI: 10.1038/s41586-023-05795-1] [Citation(s) in RCA: 179] [Impact Index Per Article: 89.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 02/03/2023] [Indexed: 03/17/2023]
Abstract
Emerging spatial technologies, including spatial transcriptomics and spatial epigenomics, are becoming powerful tools for profiling of cellular states in the tissue context1-5. However, current methods capture only one layer of omics information at a time, precluding the possibility of examining the mechanistic relationship across the central dogma of molecular biology. Here, we present two technologies for spatially resolved, genome-wide, joint profiling of the epigenome and transcriptome by cosequencing chromatin accessibility and gene expression, or histone modifications (H3K27me3, H3K27ac or H3K4me3) and gene expression on the same tissue section at near-single-cell resolution. These were applied to embryonic and juvenile mouse brain, as well as adult human brain, to map how epigenetic mechanisms control transcriptional phenotype and cell dynamics in tissue. Although highly concordant tissue features were identified by either spatial epigenome or spatial transcriptome we also observed distinct patterns, suggesting their differential roles in defining cell states. Linking epigenome to transcriptome pixel by pixel allows the uncovering of new insights in spatial epigenetic priming, differentiation and gene regulation within the tissue architecture. These technologies are of great interest in life science and biomedical research.
Collapse
Affiliation(s)
- Di Zhang
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Yanxiang Deng
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
- Yale Stem Cell Center and Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA.
- Department of Pathology and Laboratory Medicine, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
| | - Petra Kukanja
- Laboratory of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Eneritz Agirre
- Laboratory of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Marek Bartosovic
- Laboratory of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Mingze Dong
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
| | - Cong Ma
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Sai Ma
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Graham Su
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
- Yale Stem Cell Center and Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA
| | - Shuozhen Bao
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
| | - Yang Liu
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA
- Yale Stem Cell Center and Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA
| | - Yang Xiao
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
| | - Gorazd B Rosoklija
- Department of Psychiatry, Columbia University, New York, NY, USA
- Division of Molecular Imaging and Neuropathology, New York State Psychiatric Institute, New York, NY, USA
- Macedonian Academy of Sciences & Arts, Skopje, Republic of Macedonia
| | - Andrew J Dwork
- Department of Psychiatry, Columbia University, New York, NY, USA
- Division of Molecular Imaging and Neuropathology, New York State Psychiatric Institute, New York, NY, USA
- Macedonian Academy of Sciences & Arts, Skopje, Republic of Macedonia
- Department of Pathology and Cell Biology, Columbia University, New York, NY, USA
| | - J John Mann
- Department of Psychiatry, Columbia University, New York, NY, USA
- Division of Molecular Imaging and Neuropathology, New York State Psychiatric Institute, New York, NY, USA
- Department of Radiology, Columbia University, New York, NY, USA
| | - Kam W Leong
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Maura Boldrini
- Department of Psychiatry, Columbia University, New York, NY, USA
- Division of Molecular Imaging and Neuropathology, New York State Psychiatric Institute, New York, NY, USA
| | - Liya Wang
- AtlasXomics, Inc., New Haven, CT, USA
| | | | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Yuval Kluger
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA
- Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
- Applied Mathematics Program, Yale University, New Haven, CT, USA
| | - Gonçalo Castelo-Branco
- Laboratory of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.
- Ming Wai Lau Centre for Reparative Medicine, Stockholm Node, Karolinska Institutet, Stockholm, Sweden.
| | - Rong Fan
- Department of Biomedical Engineering, Yale University, New Haven, CT, USA.
- Yale Stem Cell Center and Yale Cancer Center, Yale School of Medicine, New Haven, CT, USA.
- Department of Pathology, Yale University School of Medicine, New Haven, CT, USA.
- Human and Translational Immunology Program, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|