151
|
Zhang X, Wang X, Shivashankar GV, Uhler C. Graph-based autoencoder integrates spatial transcriptomics with chromatin images and identifies joint biomarkers for Alzheimer's disease. Nat Commun 2022; 13:7480. [PMID: 36463283 PMCID: PMC9719477 DOI: 10.1038/s41467-022-35233-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 11/23/2022] [Indexed: 12/07/2022] Open
Abstract
Tissue development and disease lead to changes in cellular organization, nuclear morphology, and gene expression, which can be jointly measured by spatial transcriptomic technologies. However, methods for jointly analyzing the different spatial data modalities in 3D are still lacking. We present a computational framework to integrate Spatial Transcriptomic data using over-parameterized graph-based Autoencoders with Chromatin Imaging data (STACI) to identify molecular and functional alterations in tissues. STACI incorporates multiple modalities in a single representation for downstream tasks, enables the prediction of spatial transcriptomic data from nuclear images in unseen tissue sections, and provides built-in batch correction of gene expression and tissue morphology through over-parameterization. We apply STACI to analyze the spatio-temporal progression of Alzheimer's disease and identify the associated nuclear morphometric and coupled gene expression features. Collectively, we demonstrate the importance of characterizing disease progression by integrating multiple data modalities and its potential for the discovery of disease biomarkers.
Collapse
Affiliation(s)
- Xinyi Zhang
- Massachusetts Institute of Technology, Cambridge, USA
- Broad Institute of MIT and Harvard, Cambridge, USA
| | - Xiao Wang
- Massachusetts Institute of Technology, Cambridge, USA
- Broad Institute of MIT and Harvard, Cambridge, USA
| | - G V Shivashankar
- ETH Zurich, Zurich, Switzerland
- Paul Scherrer Institute, Villigen, Switzerland
| | - Caroline Uhler
- Massachusetts Institute of Technology, Cambridge, USA.
- Broad Institute of MIT and Harvard, Cambridge, USA.
| |
Collapse
|
152
|
Qiu Z, Li S, Luo M, Zhu S, Wang Z, Jiang Y. Detection of differentially expressed genes in spatial transcriptomics data by spatial analysis of spatial transcriptomics: A novel method based on spatial statistics. Front Neurosci 2022; 16:1086168. [PMID: 36523429 PMCID: PMC9745188 DOI: 10.3389/fnins.2022.1086168] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 11/17/2022] [Indexed: 09/10/2024] Open
Abstract
Background Spatial transcriptomics (STs) simultaneously obtains the location and amount of gene expression within a tissue section. However, current methods like FindMarkers calculated the differentially expressed genes (DEGs) based on the classical statistics, which should abolish the spatial information. Materials and methods A new method named spatial analysis of spatial transcriptomics (saSpatial) was developed for both the location and the amount of gene expression. Then saSpatial was applied to detect DEGs in both inter- and intra-cross sections. DEGs detected by saSpatial were compared with those detected by FindMarkers. Results Spatial analysis of spatial transcriptomics was founded on the basis of spatial statistics. It was able to detect DEGs in different regions in the normal brain section. As for the brain with ischemic stroke, saSpatial revealed the DEGs for the ischemic core and penumbra. In addition, saSpatial characterized the genetic heterogeneity in the normal and ischemic cortex. Compared to FindMarkers, a larger number of valuable DEGs were found by saSpatial. Conclusion Spatial analysis of spatial transcriptomics was able to effectively detect DEGs in STs data. It was a simple and valuable tool that could help potential researchers to find more valuable genes in the future research.
Collapse
Affiliation(s)
- Zhihua Qiu
- Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
- Department of Neurology, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shaojun Li
- Department of Neurology, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Ming Luo
- Department of Neurology, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shuanggen Zhu
- Department of Neurology, People’s Hospital of Longhua, Shenzhen, China
| | - Zhijian Wang
- Nanfang Hospital, Southern Medical University, Guangzhou, Guangdong, China
| | - Yongjun Jiang
- Department of Neurology, The Second Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
153
|
Overbey EG, Das S, Cope H, Madrigal P, Andrusivova Z, Frapard S, Klotz R, Bezdan D, Gupta A, Scott RT, Park J, Chirko D, Galazka JM, Costes SV, Mason CE, Herranz R, Szewczyk NJ, Borg J, Giacomello S. Challenges and considerations for single-cell and spatially resolved transcriptomics sample collection during spaceflight. CELL REPORTS METHODS 2022; 2:100325. [PMID: 36452864 PMCID: PMC9701605 DOI: 10.1016/j.crmeth.2022.100325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) and spatially resolved transcriptomics (SRT) have experienced rapid development in recent years. The findings of spaceflight-based scRNA-seq and SRT investigations are likely to improve our understanding of life in space and our comprehension of gene expression in various cell systems and tissue dynamics. However, compared to their Earth-based counterparts, gene expression experiments conducted in spaceflight have not experienced the same pace of development. Out of the hundreds of spaceflight gene expression datasets available, only a few used scRNA-seq and SRT. In this perspective piece, we explore the growing importance of scRNA-seq and SRT in space biology and discuss the challenges and considerations relevant to robust experimental design to enable growth of these methods in the field.
Collapse
Affiliation(s)
- Eliah G. Overbey
- Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, New York, NY, USA
| | - Saswati Das
- Department of Biochemistry, Atal Bihari Vajpayee Institute of Medical Sciences & Dr. Ram Manohar Lohia Hospital, New Delhi, India
| | - Henry Cope
- School of Medicine, University of Nottingham, Derby DE22 3DT, UK
| | - Pedro Madrigal
- European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Zaneta Andrusivova
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Solène Frapard
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Rebecca Klotz
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Daniela Bezdan
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen 72076, Germany
- NGS Competence Center Tübingen (NCCT), University of Tübingen, Tübingen, German
- yuri GmbH, Meckenbeuren, Germany
| | | | - Ryan T. Scott
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | | | | | - Jonathan M. Galazka
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Sylvain V. Costes
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Christopher E. Mason
- Weill Cornell Medicine, New York, NY, USA
- Institute for Computational Biomedicine, New York, NY, USA
- The Feil Family Brain and Mind Research Institute, New York, NY, USA
- The WorldQuant Initiative for Quantitative Prediction, New York, NY, USA
| | - Raul Herranz
- Centro de Investigaciones Biológicas Margarita Salas (CSIC), Madrid 28040, Spain
| | - Nathaniel J. Szewczyk
- School of Medicine, University of Nottingham, Derby DE22 3DT, UK
- Department of Biomedical Sciences, Heritage College of Osteopathic Medicine, Ohio University, Athens, OH 45701, USA
| | - Joseph Borg
- Department of Applied Biomedical Science, Faculty of Health Sciences, University of Malta, Msida, Malta
| | - Stefania Giacomello
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
154
|
Summers HD, Wills JW, Rees P. Spatial statistics is a comprehensive tool for quantifying cell neighbor relationships and biological processes via tissue image analysis. CELL REPORTS METHODS 2022; 2:100348. [PMID: 36452868 PMCID: PMC9701617 DOI: 10.1016/j.crmeth.2022.100348] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Automated microscopy and computational image analysis has transformed cell biology, providing quantitative, spatially resolved information on cells and their constituent molecules from the sub-micron to the whole-organ scale. Here we explore the application of spatial statistics to the cellular relationships within tissue microscopy data and discuss how spatial statistics offers cytometry a powerful yet underused mathematical tool set for which the required data are readily captured using standard protocols and microscopy equipment. We also highlight the often-overlooked need to carefully consider the structural heterogeneity of tissues in terms of the applicability of different statistical measures and their accuracy and demonstrate how spatial analyses offer a great deal more than just basic quantification of biological variance. Ultimately, we highlight how statistical modeling can help reveal the hierarchical spatial processes that connect the properties of individual cells to the establishment of biological function.
Collapse
Affiliation(s)
- Huw D. Summers
- Department of Biomedical Engineering, Swansea University, Swansea SA1 8QQ, UK
| | - John W. Wills
- Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, UK
| | - Paul Rees
- Department of Biomedical Engineering, Swansea University, Swansea SA1 8QQ, UK
| |
Collapse
|
155
|
Tsuchiya T, Hori H, Ozaki H. CCPLS reveals cell-type-specific spatial dependence of transcriptomes in single cells. Bioinformatics 2022; 38:4868-4877. [PMID: 36063454 PMCID: PMC9620831 DOI: 10.1093/bioinformatics/btac599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 08/17/2022] [Accepted: 09/04/2022] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Cell-cell communications regulate internal cellular states, e.g. gene expression and cell functions, and play pivotal roles in normal development and disease states. Furthermore, single-cell RNA sequencing methods have revealed cell-to-cell expression variability of highly variable genes (HVGs), which is also crucial. Nevertheless, the regulation of cell-to-cell expression variability of HVGs via cell-cell communications is still largely unexplored. The recent advent of spatial transcriptome methods has linked gene expression profiles to the spatial context of single cells, which has provided opportunities to reveal those regulations. The existing computational methods extract genes with expression levels influenced by neighboring cell types. However, limitations remain in the quantitativeness and interpretability: they neither focus on HVGs nor consider the effects of multiple neighboring cell types. RESULTS Here, we propose CCPLS (Cell-Cell communications analysis by Partial Least Square regression modeling), which is a statistical framework for identifying cell-cell communications as the effects of multiple neighboring cell types on cell-to-cell expression variability of HVGs, based on the spatial transcriptome data. For each cell type, CCPLS performs PLS regression modeling and reports coefficients as the quantitative index of the cell-cell communications. Evaluation using simulated data showed our method accurately estimated the effects of multiple neighboring cell types on HVGs. Furthermore, applications to the two real datasets demonstrate that CCPLS can extract biologically interpretable insights from the inferred cell-cell communications. AVAILABILITY AND IMPLEMENTATION The R package is available at https://github.com/bioinfo-tsukuba/CCPLS. The data are available at https://github.com/bioinfo-tsukuba/CCPLS_paper. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Takaho Tsuchiya
- Bioinformatics Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
- Center for Artificial Intelligence Research, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
| | - Hiroki Hori
- Bioinformatics Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
- Doctoral Program in Medical Sciences, Graduate School of Comprehensive Human Sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
| | - Haruka Ozaki
- Bioinformatics Laboratory, Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
- Center for Artificial Intelligence Research, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
156
|
Ma C, Chitra U, Zhang S, Raphael BJ. Belayer: Modeling discrete and continuous spatial variation in gene expression from spatially resolved transcriptomics. Cell Syst 2022; 13:786-797.e13. [PMID: 36265465 PMCID: PMC9814896 DOI: 10.1016/j.cels.2022.09.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 07/13/2022] [Accepted: 09/06/2022] [Indexed: 01/26/2023]
Abstract
Spatially resolved transcriptomics (SRT) technologies measure gene expression at known locations in a tissue slice, enabling the identification of spatially varying genes or cell types. Current approaches for these tasks assume either that gene expression varies continuously across a tissue or that a tissue contains a small number of regions with distinct cellular composition. We propose a model for SRT data from layered tissues that includes both continuous and discrete spatial variation in expression and an algorithm, Belayer, to learn the parameters of this model. Belayer models gene expression as a piecewise linear function of the relative depth of a tissue layer with possible discontinuities at layer boundaries. We use conformal maps to model relative depth and derive a dynamic programming algorithm to infer layer boundaries and gene expression functions. Belayer accurately identifies tissue layers and biologically meaningful spatially varying genes in SRT data from the brain and skin.
Collapse
Affiliation(s)
- Cong Ma
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA
| | - Uthsav Chitra
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA
| | - Shirley Zhang
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA
| | - Benjamin J Raphael
- Department of Computer Science, Princeton University, 35 Olden St, Princeton, NJ 08540, USA.
| |
Collapse
|
157
|
Zheng Y, Chen Y, Ding X, Wong KH, Cheung E. Aquila: a spatial omics database and analysis platform. Nucleic Acids Res 2022; 51:D827-D834. [PMID: 36243967 PMCID: PMC9825501 DOI: 10.1093/nar/gkac874] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/06/2022] [Accepted: 10/07/2022] [Indexed: 01/30/2023] Open
Abstract
Spatial omics is a rapidly evolving approach for exploring tissue microenvironment and cellular networks by integrating spatial knowledge with transcript or protein expression information. However, there is a lack of databases for users to access and analyze spatial omics data. To address this limitation, we developed Aquila, a comprehensive platform for managing and analyzing spatial omics data. Aquila contains 107 datasets from 30 diseases, including 6500+ regions of interest, and 15.7 million cells. The database covers studies from spatial transcriptome and proteome analyses, 2D and 3D experiments, and different technologies. Aquila provides visualization of spatial omics data in multiple formats such as spatial cell distribution, spatial expression and co-localization of markers. Aquila also lets users perform many basic and advanced spatial analyses on any dataset. In addition, users can submit their own spatial omics data for visualization and analysis in a safe and secure environment. Finally, Aquila can be installed as an individual app on a desktop and offers the RESTful API service for power users to access the database. Overall, Aquila provides a detailed insight into transcript and protein expression in tissues from a spatial perspective. Aquila is available at https://aquila.cheunglab.org.
Collapse
Affiliation(s)
- Yimin Zheng
- Cancer Centre, University of Macau, Taipa 999078, Macau SAR,Centre for Precision Medicine Research and Training, University of Macau, Taipa 999078, Macau SAR,MoE Frontiers Science Center for Precision Oncology, University of Macau, Taipa 999078, Macau SAR,Faculty of Health Sciences, University of Macau, Taipa 999078, Macau SAR
| | - Yitian Chen
- Cancer Centre, University of Macau, Taipa 999078, Macau SAR,Centre for Precision Medicine Research and Training, University of Macau, Taipa 999078, Macau SAR,MoE Frontiers Science Center for Precision Oncology, University of Macau, Taipa 999078, Macau SAR,Faculty of Health Sciences, University of Macau, Taipa 999078, Macau SAR
| | - Xianting Ding
- Institute for Personalized Medicine, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Koon Ho Wong
- Cancer Centre, University of Macau, Taipa 999078, Macau SAR,Centre for Precision Medicine Research and Training, University of Macau, Taipa 999078, Macau SAR,MoE Frontiers Science Center for Precision Oncology, University of Macau, Taipa 999078, Macau SAR,Faculty of Health Sciences, University of Macau, Taipa 999078, Macau SAR
| | - Edwin Cheung
- To whom correspondence should be addressed. Tel: +853 8822 4992; Fax: +853 8822 2933;
| |
Collapse
|
158
|
Wesley BT, Ross ADB, Muraro D, Miao Z, Saxton S, Tomaz RA, Morell CM, Ridley K, Zacharis ED, Petrus-Reurer S, Kraiczy J, Mahbubani KT, Brown S, Garcia-Bernardo J, Alsinet C, Gaffney D, Horsfall D, Tysoe OC, Botting RA, Stephenson E, Popescu DM, MacParland S, Bader G, McGilvray ID, Ortmann D, Sampaziotis F, Saeb-Parsy K, Haniffa M, Stevens KR, Zilbauer M, Teichmann SA, Vallier L. Single-cell atlas of human liver development reveals pathways directing hepatic cell fates. Nat Cell Biol 2022; 24:1487-1498. [PMID: 36109670 DOI: 10.1038/s41556-022-00989-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 07/29/2022] [Indexed: 12/14/2022]
Abstract
The liver has been studied extensively due to the broad number of diseases affecting its vital functions. However, therapeutic advances have been hampered by the lack of knowledge concerning human hepatic development. Here, we addressed this limitation by describing the developmental trajectories of different cell types that make up the human liver at single-cell resolution. These transcriptomic analyses revealed that sequential cell-to-cell interactions direct functional maturation of hepatocytes, with non-parenchymal cells playing essential roles during organogenesis. We utilized this information to derive bipotential hepatoblast organoids and then exploited this model system to validate the importance of signalling pathways in hepatocyte and cholangiocyte specification. Further insights into hepatic maturation also enabled the identification of stage-specific transcription factors to improve the functionality of hepatocyte-like cells generated from human pluripotent stem cells. Thus, our study establishes a platform to investigate the basic mechanisms directing human liver development and to produce cell types for clinical applications.
Collapse
Affiliation(s)
- Brandon T Wesley
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Alexander D B Ross
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
- Department of Paediatrics, University of Cambridge, Cambridge, UK
| | - Daniele Muraro
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
- Wellcome Sanger Institute, Hinxton, UK
| | - Zhichao Miao
- Wellcome Sanger Institute, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK
| | - Sarah Saxton
- Departments of Bioengineering and Pathology, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Rute A Tomaz
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Carola M Morell
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Katherine Ridley
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Paediatrics, University of Cambridge, Cambridge, UK
| | - Ekaterini D Zacharis
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Sandra Petrus-Reurer
- Department of Surgery, University of Cambridge, Cambridge, UK
- NIHR Cambridge Biomedical Research Centre, Cambridge, UK
| | - Judith Kraiczy
- Department of Paediatrics, University of Cambridge, Cambridge, UK
| | | | - Stephanie Brown
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | | | | | | | - Dave Horsfall
- Digital Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Olivia C Tysoe
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Rachel A Botting
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Emily Stephenson
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | | | | | - Gary Bader
- University of Toronto, Toronto, Ontario, Canada
| | - Ian D McGilvray
- Multi-Organ Transplant Program, Toronto General Hospital Research Institute, Toronto, Ontario, Canada
| | - Daniel Ortmann
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Fotios Sampaziotis
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Surgery, University of Cambridge, Cambridge, UK
| | - Kourosh Saeb-Parsy
- Department of Surgery, University of Cambridge, Cambridge, UK
- NIHR Cambridge Biomedical Research Centre, Cambridge, UK
| | - Muzlifah Haniffa
- Wellcome Sanger Institute, Hinxton, UK
- Biosciences Institute, Newcastle University, Newcastle upon Tyne, UK
- Department of Dermatology and NIHR Newcastle Biomedical Research Centre, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Kelly R Stevens
- Departments of Bioengineering and Pathology, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Matthias Zilbauer
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Department of Paediatrics, University of Cambridge, Cambridge, UK
| | - Sarah A Teichmann
- Wellcome Sanger Institute, Hinxton, UK
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK
| | - Ludovic Vallier
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK.
- Department of Surgery, University of Cambridge, Cambridge, UK.
| |
Collapse
|
159
|
Lin X, Gao L, Whitener N, Ahmed A, Wei Z. A model-based constrained deep learning clustering approach for spatially resolved single-cell data. Genome Res 2022; 32:1906-1917. [PMID: 36198490 PMCID: PMC9712636 DOI: 10.1101/gr.276477.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 09/28/2022] [Indexed: 11/25/2022]
Abstract
Spatially resolved scRNA-seq (sp-scRNA-seq) technologies provide the potential to comprehensively profile gene expression patterns in tissue context. However, the development of computational methods lags behind the advances in these technologies, which limits the fulfillment of their potential. In this study, we develop a deep learning approach for clustering sp-scRNA-seq data, named Deep Spatially constrained Single-cell Clustering (DSSC). In this model, we integrate the spatial information of cells into the clustering process in two steps: (1) the spatial information is encoded by using a graphical neural network model, and (2) cell-to-cell constraints are built based on the spatial expression pattern of the marker genes and added in the model to guide the clustering process. Then, a deep embedding clustering is performed on the bottleneck layer of autoencoder by Kullback-Leibler (KL) divergence along with the learning of feature representation. DSSC is the first model that can use information from both spatial coordinates and marker genes to guide cell/spot clustering. Extensive experiments on both simulated and real data sets show that DSSC boosts clustering performance significantly compared with the state-of-the-art methods. It has robust performance across different data sets with various cell type/tissue organization and/or cell type/tissue spatial dependency. We conclude that DSSC is a promising tool for clustering sp-scRNA-seq data.
Collapse
Affiliation(s)
- Xiang Lin
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey 07102, USA
| | - Le Gao
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey 07102, USA
| | - Nathan Whitener
- Department of Computer Science, Wake Forest University, Winston-Salem, North Carolina 27109, USA
| | - Ashley Ahmed
- Department of Chemistry and Chemical Biology and Biological Sciences, College of Arts and Sciences, Cornell University, Ithaca, New York 14853, USA
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey 07102, USA
| |
Collapse
|
160
|
Fang S, Chen B, Zhang Y, Sun H, Liu L, Liu S, Li Y, Xu X. Computational Approaches and Challenges in Spatial Transcriptomics. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00129-2. [PMID: 36252814 PMCID: PMC10372921 DOI: 10.1016/j.gpb.2022.10.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 09/08/2022] [Accepted: 10/09/2022] [Indexed: 01/19/2023]
Abstract
The development of spatial transcriptomics (ST) technologies has transformed genetic research from a single-cell data level to a two-dimensional spatial coordinate system and facilitated the study of the composition and function of various cell subsets in different environments and organs. The large-scale data generated by these ST technologies, which contain spatial gene expression information, have elicited the need for spatially resolved approaches to meet the requirements of computational and biological data interpretation. These requirements include dealing with the explosive growth of data to determine the cell-level and gene-level expression, correcting the inner batch effect and loss of expression to improve the data quality, conducting efficient interpretation and in-depth knowledge mining both at the single-cell and tissue-wide levels, and conducting multi-omics integration analysis to provide an extensible framework toward the in-depth understanding of biological processes. However, algorithms designed specifically for ST technologies to meet these requirements are still in their infancy. Here, we review computational approaches to these problems in light of corresponding issues and challenges, and present forward-looking insights into algorithm development.
Collapse
|
161
|
Zhang K, Feng W, Wang P. Identification of spatially variable genes with graph cuts. Nat Commun 2022; 13:5488. [PMID: 36123336 PMCID: PMC9485129 DOI: 10.1038/s41467-022-33182-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Accepted: 09/07/2022] [Indexed: 11/08/2022] Open
Abstract
Single-cell gene expression data with positional information is critical to dissect mechanisms and architectures of multicellular organisms, but the potential is limited by the scalability of current data analysis strategies. Here, we present scGCO, a method based on fast optimization of hidden Markov Random Fields with graph cuts to identify spatially variable genes. Comparing to existing methods, scGCO delivers a superior performance with lower false positive rate and improved specificity, while demonstrates a more robust performance in the presence of noises. Critically, scGCO scales near linearly with inputs and demonstrates orders of magnitude better running time and memory requirement than existing methods, and could represent a valuable solution when spatial transcriptomics data grows into millions of data points and beyond.
Collapse
Affiliation(s)
- Ke Zhang
- National Genomics Data Center, CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Wanwan Feng
- National Genomics Data Center, CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Peng Wang
- National Genomics Data Center, CAS Key Laboratory of Computational Biology, Bio-Med Big Data Center, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.
- Faculty of Health Science, University of Macau, Macau, China.
- Ministry of Education Frontiers Science Center for Precision Oncology, University of Macau, Macau, China.
| |
Collapse
|
162
|
Cable DM, Murray E, Shanmugam V, Zhang S, Zou LS, Diao M, Chen H, Macosko EZ, Irizarry RA, Chen F. Cell type-specific inference of differential expression in spatial transcriptomics. Nat Methods 2022; 19:1076-1087. [PMID: 36050488 PMCID: PMC10463137 DOI: 10.1038/s41592-022-01575-3] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 07/15/2022] [Indexed: 12/13/2022]
Abstract
A central problem in spatial transcriptomics is detecting differentially expressed (DE) genes within cell types across tissue context. Challenges to learning DE include changing cell type composition across space and measurement pixels detecting transcripts from multiple cell types. Here, we introduce a statistical method, cell type-specific inference of differential expression (C-SIDE), that identifies cell type-specific DE in spatial transcriptomics, accounting for localization of other cell types. We model gene expression as an additive mixture across cell types of log-linear cell type-specific expression functions. C-SIDE's framework applies to many contexts: DE due to pathology, anatomical regions, cell-to-cell interactions and cellular microenvironment. Furthermore, C-SIDE enables statistical inference across multiple/replicates. Simulations and validation experiments on Slide-seq, MERFISH and Visium datasets demonstrate that C-SIDE accurately identifies DE with valid uncertainty quantification. Last, we apply C-SIDE to identify plaque-dependent immune activity in Alzheimer's disease and cellular interactions between tumor and immune cells. We distribute C-SIDE within the R package https://github.com/dmcable/spacexr .
Collapse
Affiliation(s)
- Dylan M Cable
- Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Evan Murray
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vignesh Shanmugam
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Simon Zhang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Luli S Zou
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biostatistics, Harvard University, Boston, MA, USA
| | - Michael Diao
- Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Haiqi Chen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Evan Z Macosko
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - Rafael A Irizarry
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Biostatistics, Harvard University, Boston, MA, USA.
| | - Fei Chen
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
163
|
Kleino I, Frolovaitė P, Suomi T, Elo LL. Computational solutions for spatial transcriptomics. Comput Struct Biotechnol J 2022; 20:4870-4884. [PMID: 36147664 PMCID: PMC9464853 DOI: 10.1016/j.csbj.2022.08.043] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 08/18/2022] [Accepted: 08/18/2022] [Indexed: 11/18/2022] Open
Abstract
Transcriptome level expression data connected to the spatial organization of the cells and molecules would allow a comprehensive understanding of how gene expression is connected to the structure and function in the biological systems. The spatial transcriptomics platforms may soon provide such information. However, the current platforms still lack spatial resolution, capture only a fraction of the transcriptome heterogeneity, or lack the throughput for large scale studies. The strengths and weaknesses in current ST platforms and computational solutions need to be taken into account when planning spatial transcriptomics studies. The basis of the computational ST analysis is the solutions developed for single-cell RNA-sequencing data, with advancements taking into account the spatial connectedness of the transcriptomes. The scRNA-seq tools are modified for spatial transcriptomics or new solutions like deep learning-based joint analysis of expression, spatial, and image data are developed to extract biological information in the spatially resolved transcriptomes. The computational ST analysis can reveal remarkable biological insights into spatial patterns of gene expression, cell signaling, and cell type variations in connection with cell type-specific signaling and organization in complex tissues. This review covers the topics that help choosing the platform and computational solutions for spatial transcriptomics research. We focus on the currently available ST methods and platforms and their strengths and limitations. Of the computational solutions, we provide an overview of the analysis steps and tools used in the ST data analysis. The compatibility with the data types and the tools provided by the current ST analysis frameworks are summarized.
Collapse
Key Words
- AOI, area of illumination
- BICCN, Brain Initiative Cell Census Network
- BOLORAMIS, barcoded oligonucleotides ligated on RNA amplified for multiplexed and parallel in situ analyses
- Baysor, Bayesian Segmentation of Spatial Transcriptomics Data
- BinSpect, Binary Spatial Extraction
- CCC, cell–cell communication
- CCI, cell–cell interactions
- CNV, copy-number variation
- Computational biology
- DSP, digital spatial profiling
- DbiT-Seq, Deterministic Barcoding in Tissue for spatial omics sequencing
- FA, factor analysis
- FFPE, formalin-fixed, paraffin-embedded
- FISH, fluorescence in situ hybridization
- FISSEQ, fluorescence in situ sequencing of RNA
- FOV, Field of view
- GRNs, gene regulation networks
- GSEA, gene set enrichment analysis
- GSVA, gene set variation analysis
- HDST, high definition spatial transcriptomics
- HMRF, hidden Markov random field
- ICG, interaction changed genes
- ISH, in situ hybridization
- ISS, in situ sequencing
- JSTA, Joint cell segmentation and cell type annotation
- KNN, k-nearest neighbor
- LCM, Laser Capture Microdissection
- LCM-seq, laser capture microdissection coupled with RNA sequencing
- LOH, loss of heterozygosity analysis
- MC, Molecular Cartography
- MERFISH, multiplexed error-robust FISH
- NMF (NNMF), Non-negative matrix factorization
- PCA, Principal Component Analysis
- PIXEL-seq, Polony (or DNA cluster)-indexed library-sequencing
- PL-lig, padlock ligation
- QC, quality control
- RNAseq, RNA sequencing
- ROI, region of interest
- SCENIC, Single-Cell rEgulatory Network Inference and Clustering
- SME, Spatial Morphological gene Expression normalization
- SPATA, SPAtial Transcriptomic Analysis
- ST Pipeline, Spatial Transcriptomics Pipeline
- ST, Spatial transcriptomics
- STARmap, spatially-resolved transcript amplicon readout mapping
- Single-cell analysis
- Spatial data analysis frameworks
- Spatial deconvolution
- Spatial transcriptomics
- TIVA, Transcriptome in Vivo Analysis
- TMA, tissue microarray
- TME, tumor micro environment
- UMAP, Uniform Manifold Approximation and Projection for Dimension Reduction
- UMI, unique molecular identifier
- ZipSeq, zipcoded sequencing.
- scRNA-seq, single-cell RNA sequencing
- scvi-tools, single-cell variational inference tools
- seqFISH, sequential fluorescence in situ hybridization
- sequ-smFISH, sequential single-molecule fluorescent in situ hybridization
- smFISH, single molecule FISH
- t-SNE, t-distributed stochastic neighbor embedding
Collapse
Affiliation(s)
- Iivari Kleino
- Turku Bioscience Centre, University of Turku and Åbo Akademi University Turku, Turku, Finland
| | - Paulina Frolovaitė
- Turku Bioscience Centre, University of Turku and Åbo Akademi University Turku, Turku, Finland
| | - Tomi Suomi
- Turku Bioscience Centre, University of Turku and Åbo Akademi University Turku, Turku, Finland
| | - Laura L. Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University Turku, Turku, Finland
- Institute of Biomedicine, University of Turku, Turku, Finland
| |
Collapse
|
164
|
Bergmann S, Penfold CA, Slatery E, Siriwardena D, Drummer C, Clark S, Strawbridge SE, Kishimoto K, Vickers A, Tewary M, Kohler TN, Hollfelder F, Reik W, Sasaki E, Behr R, Boroviak TE. Spatial profiling of early primate gastrulation in utero. Nature 2022; 609:136-143. [PMID: 35709828 PMCID: PMC7614364 DOI: 10.1038/s41586-022-04953-1] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 06/08/2022] [Indexed: 11/09/2022]
Abstract
Gastrulation controls the emergence of cellular diversity and axis patterning in the early embryo. In mammals, this transformation is orchestrated by dynamic signalling centres at the interface of embryonic and extraembryonic tissues1-3. Elucidating the molecular framework of axis formation in vivo is fundamental for our understanding of human development4-6 and to advance stem-cell-based regenerative approaches7. Here we illuminate early gastrulation of marmoset embryos in utero using spatial transcriptomics and stem-cell-based embryo models. Gaussian process regression-based 3D transcriptomes delineate the emergence of the anterior visceral endoderm, which is hallmarked by conserved (HHEX, LEFTY2, LHX1) and primate-specific (POSTN, SDC4, FZD5) factors. WNT signalling spatially coordinates the formation of the primitive streak in the embryonic disc and is counteracted by SFRP1 and SFRP2 to sustain pluripotency in the anterior domain. Amnion specification occurs at the boundaries of the embryonic disc through ID1, ID2 and ID3 in response to BMP signalling, providing a developmental rationale for amnion differentiation of primate pluripotent stem cells (PSCs). Spatial identity mapping demonstrates that primed marmoset PSCs exhibit the highest similarity to the anterior embryonic disc, whereas naive PSCs resemble the preimplantation epiblast. Our 3D transcriptome models reveal the molecular code of lineage specification in the primate embryo and provide an in vivo reference to decipher human development.
Collapse
Affiliation(s)
- Sophie Bergmann
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Jeffrey Cheah Biomedical Centre, Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Christopher A Penfold
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Jeffrey Cheah Biomedical Centre, Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK
- Wellcome Trust-Cancer Research UK Gurdon Institute, University of Cambridge, Cambridge, UK
| | - Erin Slatery
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Jeffrey Cheah Biomedical Centre, Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Dylan Siriwardena
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Jeffrey Cheah Biomedical Centre, Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Charis Drummer
- Research Platform Degenerative Diseases, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), partner site Göttingen, Göttingen, Germany
| | - Stephen Clark
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Epigenetics Programme, Babraham Institute, Cambridge, UK
| | - Stanley E Strawbridge
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
- Jeffrey Cheah Biomedical Centre, Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Keiko Kishimoto
- Department of Marmoset Biology and Medicine, Central Institute for Experimental Animals, Kawasaki, Japan
| | - Alice Vickers
- Centre for Stem Cells and Regenerative Medicine, King's College London, Guy's Hospital, London, UK
| | - Mukul Tewary
- Centre for Stem Cells and Regenerative Medicine, King's College London, Guy's Hospital, London, UK
| | - Timo N Kohler
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | | | - Wolf Reik
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK
- Epigenetics Programme, Babraham Institute, Cambridge, UK
| | - Erika Sasaki
- Department of Marmoset Biology and Medicine, Central Institute for Experimental Animals, Kawasaki, Japan
| | - Rüdiger Behr
- Research Platform Degenerative Diseases, German Primate Center, Leibniz Institute for Primate Research, Göttingen, Germany
- DZHK (German Center for Cardiovascular Research), partner site Göttingen, Göttingen, Germany
| | - Thorsten E Boroviak
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK.
- Centre for Trophoblast Research, University of Cambridge, Cambridge, UK.
- Jeffrey Cheah Biomedical Centre, Wellcome Trust-Medical Research Council Stem Cell Institute, University of Cambridge, Cambridge, UK.
| |
Collapse
|
165
|
Liu Q, Hsu CY, Shyr Y. Scalable and model-free detection of spatial patterns and colocalization. Genome Res 2022; 32:1736-1745. [PMID: 36223499 PMCID: PMC9528978 DOI: 10.1101/gr.276851.122] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 08/16/2022] [Indexed: 11/24/2022]
Abstract
The expeditious growth in spatial omics technologies enables the profiling of genome-wide molecular events at molecular and single-cell resolution, highlighting a need for fast and reliable methods to characterize spatial patterns. We developed SpaGene, a model-free method to discover spatial patterns rapidly in large-scale spatial omics studies. Analyzing simulation and a variety of spatially resolved transcriptomics data showed that SpaGene is more powerful and scalable than existing methods. Spatial expression patterns identified by SpaGene reconstruct unobserved tissue structures. SpaGene also successfully discovers ligand-receptor interactions through their colocalization.
Collapse
Affiliation(s)
- Qi Liu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
| | - Chih-Yuan Hsu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
| | - Yu Shyr
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
- Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, Tennessee 37232, USA
| |
Collapse
|
166
|
Ma Y, Zhou X. Spatially informed cell-type deconvolution for spatial transcriptomics. Nat Biotechnol 2022; 40:1349-1359. [PMID: 35501392 PMCID: PMC9464662 DOI: 10.1038/s41587-022-01273-7] [Citation(s) in RCA: 121] [Impact Index Per Article: 60.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 03/07/2022] [Indexed: 12/16/2022]
Abstract
Many spatially resolved transcriptomic technologies do not have single-cell resolution but measure the average gene expression for each spot from a mixture of cells of potentially heterogeneous cell types. Here, we introduce a deconvolution method, conditional autoregressive-based deconvolution (CARD), that combines cell-type-specific expression information from single-cell RNA sequencing (scRNA-seq) with correlation in cell-type composition across tissue locations. Modeling spatial correlation allows us to borrow the cell-type composition information across locations, improving accuracy of deconvolution even with a mismatched scRNA-seq reference. CARD can also impute cell-type compositions and gene expression levels at unmeasured tissue locations to enable the construction of a refined spatial tissue map with a resolution arbitrarily higher than that measured in the original study and can perform deconvolution without an scRNA-seq reference. Applications to four datasets, including a pancreatic cancer dataset, identified multiple cell types and molecular markers with distinct spatial localization that define the progression, heterogeneity and compartmentalization of pancreatic cancer.
Collapse
Affiliation(s)
- Ying Ma
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
167
|
Chang Y, He F, Wang J, Chen S, Li J, Liu J, Yu Y, Su L, Ma A, Allen C, Lin Y, Sun S, Liu B, Javier Otero J, Chung D, Fu H, Li Z, Xu D, Ma Q. Define and visualize pathological architectures of human tissues from spatially resolved transcriptomics using deep learning. Comput Struct Biotechnol J 2022; 20:4600-4617. [PMID: 36090815 PMCID: PMC9440291 DOI: 10.1016/j.csbj.2022.08.029] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 08/11/2022] [Accepted: 08/12/2022] [Indexed: 11/29/2022] Open
Abstract
Spatially resolved transcriptomics provides a new way to define spatial contexts and understand the pathogenesis of complex human diseases. Although some computational frameworks can characterize spatial context via various clustering methods, the detailed spatial architectures and functional zonation often cannot be revealed and localized due to the limited capacities of associating spatial information. We present RESEPT, a deep-learning framework for characterizing and visualizing tissue architecture from spatially resolved transcriptomics. Given inputs such as gene expression or RNA velocity, RESEPT learns a three-dimensional embedding with a spatial retained graph neural network from spatial transcriptomics. The embedding is then visualized by mapping into color channels in an RGB image and segmented with a supervised convolutional neural network model. Based on a benchmark of 10x Genomics Visium spatial transcriptomics datasets on the human and mouse cortex, RESEPT infers and visualizes the tissue architecture accurately. It is noteworthy that, for the in-house AD samples, RESEPT can localize cortex layers and cell types based on pre-defined region- or cell-type-enriched genes and furthermore provide critical insights into the identification of amyloid-beta plaques in Alzheimer's disease. Interestingly, in a glioblastoma sample analysis, RESEPT distinguishes tumor-enriched, non-tumor, and regions of neuropil with infiltrating tumor cells in support of clinical and prognostic cancer applications.
Collapse
Affiliation(s)
- Yuzhou Chang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- The Pelotonia Institute for Immuno-oncology, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin 130117, China
| | - Juexin Wang
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Shuo Chen
- Department of Neuroscience, The Ohio State University, Columbus, OH 43210, USA
| | - Jingyi Li
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin 130117, China
| | - Jixin Liu
- School of Mathematics, Shandong University, Jinan 250100, China
| | - Yang Yu
- School of Information Science and Technology, Northeast Normal University, Changchun, Jilin 130117, China
| | - Li Su
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Anjun Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- The Pelotonia Institute for Immuno-oncology, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA
| | - Carter Allen
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Yu Lin
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Shaoli Sun
- Department of Pathology, The Ohio State University, Columbus, OH 43210, USA
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan 250100, China
| | - José Javier Otero
- Departments of Neuroscience, Pathology, Neuropathology, The Ohio State University, Columbus, OH 43210, USA
| | - Dongjun Chung
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- The Pelotonia Institute for Immuno-oncology, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA
| | - Hongjun Fu
- Department of Neuroscience, The Ohio State University, Columbus, OH 43210, USA
| | - Zihai Li
- The Pelotonia Institute for Immuno-oncology, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
- The Pelotonia Institute for Immuno-oncology, The Ohio State University Comprehensive Cancer Center, Columbus, OH 43210, USA
| |
Collapse
|
168
|
Cai Z, Lei J, Roeder K. Model-free prediction test with application to genomics data. Proc Natl Acad Sci U S A 2022; 119:e2205518119. [PMID: 35969737 PMCID: PMC9407618 DOI: 10.1073/pnas.2205518119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 07/20/2022] [Indexed: 11/18/2022] Open
Abstract
Testing the significance of predictors in a regression model is one of the most important topics in statistics. This problem is especially difficult without any parametric assumptions on the data. This paper aims to test the null hypothesis that given confounding variables Z, X does not significantly contribute to the prediction of Y under the model-free setting, where X and Z are possibly high dimensional. We propose a general framework that first fits nonparametric machine learning regression algorithms on [Formula: see text] and [Formula: see text], then compares the prediction power of the two models. The proposed method allows us to leverage the strength of the most powerful regression algorithms developed in the modern machine learning community. The P value for the test can be easily obtained by permutation. In simulations, we find that the proposed method is more powerful compared to existing methods. The proposed method allows us to draw biologically meaningful conclusions from two gene expression data analyses without strong distributional assumptions: 1) testing the prediction power of sequencing RNA for the proteins in cellular indexing of transcriptomes and epitopes by sequencing data and 2) identification of spatially variable genes in spatially resolved transcriptomics data.
Collapse
Affiliation(s)
- Zhanrui Cai
- Department of Statistics, Iowa State University, Ames, IA 50011
| | - Jing Lei
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213
| |
Collapse
|
169
|
Lin Y, Wang Y, Liang Y, Yu Y, Li J, Ma Q, He F, Xu D. Sampling and ranking spatial transcriptomics data embeddings to identify tissue architecture. Front Genet 2022; 13:912813. [PMID: 36035139 PMCID: PMC9411666 DOI: 10.3389/fgene.2022.912813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 07/08/2022] [Indexed: 11/13/2022] Open
Abstract
Spatial transcriptomics is an emerging technology widely applied to the analyses of tissue architecture and corresponding biological functions. Substantial computational methods have been developed for analyzing spatial transcriptomics data. These methods generate embeddings from gene expression and spatial locations for spot clustering or tissue architecture segmentation. Although the hyperparameters used to produce an embedding can be tuned for a given training set, a fixed embedding has variable performance from case to case due to data distributions. Therefore, selecting an effective embedding for new data in advance would be useful. For this purpose, we developed an embedding evaluation method named message passing-Moran's I with maximum filtering (MP-MIM), which combines message passing-based embedding transformation with spatial autocorrelation analysis. We applied a graph convolution to aggregate spatial transcriptomics data and employed global Moran's I to measure spatial autocorrelation and select the most effective embedding to infer tissue architecture. Sixteen spatial transcriptomics samples generated from the human brain were used to validate our method. The results show that MP-MIM can accurately identify high-quality embeddings that produce a high correlation between the predicted tissue architecture and the ground truth. Overall, our study provides a novel method to select embeddings for new test data and enhance the usability of deep learning tools for spatial transcriptome analyses.
Collapse
Affiliation(s)
- Yu Lin
- School of Artificial Intelligence, Jilin University, Changchun, China
- Department of Electrical Engineering and Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, United States
| | - Yan Wang
- School of Artificial Intelligence, Jilin University, Changchun, China
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
| | - Yanchun Liang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun, China
- School of Computer Science, Zhuhai College of Science and Technology, Zhuhai, China
| | - Yang Yu
- School of Information Science and Technology, Northeast Normal University, Changchun, China
| | - Jingyi Li
- School of Information Science and Technology, Northeast Normal University, Changchun, China
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, United States
| | - Fei He
- School of Information Science and Technology, Northeast Normal University, Changchun, China
| | - Dong Xu
- Department of Electrical Engineering and Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, United States
| |
Collapse
|
170
|
Zubair A, Chapple RH, Natarajan S, Wright WC, Pan M, Lee HM, Tillman H, Easton J, Geeleher P. Cell type identification in spatial transcriptomics data can be improved by leveraging cell-type-informative paired tissue images using a Bayesian probabilistic model. Nucleic Acids Res 2022; 50:e80. [PMID: 35536287 PMCID: PMC9371936 DOI: 10.1093/nar/gkac320] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/13/2022] [Accepted: 04/21/2022] [Indexed: 11/12/2022] Open
Abstract
Spatial transcriptomics technologies have recently emerged as a powerful tool for measuring spatially resolved gene expression directly in tissues sections, revealing cell types and their dysfunction in unprecedented detail. However, spatial transcriptomics technologies are limited in their ability to separate transcriptionally similar cell types and can suffer further difficulties identifying cell types in slide regions where transcript capture is low. Here, we describe a conceptually novel methodology that can computationally integrate spatial transcriptomics data with cell-type-informative paired tissue images, obtained from, for example, the reverse side of the same tissue section, to improve inferences of tissue cell type composition in spatial transcriptomics data. The underlying statistical approach is generalizable to any spatial transcriptomics protocol where informative paired tissue images can be obtained. We demonstrate a use case leveraging cell-type-specific immunofluorescence markers obtained on mouse brain tissue sections and a use case for leveraging the output of AI annotated H&E tissue images, which we used to markedly improve the identification of clinically relevant immune cell infiltration in breast cancer tissue. Thus, combining spatial transcriptomics data with paired tissue images has the potential to improve the identification of cell types and hence to improve the applications of spatial transcriptomics that rely on accurate cell type identification.
Collapse
Affiliation(s)
- Asif Zubair
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Richard H Chapple
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Sivaraman Natarajan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - William C Wright
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Min Pan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Hyeong-Min Lee
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Heather Tillman
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - John Easton
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - Paul Geeleher
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
171
|
Li Z, Zhou X. BASS: multi-scale and multi-sample analysis enables accurate cell type clustering and spatial domain detection in spatial transcriptomic studies. Genome Biol 2022; 23:168. [PMID: 35927760 PMCID: PMC9351148 DOI: 10.1186/s13059-022-02734-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 07/21/2022] [Indexed: 02/08/2023] Open
Abstract
Spatial transcriptomic studies are reaching single-cell spatial resolution, with data often collected from multiple tissue sections. Here, we present a computational method, BASS, that enables multi-scale and multi-sample analysis for single-cell resolution spatial transcriptomics. BASS performs cell type clustering at the single-cell scale and spatial domain detection at the tissue regional scale, with the two tasks carried out simultaneously within a Bayesian hierarchical modeling framework. We illustrate the benefits of BASS through comprehensive simulations and applications to three datasets. The substantial power gain brought by BASS allows us to reveal accurate transcriptomic and cellular landscape in both cortex and hypothalamus.
Collapse
Affiliation(s)
- Zheng Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
172
|
Wang Y, Song B, Wang S, Chen M, Xie Y, Xiao G, Wang L, Wang T. Sprod for de-noising spatially resolved transcriptomics data based on position and image information. Nat Methods 2022; 19:950-958. [PMID: 35927477 DOI: 10.1038/s41592-022-01560-w] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 06/22/2022] [Indexed: 11/09/2022]
Abstract
Spatially resolved transcriptomics (SRT) provide gene expression close to, or even superior to, single-cell resolution while retaining the physical locations of sequencing and often also providing matched pathology images. However, SRT expression data suffer from high noise levels, due to the shallow coverage in each sequencing unit and the extra experimental steps required to preserve the locations of sequencing. Fortunately, such noise can be removed by leveraging information from the physical locations of sequencing, and the tissue organization reflected in corresponding pathology images. In this work, we developed Sprod, based on latent graph learning of matched location and imaging data, to impute accurate SRT gene expression. We validated Sprod comprehensively and demonstrated its advantages over previous methods for removing drop-outs in single-cell RNA-sequencing data. We showed that, after imputation by Sprod, differential expression analyses, pathway enrichment and cell-to-cell interaction inferences are more accurate. Overall, we envision de-noising by Sprod to become a key first step towards empowering SRT technologies for biomedical discoveries.
Collapse
Affiliation(s)
- Yunguan Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Bing Song
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Shidan Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Mingyi Chen
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Yang Xie
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA.,Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Guanghua Xiao
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA.,Department of Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Li Wang
- Department of Mathematics and Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, TX, USA.
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA. .,Center for the Genetics of Host Defense, University of Texas Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
173
|
Cuomo ASE, Heinen T, Vagiaki D, Horta D, Marioni JC, Stegle O. CellRegMap: a statistical framework for mapping context-specific regulatory variants using scRNA-seq. Mol Syst Biol 2022; 18:e10663. [PMID: 35972065 PMCID: PMC9380406 DOI: 10.15252/msb.202110663] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 06/28/2022] [Accepted: 07/01/2022] [Indexed: 11/11/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) enables characterizing the cellular heterogeneity in human tissues. Recent technological advances have enabled the first population-scale scRNA-seq studies in hundreds of individuals, allowing to assay genetic effects with single-cell resolution. However, existing strategies to analyze these data remain based on principles established for the genetic analysis of bulk RNA-seq. In particular, current methods depend on a priori definitions of discrete cell types, and hence cannot assess allelic effects across subtle cell types and cell states. To address this, we propose the Cell Regulatory Map (CellRegMap), a statistical framework to test for and quantify genetic effects on gene expression in individual cells. CellRegMap provides a principled approach to identify and characterize genotype-context interactions of known eQTL variants using scRNA-seq data. This model-based approach resolves allelic effects across cellular contexts of different granularity, including genetic effects specific to cell subtypes and continuous cell transitions. We validate CellRegMap using simulated data and apply it to previously identified eQTL from two recent studies of differentiating iPSCs, where we uncover hundreds of eQTL displaying heterogeneity of genetic effects across cellular contexts. Finally, we identify fine-grained genetic regulation in neuronal subtypes for eQTL that are colocalized with human disease variants.
Collapse
Affiliation(s)
- Anna S E Cuomo
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Sanger InstituteCambridgeUK
- Present address:
Garvan Institute of Medical ScienceSydneyNSWAustralia
| | - Tobias Heinen
- Division of Computational Genomics and Systems GeneticsGerman Cancer Research Centre (DKFZ)HeidelbergGermany
- European Molecular Biology Laboratory (EMBL)Genome BiologyHeidelbergGermany
- Faculty of Mathematics and Computer ScienceHeidelberg UniversityHeidelbergGermany
| | - Danai Vagiaki
- Division of Computational Genomics and Systems GeneticsGerman Cancer Research Centre (DKFZ)HeidelbergGermany
- European Molecular Biology Laboratory (EMBL)Genome BiologyHeidelbergGermany
- Faculty of BiosciencesHeidelberg UniversityHeidelbergGermany
| | - Danilo Horta
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
| | - John C Marioni
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Sanger InstituteCambridgeUK
- Cancer Research UKCambridge InstituteCambridgeUK
| | - Oliver Stegle
- European Bioinformatics Institute (EMBL‐EBI)CambridgeUK
- Wellcome Sanger InstituteCambridgeUK
- Division of Computational Genomics and Systems GeneticsGerman Cancer Research Centre (DKFZ)HeidelbergGermany
- European Molecular Biology Laboratory (EMBL)Genome BiologyHeidelbergGermany
| |
Collapse
|
174
|
Li Y, Zhou X, Cao H. Statistical analysis of spatially resolved transcriptomic data by incorporating multiomics auxiliary information. Genetics 2022; 221:iyac095. [PMID: 35731210 PMCID: PMC9339334 DOI: 10.1093/genetics/iyac095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Accepted: 06/14/2022] [Indexed: 11/13/2022] Open
Abstract
Effective control of false discovery rate is key for multiplicity problems. Here, we consider incorporating informative covariates from external datasets in the multiple testing procedure to boost statistical power while maintaining false discovery rate control. In particular, we focus on the statistical analysis of innovative high-dimensional spatial transcriptomic data while incorporating external multiomics data that provide distinct but complementary information to the detection of spatial expression patterns. We extend OrderShapeEM, an efficient covariate-assisted multiple testing procedure that incorporates one auxiliary study, to make it permissible to incorporate multiple external omics studies, to boost statistical power of spatial expression pattern detection. Specifically, we first use a recently proposed computationally efficient statistical analysis method, spatial pattern recognition via kernels, to produce the primary test statistics for spatial transcriptomic data. Afterwards, we construct the auxiliary covariate by combining information from multiple external omics studies, such as bulk and single-cell RNA-seq data using the Cauchy combination rule. Finally, we extend and implement the integrative analysis method OrderShapeEM on the primary P-values along with auxiliary data incorporating multiomics information for efficient covariate-assisted spatial expression analysis. We conduct a series of realistic simulations to evaluate the performance of our method with known ground truth. Four case studies in mouse olfactory bulb, mouse cerebellum, human breast cancer, and human heart tissues further demonstrate the substantial power gain of our method in detecting genes with spatial expression patterns compared to existing classic approaches that do not utilize any external information.
Collapse
Affiliation(s)
- Yan Li
- School of Mathematics, Jilin University, Changchun, Jilin 130012, China
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hongyuan Cao
- School of Mathematics, Jilin University, Changchun, Jilin 130012, China
- Department of Statistics, Florida State University, Tallahassee, FL 32306, USA
| |
Collapse
|
175
|
Buen Abad Najar CF, Burra P, Yosef N, Lareau LF. Identifying cell state-associated alternative splicing events and their coregulation. Genome Res 2022; 32:1385-1397. [PMID: 35858747 PMCID: PMC9341514 DOI: 10.1101/gr.276109.121] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 06/01/2022] [Indexed: 11/25/2022]
Abstract
Alternative splicing shapes the transcriptome and contributes to each cell's unique identity, but single-cell RNA sequencing (scRNA-seq) has struggled to capture the impact of alternative splicing. We previously showed that low recovery of mRNAs from single cells led to erroneous conclusions about the cell-to-cell variability of alternative splicing. Here, we present a method, Psix, to confidently identify splicing that changes across a landscape of single cells, using a probabilistic model that is robust against the data limitations of scRNA-seq. Its autocorrelation-inspired approach finds patterns of alternative splicing that correspond to patterns of cell identity, such as cell type or developmental stage, without the need for explicit cell clustering, labeling, or trajectory inference. Applying Psix to data that follow the trajectory of mouse brain development, we identify exons whose alternative splicing patterns cluster into modules of coregulation. We show that the exons in these modules are enriched for binding by distinct neuronal splicing factors and that their changes in splicing correspond to changes in expression of these splicing factors. Thus, Psix reveals cell type-dependent splicing patterns and the wiring of the splicing regulatory networks that control them. Our new method will enable scRNA-seq analysis to go beyond transcription to understand the roles of post-transcriptional regulation in determining cell identity.
Collapse
Affiliation(s)
| | - Prakruthi Burra
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Nir Yosef
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, California 94720, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, Massachusetts 02139, USA
- Chan Zuckerberg Biohub, San Francisco, California 94158, USA
| | - Liana F Lareau
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- Chan Zuckerberg Biohub, San Francisco, California 94158, USA
- Department of Bioengineering, University of California, Berkeley, California 94720, USA
| |
Collapse
|
176
|
Jiang X, Xiao G, Li Q. A Bayesian modified Ising model for identifying spatially variable genes from spatial transcriptomics data. Stat Med 2022; 41:4647-4665. [DOI: 10.1002/sim.9530] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 05/13/2022] [Accepted: 07/01/2022] [Indexed: 12/13/2022]
Affiliation(s)
- Xi Jiang
- Department of Statistical Science Southern Methodist University Dallas Texas USA
- Department of Population and Data Sciences The University of Texas Southwestern Medical Center Dallas Texas USA
| | - Guanghua Xiao
- Department of Population and Data Sciences The University of Texas Southwestern Medical Center Dallas Texas USA
| | - Qiwei Li
- Department of Mathematical Sciences The University of Texas at Dallas Richardson Texas USA
| |
Collapse
|
177
|
Zeng Y, Wei Z, Yu W, Yin R, Yuan Y, Li B, Tang Z, Lu Y, Yang Y. Spatial transcriptomics prediction from histology jointly through Transformer and graph neural networks. Brief Bioinform 2022; 23:6645485. [PMID: 35849101 DOI: 10.1093/bib/bbac297] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 06/12/2022] [Accepted: 06/29/2022] [Indexed: 12/16/2022] Open
Abstract
The rapid development of spatial transcriptomics allows the measurement of RNA abundance at a high spatial resolution, making it possible to simultaneously profile gene expression, spatial locations of cells or spots, and the corresponding hematoxylin and eosin-stained histology images. It turns promising to predict gene expression from histology images that are relatively easy and cheap to obtain. For this purpose, several methods are devised, but they have not fully captured the internal relations of the 2D vision features or spatial dependency between spots. Here, we developed Hist2ST, a deep learning-based model to predict RNA-seq expression from histology images. Around each sequenced spot, the corresponding histology image is cropped into an image patch and fed into a convolutional module to extract 2D vision features. Meanwhile, the spatial relations with the whole image and neighbored patches are captured through Transformer and graph neural network modules, respectively. These learned features are then used to predict the gene expression by following the zero-inflated negative binomial distribution. To alleviate the impact by the small spatial transcriptomics data, a self-distillation mechanism is employed for efficient learning of the model. By comprehensive tests on cancer and normal datasets, Hist2ST was shown to outperform existing methods in terms of both gene expression prediction and spatial region identification. Further pathway analyses indicated that our model could reserve biological information. Thus, Hist2ST enables generating spatial transcriptomics data from histology images for elucidating molecular signatures of tissues.
Collapse
Affiliation(s)
- Yuansong Zeng
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Zhuoyi Wei
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Weijiang Yu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Rui Yin
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Yuchen Yuan
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Bingling Li
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Zhonghui Tang
- Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China
| | - Yutong Lu
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China.,Key Laboratory of Machine Intelligence and Advanced Computing (MOE), Guangzhou 510000, China
| |
Collapse
|
178
|
Ren H, Walker BL, Cang Z, Nie Q. Identifying multicellular spatiotemporal organization of cells with SpaceFlow. Nat Commun 2022; 13:4076. [PMID: 35835774 PMCID: PMC9283532 DOI: 10.1038/s41467-022-31739-w] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 06/30/2022] [Indexed: 11/27/2022] Open
Abstract
One major challenge in analyzing spatial transcriptomic datasets is to simultaneously incorporate the cell transcriptome similarity and their spatial locations. Here, we introduce SpaceFlow, which generates spatially-consistent low-dimensional embeddings by incorporating both expression similarity and spatial information using spatially regularized deep graph networks. Based on the embedding, we introduce a pseudo-Spatiotemporal Map that integrates the pseudotime concept with spatial locations of the cells to unravel spatiotemporal patterns of cells. By comparing with multiple existing methods on several spatial transcriptomic datasets at both spot and single-cell resolutions, SpaceFlow is shown to produce a robust domain segmentation and identify biologically meaningful spatiotemporal patterns. Applications of SpaceFlow reveal evolving lineage in heart developmental data and tumor-immune interactions in human breast cancer data. Our study provides a flexible deep learning framework to incorporate spatiotemporal information in analyzing spatial transcriptomic data.
Collapse
Affiliation(s)
- Honglei Ren
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, 92627, USA
| | - Benjamin L Walker
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, 92627, USA
- Department of Mathematics, University of California Irvine, Irvine, CA, 92627, USA
| | - Zixuan Cang
- Department of Mathematics, North Carolina State University, Raleigh, NC, 27695, USA
| | - Qing Nie
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, 92627, USA.
- Department of Mathematics, University of California Irvine, Irvine, CA, 92627, USA.
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, 92627, USA.
| |
Collapse
|
179
|
Yu J, Luo X. Identification of cell-type-specific spatially variable genes accounting for excess zeros. Bioinformatics 2022; 38:4135-4144. [PMID: 35792822 PMCID: PMC9438960 DOI: 10.1093/bioinformatics/btac457] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 05/27/2022] [Accepted: 07/05/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Spatial transcriptomic techniques can profile gene expressions while retaining the spatial information, thus offering unprecedented opportunities to explore the relationship between gene expression and spatial locations. The spatial relationship may vary across cell types, but there is a lack of statistical methods to identify cell-type-specific spatially variable (SV) genes by simultaneously modeling excess zeros and cell-type proportions. RESULTS We develop a statistical approach CTSV to detect cell-type-specific SV genes. CTSV directly models spatial raw count data and considers zero-inflation as well as overdispersion using a zero-inflated negative binomial distribution. It then incorporates cell-type proportions and spatial effect functions in the zero-inflated negative binomial regression framework. The R package pscl is employed to fit the model. For robustness, a Cauchy combination rule is applied to integrate P-values from multiple choices of spatial effect functions. Simulation studies show that CTSV not only outperforms competing methods at the aggregated level but also achieves more power at the cell-type level. By analyzing pancreatic ductal adenocarcinoma spatial transcriptomic data, SV genes identified by CTSV reveal biological insights at the cell-type level. AVAILABILITY AND IMPLEMENTATION The R package of CTSV is available at https://bioconductor.org/packages/devel/bioc/html/CTSV.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jinge Yu
- Institute of Statistics and Big Data, Renmin University of China, Beijing 100872, China
| | | |
Collapse
|
180
|
Dai X, Cai L, He F. Single-cell sequencing: expansion, integration and translation. Brief Funct Genomics 2022; 21:280-295. [PMID: 35753690 DOI: 10.1093/bfgp/elac011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 05/16/2022] [Accepted: 05/24/2022] [Indexed: 12/11/2022] Open
Abstract
With the rapid advancement in sequencing technologies, the concept of omics has revolutionized our understanding of cellular behaviors. Conventional omics investigation approaches measure the averaged behaviors of multiple cells, which may easily hide signals represented by a small-cell cohort, urging for the development of techniques with enhanced resolution. Single-cell RNA sequencing, investigating cell transcriptomics at the resolution of a single cell, has been rapidly expanded to investigate other omics such as genomics, proteomics and metabolomics since its invention. The requirement for comprehensive understanding of complex cellular behavior has led to the integration of multi-omics and single-cell sequencing data with other layers of information such as spatial data and the CRISPR screening technique towards gained knowledge or innovative functionalities. The development of single-cell sequencing in both dimensions has rendered it a unique field that offers us a versatile toolbox to delineate complex diseases, including cancers.
Collapse
|
181
|
Williams CG, Lee HJ, Asatsuma T, Vento-Tormo R, Haque A. An introduction to spatial transcriptomics for biomedical research. Genome Med 2022; 14:68. [PMID: 35761361 PMCID: PMC9238181 DOI: 10.1186/s13073-022-01075-1] [Citation(s) in RCA: 247] [Impact Index Per Article: 123.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 06/19/2022] [Indexed: 01/04/2023] Open
Abstract
Single-cell transcriptomics (scRNA-seq) has become essential for biomedical research over the past decade, particularly in developmental biology, cancer, immunology, and neuroscience. Most commercially available scRNA-seq protocols require cells to be recovered intact and viable from tissue. This has precluded many cell types from study and largely destroys the spatial context that could otherwise inform analyses of cell identity and function. An increasing number of commercially available platforms now facilitate spatially resolved, high-dimensional assessment of gene transcription, known as 'spatial transcriptomics'. Here, we introduce different classes of method, which either record the locations of hybridized mRNA molecules in tissue, image the positions of cells themselves prior to assessment, or employ spatial arrays of mRNA probes of pre-determined location. We review sizes of tissue area that can be assessed, their spatial resolution, and the number and types of genes that can be profiled. We discuss if tissue preservation influences choice of platform, and provide guidance on whether specific platforms may be better suited to discovery screens or hypothesis testing. Finally, we introduce bioinformatic methods for analysing spatial transcriptomic data, including pre-processing, integration with existing scRNA-seq data, and inference of cell-cell interactions. Spatial -omics methods are already improving our understanding of human tissues in research, diagnostic, and therapeutic settings. To build upon these recent advancements, we provide entry-level guidance for those seeking to employ spatial transcriptomics in their own biomedical research.
Collapse
Affiliation(s)
- Cameron G Williams
- Department of Microbiology and Immunology, University of Melbourne, located at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC, 3000, Australia
| | - Hyun Jae Lee
- Department of Microbiology and Immunology, University of Melbourne, located at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC, 3000, Australia
| | - Takahiro Asatsuma
- Department of Microbiology and Immunology, University of Melbourne, located at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC, 3000, Australia
| | - Roser Vento-Tormo
- Cellular Genetics Group, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Ashraful Haque
- Department of Microbiology and Immunology, University of Melbourne, located at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC, 3000, Australia.
| |
Collapse
|
182
|
Mahadevan AS, Long BL, Hu CW, Ryan DT, Grandel NE, Britton GL, Bustos M, Gonzalez Porras MA, Stojkova K, Ligeralde A, Son H, Shannonhouse J, Robinson JT, Warmflash A, Brey EM, Kim YS, Qutub AA. cytoNet: Spatiotemporal network analysis of cell communities. PLoS Comput Biol 2022; 18:e1009846. [PMID: 35696439 PMCID: PMC9191702 DOI: 10.1371/journal.pcbi.1009846] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 01/18/2022] [Indexed: 11/18/2022] Open
Abstract
We introduce cytoNet, a cloud-based tool to characterize cell populations from microscopy images. cytoNet quantifies spatial topology and functional relationships in cell communities using principles of network science. Capturing multicellular dynamics through graph features, cytoNet also evaluates the effect of cell-cell interactions on individual cell phenotypes. We demonstrate cytoNet’s capabilities in four case studies: 1) characterizing the temporal dynamics of neural progenitor cell communities during neural differentiation, 2) identifying communities of pain-sensing neurons in vivo, 3) capturing the effect of cell community on endothelial cell morphology, and 4) investigating the effect of laminin α4 on perivascular niches in adipose tissue. The analytical framework introduced here can be used to study the dynamics of complex cell communities in a quantitative manner, leading to a deeper understanding of environmental effects on cellular behavior. The versatile, cloud-based format of cytoNet makes the image analysis framework accessible to researchers across domains.
Collapse
Affiliation(s)
- Arun S. Mahadevan
- Department of Bioengineering, University of Pennsylvania; Philadelphia, Pennsylvania, United States of America
- Department of Bioengineering, Rice University, Houston, Texas, United States of America
| | - Byron L. Long
- Department of Bioengineering, Rice University, Houston, Texas, United States of America
- Department of Biomedical Engineering, University of Texas at San Antonio, San Antonio, Texas, United States of America
- Department of Computer Science, University of Texas at San Antonio, San Antonio, Texas, United States of America
| | - Chenyue W. Hu
- Department of Bioengineering, Rice University, Houston, Texas, United States of America
| | - David T. Ryan
- Department of Bioengineering, Rice University, Houston, Texas, United States of America
| | - Nicolas E. Grandel
- Systems, Synthetic and Physical Biology Program, Rice University, Houston, Texas, United States of America
| | - George L. Britton
- Systems, Synthetic and Physical Biology Program, Rice University, Houston, Texas, United States of America
| | - Marisol Bustos
- Department of Biomedical Engineering, University of Texas at San Antonio, San Antonio, Texas, United States of America
| | - Maria A. Gonzalez Porras
- Department of Biomedical Engineering, University of Texas at San Antonio, San Antonio, Texas, United States of America
| | - Katerina Stojkova
- Department of Biomedical Engineering, University of Texas at San Antonio, San Antonio, Texas, United States of America
| | - Andrew Ligeralde
- Biophysics Graduate Program, University of California, Berkeley, California, United States of America
| | - Hyeonwi Son
- Department of Oral & Maxillofacial Surgery, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States of America
| | - John Shannonhouse
- Department of Oral & Maxillofacial Surgery, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States of America
| | - Jacob T. Robinson
- Department of Electrical and Computer Engineering, Rice University, Houston, Texas, United States of America
| | - Aryeh Warmflash
- Systems, Synthetic and Physical Biology Program, Rice University, Houston, Texas, United States of America
- Department of Biosciences, Rice University, Houston, Texas, United States of America
| | - Eric M. Brey
- Department of Biomedical Engineering, University of Texas at San Antonio, San Antonio, Texas, United States of America
- UTSA–UT Health Joint Graduate Group in Biomedical Engineering, San Antonio, Texas, United States of America
| | - Yu Shin Kim
- Department of Oral & Maxillofacial Surgery, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States of America
- UTSA–UT Health Joint Graduate Group in Biomedical Engineering, San Antonio, Texas, United States of America
- Programs in Integrated Biomedical Sciences, Translational Sciences, Radiological Sciences, University of Texas Health Science Center at San Antonio, San Antonio, Texas, United States of America
| | - Amina A. Qutub
- Department of Biomedical Engineering, University of Texas at San Antonio, San Antonio, Texas, United States of America
- UTSA–UT Health Joint Graduate Group in Biomedical Engineering, San Antonio, Texas, United States of America
- UTSA AI MATRIX Consortium, San Antonio, Texas, United States of America
- * E-mail:
| |
Collapse
|
183
|
Li Y, Stanojevic S, Garmire LX. Emerging artificial intelligence applications in Spatial Transcriptomics analysis. Comput Struct Biotechnol J 2022; 20:2895-2908. [PMID: 35765645 PMCID: PMC9201012 DOI: 10.1016/j.csbj.2022.05.056] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/28/2022] [Accepted: 05/28/2022] [Indexed: 11/19/2022] Open
Abstract
Spatial transcriptomics (ST) has advanced significantly in the last few years. Such advancement comes with the urgent need for novel computational methods to handle the unique challenges of ST data analysis. Many artificial intelligence (AI) methods have been developed to utilize various machine learning and deep learning techniques for computational ST analysis. This review provides a comprehensive and up-to-date survey of current AI methods for ST analysis.
Collapse
Affiliation(s)
- Yijun Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Stefan Stanojevic
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Lana X. Garmire
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
184
|
Ni Z, Prasad A, Chen S, Halberg RB, Arkin LM, Drolet BA, Newton MA, Kendziorski C. SpotClean adjusts for spot swapping in spatial transcriptomics data. Nat Commun 2022; 13:2971. [PMID: 35624112 PMCID: PMC9142522 DOI: 10.1038/s41467-022-30587-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 05/10/2022] [Indexed: 01/22/2023] Open
Abstract
Spatial transcriptomics is a powerful and widely used approach for profiling the gene expression landscape across a tissue with emerging applications in molecular medicine and tumor diagnostics. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind RNA. Ideally, unique molecular identifiers (UMIs) at a spot measure spot-specific expression, but this is often not the case in practice due to bleed from nearby spots, an artifact we refer to as spot swapping. To improve the power and precision of downstream analyses in spatial transcriptomics experiments, we propose SpotClean, a probabilistic model that adjusts for spot swapping to provide more accurate estimates of gene-specific UMI counts. SpotClean provides substantial improvements in marker gene analyses and in clustering, especially when tissue regions are not easily separated. As demonstrated in multiple studies of cancer, SpotClean improves tumor versus normal tissue delineation and improves tumor burden estimation thus increasing the potential for clinical and diagnostic applications of spatial transcriptomics technologies.
Collapse
Grants
- R01 GM102756 NIGMS NIH HHS
- P30 CA014520 NCI NIH HHS
- P50 HD105353 NICHD NIH HHS
- UL1 TR002373 NCATS NIH HHS
- P50 CA278595 NCI NIH HHS
- NIH GM102756 (Z.N., C.K.), NIH UL1TR002373 (A.P., B.A.D.), 2020 UW-ICTR Translational Pilot Award (A.P., L.M.A., B.A.D.), NIH/NCI 1 R01 CA220004-01 (R.B.H.), 2020 Dermatology Foundation Pediatric Dermatology Career Development Award (L.M.A.), 2019 Sturge Weber Foundation Lisa's Research Award (L.M.A.), NSF 2023239-DMS (M.A.N.), NIH 1P01CA250972-01 (M.A.N.), NIH 1P50HD105353-01 (M.A.N.)
Collapse
Affiliation(s)
- Zijian Ni
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
| | - Aman Prasad
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI, USA
| | - Shuyang Chen
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
| | - Richard B Halberg
- Department of Medicine, University of Wisconsin-Madison, Madison, WI, USA
- Department of Oncology, University of Wisconsin-Madison, Madison, WI, USA
| | - Lisa M Arkin
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI, USA
| | - Beth A Drolet
- Department of Dermatology, University of Wisconsin-Madison, Madison, WI, USA
| | - Michael A Newton
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Christina Kendziorski
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
185
|
Zheng B, Fang L. Spatially resolved transcriptomics provide a new method for cancer research. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2022; 41:179. [PMID: 35590346 PMCID: PMC9118771 DOI: 10.1186/s13046-022-02385-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 05/06/2022] [Indexed: 12/22/2022]
Abstract
A major feature of cancer is the heterogeneity, both intratumoral and intertumoral. Traditional single-cell techniques have given us a comprehensive understanding of the biological characteristics of individual tumor cells, but the lack of spatial context of the transcriptome has limited the study of cell-to-cell interaction patterns and hindered further exploration of tumor heterogeneity. In recent years, the advent of spatially resolved transcriptomics (SRT) technology has made possible the multidimensional analysis of the tumor microenvironment in the context of intact tissues. Different SRT methods are applicable to different working ranges due to different working principles. In this paper, we review the advantages and disadvantages of various current SRT methods and the overall idea of applying these techniques to oncology studies, hoping to help researchers find breakthroughs. Finally, we discussed the future direction of SRT technology, and deeper investigation into the complex mechanisms of tumor development from different perspectives through multi-omics fusion, paving the way for precisely targeted tumor therapy.
Collapse
Affiliation(s)
- Bowen Zheng
- Department of Breast and Thyroid Surgery, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, 200072, People's Republic of China
| | - Lin Fang
- Department of Breast and Thyroid Surgery, Shanghai Tenth People's Hospital, School of Medicine, Tongji University, Shanghai, 200072, People's Republic of China.
| |
Collapse
|
186
|
Zhao P, Zhu J, Ma Y, Zhou X. Modeling zero inflation is not necessary for spatial transcriptomics. Genome Biol 2022; 23:118. [PMID: 35585605 PMCID: PMC9116027 DOI: 10.1186/s13059-022-02684-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 05/09/2022] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Spatial transcriptomics are a set of new technologies that profile gene expression on tissues with spatial localization information. With technological advances, recent spatial transcriptomics data are often in the form of sparse counts with an excessive amount of zero values. RESULTS We perform a comprehensive analysis on 20 spatial transcriptomics datasets collected from 11 distinct technologies to characterize the distributional properties of the expression count data and understand the statistical nature of the zero values. Across datasets, we show that a substantial fraction of genes displays overdispersion and/or zero inflation that cannot be accounted for by a Poisson model, with genes displaying overdispersion substantially overlapped with genes displaying zero inflation. In addition, we find that either the Poisson or the negative binomial model is sufficient for modeling the majority of genes across most spatial transcriptomics technologies. We further show major sources of overdispersion and zero inflation in spatial transcriptomics including gene expression heterogeneity across tissue locations and spatial distribution of cell types. In particular, when we focus on a relatively homogeneous set of tissue locations or control for cell type compositions, the number of detected overdispersed and/or zero-inflated genes is substantially reduced, and a simple Poisson model is often sufficient to fit the gene expression data there. CONCLUSIONS Our study provides the first comprehensive evidence that excessive zeros in spatial transcriptomics are not due to zero inflation, supporting the use of count models without a zero inflation component for modeling spatial transcriptomics.
Collapse
Affiliation(s)
- Peiyao Zhao
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Jiaqiang Zhu
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ying Ma
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
187
|
Zeira R, Land M, Strzalkowski A, Raphael BJ. Alignment and integration of spatial transcriptomics data. Nat Methods 2022; 19:567-575. [PMID: 35577957 PMCID: PMC9334025 DOI: 10.1038/s41592-022-01459-6] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 03/17/2022] [Indexed: 01/05/2023]
Abstract
Spatial transcriptomics (ST) measures mRNA expression across thousands of spots from a tissue slice while recording the two-dimensional (2D) coordinates of each spot. We introduce probabilistic alignment of ST experiments (PASTE), a method to align and integrate ST data from multiple adjacent tissue slices. PASTE computes pairwise alignments of slices using an optimal transport formulation that models both transcriptional similarity and physical distances between spots. PASTE further combines pairwise alignments to construct a stacked 3D alignment of a tissue. Alternatively, PASTE can integrate multiple ST slices into a single consensus slice. We show that PASTE accurately aligns spots across adjacent slices in both simulated and real ST data, demonstrating the advantages of using both transcriptional similarity and spatial information. We further show that the PASTE integrated slice improves the identification of cell types and differentially expressed genes compared with existing approaches that either analyze single ST slices or ignore spatial information.
Collapse
Affiliation(s)
- Ron Zeira
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Max Land
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | | | - Benjamin J Raphael
- Department of Computer Science, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
188
|
Tanevski J, Flores ROR, Gabor A, Schapiro D, Saez-Rodriguez J. Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biol 2022; 23:97. [PMID: 35422018 PMCID: PMC9011939 DOI: 10.1186/s13059-022-02663-5] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 04/01/2022] [Indexed: 12/12/2022] Open
Abstract
The advancement of highly multiplexed spatial technologies requires scalable methods that can leverage spatial information. We present MISTy, a flexible, scalable, and explainable machine learning framework for extracting relationships from any spatial omics data, from dozens to thousands of measured markers. MISTy builds multiple views focusing on different spatial or functional contexts to dissect different effects. We evaluated MISTy on in silico and breast cancer datasets measured by imaging mass cytometry and spatial transcriptomics. We estimated structural and functional interactions coming from different spatial contexts in breast cancer and demonstrated how to relate MISTy's results to clinical features.
Collapse
Affiliation(s)
- Jovan Tanevski
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany
- Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Ricardo Omar Ramirez Flores
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany
| | - Attila Gabor
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany
| | - Denis Schapiro
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Institute of Pathology, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany
| | - Julio Saez-Rodriguez
- Institute for Computational Biomedicine, Faculty of Medicine, Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany.
- Joint Research Centre for Computational Biomedicine (JRC-COMBINE), Faculty of Medicine, RWTH Aachen University, Aachen, Germany.
| |
Collapse
|
189
|
Zhang L, Chen D, Song D, Liu X, Zhang Y, Xu X, Wang X. Clinical and translational values of spatial transcriptomics. Signal Transduct Target Ther 2022; 7:111. [PMID: 35365599 PMCID: PMC8972902 DOI: 10.1038/s41392-022-00960-w] [Citation(s) in RCA: 62] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 03/04/2022] [Accepted: 03/09/2022] [Indexed: 02/06/2023] Open
Abstract
The combination of spatial transcriptomics (ST) and single cell RNA sequencing (scRNA-seq) acts as a pivotal component to bridge the pathological phenomes of human tissues with molecular alterations, defining in situ intercellular molecular communications and knowledge on spatiotemporal molecular medicine. The present article overviews the development of ST and aims to evaluate clinical and translational values for understanding molecular pathogenesis and uncovering disease-specific biomarkers. We compare the advantages and disadvantages of sequencing- and imaging-based technologies and highlight opportunities and challenges of ST. We also describe the bioinformatics tools necessary on dissecting spatial patterns of gene expression and cellular interactions and the potential applications of ST in human diseases for clinical practice as one of important issues in clinical and translational medicine, including neurology, embryo development, oncology, and inflammation. Thus, clear clinical objectives, designs, optimizations of sampling procedure and protocol, repeatability of ST, as well as simplifications of analysis and interpretation are the key to translate ST from bench to clinic.
Collapse
Affiliation(s)
- Linlin Zhang
- Zhongshan Hospital, Department of Pulmonary and Critical Care Medicine, Institute for Clinical Science, Shanghai Institute of Clinical Bioinformatics, Shanghai Engineering Research for AI Technology for Cardiopulmonary Diseases, Shanghai, 200000, China
| | - Dongsheng Chen
- Suzhou Institute of Systems Medicine, Suzhou, 215123, Jiangsu, China
| | - Dongli Song
- Zhongshan Hospital, Department of Pulmonary and Critical Care Medicine, Institute for Clinical Science, Shanghai Institute of Clinical Bioinformatics, Shanghai Engineering Research for AI Technology for Cardiopulmonary Diseases, Shanghai, 200000, China
| | - Xiaoxia Liu
- Zhongshan Hospital, Department of Pulmonary and Critical Care Medicine, Institute for Clinical Science, Shanghai Institute of Clinical Bioinformatics, Shanghai Engineering Research for AI Technology for Cardiopulmonary Diseases, Shanghai, 200000, China
| | - Yanan Zhang
- Tsinghua-Berkeley Shenzhen Institute (TBSI), Tsinghua University, Shenzhen, 518055, China
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, 518083, China.
| | - Xiangdong Wang
- Zhongshan Hospital, Department of Pulmonary and Critical Care Medicine, Institute for Clinical Science, Shanghai Institute of Clinical Bioinformatics, Shanghai Engineering Research for AI Technology for Cardiopulmonary Diseases, Shanghai, 200000, China.
| |
Collapse
|
190
|
Cable DM, Murray E, Zou LS, Goeva A, Macosko EZ, Chen F, Irizarry RA. Robust decomposition of cell type mixtures in spatial transcriptomics. Nat Biotechnol 2022; 40:517-526. [PMID: 33603203 PMCID: PMC8606190 DOI: 10.1038/s41587-021-00830-w] [Citation(s) in RCA: 371] [Impact Index Per Article: 185.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 12/31/2020] [Indexed: 02/07/2023]
Abstract
A limitation of spatial transcriptomics technologies is that individual measurements may contain contributions from multiple cells, hindering the discovery of cell-type-specific spatial patterns of localization and expression. Here, we develop robust cell type decomposition (RCTD), a computational method that leverages cell type profiles learned from single-cell RNA-seq to decompose cell type mixtures while correcting for differences across sequencing technologies. We demonstrate the ability of RCTD to detect mixtures and identify cell types on simulated datasets. Furthermore, RCTD accurately reproduces known cell type and subtype localization patterns in Slide-seq and Visium datasets of the mouse brain. Finally, we show how RCTD's recovery of cell type localization enables the discovery of genes within a cell type whose expression depends on spatial environment. Spatial mapping of cell types with RCTD enables the spatial components of cellular identity to be defined, uncovering new principles of cellular organization in biological tissue. RCTD is publicly available as an open-source R package at https://github.com/dmcable/RCTD .
Collapse
Affiliation(s)
- Dylan M. Cable
- Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, 02139,Broad Institute of Harvard and MIT, Cambridge, MA, 02142,Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, 02215
| | - Evan Murray
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142
| | - Luli S. Zou
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142,Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, 02215,Department of Biostatistics, Harvard University, Boston, MA, 02115
| | | | - Evan Z. Macosko
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142,Department of Psychiatry, Massachusetts General Hospital, Boston, MA, 02114
| | - Fei Chen
- Broad Institute of Harvard and MIT, Cambridge, MA, 02142,Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge MA 02138
| | - Rafael A. Irizarry
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, 02215,Department of Biostatistics, Harvard University, Boston, MA, 02115
| |
Collapse
|
191
|
Bergenstråhle L, He B, Bergenstråhle J, Abalo X, Mirzazadeh R, Thrane K, Ji AL, Andersson A, Larsson L, Stakenborg N, Boeckxstaens G, Khavari P, Zou J, Lundeberg J, Maaskola J. Super-resolved spatial transcriptomics by deep data fusion. Nat Biotechnol 2022; 40:476-479. [PMID: 34845373 DOI: 10.1038/s41587-021-01075-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 08/27/2021] [Indexed: 02/07/2023]
Abstract
Current methods for spatial transcriptomics are limited by low spatial resolution. Here we introduce a method that integrates spatial gene expression data with histological image data from the same tissue section to infer higher-resolution expression maps. Using a deep generative model, our method characterizes the transcriptome of micrometer-scale anatomical features and can predict spatial gene expression from histology images alone.
Collapse
Affiliation(s)
- Ludvig Bergenstråhle
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Bryan He
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Joseph Bergenstråhle
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Xesús Abalo
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Reza Mirzazadeh
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Kim Thrane
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Andrew L Ji
- Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Alma Andersson
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Ludvig Larsson
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Nathalie Stakenborg
- Department of Chronic Diseases and Metabolism, Katholieke Universiteit te Leuven, Leuven, Belgium
| | - Guy Boeckxstaens
- Department of Chronic Diseases and Metabolism, Katholieke Universiteit te Leuven, Leuven, Belgium
| | - Paul Khavari
- Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - James Zou
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Joakim Lundeberg
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Jonas Maaskola
- SciLifeLab, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.,SciLifeLab, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| |
Collapse
|
192
|
Abstract
Spatial transcriptomic technologies have been developed rapidly in recent years. The addition of spatial context to expression data holds the potential to revolutionize many fields in biology. However, the lack of computational tools remains a bottleneck that is preventing the broader utilization of these technologies. Recently, we have developed Giotto as a comprehensive, generally applicable, and user-friendly toolbox for spatial transcriptomic data analysis and visualization. Giotto implements a rich set of algorithms to enable robust spatial data analysis. To help users get familiar with the Giotto environment and apply it effectively in analyzing new datasets, we will describe the detailed protocols for applying Giotto without any advanced programming skills. © 2022 Wiley Periodicals LLC. Basic Protocol 1: Getting Giotto set up for use Basic Protocol 2: Pre-processing Basic Protocol 3: Clustering and cell-type identification Basic Protocol 4: Cell-type enrichment and deconvolution analyses Basic Protocol 5: Spatial structure analysis tools Basic Protocol 6: Spatial domain detection by using a hidden Markov random field model Support Protocol 1: Spatial proximity-associated cell-cell interactions Support Protocol 2: Assembly of a registered 3D Giotto object from 2D slices.
Collapse
Affiliation(s)
- Natalie Del Rossi
- Department of Genetics and Genomic Sciences. Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Jiaji G. Chen
- Section of Hematology and Medical Oncology, School of Medicine, Boston University, Boston, Massachusetts 02138, USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences. Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Precision Immunology Institute, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Ruben Dries
- Section of Hematology and Medical Oncology, School of Medicine, Boston University, Boston, Massachusetts 02138, USA
- Division of Computational Biomedicine, School of Medicine, Boston University, Boston, Massachusetts 02138, USA
| |
Collapse
|
193
|
Zeng Z, Li Y, Li Y, Luo Y. Statistical and machine learning methods for spatially resolved transcriptomics data analysis. Genome Biol 2022; 23:83. [PMID: 35337374 PMCID: PMC8951701 DOI: 10.1186/s13059-022-02653-7] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Accepted: 03/15/2022] [Indexed: 01/28/2023] Open
Abstract
The recent advancement in spatial transcriptomics technology has enabled multiplexed profiling of cellular transcriptomes and spatial locations. As the capacity and efficiency of the experimental technologies continue to improve, there is an emerging need for the development of analytical approaches. Furthermore, with the continuous evolution of sequencing protocols, the underlying assumptions of current analytical methods need to be re-evaluated and adjusted to harness the increasing data complexity. To motivate and aid future model development, we herein review the recent development of statistical and machine learning methods in spatial transcriptomics, summarize useful resources, and highlight the challenges and opportunities ahead.
Collapse
Affiliation(s)
- Zexian Zeng
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100084, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100084, China
- Department of Data Sciences, Dana Farber Cancer Institute, Harvard T.H. Chan School of Public Health, Boston, MA, 02215, USA
| | - Yawei Li
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Yiming Li
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Yuan Luo
- Division of Health and Biomedical Informatics, Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA.
- Northwestern University Clinical and Translational Sciences Institute, Chicago, IL, 60611, USA.
- Institute for Augmented Intelligence in Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Center for Health Information Partnerships, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
194
|
Walker BL, Cang Z, Ren H, Bourgain-Chang E, Nie Q. Deciphering tissue structure and function using spatial transcriptomics. Commun Biol 2022; 5:220. [PMID: 35273328 PMCID: PMC8913632 DOI: 10.1038/s42003-022-03175-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 02/16/2022] [Indexed: 01/31/2023] Open
Abstract
The rapid development of spatial transcriptomics (ST) techniques has allowed the measurement of transcriptional levels across many genes together with the spatial positions of cells. This has led to an explosion of interest in computational methods and techniques for harnessing both spatial and transcriptional information in analysis of ST datasets. The wide diversity of approaches in aim, methodology and technology for ST provides great challenges in dissecting cellular functions in spatial contexts. Here, we synthesize and review the key problems in analysis of ST data and methods that are currently applied, while also expanding on open questions and areas of future development.
Collapse
Affiliation(s)
- Benjamin L Walker
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, USA
- Department of Mathematics, University of California Irvine, Irvine, CA, USA
| | - Zixuan Cang
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, USA
- Department of Mathematics, University of California Irvine, Irvine, CA, USA
| | - Honglei Ren
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, USA
- Department of Mathematics, University of California Irvine, Irvine, CA, USA
| | | | - Qing Nie
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, CA, USA.
- Department of Mathematics, University of California Irvine, Irvine, CA, USA.
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA.
| |
Collapse
|
195
|
Abstract
The function of many biological systems, such as embryos, liver lobules, intestinal villi, and tumors, depends on the spatial organization of their cells. In the past decade, high-throughput technologies have been developed to quantify gene expression in space, and computational methods have been developed that leverage spatial gene expression data to identify genes with spatial patterns and to delineate neighborhoods within tissues. To comprehensively document spatial gene expression technologies and data-analysis methods, we present a curated review of literature on spatial transcriptomics dating back to 1987, along with a thorough analysis of trends in the field, such as usage of experimental techniques, species, tissues studied, and computational approaches used. Our Review places current methods in a historical context, and we derive insights about the field that can guide current research strategies. A companion supplement offers a more detailed look at the technologies and methods analyzed: https://pachterlab.github.io/LP_2021/ .
Collapse
|
196
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
197
|
Li K, Yan C, Li C, Chen L, Zhao J, Zhang Z, Bao S, Sun J, Zhou M. Computational elucidation of spatial gene expression variation from spatially resolved transcriptomics data. MOLECULAR THERAPY - NUCLEIC ACIDS 2022; 27:404-411. [PMID: 35036053 PMCID: PMC8728308 DOI: 10.1016/j.omtn.2021.12.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Recent advances in spatially resolved transcriptomics (SRT) have revolutionized biological and medical research and enabled unprecedented insight into the functional organization and cell communication of tissues and organs in situ. Identifying and elucidating gene spatial expression variation (SE analysis) is fundamental to elucidate the SRT landscape. There is an urgent need for public repositories and computational techniques of SRT data in SE analysis alongside technological breakthroughs and large-scale data generation. Increasing efforts to use in silico techniques in SE analysis have been made. However, these attempts are widely scattered among a large number of studies that are not easily accessible or comprehensible by both medical and life scientists. This study provides a survey and a summary of public resources on SE analysis in SRT studies. An updated systematic overview of state-of-the-art computational approaches and tools currently available in SE analysis are presented herein, emphasizing recent advances. Finally, the present study explores the future perspectives and challenges of in silico techniques in SE analysis. This study guides medical and life scientists to look for dedicated resources and more competent tools for characterizing spatial patterns of gene expression.
Collapse
Affiliation(s)
- Ke Li
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Congcong Yan
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Chenghao Li
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Lu Chen
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Jingting Zhao
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Zicheng Zhang
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Siqi Bao
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
| | - Jie Sun
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
- Corresponding author Jie Sun, School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China.
| | - Meng Zhou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China
- Corresponding author Meng Zhou, School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P. R. China.
| |
Collapse
|
198
|
Obtaining spatially resolved tumor purity maps using deep multiple instance learning in a pan-cancer study. PATTERNS (NEW YORK, N.Y.) 2022; 3:100399. [PMID: 35199060 PMCID: PMC8848022 DOI: 10.1016/j.patter.2021.100399] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/07/2021] [Accepted: 11/03/2021] [Indexed: 02/07/2023]
Abstract
Tumor purity is the percentage of cancer cells within a tissue section. Pathologists estimate tumor purity to select samples for genomic analysis by manually reading hematoxylin-eosin (H&E)-stained slides, which is tedious, time consuming, and prone to inter-observer variability. Besides, pathologists' estimates do not correlate well with genomic tumor purity values, which are inferred from genomic data and accepted as accurate for downstream analysis. We developed a deep multiple instance learning model predicting tumor purity from H&E-stained digital histopathology slides. Our model successfully predicted tumor purity in eight The Cancer Genome Atlas (TCGA) cohorts and a local Singapore cohort. The predictions were highly consistent with genomic tumor purity values. Thus, our model can be utilized to select samples for genomic analysis, which will help reduce pathologists' workload and decrease inter-observer variability. Furthermore, our model provided tumor purity maps showing the spatial variation within sections. They can help better understand the tumor microenvironment. MIL model successfully predicts a sample's tumor purity from histopathology slides MIL model learns to spatially resolve tumor purity from sample-level labels Tumor purity varies spatially within a sample Pathologists’ region selection is vital for correct percentage tumor nuclei estimation
Given some big data and coarse-level labels, extracting fine-level information is a demanding yet rewarding challenge in data science. This study develops a machine learning model utilizing big data and exploiting coarse-level labels to reveal fine-level details within the data. Although it can be applied to different data science tasks with enormous data and coarse labels, we applied it to a computational histopathology task with gigapixel histopathology slides and sample-level labels. Specifically, the model revealed spatial resolution of tumor purity within histopathology slides using only sample-level genomic tumor purity values during training. This can also be extended to other omics features, providing precious information about cancer biology and promising personalized, precision medicine. Such studies are of great clinical importance in discovering imaging biomarkers and better understanding the tumor microenvironment.
Collapse
|
199
|
Vickovic S, Lötstedt B, Klughammer J, Mages S, Segerstolpe Å, Rozenblatt-Rosen O, Regev A. SM-Omics is an automated platform for high-throughput spatial multi-omics. Nat Commun 2022; 13:795. [PMID: 35145087 PMCID: PMC8831571 DOI: 10.1038/s41467-022-28445-y] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 01/24/2022] [Indexed: 12/12/2022] Open
Abstract
The spatial organization of cells and molecules plays a key role in tissue function in homeostasis and disease. Spatial transcriptomics has recently emerged as a key technique to capture and positionally barcode RNAs directly in tissues. Here, we advance the application of spatial transcriptomics at scale, by presenting Spatial Multi-Omics (SM-Omics) as a fully automated, high-throughput all-sequencing based platform for combined and spatially resolved transcriptomics and antibody-based protein measurements. SM-Omics uses DNA-barcoded antibodies, immunofluorescence or a combination thereof, to scale and combine spatial transcriptomics and spatial antibody-based multiplex protein detection. SM-Omics allows processing of up to 64 in situ spatial reactions or up to 96 sequencing-ready libraries, of high complexity, in a ~2 days process. We demonstrate SM-Omics in the mouse brain, spleen and colorectal cancer model, showing its broad utility as a high-throughput platform for spatial multi-omics.
Collapse
Affiliation(s)
- S Vickovic
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA. .,Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA. .,New York Genome Center, New York, NY, USA. .,Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Solna, Sweden.
| | - B Lötstedt
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - J Klughammer
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - S Mages
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Å Segerstolpe
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - O Rozenblatt-Rosen
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Genentech, 1 DNA Way, South San Francisco, CA, USA
| | - A Regev
- Klarman Cell Observatory Broad Institute of MIT and Harvard, Cambridge, MA, USA. .,Howard Hughes Medical Institute and Koch Institute for Integrative Cancer Research, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA. .,Genentech, 1 DNA Way, South San Francisco, CA, USA.
| |
Collapse
|
200
|
Spatial components of molecular tissue biology. Nat Biotechnol 2022; 40:308-318. [PMID: 35132261 DOI: 10.1038/s41587-021-01182-1] [Citation(s) in RCA: 109] [Impact Index Per Article: 54.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 12/03/2021] [Indexed: 02/06/2023]
Abstract
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analysis of tissues and develop the requisite computational tools to address them. Developers of analytical tools need to decide on the intrinsic molecular features of each cell that need to be considered, and how cell shape and morphological features are incorporated into the analysis. Also, optimal ways to compare different tissue samples at various length scales are still being sought. Grouping these biological problems and related computational algorithms into classes across length scales, thus characterizing common issues that need to be addressed, will facilitate further progress in spatial transcriptomics and proteomics.
Collapse
|