1
|
He S, Jin Y, Nazaret A, Shi L, Chen X, Rampersaud S, Dhillon BS, Valdez I, Friend LE, Fan JL, Park CY, Mintz RL, Lao YH, Carrera D, Fang KW, Mehdi K, Rohde M, McFaline-Figueroa JL, Blei D, Leong KW, Rudensky AY, Plitas G, Azizi E. Starfysh integrates spatial transcriptomic and histologic data to reveal heterogeneous tumor-immune hubs. Nat Biotechnol 2024:10.1038/s41587-024-02173-8. [PMID: 38514799 PMCID: PMC11415552 DOI: 10.1038/s41587-024-02173-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/14/2024] [Indexed: 03/23/2024]
Abstract
Spatially resolved gene expression profiling provides insight into tissue organization and cell-cell crosstalk; however, sequencing-based spatial transcriptomics (ST) lacks single-cell resolution. Current ST analysis methods require single-cell RNA sequencing data as a reference for rigorous interpretation of cell states, mostly do not use associated histology images and are not capable of inferring shared neighborhoods across multiple tissues. Here we present Starfysh, a computational toolbox using a deep generative model that incorporates archetypal analysis and any known cell type markers to characterize known or new tissue-specific cell states without a single-cell reference. Starfysh improves the characterization of spatial dynamics in complex tissues using histology images and enables the comparison of niches as spatial hubs across tissues. Integrative analysis of primary estrogen receptor (ER)-positive breast cancer, triple-negative breast cancer (TNBC) and metaplastic breast cancer (MBC) tissues led to the identification of spatial hubs with patient- and disease-specific cell type compositions and revealed metabolic reprogramming shaping immunosuppressive hubs in aggressive MBC.
Collapse
Affiliation(s)
- Siyu He
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Yinuo Jin
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Achille Nazaret
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Lingting Shi
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Xueer Chen
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Sham Rampersaud
- Pharmaceutical Sciences and Pharmacogenomics Graduate Program, University of California, San Francisco, San Francisco, CA, USA
| | - Bahawar S Dhillon
- Immunology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Izabella Valdez
- The Graduate School of Biomedical Sciences at the Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lauren E Friend
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Joy Linyue Fan
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Cameron Y Park
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
| | - Rachel L Mintz
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO, USA
| | - Yeh-Hsing Lao
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Department of Pharmaceutical Sciences, University at Buffalo, the State University of New York, Buffalo, NY, USA
| | - David Carrera
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Kaylee W Fang
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Kaleem Mehdi
- Department of Computer Science, Fordham University, New York, NY, USA
| | | | - José L McFaline-Figueroa
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
| | - David Blei
- Department of Computer Science, Columbia University, New York, NY, USA
- Department of Statistics, Columbia University, New York, NY, USA
| | - Kam W Leong
- Department of Biomedical Engineering, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Alexander Y Rudensky
- Immunology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Ludwig Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | - George Plitas
- Immunology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Howard Hughes Medical Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Ludwig Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
- Department of Surgery, Breast Service, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
| | - Elham Azizi
- Department of Biomedical Engineering, Columbia University, New York, NY, USA.
- Irving Institute for Cancer Dynamics, Columbia University, New York, NY, USA.
- Department of Computer Science, Columbia University, New York, NY, USA.
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA.
- Data Science Institute, Columbia University, New York, NY, USA.
| |
Collapse
|
2
|
Zhu B, Wang Y, Ku LT, van Dijk D, Zhang L, Hafler DA, Zhao H. scNAT: a deep learning method for integrating paired single-cell RNA and T cell receptor sequencing profiles. Genome Biol 2023; 24:292. [PMID: 38111007 PMCID: PMC10726524 DOI: 10.1186/s13059-023-03129-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 11/27/2023] [Indexed: 12/20/2023] Open
Abstract
Many deep learning-based methods have been proposed to handle complex single-cell data. Deep learning approaches may also prove useful to jointly analyze single-cell RNA sequencing (scRNA-seq) and single-cell T cell receptor sequencing (scTCR-seq) data for novel discoveries. We developed scNAT, a deep learning method that integrates paired scRNA-seq and scTCR-seq data to represent data in a unified latent space for downstream analysis. We demonstrate that scNAT is capable of removing batch effects, and identifying cell clusters and a T cell migration trajectory from blood to cerebrospinal fluid in multiple sclerosis.
Collapse
Affiliation(s)
- Biqing Zhu
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, USA, MD , 20815
| | - Yuge Wang
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, 06511, USA
| | - Li-Ting Ku
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, 06511, USA
| | - David van Dijk
- Department of Internal Medicine, Yale School of Medicine, New Haven, CT, 06511, USA
- Department of Computer Science, Yale University, New Haven, CT, 06511, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, USA, MD , 20815
| | - Le Zhang
- Department of Neuroscience, School of Medicine, Yale University, New Haven, CT, 06511, USA
- Department of Immunobiology, School of Medicine, Yale University, New Haven, CT, 06511, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, USA, MD , 20815
| | - David A Hafler
- Department of Neurology, School of Medicine, Yale University, New Haven, CT, 06511, USA
- Department of Immunobiology, School of Medicine, Yale University, New Haven, CT, 06511, USA
- Aligning Science Across Parkinson's (ASAP) Collaborative Research Network, Chevy Chase, USA, MD , 20815
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA.
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, 06511, USA.
| |
Collapse
|
4
|
Paas-Oliveros E, Hernández-Lemus E, de Anda-Jáuregui G. Computational single cell oncology: state of the art. Front Genet 2023; 14:1256991. [PMID: 38028624 PMCID: PMC10663273 DOI: 10.3389/fgene.2023.1256991] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
Single cell computational analysis has emerged as a powerful tool in the field of oncology, enabling researchers to decipher the complex cellular heterogeneity that characterizes cancer. By leveraging computational algorithms and bioinformatics approaches, this methodology provides insights into the underlying genetic, epigenetic and transcriptomic variations among individual cancer cells. In this paper, we present a comprehensive overview of single cell computational analysis in oncology, discussing the key computational techniques employed for data processing, analysis, and interpretation. We explore the challenges associated with single cell data, including data quality control, normalization, dimensionality reduction, clustering, and trajectory inference. Furthermore, we highlight the applications of single cell computational analysis, including the identification of novel cell states, the characterization of tumor subtypes, the discovery of biomarkers, and the prediction of therapy response. Finally, we address the future directions and potential advancements in the field, including the development of machine learning and deep learning approaches for single cell analysis. Overall, this paper aims to provide a roadmap for researchers interested in leveraging computational methods to unlock the full potential of single cell analysis in understanding cancer biology with the goal of advancing precision oncology. For this purpose, we also include a notebook that instructs on how to apply the recommended tools in the Preprocessing and Quality Control section.
Collapse
Affiliation(s)
- Ernesto Paas-Oliveros
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
- Investigadores por Mexico, Conahcyt, Mexico City, Mexico
| |
Collapse
|
6
|
Ten FW, Yuan D, Jabareen N, Phua YJ, Eils R, Lukassen S, Conrad C. resVAE ensemble: Unsupervised identification of gene sets in multi-modal single-cell sequencing data using deep ensembles. Front Cell Dev Biol 2023; 11:1091047. [PMID: 36875765 PMCID: PMC9975353 DOI: 10.3389/fcell.2023.1091047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 01/24/2023] [Indexed: 02/17/2023] Open
Abstract
Feature identification and manual inspection is currently still an integral part of biological data analysis in single-cell sequencing. Features such as expressed genes and open chromatin status are selectively studied in specific contexts, cell states or experimental conditions. While conventional analysis methods construct a relatively static view on gene candidates, artificial neural networks have been used to model their interactions after hierarchical gene regulatory networks. However, it is challenging to identify consistent features in this modeling process due to the inherently stochastic nature of these methods. Therefore, we propose using ensembles of autoencoders and subsequent rank aggregation to extract consensus features in a less biased manner. Here, we performed sequencing data analyses of different modalities either independently or simultaneously as well as with other analysis tools. Our resVAE ensemble method can successfully complement and find additional unbiased biological insights with minimal data processing or feature selection steps while giving a measurement of confidence, especially for models using stochastic or approximation algorithms. In addition, our method can also work with overlapping clustering identity assignment suitable for transitionary cell types or cell fates in comparison to most conventional tools.
Collapse
Affiliation(s)
- Foo Wei Ten
- Center for Digital Health, Berlin Institute of Health (BIH) at Charité-Universitatsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Dongsheng Yuan
- Center for Digital Health, Berlin Institute of Health (BIH) at Charité-Universitatsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany.,Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Department of Neurology with Experimental Neurology, Berlin, Germany
| | - Nabil Jabareen
- Center for Digital Health, Berlin Institute of Health (BIH) at Charité-Universitatsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Yin Jun Phua
- Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
| | - Roland Eils
- Center for Digital Health, Berlin Institute of Health (BIH) at Charité-Universitatsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany.,Health Data Science Unit, Faculty of Medicine, University of Heidelberg, Heidelberg, Germany
| | - Sören Lukassen
- Center for Digital Health, Berlin Institute of Health (BIH) at Charité-Universitatsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Christian Conrad
- Center for Digital Health, Berlin Institute of Health (BIH) at Charité-Universitatsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|