1
|
Caron DP, Specht WL, Chen D, Wells SB, Szabo PA, Jensen IJ, Farber DL, Sims PA. Multimodal hierarchical classification of CITE-seq data delineates immune cell states across lineages and tissues. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.06.547944. [PMID: 37461466 PMCID: PMC10350048 DOI: 10.1101/2023.07.06.547944] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/27/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) is invaluable for profiling cellular heterogeneity and dissecting transcriptional states, but transcriptomic profiles do not always delineate subsets defined by surface proteins, as in cells of the immune system. Cellular Indexing of Transcriptomes and Epitopes (CITE-seq) enables simultaneous profiling of single-cell transcriptomes and surface proteomes; however, accurate cell type annotation requires a classifier that integrates multimodal data. Here, we describe MultiModal Classifier Hierarchy (MMoCHi), a marker-based approach for classification, reconciling gene and protein expression without reliance on reference atlases. We benchmark MMoCHi using sorted T lymphocyte subsets and annotate a cross-tissue human immune cell dataset. MMoCHi outperforms leading transcriptome-based classifiers and multimodal unsupervised clustering in its ability to identify immune cell subsets that are not readily resolved and to reveal novel subset markers. MMoCHi is designed for adaptability and can integrate annotation of cell types and developmental states across diverse lineages, samples, or modalities.
Collapse
Affiliation(s)
- Daniel P. Caron
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY, USA
| | - William L. Specht
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY, USA
| | - David Chen
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Steven B. Wells
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
| | - Peter A. Szabo
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY, USA
| | - Isaac J. Jensen
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY, USA
| | - Donna L. Farber
- Department of Microbiology and Immunology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Surgery, Columbia University Irving Medical Center, New York, NY, USA
| | - Peter A. Sims
- Department of Systems Biology, Columbia University Irving Medical Center, New York, NY, USA
- Department of Biochemistry and Molecular Biophysics, Columbia University Irving Medical Center, New York, NY, USA
| |
Collapse
|
2
|
Feng S, Calinawan A, Pugliese P, Wang P, Ceccarelli M, Petralia F, Gosline SJC. Decomprolute is a benchmarking platform designed for multiomics-based tumor deconvolution. CELL REPORTS METHODS 2024; 4:100708. [PMID: 38412834 PMCID: PMC10921018 DOI: 10.1016/j.crmeth.2024.100708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 10/23/2023] [Accepted: 01/18/2024] [Indexed: 02/29/2024]
Abstract
Tumor deconvolution enables the identification of diverse cell types that comprise solid tumors. To date, however, both the algorithms developed to deconvolve tumor samples, and the gold-standard datasets used to assess the algorithms are geared toward the analysis of gene expression (e.g., RNA sequencing) rather than protein levels. Despite the popularity of gene expression datasets, protein levels often provide a more accurate view of rare cell types. To facilitate the use, development, and reproducibility of multiomic deconvolution algorithms, we introduce Decomprolute, a Common Workflow Language framework that leverages containerization to compare tumor deconvolution algorithms across multiomic datasets. Decomprolute incorporates the large-scale multiomic datasets produced by the Clinical Proteomic Tumor Analysis Consortium (CPTAC), which include matched mRNA expression and proteomic data from thousands of tumors across multiple cancer types to build a fully open-source, containerized proteogenomic tumor deconvolution benchmarking platform. http://pnnl-compbio.github.io/decomprolute.
Collapse
Affiliation(s)
- Song Feng
- Pacific Northwest National Laboratory, Seattle, WA, USA
| | - Anna Calinawan
- Icahn School of Medicine at Mount Sinai School, New York, NY, USA
| | | | - Pei Wang
- Icahn School of Medicine at Mount Sinai School, New York, NY, USA
| | | | | | | |
Collapse
|
3
|
Jiang Y, Chen Z, Han N, Shang J, Wu A. sc-ImmuCC: hierarchical annotation for immune cell types in single-cell RNA-seq. Front Immunol 2023; 14:1223471. [PMID: 37545533 PMCID: PMC10399579 DOI: 10.3389/fimmu.2023.1223471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/03/2023] [Indexed: 08/08/2023] Open
Abstract
Accurately identifying immune cell types in single-cell RNA-sequencing (scRNA-Seq) data is critical to uncovering immune responses in health or disease conditions. However, the high heterogeneity and sparsity of scRNA-Seq data, as well as the similarity in gene expression among immune cell types, poses a great challenge for accurate identification of immune cell types in scRNA-Seq data. Here, we developed a tool named sc-ImmuCC for hierarchical annotation of immune cell types from scRNA-Seq data, based on the optimized gene sets and ssGSEA algorithm. sc-ImmuCC simulates the natural differentiation of immune cells, and the hierarchical annotation includes three layers, which can annotate nine major immune cell types and 29 cell subtypes. The test results showed its stable performance and strong consistency among different tissue datasets with average accuracy of 71-90%. In addition, the optimized gene sets and hierarchical annotation strategy could be applied to other methods to improve their annotation accuracy and the spectrum of annotated cell types and subtypes. We also applied sc-ImmuCC to a dataset composed of COVID-19, influenza, and healthy donors, and found that the proportion of monocytes in patients with COVID-19 and influenza was significantly higher than that in healthy people. The easy-to-use sc-ImmuCC tool provides a good way to comprehensively annotate immune cell types from scRNA-Seq data, and will also help study the immune mechanism underlying physiological and pathological conditions.
Collapse
|
4
|
Khodayari S, Khodayari H, Saeedi E, Mahmoodzadeh H, Sadrkhah A, Nayernia K. Single-Cell Transcriptomics for Unlocking Personalized Cancer Immunotherapy: Toward Targeting the Origin of Tumor Development Immunogenicity. Cancers (Basel) 2023; 15:3615. [PMID: 37509276 PMCID: PMC10377122 DOI: 10.3390/cancers15143615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 07/11/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023] Open
Abstract
Cancer immunotherapy is a promising approach for treating malignancies through the activation of anti-tumor immunity. However, the effectiveness and safety of immunotherapy can be limited by tumor complexity and heterogeneity, caused by the diverse molecular and cellular features of tumors and their microenvironments. Undifferentiated tumor cell niches, which we refer to as the "Origin of Tumor Development" (OTD) cellular population, are believed to be the source of these variations and cellular heterogeneity. From our perspective, the existence of distinct features within the OTD is expected to play a significant role in shaping the unique tumor characteristics observed in each patient. Single-cell transcriptomics is a high-resolution and high-throughput technique that provides insights into the genetic signatures of individual tumor cells, revealing mechanisms of tumor development, progression, and immune evasion. In this review, we explain how single-cell transcriptomics can be used to develop personalized cancer immunotherapy by identifying potential biomarkers and targets specific to each patient, such as immune checkpoint and tumor-infiltrating lymphocyte function, for targeting the OTD. Furthermore, in addition to offering a possible workflow, we discuss the future directions of, and perspectives on, single-cell transcriptomics, such as the development of powerful analytical tools and databases, that will aid in unlocking personalized cancer immunotherapy through the targeting of the patient's cellular OTD.
Collapse
Affiliation(s)
- Saeed Khodayari
- International Center for Personalized Medicine (P7MEDICINE), Luise-Rainer-Str. 6-12, 40235 Düsseldorf, Germany
| | - Hamid Khodayari
- International Center for Personalized Medicine (P7MEDICINE), Luise-Rainer-Str. 6-12, 40235 Düsseldorf, Germany
| | - Elnaz Saeedi
- Oxford Clinical Trials Research Unit, Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences (NDORMS), University of Oxford, Oxford OX3 7LD, UK
| | - Habibollah Mahmoodzadeh
- Breast Disease Research Center, Tehran University of Medical Sciences, Tehran 1819613844, Iran
| | | | - Karim Nayernia
- International Center for Personalized Medicine (P7MEDICINE), Luise-Rainer-Str. 6-12, 40235 Düsseldorf, Germany
| |
Collapse
|
5
|
Wang H, Yao Z, Luo R, Liu J, Wang Z, Zhang G. LaCOme: Learning the latent convolutional patterns among transcriptomic features to improve classifications. Gene 2023; 862:147246. [PMID: 36736509 DOI: 10.1016/j.gene.2023.147246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 12/22/2022] [Accepted: 01/27/2023] [Indexed: 02/04/2023]
Abstract
OMIC is a novel approach that analyses entire genetic or molecular profiles in humans and other organisms. It involves identifying and quantifying biological molecules that contribute to a species' structure, function, and dynamics. Finding the secrets of OMIC is like deciphering the biochemical code, but building data-driven models to mine the hidden phenotypic trait information has been a research hotspot. Transcriptome analysis is a popular biological technology for characterizing living systems' overall health, including cells and tissues. Individual transcript expression levels are known to be correlated with those of other transcripts. Nevertheless, most computational studies do not fully exploit these inter-feature correlations. Differential expression analyses, for example, assume that the expression levels of the transcripts are independent. Thus, we propose extracting these inter-feature correlations using the convolutional neural network (CNN) and transforming the transcriptomic features into a new space of convolutional transcriptomic (LaCOme) features. On most transcriptomic datasets in use, a series of comprehensive experiments have demonstrated that engineered LaCOme features outperform the original transcriptomic features in classification performances. Based on experimental results, OMIC data from biological samples could be further enriched using CNN to enhance computational analysis results. Also, feature rough screening can be used to extract valuable information from OMIC, regardless of the algorithm used to select features. It may always be better to create a novel feature than to keep the original. Furthermore, we investigated the feasibility of the feature construction method through cross-validation and independent verification, hoping to develop a more efficient and effective method.
Collapse
Affiliation(s)
- Hongyu Wang
- Department of Nuclear Medicine, General Hospital of Northern Theater Command, Shenyang, Liaoning 110016, China; College of Software, Jilin University, Changchun, Jilin 130012, China
| | - Zhaomin Yao
- Department of Nuclear Medicine, General Hospital of Northern Theater Command, Shenyang, Liaoning 110016, China; College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning 110167, China
| | - Renli Luo
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning 110167, China
| | - Jiahao Liu
- School of Mathematical Sciences, Chongqing Normal University, Chongqing 401331, China
| | - Zhiguo Wang
- Department of Nuclear Medicine, General Hospital of Northern Theater Command, Shenyang, Liaoning 110016, China; College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning 110167, China.
| | - Guoxu Zhang
- Department of Nuclear Medicine, General Hospital of Northern Theater Command, Shenyang, Liaoning 110016, China; College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning 110167, China.
| |
Collapse
|
6
|
Brendel M, Su C, Bai Z, Zhang H, Elemento O, Wang F. Application of Deep Learning on Single-cell RNA Sequencing Data Analysis: A Review. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:814-835. [PMID: 36528240 PMCID: PMC10025684 DOI: 10.1016/j.gpb.2022.11.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2022] [Revised: 08/17/2022] [Accepted: 11/24/2022] [Indexed: 12/23/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously. Analysis of scRNA-seq data plays an important role in the study of cell states and phenotypes, and has helped elucidate biological processes, such as those occurring during the development of complex organisms, and improved our understanding of disease states, such as cancer, diabetes, and coronavirus disease 2019 (COVID-19). Deep learning, a recent advance of artificial intelligence that has been used to address many problems involving large datasets, has also emerged as a promising tool for scRNA-seq data analysis, as it has a capacity to extract informative and compact features from noisy, heterogeneous, and high-dimensional scRNA-seq data to improve downstream analysis. The present review aims at surveying recently developed deep learning techniques in scRNA-seq data analysis, identifying key steps within the scRNA-seq data analysis pipeline that have been advanced by deep learning, and explaining the benefits of deep learning over more conventional analytic tools. Finally, we summarize the challenges in current deep learning approaches faced within scRNA-seq data and discuss potential directions for improvements in deep learning algorithms for scRNA-seq data analysis.
Collapse
Affiliation(s)
- Matthew Brendel
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA; Institute for Computational Biomedicine, Caryl and Israel Englander Institute for Precision Medicine, Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
| | - Chang Su
- Department of Health Service Administration and Policy, Temple University, Philadelphia, PA 19122, USA.
| | - Zilong Bai
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
| | - Hao Zhang
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
| | - Olivier Elemento
- Institute for Computational Biomedicine, Caryl and Israel Englander Institute for Precision Medicine, Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY 10065, USA.
| |
Collapse
|