1
|
Li Y, Luo Y. STdGCN: spatial transcriptomic cell-type deconvolution using graph convolutional networks. Genome Biol 2024; 25:206. [PMID: 39103939 DOI: 10.1186/s13059-024-03353-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Accepted: 07/26/2024] [Indexed: 08/07/2024] Open
Abstract
Spatially resolved transcriptomics integrates high-throughput transcriptome measurements with preserved spatial cellular organization information. However, many technologies cannot reach single-cell resolution. We present STdGCN, a graph model leveraging single-cell RNA sequencing (scRNA-seq) as reference for cell-type deconvolution in spatial transcriptomic (ST) data. STdGCN incorporates expression profiles from scRNA-seq and spatial localization from ST data for deconvolution. Extensive benchmarking on multiple datasets demonstrates that STdGCN outperforms 17 state-of-the-art models. In a human breast cancer Visium dataset, STdGCN delineates stroma, lymphocytes, and cancer cells, aiding tumor microenvironment analysis. In human heart ST data, STdGCN identifies changes in endothelial-cardiomyocyte communications during tissue development.
Collapse
Affiliation(s)
- Yawei Li
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Center for Collaborative AI in Healthcare, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Yuan Luo
- Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Center for Collaborative AI in Healthcare, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
2
|
Görtler F, Mensching-Buhr M, Skaar Ø, Schrod S, Sterr T, Schäfer A, Beißbarth T, Joshi A, Zacharias HU, Grellscheid SN, Altenbuchinger M. Adaptive digital tissue deconvolution. Bioinformatics 2024; 40:i100-i109. [PMID: 38940181 PMCID: PMC11256946 DOI: 10.1093/bioinformatics/btae263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION The inference of cellular compositions from bulk and spatial transcriptomics data increasingly complements data analyses. Multiple computational approaches were suggested and recently, machine learning techniques were developed to systematically improve estimates. Such approaches allow to infer additional, less abundant cell types. However, they rely on training data which do not capture the full biological diversity encountered in transcriptomics analyses; data can contain cellular contributions not seen in the training data and as such, analyses can be biased or blurred. Thus, computational approaches have to deal with unknown, hidden contributions. Moreover, most methods are based on cellular archetypes which serve as a reference; e.g. a generic T-cell profile is used to infer the proportion of T-cells. It is well known that cells adapt their molecular phenotype to the environment and that pre-specified cell archetypes can distort the inference of cellular compositions. RESULTS We propose Adaptive Digital Tissue Deconvolution (ADTD) to estimate cellular proportions of pre-selected cell types together with possibly unknown and hidden background contributions. Moreover, ADTD adapts prototypic reference profiles to the molecular environment of the cells, which further resolves cell-type specific gene regulation from bulk transcriptomics data. We verify this in simulation studies and demonstrate that ADTD improves existing approaches in estimating cellular compositions. In an application to bulk transcriptomics data from breast cancer patients, we demonstrate that ADTD provides insights into cell-type specific molecular differences between breast cancer subtypes. AVAILABILITY AND IMPLEMENTATION A python implementation of ADTD and a tutorial are available at Gitlab and zenodo (doi:10.5281/zenodo.7548362).
Collapse
Affiliation(s)
- Franziska Görtler
- Computational Biology Unit, Department of Biological Sciences, University of Bergen, N-5008 Bergen, Norway
- Department of Oncology and Medical Physics, Haukeland University Hospital, 5021 Bergen, Norway
| | - Malte Mensching-Buhr
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Ørjan Skaar
- Department of Informatics, Computational Biology Unit, University of Bergen, N-5008 Bergen, Norway
| | - Stefan Schrod
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Thomas Sterr
- Institute of Theoretical Physics, University of Regensburg, 93053 Regensburg, Germany
| | - Andreas Schäfer
- Institute of Theoretical Physics, University of Regensburg, 93053 Regensburg, Germany
| | - Tim Beißbarth
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Anagha Joshi
- Department of Clinical Science, Computational Biology Unit, University of Bergen, N-5008 Bergen, Norway
| | - Helena U Zacharias
- Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, 30625 Hannover, Germany
| | | | - Michael Altenbuchinger
- Department of Medical Bioinformatics, University Medical Center Göttingen, 37075 Göttingen, Germany
| |
Collapse
|
3
|
Xue Y, Friedl V, Ding H, Wong CK, Stuart JM. Single-cell signatures identify microenvironment factors in tumors associated with patient outcomes. CELL REPORTS METHODS 2024; 4:100799. [PMID: 38889686 PMCID: PMC11228369 DOI: 10.1016/j.crmeth.2024.100799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 04/30/2024] [Accepted: 05/21/2024] [Indexed: 06/20/2024]
Abstract
The cellular components of tumors and their microenvironment play pivotal roles in tumor progression, patient survival, and the response to cancer treatments. Unveiling a comprehensive cellular profile within bulk tumors via single-cell RNA sequencing (scRNA-seq) data is crucial, as it unveils intrinsic tumor cellular traits that elude identification through conventional cancer subtyping methods. Our contribution, scBeacon, is a tool that derives cell-type signatures by integrating and clustering multiple scRNA-seq datasets to extract signatures for deconvolving unrelated tumor datasets on bulk samples. Through the employment of scBeacon on the The Cancer Genome Atlas (TCGA) cohort, we find cellular and molecular attributes within specific tumor categories, many with patient outcome relevance. We developed a tumor cell-type map to visually depict the relationships among TCGA samples based on the cell-type inferences.
Collapse
Affiliation(s)
- Yuanqing Xue
- UC Santa Cruz Department, Biomolecular Engineering, Genomics Institute, Santa Cruz, CA, USA
| | - Verena Friedl
- UC Santa Cruz Department, Biomolecular Engineering, Genomics Institute, Santa Cruz, CA, USA
| | - Hongxu Ding
- UC Santa Cruz Department, Biomolecular Engineering, Genomics Institute, Santa Cruz, CA, USA
| | - Christopher K Wong
- UC Santa Cruz Department, Biomolecular Engineering, Genomics Institute, Santa Cruz, CA, USA
| | - Joshua M Stuart
- UC Santa Cruz Department, Biomolecular Engineering, Genomics Institute, Santa Cruz, CA, USA.
| |
Collapse
|
4
|
Nguyen H, Nguyen H, Tran D, Draghici S, Nguyen T. Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res 2024; 52:4761-4783. [PMID: 38619038 PMCID: PMC11109966 DOI: 10.1093/nar/gkae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/01/2024] [Accepted: 04/02/2024] [Indexed: 04/16/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Advaita Bioinformatics, Ann Arbor, MI, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
5
|
Li Y, Luo Y. Spatial Transcriptomic Cell-type Deconvolution Using Graph Neural Networks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.10.532112. [PMID: 37333198 PMCID: PMC10274700 DOI: 10.1101/2023.03.10.532112] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Spatially resolved transcriptomics performs high-throughput measurement of transcriptomes while preserving spatial information about the cellular organizations. However, many spatially resolved transcriptomic technologies can only distinguish spots consisting of a mixture of cells instead of working at single-cell resolution. Here, we present STdGCN, a graph neural network model designed for cell type deconvolution of spatial transcriptomic (ST) data that can leverage abundant single-cell RNA sequencing (scRNA-seq) data as reference. STdGCN is the first model incorporating the expression profiles from single cell data as well as the spatial localization information from the ST data for cell type deconvolution. Extensive benchmarking experiments on multiple ST datasets showed that STdGCN outperformed 14 published state-of-the-art models. Applied to a human breast cancer Visium dataset, STdGCN discerned spatial distributions between stroma, lymphocytes and cancer cells for tumor microenvironment dissection. In a human heart ST dataset, STdGCN detected the changes of potential endothelial-cardiomyocyte communications during tissue development.
Collapse
Affiliation(s)
- Yawei Li
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
- Center for Collaborative AI in Healthcare, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Yuan Luo
- Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
- Center for Collaborative AI in Healthcare, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA
| |
Collapse
|
6
|
Luo J, Wu X, Cheng Y, Chen G, Wang J, Song X. Expression quantitative trait locus studies in the era of single-cell omics. Front Genet 2023; 14:1182579. [PMID: 37284065 PMCID: PMC10239882 DOI: 10.3389/fgene.2023.1182579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 04/26/2023] [Indexed: 06/08/2023] Open
Abstract
Genome-wide association studies have revealed that the regulation of gene expression bridges genetic variants and complex phenotypes. Profiling of the bulk transcriptome coupled with linkage analysis (expression quantitative trait locus (eQTL) mapping) has advanced our understanding of the relationship between genetic variants and gene regulation in the context of complex phenotypes. However, bulk transcriptomics has inherited limitations as the regulation of gene expression tends to be cell-type-specific. The advent of single-cell RNA-seq technology now enables the identification of the cell-type-specific regulation of gene expression through a single-cell eQTL (sc-eQTL). In this review, we first provide an overview of sc-eQTL studies, including data processing and the mapping procedure of the sc-eQTL. We then discuss the benefits and limitations of sc-eQTL analyses. Finally, we present an overview of the current and future applications of sc-eQTL discoveries.
Collapse
Affiliation(s)
- Jie Luo
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xinyi Wu
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Yuan Cheng
- Institute of Vegetables, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Guang Chen
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Jian Wang
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| | - Xijiao Song
- State Key Laboratory for Managing Biotic and Chemical Threats to The Quality and Safety of Agro‐products, Zhejiang Academy of Agricultural Sciences, Hangzhou, China
| |
Collapse
|
7
|
Jiang X, Luo D, Fern Ndez E, Yang J, Li H, Jin KW, Zhan Y, Yao B, Bedi S, Xiao G, Zhan X, Li Q, Xie Y. Spatial Transcriptomics Arena (STAr): an Integrated Platform for Spatial Transcriptomics Methodology Research. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.10.532127. [PMID: 36945650 PMCID: PMC10028992 DOI: 10.1101/2023.03.10.532127] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
The emerging field of spatially resolved transcriptomics (SRT) has revolutionized biomedical research. SRT quantifies expression levels at different spatial locations, providing a new and powerful tool to interrogate novel biological insights. An essential question in the analysis of SRT data is to identify spatially variable (SV) genes; the expression levels of such genes have spatial variation across different tissues. SV genes usually play an important role in underlying biological mechanisms and tissue heterogeneity. Currently, several computational methods have been developed to detect such genes; however, there is a lack of unbiased assessment of these approaches to guide researchers in selecting the appropriate methods for their specific biomedical applications. In addition, it is difficult for researchers to implement different existing methods for either biological study or methodology development. Furthermore, currently available public SRT datasets are scattered across different websites and preprocessed in different ways, posing additional obstacles for quantitative researchers developing computational methods for SRT data analysis. To address these challenges, we designed Spatial Transcriptomics Arena (STAr), an open platform comprising 193 curated datasets from seven technologies, seven statistical methods, and analysis results. This resource allows users to retrieve high-quality datasets, apply or develop spatial gene detection methods, as well as browse and compare spatial gene analysis results. It also enables researchers to comprehensively evaluate SRT methodology research in both simulated and real datasets. Altogether, STAr is an integrated research resource intended to promote reproducible research and accelerate rigorous methodology development, which can eventually lead to an improved understanding of biological processes and diseases. STAr can be accessed at https://lce.biohpc.swmed.edu/star/ .
Collapse
|
8
|
Knowledge-graph-based cell-cell communication inference for spatially resolved transcriptomic data with SpaTalk. Nat Commun 2022; 13:4429. [PMID: 35908020 PMCID: PMC9338929 DOI: 10.1038/s41467-022-32111-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 07/18/2022] [Indexed: 12/19/2022] Open
Abstract
Spatially resolved transcriptomics provides genetic information in space toward elucidation of the spatial architecture in intact organs and the spatially resolved cell-cell communications mediating tissue homeostasis, development, and disease. To facilitate inference of spatially resolved cell-cell communications, we here present SpaTalk, which relies on a graph network and knowledge graph to model and score the ligand-receptor-target signaling network between spatially proximal cells by dissecting cell-type composition through a non-negative linear model and spatial mapping between single-cell transcriptomic and spatially resolved transcriptomic data. The benchmarked performance of SpaTalk on public single-cell spatial transcriptomic datasets is superior to that of existing inference methods. Then we apply SpaTalk to STARmap, Slide-seq, and 10X Visium data, revealing the in-depth communicative mechanisms underlying normal and disease tissues with spatial structure. SpaTalk can uncover spatially resolved cell-cell communications for single-cell and spot-based spatially resolved transcriptomic data universally, providing valuable insights into spatial inter-cellular tissue dynamics. Cell-cell communication is a vital feature involving numerous biological processes. Here, the authors develop SpaTalk, a cell-cell communication inference method using knowledge graph for spatially resolved transcriptomic data, providing valuable insights into spatial intercellular tissue dynamics.
Collapse
|
9
|
Detection of Cell Separation-Induced Gene Expression Through a Penalized Deconvolution Approach. STATISTICS IN BIOSCIENCES 2022. [DOI: 10.1007/s12561-022-09344-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
10
|
Comprehensive evaluation of deconvolution methods for human brain gene expression. Nat Commun 2022; 13:1358. [PMID: 35292647 PMCID: PMC8924248 DOI: 10.1038/s41467-022-28655-4] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Accepted: 01/28/2022] [Indexed: 11/08/2022] Open
Abstract
Transcriptome deconvolution aims to estimate the cellular composition of an RNA sample from its gene expression data, which in turn can be used to correct for composition differences across samples. The human brain is unique in its transcriptomic diversity, and comprises a complex mixture of cell-types, including transcriptionally similar subtypes of neurons. Here, we carry out a comprehensive evaluation of deconvolution methods for human brain transcriptome data, and assess the tissue-specificity of our key observations by comparison with human pancreas and heart. We evaluate eight transcriptome deconvolution approaches and nine cell-type signatures, testing the accuracy of deconvolution using in silico mixtures of single-cell RNA-seq data, RNA mixtures, as well as nearly 2000 human brain samples. Our results identify the main factors that drive deconvolution accuracy for brain data, and highlight the importance of biological factors influencing cell-type signatures, such as brain region and in vitro cell culturing. Transcriptome deconvolution aims to estimate cellular composition based on gene expression data. Here the authors evaluate deconvolution methods for human brain transcriptome and conclude that partial deconvolution algorithms work best, but that appropriate cell-type signatures are also important.
Collapse
|
11
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
12
|
Marquez-Galera A, de la Prida LM, Lopez-Atalaya JP. A protocol to extract cell-type-specific signatures from differentially expressed genes in bulk-tissue RNA-seq. STAR Protoc 2022; 3:101121. [PMID: 35118429 PMCID: PMC8792262 DOI: 10.1016/j.xpro.2022.101121] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Bulk-tissue RNA-seq is widely used to dissect variation in gene expression levels across tissues and under different experimental conditions. Here, we introduce a protocol that leverages existing single-cell expression data to deconvolve patterns of cell-type-specific gene expression in differentially expressed gene lists from highly heterogeneous tissue. We apply this protocol to interrogate cell-type-specific gene expression and variation in cell type composition between the distinct sublayers of the hippocampal CA1 region of the brain in a rodent model of epilepsy. For complete details on the use and execution of this protocol, please refer to Cid et al. (2021). A protocol to explore gene signatures from bulk RNA-seq at the cell-type-specific level Deconvolution of complex gene signatures from highly heterogeneous tissues Publicly available single-cell gene expression dataset is retrieved and curated Gene signatures across brain regions and disease states are surveyed in scRNA-seq data
Collapse
|
13
|
Longo SK, Guo MG, Ji AL, Khavari PA. Integrating single-cell and spatial transcriptomics to elucidate intercellular tissue dynamics. Nat Rev Genet 2021; 22:627-644. [PMID: 34145435 PMCID: PMC9888017 DOI: 10.1038/s41576-021-00370-8] [Citation(s) in RCA: 385] [Impact Index Per Article: 128.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/29/2021] [Indexed: 02/07/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) identifies cell subpopulations within tissue but does not capture their spatial distribution nor reveal local networks of intercellular communication acting in situ. A suite of recently developed techniques that localize RNA within tissue, including multiplexed in situ hybridization and in situ sequencing (here defined as high-plex RNA imaging) and spatial barcoding, can help address this issue. However, no method currently provides as complete a scope of the transcriptome as does scRNA-seq, underscoring the need for approaches to integrate single-cell and spatial data. Here, we review efforts to integrate scRNA-seq with spatial transcriptomics, including emerging integrative computational methods, and propose ways to effectively combine current methodologies.
Collapse
Affiliation(s)
- Sophia K. Longo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Margaret G. Guo
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA,Program in Biomedical Informatics, Stanford University, Stanford, CA, USA
| | - Andrew L. Ji
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Paul A. Khavari
- Program in Epithelial Biology, Stanford University, Stanford, CA, USA,Stanford Cancer Institute, Stanford University, Stanford, CA, USA,Veterans Affairs Palo Alto Healthcare System, Palo Alto, CA, USA
| |
Collapse
|
14
|
Auerbach BJ, Hu J, Reilly MP, Li M. Applications of single-cell genomics and computational strategies to study common disease and population-level variation. Genome Res 2021; 31:1728-1741. [PMID: 34599006 PMCID: PMC8494214 DOI: 10.1101/gr.275430.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The advent and rapid development of single-cell technologies have made it possible to study cellular heterogeneity at an unprecedented resolution and scale. Cellular heterogeneity underlies phenotypic differences among individuals, and studying cellular heterogeneity is an important step toward our understanding of the disease molecular mechanism. Single-cell technologies offer opportunities to characterize cellular heterogeneity from different angles, but how to link cellular heterogeneity with disease phenotypes requires careful computational analysis. In this article, we will review the current applications of single-cell methods in human disease studies and describe what we have learned so far from existing studies about human genetic variation. As single-cell technologies are becoming widely applicable in human disease studies, population-level studies have become a reality. We will describe how we should go about pursuing and designing these studies, particularly how to select study subjects, how to determine the number of cells to sequence per subject, and the needed sequencing depth per cell. We also discuss computational strategies for the analysis of single-cell data and describe how single-cell data can be integrated with bulk tissue data and data generated from genome-wide association studies. Finally, we point out open problems and future research directions.
Collapse
Affiliation(s)
- Benjamin J Auerbach
- Graduate Group in Genomics and Computational Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA
| | - Jian Hu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA
| | - Muredach P Reilly
- Division of Cardiology, Department of Medicine, Columbia University Irving Medical Center, New York, New York 10032, USA
| | - Mingyao Li
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
15
|
Thind AS, Monga I, Thakur PK, Kumari P, Dindhoria K, Krzak M, Ranson M, Ashford B. Demystifying emerging bulk RNA-Seq applications: the application and utility of bioinformatic methodology. Brief Bioinform 2021; 22:6330938. [PMID: 34329375 DOI: 10.1093/bib/bbab259] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 06/14/2021] [Accepted: 06/18/2021] [Indexed: 12/13/2022] Open
Abstract
Significant innovations in next-generation sequencing techniques and bioinformatics tools have impacted our appreciation and understanding of RNA. Practical RNA sequencing (RNA-Seq) applications have evolved in conjunction with sequence technology and bioinformatic tools advances. In most projects, bulk RNA-Seq data is used to measure gene expression patterns, isoform expression, alternative splicing and single-nucleotide polymorphisms. However, RNA-Seq holds far more hidden biological information including details of copy number alteration, microbial contamination, transposable elements, cell type (deconvolution) and the presence of neoantigens. Recent novel and advanced bioinformatic algorithms developed the capacity to retrieve this information from bulk RNA-Seq data, thus broadening its scope. The focus of this review is to comprehend the emerging bulk RNA-Seq-based analyses, emphasizing less familiar and underused applications. In doing so, we highlight the power of bulk RNA-Seq in providing biological insights.
Collapse
Affiliation(s)
- Amarinder Singh Thind
- University of Wollongong, Wollongong, Australia.,Illawarra Health and Medical Research Institute, Wollongong, Australia
| | - Isha Monga
- Columbia University, New York City, NY, USA
| | | | - Pallawi Kumari
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Kiran Dindhoria
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | | | - Marie Ranson
- University of Wollongong, Wollongong, Australia.,Illawarra Health and Medical Research Institute, Wollongong, Australia
| | - Bruce Ashford
- University of Wollongong, Wollongong, Australia.,Illawarra Health and Medical Research Institute, Wollongong, Australia
| |
Collapse
|
16
|
Aliee H, Theis FJ. AutoGeneS: Automatic gene selection using multi-objective optimization for RNA-seq deconvolution. Cell Syst 2021; 12:706-715.e4. [PMID: 34293324 DOI: 10.1016/j.cels.2021.05.006] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2020] [Revised: 07/31/2020] [Accepted: 05/07/2021] [Indexed: 12/25/2022]
Abstract
Knowing cell-type proportions in a tissue is very important to identify which cells or cell types are targeted by a disease or perturbation. Hence, several deconvolution methods have been proposed to infer cell-type proportions from bulk RNA samples. Their performance with noisy reference profiles and closely correlated cell types highly depends on the set of genes undergoing deconvolution. In this work, we introduce AutoGeneS, a platform that automatically extracts discriminative genes and reveals the cellular heterogeneity of bulk RNA samples. AutoGeneS requires no prior knowledge about marker genes and selects genes by simultaneously optimizing multiple criteria: minimizing the correlation and maximizing the distance between cell types. AutoGeneS can be applied to reference profiles from various sources like single-cell experiments or sorted cell populations. Ground truth cell proportions analyzed by flow cytometry confirmed the accuracy of AutoGeneS in identifying cell-type proportions. AutoGeneS is available for use via a standalone Python package (https://github.com/theislab/AutoGeneS).
Collapse
Affiliation(s)
- Hananeh Aliee
- Institute of Computational Biology, Helmholtz Centre, Munich, Bayern 85764, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Centre, Munich, Bayern 85764, Germany; Department of Mathematics, Technical University of Munich, Munich, Bayern 85748, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
| |
Collapse
|
17
|
Kang K, Huang C, Li Y, Umbach DM, Li L. CDSeqR: fast complete deconvolution for gene expression data from bulk tissues. BMC Bioinformatics 2021; 22:262. [PMID: 34030626 PMCID: PMC8142515 DOI: 10.1186/s12859-021-04186-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 05/12/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Biological tissues consist of heterogenous populations of cells. Because gene expression patterns from bulk tissue samples reflect the contributions from all cells in the tissue, understanding the contribution of individual cell types to the overall gene expression in the tissue is fundamentally important. We recently developed a computational method, CDSeq, that can simultaneously estimate both sample-specific cell-type proportions and cell-type-specific gene expression profiles using only bulk RNA-Seq counts from multiple samples. Here we present an R implementation of CDSeq (CDSeqR) with significant performance improvement over the original implementation in MATLAB and an added new function to aid cell type annotation. The R package would be of interest for the broader R community. RESULT We developed a novel strategy to substantially improve computational efficiency in both speed and memory usage. In addition, we designed and implemented a new function for annotating the CDSeq estimated cell types using single-cell RNA sequencing (scRNA-seq) data. This function allows users to readily interpret and visualize the CDSeq estimated cell types. In addition, this new function further allows the users to annotate CDSeq-estimated cell types using marker genes. We carried out additional validations of the CDSeqR software using synthetic, real cell mixtures, and real bulk RNA-seq data from the Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression (GTEx) project. CONCLUSIONS The existing bulk RNA-seq repositories, such as TCGA and GTEx, provide enormous resources for better understanding changes in transcriptomics and human diseases. They are also potentially useful for studying cell-cell interactions in the tissue microenvironment. Bulk level analyses neglect tissue heterogeneity, however, and hinder investigation of a cell-type-specific expression. The CDSeqR package may aid in silico dissection of bulk expression data, enabling researchers to recover cell-type-specific information.
Collapse
Affiliation(s)
- Kai Kang
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA.
| | - Caizhi Huang
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA
| | - Yuanyuan Li
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA
| | - David M Umbach
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA
| | - Leping Li
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, Durham, NC, 27709, USA.
| |
Collapse
|
18
|
Zhang M, Sheffield T, Zhan X, Li Q, Yang DM, Wang Y, Wang S, Xie Y, Wang T, Xiao G. Spatial molecular profiling: platforms, applications and analysis tools. Brief Bioinform 2021; 22:bbaa145. [PMID: 32770205 PMCID: PMC8138878 DOI: 10.1093/bib/bbaa145] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 05/26/2020] [Accepted: 06/09/2020] [Indexed: 12/24/2022] Open
Abstract
Molecular profiling technologies, such as genome sequencing and proteomics, have transformed biomedical research, but most such technologies require tissue dissociation, which leads to loss of tissue morphology and spatial information. Recent developments in spatial molecular profiling technologies have enabled the comprehensive molecular characterization of cells while keeping their spatial and morphological contexts intact. Molecular profiling data generate deep characterizations of the genetic, transcriptional and proteomic events of cells, while tissue images capture the spatial locations, organizations and interactions of the cells together with their morphology features. These data, together with cell and tissue imaging data, provide unprecedented opportunities to study tissue heterogeneity and cell spatial organization. This review aims to provide an overview of these recent developments in spatial molecular profiling technologies and the corresponding computational methods developed for analyzing such data.
Collapse
Affiliation(s)
- Minzhe Zhang
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Thomas Sheffield
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Xiaowei Zhan
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Qiwei Li
- Department of Mathematics Sciences at University of Texas at Dallas
| | - Donghan M Yang
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Yunguan Wang
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Shidan Wang
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Yang Xie
- Quantitative Biomedical Research Center at the University of Texas Southwestern Medical Center
| | - Tao Wang
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| | - Guanghua Xiao
- Department of Population and Data Sciences at University of Texas Southwestern Medical Center
| |
Collapse
|
19
|
Du SZ, Chen C, Qin L, Tang XL. Bioinformatics analysis of immune infiltration in glioblastoma multiforme based on data using a methylation chip in the GEO database. Transl Cancer Res 2021; 10:1484-1491. [PMID: 35116473 PMCID: PMC8798202 DOI: 10.21037/tcr-21-74] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 02/26/2021] [Indexed: 12/12/2022]
Abstract
Background Glioblastoma multiforme (GBM) is the most aggressive and malignant tumor of the central nervous system. The study was to obtain the data of immune cell infiltration based on the data of a methylation chip in the GEO, and to clarify its prognostic significance for GBM. Methods The methylation data of glioblastoma was obtained by using the Illumina human methylation 450k BeadChip. The corrected expression was obtained by using edge R. Limma was used to correct the expression amount of the samples, and EpiDISH was used to translate the methylation expression data, so that the expression amount was transformed into the expression matrix of immune cells. The immune cells were then co-expressed, and the proportion and correlation of related immune cells was determined. The results of the cells in each of two groups were analyzed by enrichment and PCA mapping to establish the relevant differences. Results The data of GBM patients were obtained from the methylation chip of the GEO database. Patients were divided into a long-term (SNU-LTS) (21 cases), and short-term survival group (SNU-STS) (12 cases). There were 73 genes with significant individual differences between the two groups (P<0.05). EpiDISH was used to translate the methylation expression data into the expression matrix of immune cells, which showed that the highest proportion of cells in groups were mono cells, while Gran cells and CD8T appeared in a very small number of samples. The positive correlation between mono and B cells was the strongest, while the negative correlation between mono and Gran cells was the strongest. A violin chart shows that there was no significant difference in the infiltration degree of six kinds of immune cells between the two groups. Principal component analysis (PCA) showed that there was individual difference between the two groups, but the overall consistency was high. Conclusions Data on tumor immune cell infiltration can be obtained by using a methylation chip in the GEO database. This not only extends the application abilities of methylation chips but provides obvious individual differences. The study of tumor immune infiltrating cells may pave the way for targeted therapy in the treatment of GBM.
Collapse
Affiliation(s)
- Song-Zhou Du
- Department of Neurosurgery, Jingzhou Hospital of Traditional Chinese Medicine, The Third Clinical Medical College, Yangtze University, Jingzhou, China
| | - Cheng Chen
- Department of Nuclear Medicine, Jingzhou Central Hospital, The Second Clinical Medical College, Yangtze University, Jingzhou, China
| | - Lu Qin
- Department of Thyroid Vascular Surgery, Jingzhou Central Hospital, The Second Clinical Medical College, Yangtze University, Jingzhou, China
| | - Xue-Lian Tang
- Department of Respiratory Medicine, Jingzhou Central Hospital, The Second Clinical Medical College, Yangtze University, Jingzhou, China
| |
Collapse
|
20
|
Hunt GJ, Gagnon-Bartsch JA. The role of scale in the estimation of cell-type proportions. Ann Appl Stat 2021. [DOI: 10.1214/20-aoas1395] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
21
|
Jaakkola MK, Elo LL. Computational deconvolution to estimate cell type-specific gene expression from bulk data. NAR Genom Bioinform 2021; 3:lqaa110. [PMID: 33575652 PMCID: PMC7803005 DOI: 10.1093/nargab/lqaa110] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Revised: 12/14/2020] [Accepted: 12/17/2020] [Indexed: 12/24/2022] Open
Abstract
Computational deconvolution is a time and cost-efficient approach to obtain cell type-specific information from bulk gene expression of heterogeneous tissues like blood. Deconvolution can aim to either estimate cell type proportions or abundances in samples, or estimate how strongly each present cell type expresses different genes, or both tasks simultaneously. Among the two separate goals, the estimation of cell type proportions/abundances is widely studied, but less attention has been paid on defining the cell type-specific expression profiles. Here, we address this gap by introducing a novel method Rodeo and empirically evaluating it and the other available tools from multiple perspectives utilizing diverse datasets.
Collapse
Affiliation(s)
- Maria K Jaakkola
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Tykistökatu 6, FI-20520 Turku, Finland
| | - Laura L Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Tykistökatu 6, FI-20520 Turku, Finland
| |
Collapse
|
22
|
Chen Z, Wu A. Progress and challenge for computational quantification of tissue immune cells. Brief Bioinform 2021; 22:6065002. [PMID: 33401306 DOI: 10.1093/bib/bbaa358] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 10/23/2020] [Accepted: 11/07/2020] [Indexed: 12/28/2022] Open
Abstract
Tissue immune cells have long been recognized as important regulators for the maintenance of balance in the body system. Quantification of the abundance of different immune cells will provide enhanced understanding of the correlation between immune cells and normal or abnormal situations. Currently, computational methods to predict tissue immune cell compositions from bulk transcriptomes have been largely developed. Therefore, summarizing the advantages and disadvantages is appropriate. In addition, an examination of the challenges and possible solutions for these computational models will assist the development of this field. The common hypothesis of these models is that the expression of signature genes for immune cell types might represent the proportion of immune cells that contribute to the tissue transcriptome. In general, we grouped all reported tools into three groups, including reference-free, reference-based scoring and reference-based deconvolution methods. In this review, a summary of all the currently reported computational immune cell quantification tools and their applications, limitations, and perspectives are presented. Furthermore, some critical problems are found that have limited the performance and application of these models, including inadequate immune cell type, the collinearity problem, the impact of the tissue environment on the immune cell expression level, and the deficiency of standard datasets for model validation. To address these issues, tissue specific training datasets that include all known immune cells, a hierarchical computational framework, and benchmark datasets including both tissue expression profiles and the abundances of all the immune cells are proposed to further promote the development of this field.
Collapse
Affiliation(s)
- Ziyi Chen
- Suzhou Institute of Systems Medicine, Center for Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Jiangsu, Suzhou, China
| | - Aiping Wu
- Suzhou Institute of Systems Medicine, Center for Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Jiangsu, Suzhou, China
| |
Collapse
|
23
|
Benchmarking of cell type deconvolution pipelines for transcriptomics data. Nat Commun 2020; 11:5650. [PMID: 33159064 PMCID: PMC7648640 DOI: 10.1038/s41467-020-19015-1] [Citation(s) in RCA: 185] [Impact Index Per Article: 46.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Accepted: 09/16/2020] [Indexed: 01/05/2023] Open
Abstract
Many computational methods have been developed to infer cell type proportions from bulk transcriptomics data. However, an evaluation of the impact of data transformation, pre-processing, marker selection, cell type composition and choice of methodology on the deconvolution results is still lacking. Using five single-cell RNA-sequencing (scRNA-seq) datasets, we generate pseudo-bulk mixtures to evaluate the combined impact of these factors. Both bulk deconvolution methodologies and those that use scRNA-seq data as reference perform best when applied to data in linear scale and the choice of normalization has a dramatic impact on some, but not all methods. Overall, methods that use scRNA-seq data have comparable performance to the best performing bulk methods whereas semi-supervised approaches show higher error values. Moreover, failure to include cell types in the reference that are present in a mixture leads to substantially worse results, regardless of the previous choices. Altogether, we evaluate the combined impact of factors affecting the deconvolution task across different datasets and propose general guidelines to maximize its performance. Inferring cell type proportions from transcriptomics data is affected by data transformation, normalization, choice of method and the markers used. Here, the authors use single-cell RNAseq datasets to evaluate the impact of these factors and propose guidelines to maximise deconvolution performance.
Collapse
|
24
|
Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography. Commun Biol 2020; 3:565. [PMID: 33037292 PMCID: PMC7547664 DOI: 10.1038/s42003-020-01247-y] [Citation(s) in RCA: 207] [Impact Index Per Article: 51.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Accepted: 08/04/2020] [Indexed: 12/21/2022] Open
Abstract
The field of spatial transcriptomics is rapidly expanding, and with it the repertoire of available technologies. However, several of the transcriptome-wide spatial assays do not operate on a single cell level, but rather produce data comprised of contributions from a – potentially heterogeneous – mixture of cells. Still, these techniques are attractive to use when examining complex tissue specimens with diverse cell populations, where complete expression profiles are required to properly capture their richness. Motivated by an interest to put gene expression into context and delineate the spatial arrangement of cell types within a tissue, we here present a model-based probabilistic method that uses single cell data to deconvolve the cell mixtures in spatial data. To illustrate the capacity of our method, we use data from different experimental platforms and spatially map cell types from the mouse brain and developmental heart, which arrange as expected. Alma Andersson et al. present a probabilistic framework that integrates single-cell and bulk spatial transcriptomics in order to spatially map cell types onto their respective tissues. They apply their method to the developing human heart and mouse brain to demonstrate the power of the technique.
Collapse
|
25
|
Xu F, Ashbrook DG, Gao J, Starlard-Davenport A, Zhao W, Miller DB, O'Callaghan JP, Williams RW, Jones BC, Lu L. Genome-wide transcriptome architecture in a mouse model of Gulf War Illness. Brain Behav Immun 2020; 89:209-223. [PMID: 32574576 PMCID: PMC7787136 DOI: 10.1016/j.bbi.2020.06.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Revised: 05/18/2020] [Accepted: 06/11/2020] [Indexed: 12/31/2022] Open
Abstract
Gulf War Illness (GWI) is thought to be a chronic neuroimmune disorder caused by in-theater exposure during the 1990-1991 Gulf War. There is a consensus that the illness is caused by exposure to insecticides and nerve agent toxicants. However, the heterogeneity in both development of disease and clinical outcomes strongly suggests a genetic contribution. Here, we modeled GWI in 30 BXD recombinant inbred mouse strains with a combined treatment of corticosterone (CORT) and diisopropyl fluorophosphate (DFP). We quantified transcriptomes from 409 prefrontal cortex samples. Compared to the untreated and DFP treated controls, the combined treatment significantly activated pathways such as cytokine-cytokine receptor interaction and TNF signaling pathway. Protein-protein interaction analysis defined 6 subnetworks for CORT + DFP, with the key regulators being Cxcl1, Il6, Ccnb1, Tnf, Agt, and Itgam. We also identified 21 differentially expressed genes having significant QTLs related to CORT + DFP, but without evidence for untreated and DFP treated controls, suggesting regions of the genome specifically involved in the response to CORT + DFP. We identified Adamts9 as a potential contributor to response to CORT + DFP and found links to symptoms of GWI. Furthermore, we observed a significant effect of CORT + DFP treatment on the relative proportion of myelinating oligodendrocytes, with a QTL on Chromosome 5. We highlight three candidates, Magi2, Sema3c, and Gnai1, based on their high expression in the brain and oligodendrocyte. In summary, our results show significant genetic effects of the CORT + DFP treatment, which mirrors gene and protein expression changes seen in GWI sufferers, providing insight into the disease and a testbed for future interventions.
Collapse
Affiliation(s)
- Fuyi Xu
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - David G Ashbrook
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Jun Gao
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA; Institute of Animal Husbandry and Veterinary Science, Shanghai Academy of Agricultural Sciences, Shanghai 201106, China
| | - Athena Starlard-Davenport
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Wenyuan Zhao
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Diane B Miller
- Toxicology and Molecular Biology Branch, Health Effects Laboratory Division, Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Morgantown, WV 26505, USA
| | - James P O'Callaghan
- Molecular Neurotoxicology Laboratory, Centers for Disease Control and Prevention, National Institute for Occupational Safety and Health, Morgantown, WV 26505, USA
| | - Robert W Williams
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Byron C Jones
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA.
| | - Lu Lu
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA.
| |
Collapse
|
26
|
Noé A, Cargill TN, Nielsen CM, Russell AJC, Barnes E. The Application of Single-Cell RNA Sequencing in Vaccinology. J Immunol Res 2020; 2020:8624963. [PMID: 32802896 PMCID: PMC7411487 DOI: 10.1155/2020/8624963] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 07/09/2020] [Indexed: 02/06/2023] Open
Abstract
Single-cell RNA sequencing allows highly detailed profiling of cellular immune responses from limited-volume samples, advancing prospects of a new era of systems immunology. The power of single-cell RNA sequencing offers various opportunities to decipher the immune response to infectious diseases and vaccines. Here, we describe the potential uses of single-cell RNA sequencing methods in prophylactic vaccine development, concentrating on infectious diseases including COVID-19. Using examples from several diseases, we review how single-cell RNA sequencing has been used to evaluate the immunological response to different vaccine platforms and regimens. By highlighting published and unpublished single-cell RNA sequencing studies relevant to vaccinology, we discuss some general considerations how the field could be enriched with the widespread adoption of this technology.
Collapse
MESH Headings
- Animals
- Betacoronavirus/immunology
- COVID-19
- Cell Line
- Clinical Trials as Topic
- Coronavirus Infections/epidemiology
- Coronavirus Infections/immunology
- Coronavirus Infections/prevention & control
- Coronavirus Infections/virology
- Disease Models, Animal
- Drug Evaluation, Preclinical
- Host-Pathogen Interactions/genetics
- Host-Pathogen Interactions/immunology
- Humans
- Immunity, Cellular/genetics
- Immunity, Innate/genetics
- Immunogenicity, Vaccine
- Pandemics/prevention & control
- Pneumonia, Viral/epidemiology
- Pneumonia, Viral/immunology
- Pneumonia, Viral/prevention & control
- Pneumonia, Viral/virology
- RNA, Viral/isolation & purification
- RNA-Seq/methods
- SARS-CoV-2
- Single-Cell Analysis
- Vaccinology/methods
- Viral Vaccines/administration & dosage
- Viral Vaccines/immunology
Collapse
Affiliation(s)
- Andrés Noé
- The Jenner Institute, University of Oxford, Old Road Campus Research Building, Oxford OX3 7DQ, UK
| | - Tamsin N. Cargill
- Peter Medawar Building for Pathogen Research and Oxford NIHR Biomedical Research Centre, Nuffield Department of Medicine, University of Oxford, South Parks Road, Oxford OX1 3SY, UK
- Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford OX3 9DU, UK
| | - Carolyn M. Nielsen
- The Jenner Institute, University of Oxford, Old Road Campus Research Building, Oxford OX3 7DQ, UK
| | | | - Eleanor Barnes
- Peter Medawar Building for Pathogen Research and Oxford NIHR Biomedical Research Centre, Nuffield Department of Medicine, University of Oxford, South Parks Road, Oxford OX1 3SY, UK
- Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford OX3 9DU, UK
| |
Collapse
|
27
|
Abstract
Tumor immunology is undergoing a renaissance due to the recent profound clinical successes of tumor immunotherapy. These advances have coincided with an exponential growth in the development of -omics technologies. Armed with these technologies and their associated computational and modeling toolsets, systems biologists have turned their attention to tumor immunology in an effort to understand the precise nature and consequences of interactions between tumors and the immune system. Such interactions are inherently multivariate, spanning multiple time and size scales, cell types, and organ systems, rendering systems biology approaches particularly amenable to their interrogation. While in its infancy, the field of 'Cancer Systems Immunology' has already influenced our understanding of tumor immunology and immunotherapy. As the field matures, studies will move beyond descriptive characterizations toward functional investigations of the emergent behavior that govern tumor-immune responses. Thus, Cancer Systems Immunology holds incredible promise to advance our ability to fight this disease.
Collapse
Affiliation(s)
| | - Edgar G Engleman
- Department of Pathology, Stanford University School of MedicineStanfordUnited States
- Division of Immunology and Rheumatology, Department of Medicine, Stanford University School of MedicineStanfordUnited States
- Stanford Cancer Institute, Stanford UniversityStanfordUnited States
| |
Collapse
|