1
|
Tiwari A, Trivedi R, Lin SY. Tumor microenvironment: barrier or opportunity towards effective cancer therapy. J Biomed Sci 2022; 29:83. [PMID: 36253762 PMCID: PMC9575280 DOI: 10.1186/s12929-022-00866-3] [Citation(s) in RCA: 161] [Impact Index Per Article: 53.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 10/01/2022] [Indexed: 12/24/2022] Open
Abstract
Tumor microenvironment (TME) is a specialized ecosystem of host components, designed by tumor cells for successful development and metastasis of tumor. With the advent of 3D culture and advanced bioinformatic methodologies, it is now possible to study TME’s individual components and their interplay at higher resolution. Deeper understanding of the immune cell’s diversity, stromal constituents, repertoire profiling, neoantigen prediction of TMEs has provided the opportunity to explore the spatial and temporal regulation of immune therapeutic interventions. The variation of TME composition among patients plays an important role in determining responders and non-responders towards cancer immunotherapy. Therefore, there could be a possibility of reprogramming of TME components to overcome the widely prevailing issue of immunotherapeutic resistance. The focus of the present review is to understand the complexity of TME and comprehending future perspective of its components as potential therapeutic targets. The later part of the review describes the sophisticated 3D models emerging as valuable means to study TME components and an extensive account of advanced bioinformatic tools to profile TME components and predict neoantigens. Overall, this review provides a comprehensive account of the current knowledge available to target TME.
Collapse
Affiliation(s)
- Aadhya Tiwari
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| | - Rakesh Trivedi
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Shiaw-Yih Lin
- Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA.
| |
Collapse
|
2
|
Bunis DG, Wang W, Vallvé-Juanico J, Houshdaran S, Sen S, Ben Soltane I, Kosti I, Vo KC, Irwin JC, Giudice LC, Sirota M. Whole-Tissue Deconvolution and scRNAseq Analysis Identify Altered Endometrial Cellular Compositions and Functionality Associated With Endometriosis. Front Immunol 2022; 12:788315. [PMID: 35069565 PMCID: PMC8766492 DOI: 10.3389/fimmu.2021.788315] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
The uterine lining (endometrium) exhibits a pro-inflammatory phenotype in women with endometriosis, resulting in pain, infertility, and poor pregnancy outcomes. The full complement of cell types contributing to this phenotype has yet to be identified, as most studies have focused on bulk tissue or select cell populations. Herein, through integrating whole-tissue deconvolution and single-cell RNAseq, we comprehensively characterized immune and nonimmune cell types in the endometrium of women with or without disease and their dynamic changes across the menstrual cycle. We designed metrics to evaluate specificity of deconvolution signatures that resulted in single-cell identification of 13 novel signatures for immune cell subtypes in healthy endometrium. Guided by statistical metrics, we identified contributions of endometrial epithelial, endothelial, plasmacytoid dendritic cells, classical dendritic cells, monocytes, macrophages, and granulocytes to the endometrial pro-inflammatory phenotype, underscoring roles for nonimmune as well as immune cells to the dysfunctionality of this tissue.
Collapse
Affiliation(s)
- Daniel G. Bunis
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
| | - Wanxin Wang
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Júlia Vallvé-Juanico
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Sahar Houshdaran
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Sushmita Sen
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Isam Ben Soltane
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
| | - Idit Kosti
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
| | - Kim Chi Vo
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Juan C. Irwin
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Linda C. Giudice
- Center for Reproductive Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, United States
- Department of Pediatrics, Division of Neonatology, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
3
|
Cui X, Qin F, Yu X, Xiao F, Cai G. SCISSOR™: a single-cell inferred site-specific omics resource for tumor microenvironment association study. NAR Cancer 2021; 3:zcab037. [PMID: 34514416 PMCID: PMC8428296 DOI: 10.1093/narcan/zcab037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 08/24/2021] [Accepted: 08/31/2021] [Indexed: 12/04/2022] Open
Abstract
Tumor tissues are heterogeneous with different cell types in tumor microenvironment, which play an important role in tumorigenesis and tumor progression. Several computational algorithms and tools have been developed to infer the cell composition from bulk transcriptome profiles. However, they ignore the tissue specificity and thus a new resource for tissue-specific cell transcriptomic reference is needed for inferring cell composition in tumor microenvironment and exploring their association with clinical outcomes and tumor omics. In this study, we developed SCISSOR™ (https://thecailab.com/scissor/), an online open resource to fulfill that demand by integrating five orthogonal omics data of >6031 large-scale bulk samples, patient clinical outcomes and 451 917 high-granularity tissue-specific single-cell transcriptomic profiles of 16 cancer types. SCISSOR™ provides five major analysis modules that enable flexible modeling with adjustable parameters and dynamic visualization approaches. SCISSOR™ is valuable as a new resource for promoting tumor heterogeneity and tumor–tumor microenvironment cell interaction research, by delineating cells in the tissue-specific tumor microenvironment and characterizing their associations with tumor omics and clinical outcomes.
Collapse
Affiliation(s)
- Xiang Cui
- Department of Environmental Health Sciences, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Fei Qin
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Xuanxuan Yu
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Feifei Xiao
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Guoshuai Cai
- Department of Environmental Health Sciences, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| |
Collapse
|
4
|
Najafi A, Ilchi S, Saberi AH, Motahari SA, Khalaj BH, Rabiee HR. On statistical learning of simplices: Unmixing problem revisited. Ann Stat 2021. [DOI: 10.1214/20-aos2016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Amir Najafi
- Data Analytics Lab (DAL), Computer Engineering Department, Sharif University of Technology
| | - Saeed Ilchi
- Data Analytics Lab (DAL), Computer Engineering Department, Sharif University of Technology
| | | | | | - Babak H. Khalaj
- Electrical Engineering Department, Sharif University of Technology
| | - Hamid R. Rabiee
- Data science and Machine learning Lab (DML), Computer Engineering Department, Sharif University of Technology
| |
Collapse
|
5
|
Data-driven detection of subtype-specific differentially expressed genes. Sci Rep 2021; 11:332. [PMID: 33432005 PMCID: PMC7801594 DOI: 10.1038/s41598-020-79704-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Accepted: 12/11/2020] [Indexed: 11/08/2022] Open
Abstract
Among multiple subtypes of tissue or cell, subtype-specific differentially-expressed genes (SDEGs) are defined as being most-upregulated in only one subtype but not in any other. Detecting SDEGs plays a critical role in the molecular characterization and deconvolution of multicellular complex tissues. Classic differential analysis assumes a null hypothesis whose test statistic is not subtype-specific, thus can produce a high false positive rate and/or lower detection power. Here we first introduce a One-Versus-Everyone Fold Change (OVE-FC) test for detecting SDEGs. We then propose a scaled test statistic (OVE-sFC) for assessing the statistical significance of SDEGs that applies a mixture null distribution model and a tailored permutation test. The OVE-FC/sFC test was validated on both type 1 error rate and detection power using extensive simulation data sets generated from real gene expression profiles of purified subtype samples. The OVE-FC/sFC test was then applied to two benchmark gene expression data sets of purified subtype samples and detected many known or previously unknown SDEGs. Subsequent supervised deconvolution results on synthesized bulk expression data, obtained using the SDEGs detected from the independent purified expression data by the OVE-FC/sFC test, showed superior performance in deconvolution accuracy when compared with popular peer methods.
Collapse
|
6
|
Yoosuf N, Navarro JF, Salmén F, Ståhl PL, Daub CO. Identification and transfer of spatial transcriptomics signatures for cancer diagnosis. Breast Cancer Res 2020; 22:6. [PMID: 31931856 PMCID: PMC6958738 DOI: 10.1186/s13058-019-1242-9] [Citation(s) in RCA: 59] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Accepted: 12/27/2019] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Distinguishing ductal carcinoma in situ (DCIS) from invasive ductal carcinoma (IDC) regions in clinical biopsies constitutes a diagnostic challenge. Spatial transcriptomics (ST) is an in situ capturing method, which allows quantification and visualization of transcriptomes in individual tissue sections. In the past, studies have shown that breast cancer samples can be used to study their transcriptomes with spatial resolution in individual tissue sections. Previously, supervised machine learning methods were used in clinical studies to predict the clinical outcomes for cancer types. METHODS We used four publicly available ST breast cancer datasets from breast tissue sections annotated by pathologists as non-malignant, DCIS, or IDC. We trained and tested a machine learning method (support vector machine) based on the expert annotation as well as based on automatic selection of cell types by their transcriptome profiles. RESULTS We identified expression signatures for expert annotated regions (non-malignant, DCIS, and IDC) and build machine learning models. Classification results for 798 expression signature transcripts showed high coincidence with the expert pathologist annotation for DCIS (100%) and IDC (96%). Extending our analysis to include all 25,179 expressed transcripts resulted in an accuracy of 99% for DCIS and 98% for IDC. Further, classification based on an automatically identified expression signature covering all ST spots of tissue sections resulted in prediction accuracy of 95% for DCIS and 91% for IDC. CONCLUSIONS This concept study suggest that the ST signatures learned from expert selected breast cancer tissue sections can be used to identify breast cancer regions in whole tissue sections including regions not trained on. Furthermore, the identified expression signatures can classify cancer regions in tissue sections not used for training with high accuracy. Expert-generated but even automatically generated cancer signatures from ST data might be able to classify breast cancer regions and provide clinical decision support for pathologists in the future.
Collapse
Affiliation(s)
- Niyaz Yoosuf
- Department of Biosciences and Nutrition, Karolinska Institutet, 141 83, Huddinge, Sweden. .,Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - José Fernández Navarro
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Fredrik Salmén
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.,Hubrecht Institute-KNAW (Royal Netherlands Academy of Arts and Sciences) and University Medical Center Utrecht, Cancer Genomics Netherlands, Utrecht, the Netherlands
| | - Patrik L Ståhl
- Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Carsten O Daub
- Department of Biosciences and Nutrition, Karolinska Institutet, 141 83, Huddinge, Sweden.
| |
Collapse
|
7
|
Sun X, Sun S, Yang S. An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data. Cells 2019; 8:E1161. [PMID: 31569701 PMCID: PMC6830085 DOI: 10.3390/cells8101161] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Revised: 09/23/2019] [Accepted: 09/26/2019] [Indexed: 12/25/2022] Open
Abstract
Estimating cell type compositions for complex diseases is an important step to investigate the cellular heterogeneity for understanding disease etiology and potentially facilitate early disease diagnosis and prevention. Here, we developed a computationally statistical method, referring to Multi-Omics Matrix Factorization (MOMF), to estimate the cell-type compositions of bulk RNA sequencing (RNA-seq) data by leveraging cell type-specific gene expression levels from single-cell RNA sequencing (scRNA-seq) data. MOMF not only directly models the count nature of gene expression data, but also effectively accounts for the uncertainty of cell type-specific mean gene expression levels. We demonstrate the benefits of MOMF through three real data applications, i.e., Glioblastomas (GBM), colorectal cancer (CRC) and type II diabetes (T2D) studies. MOMF is able to accurately estimate disease-related cell type proportions, i.e., oligodendrocyte progenitor cells and macrophage cells, which are strongly associated with the survival of GBM and CRC, respectively.
Collapse
Affiliation(s)
- Xifang Sun
- Department of Mathematics, School of Science, Xi'an Shiyou University, 710065 Xi'an, China.
| | - Shiquan Sun
- School of Computer Science, Northwestern Polytechnical University, 710072 Xi'an, China.
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA.
| | - Sheng Yang
- Department of Biostatistics, School of Public Health, Nanjing Medical University, 211166 Nanjing, China.
| |
Collapse
|
8
|
Avila Cobos F, Vandesompele J, Mestdagh P, De Preter K. Computational deconvolution of transcriptomics data from mixed cell populations. Bioinformatics 2019; 34:1969-1979. [PMID: 29351586 DOI: 10.1093/bioinformatics/bty019] [Citation(s) in RCA: 137] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 01/10/2018] [Indexed: 12/22/2022] Open
Abstract
Summary Gene expression analyses of bulk tissues often ignore cell type composition as an important confounding factor, resulting in a loss of signal from lowly abundant cell types. In this review, we highlight the importance and value of computational deconvolution methods to infer the abundance of different cell types and/or cell type-specific expression profiles in heterogeneous samples without performing physical cell sorting. We also explain the various deconvolution scenarios, the mathematical approaches used to solve them and the effect of data processing and different confounding factors on the accuracy of the deconvolution results. Contact katleen.depreter@ugent.be. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Francisco Avila Cobos
- Center for Medical Genetics Ghent (CMGG), Ghent University, 9000 Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium.,Bioinformatics Institute Ghent from Nucleotides to Networks (BIG N2N), 9000 Ghent, Belgium
| | - Jo Vandesompele
- Center for Medical Genetics Ghent (CMGG), Ghent University, 9000 Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium.,Bioinformatics Institute Ghent from Nucleotides to Networks (BIG N2N), 9000 Ghent, Belgium
| | - Pieter Mestdagh
- Center for Medical Genetics Ghent (CMGG), Ghent University, 9000 Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium.,Bioinformatics Institute Ghent from Nucleotides to Networks (BIG N2N), 9000 Ghent, Belgium
| | - Katleen De Preter
- Center for Medical Genetics Ghent (CMGG), Ghent University, 9000 Ghent, Belgium.,Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium.,Bioinformatics Institute Ghent from Nucleotides to Networks (BIG N2N), 9000 Ghent, Belgium
| |
Collapse
|
9
|
Domanskyi S, Szedlak A, Hawkins NT, Wang J, Paternostro G, Piermarocchi C. Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters. BMC Bioinformatics 2019; 20:369. [PMID: 31262249 PMCID: PMC6604348 DOI: 10.1186/s12859-019-2951-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 06/13/2019] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Single cell RNA sequencing (scRNA-seq) brings unprecedented opportunities for mapping the heterogeneity of complex cellular environments such as bone marrow, and provides insight into many cellular processes. Single cell RNA-seq has a far larger fraction of missing data reported as zeros (dropouts) than traditional bulk RNA-seq, and unsupervised clustering combined with Principal Component Analysis (PCA) can be used to overcome this limitation. After clustering, however, one has to interpret the average expression of markers on each cluster to identify the corresponding cell types, and this is normally done by hand by an expert curator. RESULTS We present a computational tool for processing single cell RNA-seq data that uses a voting algorithm to automatically identify cells based on approval votes received by known molecular markers. Using a stochastic procedure that accounts for imbalances in the number of known molecular signatures for different cell types, the method computes the statistical significance of the final approval score and automatically assigns a cell type to clusters without an expert curator. We demonstrate the utility of the tool in the analysis of eight samples of bone marrow from the Human Cell Atlas. The tool provides a systematic identification of cell types in bone marrow based on a list of markers of immune cell types, and incorporates a suite of visualization tools that can be overlaid on a t-SNE representation. The software is freely available as a Python package at https://github.com/sdomanskyi/DigitalCellSorter . CONCLUSIONS This methodology assures that extensive marker to cell type matching information is taken into account in a systematic way when assigning cell clusters to cell types. Moreover, the method allows for a high throughput processing of multiple scRNA-seq datasets, since it does not involve an expert curator, and it can be applied recursively to obtain cell sub-types. The software is designed to allow the user to substitute the marker to cell type matching information and apply the methodology to different cellular environments.
Collapse
Affiliation(s)
- Sergii Domanskyi
- Department of Physics and Astronomy, Michigan State University, East Lansing, MI, 48824, USA.
| | - Anthony Szedlak
- Department of Physics and Astronomy, Michigan State University, East Lansing, MI, 48824, USA
| | - Nathaniel T Hawkins
- Department of Physics and Astronomy, Michigan State University, East Lansing, MI, 48824, USA
| | | | | | - Carlo Piermarocchi
- Department of Physics and Astronomy, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
10
|
Clarke R, Tyson JJ, Tan M, Baumann WT, Jin L, Xuan J, Wang Y. Systems biology: perspectives on multiscale modeling in research on endocrine-related cancers. Endocr Relat Cancer 2019; 26:R345-R368. [PMID: 30965282 PMCID: PMC7045974 DOI: 10.1530/erc-18-0309] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Accepted: 04/08/2019] [Indexed: 12/12/2022]
Abstract
Drawing on concepts from experimental biology, computer science, informatics, mathematics and statistics, systems biologists integrate data across diverse platforms and scales of time and space to create computational and mathematical models of the integrative, holistic functions of living systems. Endocrine-related cancers are well suited to study from a systems perspective because of the signaling complexities arising from the roles of growth factors, hormones and their receptors as critical regulators of cancer cell biology and from the interactions among cancer cells, normal cells and signaling molecules in the tumor microenvironment. Moreover, growth factors, hormones and their receptors are often effective targets for therapeutic intervention, such as estrogen biosynthesis, estrogen receptors or HER2 in breast cancer and androgen receptors in prostate cancer. Given the complexity underlying the molecular control networks in these cancers, a simple, intuitive understanding of how endocrine-related cancers respond to therapeutic protocols has proved incomplete and unsatisfactory. Systems biology offers an alternative paradigm for understanding these cancers and their treatment. To correctly interpret the results of systems-based studies requires some knowledge of how in silico models are built, and how they are used to describe a system and to predict the effects of perturbations on system function. In this review, we provide a general perspective on the field of cancer systems biology, and we explore some of the advantages, limitations and pitfalls associated with using predictive multiscale modeling to study endocrine-related cancers.
Collapse
Affiliation(s)
- Robert Clarke
- Department of Oncology, Georgetown University Medical Center, Washington, District of Columbia, USA
| | - John J Tyson
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Ming Tan
- Department of Biostatistics, Bioinformatics & Biomathematics, Georgetown University Medical Center, Washington, District of Columbia, USA
| | - William T Baumann
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA
| | - Lu Jin
- Department of Oncology, Georgetown University Medical Center, Washington, District of Columbia, USA
| | - Jianhua Xuan
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, Virginia, USA
| | - Yue Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, Virginia, USA
| |
Collapse
|
11
|
Newman AM, Steen CB, Liu CL, Gentles AJ, Chaudhuri AA, Scherer F, Khodadoust MS, Esfahani MS, Luca BA, Steiner D, Diehn M, Alizadeh AA. Determining cell type abundance and expression from bulk tissues with digital cytometry. Nat Biotechnol 2019; 37:773-782. [PMID: 31061481 PMCID: PMC6610714 DOI: 10.1038/s41587-019-0114-2] [Citation(s) in RCA: 2493] [Impact Index Per Article: 415.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 03/26/2019] [Indexed: 02/07/2023]
Abstract
Single-cell RNA-sequencing has emerged as a powerful technique for characterizing cellular heterogeneity, but it is currently impractical on large sample cohorts and cannot be applied to fixed specimens collected as part of routine clinical care. We previously developed an approach for digital cytometry, called CIBERSORT, that enables estimation of cell type abundances from bulk tissue transcriptomes. We now introduce CIBERSORTx, a machine learning method that extends this framework to infer cell-type-specific gene expression profiles without physical cell isolation. By minimizing platform-specific variation, CIBERSORTx also allows the use of single-cell RNA-sequencing data for large-scale tissue dissection. We evaluated the utility of CIBERSORTx in multiple tumor types, including melanoma, where single-cell reference profiles were used to dissect bulk clinical specimens, revealing cell-type-specific phenotypic states linked to distinct driver mutations and response to immune checkpoint blockade. We anticipate that digital cytometry will augment single-cell profiling efforts, enabling cost-effective, high-throughput tissue characterization without the need for antibodies, disaggregation or viable cells.
Collapse
Affiliation(s)
- Aaron M Newman
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA. .,Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| | - Chloé B Steen
- Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.,Department of Informatics, University of Oslo, Oslo, Norway
| | - Chih Long Liu
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA.,Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Andrew J Gentles
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.,Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.,Center for Cancer Systems Biology, Stanford University, Stanford, CA, USA.,Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, USA
| | - Aadel A Chaudhuri
- Department of Radiation Oncology, Stanford University, Stanford, CA, USA.,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Florian Scherer
- Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.,Division of Hematology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Michael S Khodadoust
- Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Mohammad S Esfahani
- Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.,Center for Cancer Systems Biology, Stanford University, Stanford, CA, USA.,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Bogdan A Luca
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, USA
| | - David Steiner
- Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Maximilian Diehn
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA.,Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford University, Stanford, CA, USA.,Stanford Cancer Institute, Stanford University, Stanford, CA, USA
| | - Ash A Alizadeh
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA. .,Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA. .,Center for Cancer Systems Biology, Stanford University, Stanford, CA, USA. .,Stanford Cancer Institute, Stanford University, Stanford, CA, USA. .,Division of Hematology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
12
|
Klopfenstein Q, Truntzer C, Vincent J, Ghiringhelli F. Cell lines and immune classification of glioblastoma define patient's prognosis. Br J Cancer 2019; 120:806-814. [PMID: 30899088 PMCID: PMC6474266 DOI: 10.1038/s41416-019-0404-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 01/11/2019] [Accepted: 01/28/2019] [Indexed: 12/26/2022] Open
Abstract
Background Prognostic markers for glioblastoma are lacking. Both intrinsic tumour characteristics and microenvironment could influence cancer prognostic. The aim of our study was to generate a pure glioblastoma cell lines and immune classification in order to decipher the respective role of glioblastoma cell and microenvironment on prognosis. Methods We worked on two large cohorts of patients suffering from glioblastoma (TCGA, n = 481 and Rembrandt, n = 180) for which clinical data, transcriptomic profiles and outcome were recorded. Transcriptomic profiles of 129 pure glioblastoma cell lines were clustered to generate a glioblastoma cell lines classification. Presence of subtypes of glioblastoma cell lines and immune cells was determined using deconvolution. Results Glioblastoma cell lines classification defined three new molecular groups called oncogenic, metabolic and neuronal communication enriched. Neuronal communication-enriched tumours were associated with poor prognosis in both cohorts. Immune cell infiltrate was more frequent in mesenchymal classical classification subgroup and metabolic-enriched tumours. A combination of age, glioblastoma cell lines classification and immune classification could be used to determine patient’s outcome in both cohorts. Conclusions Our study shows that glioblastoma-bearing patients can be classified based on their age, glioblastoma cell lines classification and immune classification. The combination of these information improves the capacity to address prognosis.
Collapse
Affiliation(s)
- Quentin Klopfenstein
- Research Platform in Biological Oncology, Dijon, France.,GIMI Genetic and Immunology Medical Institute, Dijon, France
| | - Caroline Truntzer
- Research Platform in Biological Oncology, Dijon, France.,GIMI Genetic and Immunology Medical Institute, Dijon, France
| | - Julie Vincent
- Department of Medical Oncology, Centre GF Leclerc, Dijon, France
| | - Francois Ghiringhelli
- Research Platform in Biological Oncology, Dijon, France. .,GIMI Genetic and Immunology Medical Institute, Dijon, France. .,Department of Medical Oncology, Centre GF Leclerc, Dijon, France. .,INSERM, UMR1231, Dijon, France.
| |
Collapse
|
13
|
Abdolhosseini F, Azarkhalili B, Maazallahi A, Kamal A, Motahari SA, Sharifi-Zarchi A, Chitsaz H. Cell Identity Codes: Understanding Cell Identity from Gene Expression Profiles using Deep Neural Networks. Sci Rep 2019; 9:2342. [PMID: 30787315 PMCID: PMC6382891 DOI: 10.1038/s41598-019-38798-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Accepted: 01/10/2019] [Indexed: 01/11/2023] Open
Abstract
Understanding cell identity is an important task in many biomedical areas. Expression patterns of specific marker genes have been used to characterize some limited cell types, but exclusive markers are not available for many cell types. A second approach is to use machine learning to discriminate cell types based on the whole gene expression profiles (GEPs). The accuracies of simple classification algorithms such as linear discriminators or support vector machines are limited due to the complexity of biological systems. We used deep neural networks to analyze 1040 GEPs from 16 different human tissues and cell types. After comparing different architectures, we identified a specific structure of deep autoencoders that can encode a GEP into a vector of 30 numeric values, which we call the cell identity code (CIC). The original GEP can be reproduced from the CIC with an accuracy comparable to technical replicates of the same experiment. Although we use an unsupervised approach to train the autoencoder, we show different values of the CIC are connected to different biological aspects of the cell, such as different pathways or biological processes. This network can use CIC to reproduce the GEP of the cell types it has never seen during the training. It also can resist some noise in the measurement of the GEP. Furthermore, we introduce classifier autoencoder, an architecture that can accurately identify cell type based on the GEP or the CIC.
Collapse
Affiliation(s)
- Farzad Abdolhosseini
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | | | - Abbas Maazallahi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Aryan Kamal
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | | | - Ali Sharifi-Zarchi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| | - Hamidreza Chitsaz
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA.
| |
Collapse
|
14
|
Toker L, Mancarci BO, Tripathy S, Pavlidis P. Transcriptomic Evidence for Alterations in Astrocytes and Parvalbumin Interneurons in Subjects With Bipolar Disorder and Schizophrenia. Biol Psychiatry 2018; 84:787-796. [PMID: 30177255 PMCID: PMC6226343 DOI: 10.1016/j.biopsych.2018.07.010] [Citation(s) in RCA: 72] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 07/05/2018] [Accepted: 07/06/2018] [Indexed: 11/26/2022]
Abstract
BACKGROUND High-throughput expression analyses of postmortem brain tissue have been widely used to study bipolar disorder and schizophrenia. However, despite the extensive efforts, no consensus has emerged as to the functional interpretation of the findings. We hypothesized that incorporating information on cell type-specific expression would provide new insights. METHODS We reanalyzed 15 publicly available bulk tissue expression datasets on schizophrenia and bipolar disorder, representing various brain regions from eight different cohorts of subjects (unique subjects: 332 control, 129 bipolar disorder, 341 schizophrenia). We studied changes in the expression profiles of cell type marker genes and evaluated whether these expression profiles could serve as surrogates for relative abundance of their corresponding cells. RESULTS In both bipolar disorder and schizophrenia, we consistently observed an increase in the expression profiles of cortical astrocytes and a decrease in the expression profiles of fast-spiking parvalbumin interneurons. No changes in astrocyte expression profiles were observed in subcortical regions. Furthermore, we found that many of the genes previously identified as differentially expressed in schizophrenia are highly correlated with the expression profiles of astrocytes or fast-spiking parvalbumin interneurons. CONCLUSIONS Our results indicate convergence of transcriptome studies of schizophrenia and bipolar disorder on changes in cortical astrocytes and fast-spiking parvalbumin interneurons, providing a unified interpretation of numerous studies. We suggest that these changes can be attributed to alterations in the relative abundance of the cells and are important for understanding the pathophysiology of the disorders.
Collapse
Affiliation(s)
- Lilah Toker
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada; Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Burak Ogan Mancarci
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada; Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada; Graduate Program in Bioinformatics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Shreejoy Tripathy
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada; Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Paul Pavlidis
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada; Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
15
|
BayesCCE: a Bayesian framework for estimating cell-type composition from DNA methylation without the need for methylation reference. Genome Biol 2018; 19:141. [PMID: 30241486 PMCID: PMC6151042 DOI: 10.1186/s13059-018-1513-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Accepted: 08/20/2018] [Indexed: 11/10/2022] Open
Abstract
We introduce a Bayesian semi-supervised method for estimating cell counts from DNA methylation by leveraging an easily obtainable prior knowledge on the cell-type composition distribution of the studied tissue. We show mathematically and empirically that alternative methods which attempt to infer cell counts without methylation reference only capture linear combinations of cell counts rather than provide one component per cell type. Our approach allows the construction of components such that each component corresponds to a single cell type, and provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before.
Collapse
|
16
|
Kelley KW, Nakao-Inoue H, Molofsky AV, Oldham MC. Variation among intact tissue samples reveals the core transcriptional features of human CNS cell classes. Nat Neurosci 2018; 21:1171-1184. [PMID: 30154505 PMCID: PMC6192711 DOI: 10.1038/s41593-018-0216-z] [Citation(s) in RCA: 125] [Impact Index Per Article: 17.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 07/10/2018] [Indexed: 02/08/2023]
Abstract
It is widely assumed that cells must be physically isolated to study their molecular profiles. However, intact tissue samples naturally exhibit variation in cellular composition, which drives covariation of cell-class-specific molecular features. By analyzing transcriptional covariation in 7,221 intact CNS samples from 840 neurotypical individuals, representing billions of cells, we reveal the core transcriptional identities of major CNS cell classes in humans. By modeling intact CNS transcriptomes as a function of variation in cellular composition, we identify cell-class-specific transcriptional differences in Alzheimer's disease, among brain regions, and between species. Among these, we show that PMP2 is expressed by human but not mouse astrocytes and significantly increases mouse astrocyte size upon ectopic expression in vivo, causing them to more closely resemble their human counterparts. Our work is available as an online resource ( http://oldhamlab.ctec.ucsf.edu/ ) and provides a generalizable strategy for determining the core molecular features of cellular identity in intact biological systems.
Collapse
Affiliation(s)
- Kevin W Kelley
- Department of Neurological Surgery, University of California at San Francisco, San Francisco, CA, USA
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California at San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California at San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California at San Francisco, San Francisco, CA, USA
- Medical Scientist Training Program and Neuroscience Graduate Program, University of California at San Francisco, San Francisco, CA, USA
| | - Hiromi Nakao-Inoue
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California at San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California at San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California at San Francisco, San Francisco, CA, USA
| | - Anna V Molofsky
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California at San Francisco, San Francisco, CA, USA
- Weill Institute for Neurosciences, University of California at San Francisco, San Francisco, CA, USA
- Department of Psychiatry, University of California at San Francisco, San Francisco, CA, USA
| | - Michael C Oldham
- Department of Neurological Surgery, University of California at San Francisco, San Francisco, CA, USA.
- The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California at San Francisco, San Francisco, CA, USA.
- Weill Institute for Neurosciences, University of California at San Francisco, San Francisco, CA, USA.
| |
Collapse
|
17
|
Wang N, Chen L, Wang Y. Mathematical Modeling and Deconvolution of Molecular Heterogeneity Identifies Novel Subpopulations in Complex Tissues. Methods Mol Biol 2018; 1751:223-236. [PMID: 29508301 DOI: 10.1007/978-1-4939-7710-9_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Tissue heterogeneity is both a major confounding factor and an underexploited information source. While a handful of reports have demonstrated the potential of supervised methods to deconvolve tissue heterogeneity, these approaches require a priori information on the marker genes or composition of known subpopulations. To address the critical problem of the absence of validated marker genes for many (including novel) subpopulations, we develop a novel unsupervised deconvolution method, Convex Analysis of Mixtures (CAM), within a well-grounded mathematical framework, to dissect mixed gene expressions in heterogeneous tissue samples. To facilitate the utility of this method, we implement an R-Java CAM package that provides comprehensive analytic functions and graphic user interface (GUI).
Collapse
Affiliation(s)
- Niya Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA.
| | - Lulu Chen
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA
| | - Yue Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA, USA
| |
Collapse
|
18
|
Newman AM, Alizadeh AA. High-throughput genomic profiling of tumor-infiltrating leukocytes. Curr Opin Immunol 2016; 41:77-84. [PMID: 27372732 DOI: 10.1016/j.coi.2016.06.006] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2016] [Accepted: 06/13/2016] [Indexed: 12/21/2022]
Abstract
Tumors are complex ecosystems comprised of diverse cell types including malignant cells, mesenchymal cells, and tumor-infiltrating leukocytes (TILs). While TILs are well known to play important roles in many aspects of cancer biology, recent developments in immuno-oncology have spurred considerable interest in TILs, particularly in relation to their optimal engagement by emerging immunotherapies. Traditionally, the enumeration of TIL phenotypic diversity and composition in solid tumors has relied on resolving single cells by flow cytometry and immunohistochemical methods. However, advances in genome-wide technologies and computational methods are now allowing TILs to be profiled with increasingly high resolution and accuracy directly from RNA mixtures of bulk tumor samples. In this review, we highlight recent progress in the development of in silico tumor dissection methods, and illustrate examples of how these strategies can be applied to characterize TILs in human tumors to facilitate personalized cancer therapy.
Collapse
Affiliation(s)
- Aaron M Newman
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA; Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.
| | - Ash A Alizadeh
- Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, CA, USA; Division of Oncology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University, Stanford, CA, USA; Division of Hematology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
19
|
Mathematical modelling of transcriptional heterogeneity identifies novel markers and subpopulations in complex tissues. Sci Rep 2016; 6:18909. [PMID: 26739359 PMCID: PMC4703969 DOI: 10.1038/srep18909] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 11/23/2015] [Indexed: 01/18/2023] Open
Abstract
Tissue heterogeneity is both a major confounding factor and an underexploited information source. While a handful of reports have demonstrated the potential of supervised computational methods to deconvolute tissue heterogeneity, these approaches require a priori information on the marker genes or composition of known subpopulations. To address the critical problem of the absence of validated marker genes for many (including novel) subpopulations, we describe convex analysis of mixtures (CAM), a fully unsupervised in silico method, for identifying subpopulation marker genes directly from the original mixed gene expressions in scatter space that can improve molecular analyses in many biological contexts. Validated with predesigned mixtures, CAM on the gene expression data from peripheral leukocytes, brain tissue, and yeast cell cycle, revealed novel marker genes that were otherwise undetectable using existing methods. Importantly, CAM requires no a priori information on the number, identity, or composition of the subpopulations present in mixed samples, and does not require the presence of pure subpopulations in sample space. This advantage is significant in that CAM can achieve all of its goals using only a small number of heterogeneous samples, and is more powerful to distinguish between phenotypically similar subpopulations.
Collapse
|
20
|
Arendt BM, Comelli EM, Ma DWL, Lou W, Teterina A, Kim T, Fung SK, Wong DKH, McGilvray I, Fischer SE, Allard JP. Altered hepatic gene expression in nonalcoholic fatty liver disease is associated with lower hepatic n-3 and n-6 polyunsaturated fatty acids. Hepatology 2015; 61:1565-78. [PMID: 25581263 DOI: 10.1002/hep.27695] [Citation(s) in RCA: 242] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/29/2014] [Accepted: 12/31/2014] [Indexed: 12/11/2022]
Abstract
UNLABELLED In nonalcoholic fatty liver disease, hepatic gene expression and fatty acid (FA) composition have been reported independently, but a comprehensive gene expression profiling in relation to FA composition is lacking. The aim was to assess this relationship. In a cross-sectional study, hepatic gene expression (Illumina Microarray) was first compared among 20 patients with simple steatosis (SS), 19 with nonalcoholic steatohepatitis (NASH), and 24 healthy controls. The FA composition in hepatic total lipids was compared between SS and NASH, and associations between gene expression and FAs were examined. Gene expression differed mainly between healthy controls and patients (SS and NASH), including genes related to unsaturated FA metabolism. Twenty-two genes were differentially expressed between NASH and SS; most of them correlated with disease severity and related more to cancer progression than to lipid metabolism. Biologically active long-chain polyunsaturated FAs (PUFAs; eicosapentaenoic acid + docosahexaenoic acid, arachidonic acid) in hepatic total lipids were lower in NASH than in SS. This may be related to overexpression of FADS1, FADS2, and PNPLA3. The degree and direction of correlations between PUFAs and gene expression were different among SS and NASH, which may suggest that low PUFA content in NASH modulates gene expression in a different way compared with SS or, alternatively, that gene expression influences PUFA content differently depending on disease severity (SS versus NASH). CONCLUSION Well-defined subjects with either healthy liver, SS, or NASH showed distinct hepatic gene expression profiles including genes involved in unsaturated FA metabolism. In patients with NASH, hepatic PUFAs were lower and associations with gene expression were different compared to SS.
Collapse
Affiliation(s)
- Bianca M Arendt
- Toronto General Hospital, University Health Network, Toronto, ON, Canada
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 2015; 12:453-7. [PMID: 25822800 PMCID: PMC4739640 DOI: 10.1038/nmeth.3337] [Citation(s) in RCA: 8421] [Impact Index Per Article: 842.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Accepted: 02/02/2015] [Indexed: 12/15/2022]
Abstract
We introduce CIBERSORT, a method for characterizing cell composition of complex tissues from their gene expression profiles. When applied to enumeration of hematopoietic subsets in RNA mixtures from fresh, frozen, and fixed tissues, including solid tumors, CIBERSORT outperformed other methods with respect to noise, unknown mixture content, and closely related cell types. CIBERSORT should enable large-scale analysis of RNA mixtures for cellular biomarkers and therapeutic targets (http://cibersort.stanford.edu).
Collapse
|
22
|
Li R, Zhang W, Ji S. Automated identification of cell-type-specific genes in the mouse brain by image computing of expression patterns. BMC Bioinformatics 2014; 15:209. [PMID: 24947138 PMCID: PMC4078975 DOI: 10.1186/1471-2105-15-209] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 05/29/2014] [Indexed: 02/07/2023] Open
Abstract
Background Differential gene expression patterns in cells of the mammalian brain result in the morphological, connectional, and functional diversity of cells. A wide variety of studies have shown that certain genes are expressed only in specific cell-types. Analysis of cell-type-specific gene expression patterns can provide insights into the relationship between genes, connectivity, brain regions, and cell-types. However, automated methods for identifying cell-type-specific genes are lacking to date. Results Here, we describe a set of computational methods for identifying cell-type-specific genes in the mouse brain by automated image computing of in situ hybridization (ISH) expression patterns. We applied invariant image feature descriptors to capture local gene expression information from cellular-resolution ISH images. We then built image-level representations by applying vector quantization on the image descriptors. We employed regularized learning methods for classifying genes specifically expressed in different brain cell-types. These methods can also rank image features based on their discriminative power. We used a data set of 2,872 genes from the Allen Brain Atlas in the experiments. Results showed that our methods are predictive of cell-type-specificity of genes. Our classifiers achieved AUC values of approximately 87% when the enrichment level is set to 20. In addition, we showed that the highly-ranked image features captured the relationship between cell-types. Conclusions Overall, our results showed that automated image computing methods could potentially be used to identify cell-type-specific genes in the mouse brain.
Collapse
Affiliation(s)
| | | | - Shuiwang Ji
- Department of Computer Science, Old Dominion University, 23529 Norfolk, VA, USA.
| |
Collapse
|
23
|
Shannon CP, Balshaw R, Ng RT, Wilson-McManus JE, Keown P, McMaster R, McManus BM, Landsberg D, Isbel NM, Knoll G, Tebbutt SJ. Two-stage, in silico deconvolution of the lymphocyte compartment of the peripheral whole blood transcriptome in the context of acute kidney allograft rejection. PLoS One 2014; 9:e95224. [PMID: 24733377 PMCID: PMC3986379 DOI: 10.1371/journal.pone.0095224] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2013] [Accepted: 03/24/2014] [Indexed: 01/21/2023] Open
Abstract
Acute rejection is a major complication of solid organ transplantation that prevents the long-term assimilation of the allograft. Various populations of lymphocytes are principal mediators of this process, infiltrating graft tissues and driving cell-mediated cytotoxicity. Understanding the lymphocyte-specific biology associated with rejection is therefore critical. Measuring genome-wide changes in transcript abundance in peripheral whole blood cells can deliver a comprehensive view of the status of the immune system. The heterogeneous nature of the tissue significantly affects the sensitivity and interpretability of traditional analyses, however. Experimental separation of cell types is an obvious solution, but is often impractical and, more worrying, may affect expression, leading to spurious results. Statistical deconvolution of the cell type-specific signal is an attractive alternative, but existing approaches still present some challenges, particularly in a clinical research setting. Obtaining time-matched sample composition to biologically interesting, phenotypically homogeneous cell sub-populations is costly and adds significant complexity to study design. We used a two-stage, in silico deconvolution approach that first predicts sample composition to biologically meaningful and homogeneous leukocyte sub-populations, and then performs cell type-specific differential expression analysis in these same sub-populations, from peripheral whole blood expression data. We applied this approach to a peripheral whole blood expression study of kidney allograft rejection. The patterns of differential composition uncovered are consistent with previous studies carried out using flow cytometry and provide a relevant biological context when interpreting cell type-specific differential expression results. We identified cell type-specific differential expression in a variety of leukocyte sub-populations at the time of rejection. The tissue-specificity of these differentially expressed probe-set lists is consistent with the originating tissue and their functional enrichment consistent with allograft rejection. Finally, we demonstrate that the strategy described here can be used to derive useful hypotheses by validating a cell type-specific ratio in an independent cohort using the nanoString nCounter assay.
Collapse
Affiliation(s)
- Casey P. Shannon
- PROOF Centre of Excellence, Vancouver, BC, Canada
- UBC James Hogg Centre for Heart Lung Innovations, Vancouver, BC, Canada
| | - Robert Balshaw
- PROOF Centre of Excellence, Vancouver, BC, Canada
- Department of Statistics, University of British Columbia, Vancouver, BC, Canada
| | - Raymond T. Ng
- PROOF Centre of Excellence, Vancouver, BC, Canada
- Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
- UBC James Hogg Centre for Heart Lung Innovations, Vancouver, BC, Canada
| | - Janet E. Wilson-McManus
- PROOF Centre of Excellence, Vancouver, BC, Canada
- UBC James Hogg Centre for Heart Lung Innovations, Vancouver, BC, Canada
| | - Paul Keown
- PROOF Centre of Excellence, Vancouver, BC, Canada
- Department of Medicine, Division of Nephrology, University of British Columbia, Vancouver, BC, Canada
| | - Robert McMaster
- PROOF Centre of Excellence, Vancouver, BC, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, Canada
| | - Bruce M. McManus
- PROOF Centre of Excellence, Vancouver, BC, Canada
- Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, BC, Canada
- UBC James Hogg Centre for Heart Lung Innovations, Vancouver, BC, Canada
| | - David Landsberg
- Division of Nephrology, St. Paul's Hospital, and University of British Columbia, Vancouver, BC, Canada
| | - Nicole M. Isbel
- Department of Nephrology, Princess Alexandra Hospital, and University of Queensland, Brisbane, Australia
| | - Greg Knoll
- Ottawa Hospital Research Institute, Ottawa, On, Canada
| | - Scott J. Tebbutt
- PROOF Centre of Excellence, Vancouver, BC, Canada
- Department of Medicine, Division of Respiratory Medicine, University of British Columbia, Vancouver, BC, Canada
- UBC James Hogg Centre for Heart Lung Innovations, Vancouver, BC, Canada
| |
Collapse
|
24
|
Shen-Orr SS, Gaujoux R. Computational deconvolution: extracting cell type-specific information from heterogeneous samples. Curr Opin Immunol 2013; 25:571-8. [PMID: 24148234 DOI: 10.1016/j.coi.2013.09.015] [Citation(s) in RCA: 189] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2013] [Revised: 09/22/2013] [Accepted: 09/30/2013] [Indexed: 12/31/2022]
Abstract
The quanta unit of the immune system is the cell, yet analyzed samples are often heterogeneous with respect to cell subsets which can mislead result interpretation. Experimentally, researchers face a difficult choice whether to profile heterogeneous samples with the ensuing confounding effects, or a priori focus on a few cell subsets of interest, potentially limiting new discoveries. An attractive alternative solution is to extract cell subset-specific information directly from heterogeneous samples via computational deconvolution techniques, thereby capturing both cell-centered and whole system level context. Such approaches are capable of unraveling novel biology, undetectable otherwise. Here we review the present state of available deconvolution techniques, their advantages and limitations, with a focus on blood expression data and immunological studies in general.
Collapse
Affiliation(s)
- Shai S Shen-Orr
- Rappaport Institute of Medical Research, Technion-Israel Institute of Technology, Haifa 31096, Israel; Department of Immunology, Faculty of Medicine, Technion-Israel Institute of Technology, Haifa 31096, Israel; Faculty of Biology, Technion-Israel Institute of Technology, Haifa 31096, Israel.
| | | |
Collapse
|