1
|
Nguyen H, Nguyen H, Tran D, Draghici S, Nguyen T. Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res 2024; 52:4761-4783. [PMID: 38619038 PMCID: PMC11109966 DOI: 10.1093/nar/gkae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/01/2024] [Accepted: 04/02/2024] [Indexed: 04/16/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Advaita Bioinformatics, Ann Arbor, MI, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
2
|
Garcia-Recio S, Hinoue T, Wheeler GL, Kelly BJ, Garrido-Castro AC, Pascual T, De Cubas AA, Xia Y, Felsheim BM, McClure MB, Rajkovic A, Karaesmen E, Smith MA, Fan C, Ericsson PIG, Sanders ME, Creighton CJ, Bowen J, Leraas K, Burns RT, Coppens S, Wheless A, Rezk S, Garrett AL, Parker JS, Foy KK, Shen H, Park BH, Krop I, Anders C, Gastier-Foster J, Rimawi MF, Nanda R, Lin NU, Isaacs C, Marcom PK, Storniolo AM, Couch FJ, Chandran U, Davis M, Silverstein J, Ropelewski A, Liu MC, Hilsenbeck SG, Norton L, Richardson AL, Symmans WF, Wolff AC, Davidson NE, Carey LA, Lee AV, Balko JM, Hoadley KA, Laird PW, Mardis ER, King TA, Perou CM. Multiomics in primary and metastatic breast tumors from the AURORA US network finds microenvironment and epigenetic drivers of metastasis. NATURE CANCER 2023; 4:128-147. [PMID: 36585450 PMCID: PMC9886551 DOI: 10.1038/s43018-022-00491-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 11/11/2022] [Indexed: 12/31/2022]
Abstract
The AURORA US Metastasis Project was established with the goal to identify molecular features associated with metastasis. We assayed 55 females with metastatic breast cancer (51 primary cancers and 102 metastases) by RNA sequencing, tumor/germline DNA exome and low-pass whole-genome sequencing and global DNA methylation microarrays. Expression subtype changes were observed in ~30% of samples and were coincident with DNA clonality shifts, especially involving HER2. Downregulation of estrogen receptor (ER)-mediated cell-cell adhesion genes through DNA methylation mechanisms was observed in metastases. Microenvironment differences varied according to tumor subtype; the ER+/luminal subtype had lower fibroblast and endothelial content, while triple-negative breast cancer/basal metastases showed a decrease in B and T cells. In 17% of metastases, DNA hypermethylation and/or focal deletions were identified near HLA-A and were associated with reduced expression and lower immune cell infiltrates, especially in brain and liver metastases. These findings could have implications for treating individuals with metastatic breast cancer with immune- and HER2-targeting therapies.
Collapse
Affiliation(s)
| | | | | | | | | | - Tomas Pascual
- University of North Carolina, Chapel Hill, NC, USA
- SOLTI Cancer Research Group, Barcelona, Spain
| | - Aguirre A De Cubas
- Vanderbilt University Medical Center, Nashville, TN, USA
- Medical University of South Carolina, Charleston, SC, USA
| | - Youli Xia
- University of North Carolina, Chapel Hill, NC, USA
- Boehringer Ingelheim, Ridgefield, CT, USA
| | | | - Marni B McClure
- University of North Carolina, Chapel Hill, NC, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | - Cheng Fan
- University of North Carolina, Chapel Hill, NC, USA
| | | | | | | | - Jay Bowen
- Nationwide Children's Hospital, Columbus, OH, USA
| | | | - Robyn T Burns
- Translational Breast Cancer Research Consortium, Baltimore, USA
| | - Sara Coppens
- Nationwide Children's Hospital, Columbus, OH, USA
| | - Amy Wheless
- University of North Carolina, Chapel Hill, NC, USA
| | - Salma Rezk
- University of North Carolina, Chapel Hill, NC, USA
| | | | | | | | - Hui Shen
- Van Andel Institute, Grand Rapids, MI, USA
| | - Ben H Park
- Vanderbilt University Medical Center, Nashville, TN, USA
| | - Ian Krop
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | | | | | | | | | - Nancy U Lin
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | | | | | | | | | - Uma Chandran
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
| | - Michael Davis
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Alexander Ropelewski
- Pittsburgh Supercomputing Center, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | - Larry Norton
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | | | | | - Nancy E Davidson
- Fred Hutchinson Cancer Research Center, University of Washington, Seattle, WA, USA
| | - Lisa A Carey
- University of North Carolina, Chapel Hill, NC, USA
| | - Adrian V Lee
- UPMC Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
| | - Justin M Balko
- Vanderbilt University Medical Center, Nashville, TN, USA
| | | | | | | | - Tari A King
- Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
- Division of Breast Surgery, Brigham and Women's Hospital, Boston, MA, USA
| | | |
Collapse
|
3
|
Cai M, Yue M, Chen T, Liu J, Forno E, Lu X, Billiar T, Celedón J, McKennan C, Chen W, Wang J. Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution. Bioinformatics 2022; 38:3004-3010. [PMID: 35438146 PMCID: PMC9991889 DOI: 10.1093/bioinformatics/btac279] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/22/2022] [Accepted: 04/13/2022] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Tissue-level omics data such as transcriptomics and epigenomics are an average across diverse cell types. To extract cell-type-specific (CTS) signals, dozens of cellular deconvolution methods have been proposed to infer cell-type fractions from tissue-level data. However, these methods produce vastly different results under various real data settings. Simulation-based benchmarking studies showed no universally best deconvolution approaches. There have been attempts of ensemble methods, but they only aggregate multiple single-cell references or reference-free deconvolution methods. RESULTS To achieve a robust estimation of cellular fractions, we proposed EnsDeconv (Ensemble Deconvolution), which adopts CTS robust regression to synthesize the results from 11 single deconvolution methods, 10 reference datasets, 5 marker gene selection procedures, 5 data normalizations and 2 transformations. Unlike most benchmarking studies based on simulations, we compiled four large real datasets of 4937 tissue samples in total with measured cellular fractions and bulk gene expression from different tissues. Comprehensive evaluations demonstrated that EnsDeconv yields more stable, robust and accurate fractions than existing methods. We illustrated that EnsDeconv estimated cellular fractions enable various CTS downstream analyses such as differential fractions associated with clinical variables. We further extended EnsDeconv to analyze bulk DNA methylation data. AVAILABILITY AND IMPLEMENTATION EnsDeconv is freely available as an R-package from https://github.com/randel/EnsDeconv. The RNA microarray data from the TRAUMA study are available and can be accessed in GEO (GSE36809). The demographic and clinical phenotypes can be shared on reasonable request to the corresponding authors. The RNA-seq data from the EVAPR study cannot be shared publicly due to the privacy of individuals that participated in the clinical research in compliance with the IRB approval at the University of Pittsburgh. The RNA microarray data from the FHS study are available from dbGaP (phs000007.v32.p13). The RNA-seq data from ROS study is downloaded from AD Knowledge Portal. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Manqi Cai
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Molin Yue
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Tianmeng Chen
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15213, USA
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Jinling Liu
- Department of Engineering Management and Systems Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
- Department of Biological Sciences, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Erick Forno
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, USA
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| | - Timothy Billiar
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Juan Celedón
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, USA
| | - Chris McKennan
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Wei Chen
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, USA
| | - Jiebiao Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
| |
Collapse
|
4
|
Zeng J, Zhang Y, Shang Y, Mai J, Shi S, Lu M, Bu C, Zhang Z, Zhang Z, Li Y, Du Z, Xiao J. CancerSCEM: a database of single-cell expression map across various human cancers. Nucleic Acids Res 2021; 50:D1147-D1155. [PMID: 34643725 PMCID: PMC8728207 DOI: 10.1093/nar/gkab905] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 09/15/2021] [Accepted: 09/29/2021] [Indexed: 12/11/2022] Open
Abstract
With the proliferating studies of human cancers by single-cell RNA sequencing technique (scRNA-seq), cellular heterogeneity, immune landscape and pathogenesis within diverse cancers have been uncovered successively. The exponential explosion of massive cancer scRNA-seq datasets in the past decade are calling for a burning demand to be integrated and processed for essential investigations in tumor microenvironment of various cancer types. To fill this gap, we developed a database of Cancer Single-cell Expression Map (CancerSCEM, https://ngdc.cncb.ac.cn/cancerscem), particularly focusing on a variety of human cancers. To date, CancerSCE version 1.0 consists of 208 cancer samples across 28 studies and 20 human cancer types. A series of uniformly and multiscale analyses for each sample were performed, including accurate cell type annotation, functional gene expressions, cell interaction network, survival analysis and etc. Plus, we visualized CancerSCEM as a user-friendly web interface for users to browse, search, online analyze and download all the metadata as well as analytical results. More importantly and unprecedentedly, the newly-constructed comprehensive online analyzing platform in CancerSCEM integrates seven analyze functions, where investigators can interactively perform cancer scRNA-seq analyses. In all, CancerSCEM paves an informative and practical way to facilitate human cancer studies, and also provides insights into clinical therapy assessments.
Collapse
Affiliation(s)
- Jingyao Zeng
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yadong Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yunfei Shang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jialin Mai
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shuo Shi
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Mingming Lu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Congfan Bu
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhewen Zhang
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Zaichao Zhang
- Department of Biology, The University of Western Ontario, London, Ontario N6A 5B7, Canada
| | - Yang Li
- Beijing Tongren Eye Center, Beijing key Laboratory of Intraocular Tumor Diagnosis and Treatment, Beijing Tongren Hospital, Capital Medical University, Beijing 100730, China
| | - Zhenglin Du
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingfa Xiao
- National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,China National Center for Bioinformation, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|