151
|
Zou J, Deng F, Wang M, Zhang Z, Liu Z, Zhang X, Hua R, Chen K, Zou X, Hao J. scCODE: an R package for data-specific differentially expressed gene detection on single-cell RNA-sequencing data. Brief Bioinform 2022; 23:6590434. [PMID: 35598331 DOI: 10.1093/bib/bbac180] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 04/06/2022] [Accepted: 04/22/2022] [Indexed: 12/13/2022] Open
Abstract
Abstract
Differential expression (DE) gene detection in single-cell ribonucleic acid (RNA)-sequencing (scRNA-seq) data is a key step to understand the biological question investigated. Filtering genes is suggested to improve the performance of DE methods, but the influence of filtering genes has not been demonstrated. Furthermore, the optimal methods for different scRNA-seq datasets are divergent, and different datasets should benefit from data-specific DE gene detection strategies. However, existing tools did not take gene filtering into consideration. There is a lack of metrics for evaluating the optimal method on experimental datasets. Based on two new metrics, we propose single-cell Consensus Optimization of Differentially Expressed gene detection, an R package to automatically optimize DE gene detection for each experimental scRNA-seq dataset.
Collapse
Affiliation(s)
- Jiawei Zou
- School of Life Sciences and Biotechnology, Shanghai Centre for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
- Institute of Clinical Science, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Fulan Deng
- School of Materials Science and Engineering, Shanghai Institute of Technology, Shanghai 201418, China
| | - Miaochen Wang
- Department of Oral and Maxillofacial-Head & Neck Oncology, Shanghai Ninth Peopleȉs Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases; Shanghai Key Laboratory of Stomatology
| | - Zhen Zhang
- Department of Oral and Maxillofacial-Head & Neck Oncology, Shanghai Ninth Peopleȉs Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases; Shanghai Key Laboratory of Stomatology
| | - Zheqi Liu
- Department of Oral and Maxillofacial-Head & Neck Oncology, Shanghai Ninth Peopleȉs Hospital, Shanghai Jiao Tong University School of Medicine; College of Stomatology, Shanghai Jiao Tong University; National Center for Stomatology; National Clinical Research Center for Oral Diseases; Shanghai Key Laboratory of Stomatology
| | - Xiaobin Zhang
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
- Department of Cardiovascular Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Rong Hua
- Department of Thoracic Surgery, Shanghai Chest Hospital, Shanghai Jiao Tong University, Shanghai, China
| | - Ke Chen
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, Shanghai, 201602, China
| | - Xin Zou
- Jinshan Hospital Center for Tumor Diagnosis & Therapy, Jinshan Hospital, Fudan University, Shanghai, 201508, China
| | - Jie Hao
- Institute of Clinical Science, Zhongshan Hospital, Fudan University, Shanghai, China
| |
Collapse
|
152
|
Chen S, Luo Y, Gao H, Li F, Chen Y, Li J, You R, Hao M, Bian H, Xi X, Li W, Li W, Ye M, Meng Q, Zou Z, Li C, Li H, Zhang Y, Cui Y, Wei L, Chen F, Wang X, Lv H, Hua K, Jiang R, Zhang X. hECA: The cell-centric assembly of a cell atlas. iScience 2022; 25:104318. [PMID: 35602947 PMCID: PMC9114628 DOI: 10.1016/j.isci.2022.104318] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 03/18/2022] [Accepted: 04/25/2022] [Indexed: 12/04/2022] Open
Abstract
The accumulation of massive single-cell omics data provides growing resources for building biomolecular atlases of all cells of human organs or the whole body. The true assembly of a cell atlas should be cell-centric rather than file-centric. We developed a unified informatics framework for seamless cell-centric data assembly and built the human Ensemble Cell Atlas (hECA) from scattered data. hECA v1.0 assembled 1,093,299 labeled human cells from 116 published datasets, covering 38 organs and 11 systems. We invented three new methods of atlas applications based on the cell-centric assembly: “in data” cell sorting for targeted data retrieval with customizable logic expressions, “quantitative portraiture” for multi-view representations of biological entities, and customizable reference creation for generating references for automatic annotations. Case studies on agile construction of user-defined sub-atlases and “in data” investigation of CAR-T off-targets in multiple organs showed the great potential enabled by the cell-centric ensemble atlas. A unified informatics framework for seamless cell-centric assembly of massive single-cell data Built the general-purpose human Ensemble Cell Atlas (hECA) V1.0 from scattered data Three new methods of applications enabling “in data” cell experiments and portraiture Case studies of agile atlas reconstruction and target therapies side-effect discovery
Collapse
Affiliation(s)
- Sijie Chen
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yanting Luo
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Haoxiang Gao
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Fanhong Li
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yixin Chen
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Jiaqi Li
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Renke You
- Fuzhou Institute of Data Technology, Changle, Fuzhou 350200, China
| | - Minsheng Hao
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Haiyang Bian
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xi Xi
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Wenrui Li
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Weiyu Li
- Fuzhou Institute of Data Technology, Changle, Fuzhou 350200, China
| | - Mingli Ye
- Fuzhou Institute of Data Technology, Changle, Fuzhou 350200, China
| | - Qiuchen Meng
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Ziheng Zou
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Chen Li
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Haochen Li
- School of Medicine, Tsinghua University, Beijing 100084, China
| | - Yangyuan Zhang
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yanfei Cui
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Lei Wei
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Fufeng Chen
- Fuzhou Institute of Data Technology, Changle, Fuzhou 350200, China
| | - Xiaowo Wang
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Hairong Lv
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China.,Fuzhou Institute of Data Technology, Changle, Fuzhou 350200, China
| | - Kui Hua
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Rui Jiang
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xuegong Zhang
- MOE Key Lab of Bioinformatics, Bioinformatics Division of BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China.,School of Medicine, Tsinghua University, Beijing 100084, China.,School of Life Sciences, Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
153
|
Peng L, Renauer PA, Ökten A, Fang Z, Park JJ, Zhou X, Lin Q, Dong MB, Filler R, Xiong Q, Clark P, Lin C, Wilen CB, Chen S. Variant-specific vaccination induces systems immune responses and potent in vivo protection against SARS-CoV-2. Cell Rep Med 2022; 3:100634. [PMID: 35561673 PMCID: PMC9040489 DOI: 10.1016/j.xcrm.2022.100634] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 03/06/2022] [Accepted: 04/21/2022] [Indexed: 12/27/2022]
Abstract
Lipid nanoparticle (LNP)-mRNA vaccines offer protection against COVID-19; however, multiple variant lineages caused widespread breakthrough infections. Here, we generate LNP-mRNAs specifically encoding wild-type (WT), B.1.351, and B.1.617 SARS-CoV-2 spikes, and systematically study their immune responses. All three LNP-mRNAs induced potent antibody and T cell responses in animal models; however, differences in neutralization activity have been observed between variants. All three vaccines offer potent protection against in vivo challenges of authentic viruses of WA-1, Beta, and Delta variants. Single-cell transcriptomics of WT- and variant-specific LNP-mRNA-vaccinated animals reveal a systematic landscape of immune cell populations and global gene expression. Variant-specific vaccination induces a systemic increase of reactive CD8 T cells and altered gene expression programs in B and T lymphocytes. BCR-seq and TCR-seq unveil repertoire diversity and clonal expansions in vaccinated animals. These data provide assessment of efficacy and direct systems immune profiling of variant-specific LNP-mRNA vaccination in vivo.
Collapse
Affiliation(s)
- Lei Peng
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA
| | - Paul A Renauer
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA; Molecular Cell Biology, Genetics, and Development Program, Yale University, New Haven, CT, USA
| | - Arya Ökten
- Department of Immunobiology, Yale University, New Haven, CT, USA; Department of Laboratory Medicine, Yale University, New Haven, CT, USA
| | - Zhenhao Fang
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA
| | - Jonathan J Park
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA; M.D.-Ph.D. Program, Yale University, West Haven, CT, USA
| | - Xiaoyu Zhou
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA
| | - Qianqian Lin
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA
| | - Matthew B Dong
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA; Department of Immunobiology, Yale University, New Haven, CT, USA; M.D.-Ph.D. Program, Yale University, West Haven, CT, USA; Immunobiology Program, Yale University, New Haven, CT, USA
| | - Renata Filler
- Department of Immunobiology, Yale University, New Haven, CT, USA; Department of Laboratory Medicine, Yale University, New Haven, CT, USA
| | - Qiancheng Xiong
- Department of Cell Biology, Yale University, New Haven, CT, USA; Nanobiology Institute, Yale University, New Haven, CT, USA
| | - Paul Clark
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA
| | - Chenxiang Lin
- Department of Cell Biology, Yale University, New Haven, CT, USA; Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Nanobiology Institute, Yale University, New Haven, CT, USA
| | - Craig B Wilen
- Department of Immunobiology, Yale University, New Haven, CT, USA; Department of Laboratory Medicine, Yale University, New Haven, CT, USA.
| | - Sidi Chen
- Department of Genetics, Yale University School of Medicine, New Haven, CT, USA; System Biology Institute, Yale University, West Haven, CT, USA; Center for Cancer Systems Biology, Yale University, West Haven, CT, USA; Molecular Cell Biology, Genetics, and Development Program, Yale University, New Haven, CT, USA; M.D.-Ph.D. Program, Yale University, West Haven, CT, USA; Immunobiology Program, Yale University, New Haven, CT, USA; Department of Cell Biology, Yale University, New Haven, CT, USA; Department of Biomedical Engineering, Yale University, New Haven, CT, USA; Nanobiology Institute, Yale University, New Haven, CT, USA; Yale Comprehensive Cancer Center, Yale University School of Medicine, New Haven, CT, USA; Yale Stem Cell Center, Yale University School of Medicine, New Haven, CT, USA; Yale Center for Biomedical Data Science, Yale University School of Medicine, New Haven, CT, USA.
| |
Collapse
|
154
|
Duren Z, Chang F, Naqing F, Xin J, Liu Q, Wong WH. Regulatory analysis of single cell multiome gene expression and chromatin accessibility data with scREG. Genome Biol 2022; 23:114. [PMID: 35578363 PMCID: PMC9109353 DOI: 10.1186/s13059-022-02682-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 04/29/2022] [Indexed: 12/12/2022] Open
Abstract
Technological development has enabled the profiling of gene expression and chromatin accessibility from the same cell. We develop scREG, a dimension reduction methodology, based on the concept of cis-regulatory potential, for single cell multiome data. This concept is further used for the construction of subpopulation-specific cis-regulatory networks. The capability of inferring useful regulatory network is demonstrated by the two-fold increment on network inference accuracy compared to the Pearson correlation-based method and the 27-fold enrichment of GWAS variants for inflammatory bowel disease in the cis-regulatory elements. The R package scREG provides comprehensive functions for single cell multiome data analysis.
Collapse
Affiliation(s)
- Zhana Duren
- Center for Human Genetics and Department of Genetics and Biochemistry, Clemson University, Greenwood, SC, 29646, USA.
| | - Fengge Chang
- Center for Human Genetics and Department of Genetics and Biochemistry, Clemson University, Greenwood, SC, 29646, USA
| | - Fnu Naqing
- Center for Human Genetics and Department of Genetics and Biochemistry, Clemson University, Greenwood, SC, 29646, USA
| | - Jingxue Xin
- Department of Statistics, Department of Biomedical Data Science and Bio-X Program, Stanford University, Stanford, CA, 94305, USA
| | - Qiao Liu
- Department of Statistics, Department of Biomedical Data Science and Bio-X Program, Stanford University, Stanford, CA, 94305, USA
| | - Wing Hung Wong
- Department of Statistics, Department of Biomedical Data Science and Bio-X Program, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
155
|
Krstic N, Multani K, Wishart DS, Blydt-Hansen T, Cohen Freue GV. The impact of methodological choices when developing predictive models using urinary metabolite data. Stat Med 2022; 41:3511-3526. [PMID: 35567357 DOI: 10.1002/sim.9431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 04/15/2022] [Accepted: 04/26/2022] [Indexed: 11/08/2022]
Abstract
The continuous evolution of metabolomics over the past two decades has stimulated the search for metabolic biomarkers of many diseases. Metabolomic data measured from urinary samples can provide rich information of the biological events triggered by organ rejection in pediatric kidney transplant recipients. With additional validation, metabolic markers can be used to build clinically useful diagnostic tools. However, there are many methodological steps ranging from data processing to modeling that can influence the performance of the resulting metabolomic classifiers. In this study we focus on the comparison of various classification methods that can handle the complex structure of metabolomic data, including regularized classifiers, partial least squares discriminant analysis, and nonlinear classification models. We also examine the effectiveness of a physiological normalization technique widely used in the clinical and biochemical literature but not extensively analyzed and compared in urine metabolomic studies. While the main objective of this work is to interrogate metabolomic data of pediatric kidney transplant recipients to improve the diagnosis of T cell-mediated rejection (TCMR), we also analyze three independent datasets from other disease conditions to investigate the generalizability of our findings.
Collapse
Affiliation(s)
- Nikolas Krstic
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Kevin Multani
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada.,Department of Physics, Stanford University, Stanford, California, USA
| | - David S Wishart
- Departments of Computing Science and Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Tom Blydt-Hansen
- Department of Pediatrics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Gabriela V Cohen Freue
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
156
|
Nault R, Saha S, Bhattacharya S, Dodson J, Sinha S, Maiti T, Zacharewski T. Benchmarking of a Bayesian single cell RNAseq differential gene expression test for dose-response study designs. Nucleic Acids Res 2022; 50:e48. [PMID: 35061903 PMCID: PMC9071439 DOI: 10.1093/nar/gkac019] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 12/15/2021] [Accepted: 01/07/2022] [Indexed: 12/04/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNAseq) for the evaluation of chemicals, drugs, and food contaminants presents the opportunity to consider cellular heterogeneity in pharmacological and toxicological responses. Current differential gene expression analysis (DGEA) methods focus primarily on two group comparisons, not multi-group dose-response study designs used in safety assessments. To benchmark DGEA methods for dose-response scRNAseq experiments, we proposed a multiplicity corrected Bayesian testing approach and compare it against 8 other methods including two frequentist fit-for-purpose tests using simulated and experimental data. Our Bayesian test method outperformed all other tests for a broad range of accuracy metrics including control of false positive error rates. Most notable, the fit-for-purpose and standard multiple group DGEA methods were superior to the two group scRNAseq methods for dose-response study designs. Collectively, our benchmarking of DGEA methods demonstrates the importance in considering study design when determining the most appropriate test methods.
Collapse
Affiliation(s)
- Rance Nault
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI, USA
- Institute for Integrative Toxicology, Michigan State University, East Lansing, MI 48824, USA
| | - Satabdi Saha
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Sudin Bhattacharya
- Biomedical Engineering Department, Pharmacology & Toxicology, Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jack Dodson
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Samiran Sinha
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Tapabrata Maiti
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
| | - Tim Zacharewski
- Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI, USA
- Institute for Integrative Toxicology, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
157
|
Zhu B, Li H, Zhang L, Chandra SS, Zhao H. A Markov random field model-based approach for differentially expressed gene detection from single-cell RNA-seq data. Brief Bioinform 2022; 23:6581434. [PMID: 35514182 PMCID: PMC9487630 DOI: 10.1093/bib/bbac166] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 04/02/2022] [Accepted: 04/13/2022] [Indexed: 11/13/2022] Open
Abstract
The development of single-cell RNA-sequencing (scRNA-seq) technologies has offered insights into complex biological systems at the single-cell resolution. In particular, these techniques facilitate the identifications of genes showing cell-type-specific differential expressions (DE). In this paper, we introduce MARBLES, a novel statistical model for cross-condition DE gene detection from scRNA-seq data. MARBLES employs a Markov Random Field model to borrow information across similar cell types and utilizes cell-type-specific pseudobulk count to account for sample-level variability. Our simulation results showed that MARBLES is more powerful than existing methods to detect DE genes with an appropriate control of false positive rate. Applications of MARBLES to real data identified novel disease-related DE genes and biological pathways from both a single-cell lipopolysaccharide mouse dataset with 24 381 cells and 11 076 genes and a Parkinson's disease human data set with 76 212 cells and 15 891 genes. Overall, MARBLES is a powerful tool to identify cell-type-specific DE genes across conditions from scRNA-seq data.
Collapse
Affiliation(s)
- Biqing Zhu
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA
| | - Hongyu Li
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, 06511, USA
| | - Le Zhang
- Department of Neurology, School of Medicine, Yale University, New Haven, CT, 06511, USA
| | - Sreeganga S Chandra
- Department of Neurology, School of Medicine, Yale University, New Haven, CT, 06511, USA,Department of Neuroscience, School of Medicine, Yale University, New Haven, CT, 06511, USA
| | - Hongyu Zhao
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT, 06511, USA,Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, 06511, USA,Corresponding author. Hongyu Zhao, 300 George Street, Ste 503, New Haven, CT 06511. E-mail:
| |
Collapse
|
158
|
Zhang X, Chen Z, Bhadani R, Cao S, Lu M, Lytal N, Chen Y, An L. NISC: Neural Network-Imputation for Single-Cell RNA Sequencing and Cell Type Clustering. Front Genet 2022; 13:847112. [PMID: 35591853 PMCID: PMC9110639 DOI: 10.3389/fgene.2022.847112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) reveals the transcriptome diversity in heterogeneous cell populations as it allows researchers to study gene expression at single-cell resolution. The latest advances in scRNA-seq technology have made it possible to profile tens of thousands of individual cells simultaneously. However, the technology also increases the number of missing values, i. e, dropouts, from technical constraints, such as amplification failure during the reverse transcription step. The resulting sparsity of scRNA-seq count data can be very high, with greater than 90% of data entries being zeros, which becomes an obstacle for clustering cell types. Current imputation methods are not robust in the case of high sparsity. In this study, we develop a Neural Network-based Imputation for scRNA-seq count data, NISC. It uses autoencoder, coupled with a weighted loss function and regularization, to correct the dropouts in scRNA-seq count data. A systematic evaluation shows that NISC is an effective imputation approach for handling sparse scRNA-seq count data, and its performance surpasses existing imputation methods in cell type identification.
Collapse
Affiliation(s)
- Xiang Zhang
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, United States
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, United States
| | - Zhuo Chen
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, United States
| | - Rahul Bhadani
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, United States
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ, United States
| | - Siyang Cao
- Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ, United States
| | - Meng Lu
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, United States
| | - Nicholas Lytal
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, United States
- Department of Mathematics and Statistics, California State University at Chico, Chico, CA, United States
| | - Yin Chen
- College of Pharmacy, University of Arizona, Tucson, AZ, United States
| | - Lingling An
- Interdisciplinary Program in Statistics and Data Science, University of Arizona, Tucson, AZ, United States
- Department of Biosystems Engineering, University of Arizona, Tucson, AZ, United States
- Department of Biostatistics and Epidemiology, University of Arizona, Tucson, AZ, United States
- *Correspondence: Lingling An,
| |
Collapse
|
159
|
Spatially informed cell-type deconvolution for spatial transcriptomics. Nat Biotechnol 2022; 40:1349-1359. [PMID: 35501392 PMCID: PMC9464662 DOI: 10.1038/s41587-022-01273-7] [Citation(s) in RCA: 121] [Impact Index Per Article: 60.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 03/07/2022] [Indexed: 12/16/2022]
Abstract
Many spatially resolved transcriptomic technologies do not have single-cell resolution but measure the average gene expression for each spot from a mixture of cells of potentially heterogeneous cell types. Here, we introduce a deconvolution method, conditional autoregressive deconvolution (CARD), that combines cell type–specific expression information from single-cell RNA sequencing (scRNA-seq) with correlation in cell type composition across tissue locations. Modeling spatial correlation allows us to borrow the cell-type composition information across locations, improving accuracy of deconvolution even with a mismatched scRNA-seq reference. CARD can also impute cell type compositions and gene expression levels at unmeasured tissue locations, enable the construction of a refined spatial tissue map with a resolution arbitrarily higher than that measured in the original study, and perform deconvolution without a scRNA-seq reference. Applications to four datasets including a pancreatic cancer dataset identified multiple cell types and molecular markers with distinct spatial localization that define the progression, heterogeneity, and compartmentalization of pancreatic cancer.
Collapse
|
160
|
Abondio P, De Intinis C, da Silva Gonçalves Vianez Júnior JL, Pace L. SINGLE CELL MULTIOMIC APPROACHES TO DISENTANGLE T CELL HETEROGENEITY. Immunol Lett 2022; 246:37-51. [DOI: 10.1016/j.imlet.2022.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 04/16/2022] [Accepted: 04/26/2022] [Indexed: 11/29/2022]
|
161
|
Affiliation(s)
- Greg Gibson
- School of Biological Sciences and Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
162
|
Cossard GG, Godfroy O, Nehr Z, Cruaud C, Cock JM, Lipinska AP, Coelho SM. Selection drives convergent gene expression changes during transitions to co-sexuality in haploid sexual systems. Nat Ecol Evol 2022; 6:579-589. [PMID: 35314785 PMCID: PMC9085613 DOI: 10.1038/s41559-022-01692-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 02/07/2022] [Indexed: 11/25/2022]
Abstract
Co-sexuality has evolved repeatedly from unisexual (dioicous) ancestors across a wide range of taxa. However, the molecular changes underpinning this important transition remain unknown, particularly in organisms with haploid sexual systems such as bryophytes, red algae and brown algae. Here we explore four independent events of emergence of co-sexuality from unisexual ancestors in brown algal clades to examine the nature, evolution and degree of convergence of gene expression changes that accompany the breakdown of dioicy. The amounts of male versus female phenotypic differences in dioicous species were not correlated with the extent of sex-biased gene expression, in stark contrast to what is observed in animals. Although sex-biased genes exhibited a high turnover rate during brown alga diversification, some of their predicted functions were conserved across species. Transitions to co-sexuality consistently involved adaptive gene expression shifts and rapid sequence evolution, particularly for male-biased genes. Gene expression in co-sexual species was more similar to that in females rather than males of related dioicous species, suggesting that co-sexuality may have arisen from ancestral females. Finally, extensive convergent gene expression changes, driven by selection, were associated with the transition to co-sexuality. Together, our observations provide insights on how co-sexual systems arise from ancestral, haploid UV sexual systems.
Collapse
Affiliation(s)
- Guillaume G Cossard
- Sorbonne Université, UPMC Univ Paris 06, CNRS, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS, Roscoff, France
- Max Plank Institute for Biology Tübingen, Tübingen, Germany
| | - Olivier Godfroy
- Sorbonne Université, UPMC Univ Paris 06, CNRS, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS, Roscoff, France
| | - Zofia Nehr
- Sorbonne Université, UPMC Univ Paris 06, CNRS, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS, Roscoff, France
| | - Corinne Cruaud
- Genoscope, Institut de Biologie François Jacob, CEA, Université Paris-Saclay, Evry, France
| | - J Mark Cock
- Sorbonne Université, UPMC Univ Paris 06, CNRS, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS, Roscoff, France
| | - Agnieszka P Lipinska
- Sorbonne Université, UPMC Univ Paris 06, CNRS, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS, Roscoff, France
- Max Plank Institute for Biology Tübingen, Tübingen, Germany
| | - Susana M Coelho
- Sorbonne Université, UPMC Univ Paris 06, CNRS, Integrative Biology of Marine Models, Station Biologique de Roscoff, CS, Roscoff, France.
- Max Plank Institute for Biology Tübingen, Tübingen, Germany.
| |
Collapse
|
163
|
Huuhtanen J, Bhattacharya D, Lönnberg T, Kankainen M, Kerr C, Theodoropoulos J, Rajala H, Gurnari C, Kasanen T, Braun T, Teramo A, Zambello R, Herling M, Ishida F, Kawakami T, Salmi M, Loughran T, Maciejewski JP, Lähdesmäki H, Kelkka T, Mustjoki S. Single-cell characterization of leukemic and non-leukemic immune repertoires in CD8 + T-cell large granular lymphocytic leukemia. Nat Commun 2022; 13:1981. [PMID: 35411050 PMCID: PMC9001660 DOI: 10.1038/s41467-022-29173-z] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 02/17/2022] [Indexed: 12/13/2022] Open
Abstract
T cell large granular lymphocytic leukemia (T-LGLL) is a rare lymphoproliferative disorder of mature, clonally expanded T cells, where somatic-activating STAT3 mutations are common. Although T-LGLL has been described as a chronic T cell response to an antigen, the function of the non-leukemic immune system in this response is largely uncharacterized. Here, by utilizing single-cell RNA and T cell receptor profiling (scRNA+TCRαβ-seq), we show that irrespective of STAT3 mutation status, T-LGLL clonotypes are more cytotoxic and exhausted than healthy reactive clonotypes. In addition, T-LGLL clonotypes show more active cell communication than reactive clones with non-leukemic immune cells via costimulatory cell-cell interactions, monocyte-secreted proinflammatory cytokines, and T-LGLL-clone-secreted IFNγ. Besides the leukemic repertoire, the non-leukemic T cell repertoire in T-LGLL is also more mature, cytotoxic, and clonally restricted than in other cancers and autoimmune disorders. Finally, 72% of the leukemic T-LGLL clonotypes share T cell receptor similarities with their non-leukemic repertoire, linking the leukemic and non-leukemic repertoires together via possible common target antigens. Our results provide a rationale to prioritize therapies that target the entire immune repertoire and not only the T-LGLL clonotype.
Collapse
Affiliation(s)
- Jani Huuhtanen
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Dipabarna Bhattacharya
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
| | - Tapio Lönnberg
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFlames Flagship Center, University of Turku, Turku, Finland
| | - Matti Kankainen
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
| | - Cassandra Kerr
- Translational Hematology and Oncology Department, Taussig Cancer Center, Cleveland Clinic, Cleveland, OH, USA
| | - Jason Theodoropoulos
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Hanna Rajala
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
| | - Carmelo Gurnari
- Translational Hematology and Oncology Department, Taussig Cancer Center, Cleveland Clinic, Cleveland, OH, USA
| | - Tiina Kasanen
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
| | - Till Braun
- Department I of Internal Medicine, Center for Integrated Oncology (CIO), Aachen-Bonn-Cologne-Duesseldorf, University of Cologne (UoC), Cologne, Germany
| | - Antonella Teramo
- Department of Medicine (DIMED), Hematology and Clinical Immunology Branch, Padova University School of Medicine, Padova, Italy
- Veneto Institute of Molecular Medicine (VIMM), Padova, Italy
| | - Renato Zambello
- Department of Medicine (DIMED), Hematology and Clinical Immunology Branch, Padova University School of Medicine, Padova, Italy
- Veneto Institute of Molecular Medicine (VIMM), Padova, Italy
| | - Marco Herling
- Department I of Internal Medicine, Center for Integrated Oncology (CIO), Aachen-Bonn-Cologne-Duesseldorf, University of Cologne (UoC), Cologne, Germany
- Clinic of Hematology and Cellular Therapy, University of Leipzig, Leipzig, Germany
| | - Fumihiro Ishida
- Department of Biomedical Laboratory Sciences, Shinshu University School of Medicine, Matsumoto, Japan
- Division of Hematology, Department of Internal Medicine, Shinshu University School of Medicine, Matsumoto, Japan
| | - Toru Kawakami
- Department of Biomedical Laboratory Sciences, Shinshu University School of Medicine, Matsumoto, Japan
- Division of Hematology, Department of Internal Medicine, Shinshu University School of Medicine, Matsumoto, Japan
| | - Marko Salmi
- InFlames Flagship Center, University of Turku, Turku, Finland
- MediCity Research Laboratory and Institute of Biomedicine, University of Turku, Turku, Finland
| | - Thomas Loughran
- Division of Hematology/Oncology, Department of Medicine, UVA Cancer Center, University of Virginia, Charlottesville, VA, USA
| | - Jaroslaw P Maciejewski
- Translational Hematology and Oncology Department, Taussig Cancer Center, Cleveland Clinic, Cleveland, OH, USA
| | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Tiina Kelkka
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland
| | - Satu Mustjoki
- Hematology Research Unit Helsinki, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland.
- Translational Immunology Research Program and Department of Clinical Chemistry and Hematology, University of Helsinki, Helsinki, Finland.
- iCAN Digital Precision Medicine Flagship, Helsinki, Finland.
| |
Collapse
|
164
|
Arzalluz-Luque A, Salguero P, Tarazona S, Conesa A. acorde unravels functionally interpretable networks of isoform co-usage from single cell data. Nat Commun 2022; 13:1828. [PMID: 35383181 PMCID: PMC8983708 DOI: 10.1038/s41467-022-29497-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 03/16/2022] [Indexed: 12/13/2022] Open
Abstract
Alternative splicing (AS) is a highly-regulated post-transcriptional mechanism known to modulate isoform expression within genes and contribute to cell-type identity. However, the extent to which alternative isoforms establish co-expression networks that may be relevant in cellular function has not been explored yet. Here, we present acorde, a pipeline that successfully leverages bulk long reads and single-cell data to confidently detect alternative isoform co-expression relationships. To achieve this, we develop and validate percentile correlations, an innovative approach that overcomes data sparsity and yields accurate co-expression estimates from single-cell data. Next, acorde uses correlations to cluster co-expressed isoforms into a network, unraveling cell type-specific alternative isoform usage patterns. By selecting same-gene isoforms between these clusters, we subsequently detect and characterize genes with co-differential isoform usage (coDIU) across cell types. Finally, we predict functional elements from long read-defined isoforms and provide insight into biological processes, motifs, and domains potentially controlled by the coordination of post-transcriptional regulation. The code for acorde is available at https://github.com/ConesaLab/acorde .
Collapse
Affiliation(s)
- Angeles Arzalluz-Luque
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
- Institute for Integrative Systems Biology (CSIC-UV), Spanish National Research Council, Paterna, Valencia, Spain
| | - Pedro Salguero
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain.
| | - Ana Conesa
- Institute for Integrative Systems Biology (CSIC-UV), Spanish National Research Council, Paterna, Valencia, Spain.
- Microbiology and Cell Sciences Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
165
|
Liu Y, Lin Y, Yang W, Lin Y, Wu Y, Zhang Z, Lin N, Wang X, Tong M, Yu R. Application of individualized differential expression analysis in human cancer proteome. Brief Bioinform 2022; 23:6562685. [DOI: 10.1093/bib/bbac096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 02/06/2022] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Liquid chromatography–mass spectrometry-based quantitative proteomics can measure the expression of thousands of proteins from biological samples and has been increasingly applied in cancer research. Identifying differentially expressed proteins (DEPs) between tumors and normal controls is commonly used to investigate carcinogenesis mechanisms. While differential expression analysis (DEA) at an individual level is desired to identify patient-specific molecular defects for better patient stratification, most statistical DEP analysis methods only identify deregulated proteins at the population level. To date, robust individualized DEA algorithms have been proposed for ribonucleic acid data, but their performance on proteomics data is underexplored. Herein, we performed a systematic evaluation on five individualized DEA algorithms for proteins on cancer proteomic datasets from seven cancer types. Results show that the within-sample relative expression orderings (REOs) of protein pairs in normal tissues were highly stable, providing the basis for individualized DEA for proteins using REOs. Moreover, individualized DEA algorithms achieve higher precision in detecting sample-specific deregulated proteins than population-level methods. To facilitate the utilization of individualized DEA algorithms in proteomics for prognostic biomarker discovery and personalized medicine, we provide Individualized DEP Analysis IDEPAXMBD (XMBD: Xiamen Big Data, a biomedical open software initiative in the National Institute for Data Science in Health and Medicine, Xiamen University, China.) (https://github.com/xmuyulab/IDEPA-XMBD), which is a user-friendly and open-source Python toolkit that integrates individualized DEA algorithms for DEP-associated deregulation pattern recognition.
Collapse
Affiliation(s)
- Yachen Liu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
| | - Yalan Lin
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
| | - Wenxian Yang
- Aginome Scientific, Xiamen, Fujian 316005, China
| | - Yuxiang Lin
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Yujuan Wu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
| | - Zheyang Zhang
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Nuoqi Lin
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Xianlong Wang
- Department of Bioinformatics, School of Medical Technology and Engineering, Key Laboratory of Medical Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Fujian Medical University, Fuzhou, Fujian 350122, China
| | - Mengsha Tong
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, School of Life Sciences, Xiamen University, Xiamen, Fujian 361102, China
| | - Rongshan Yu
- School of Informatics, Xiamen University, Xiamen, Fujian 316000, China
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian 316005, China
- Aginome Scientific, Xiamen, Fujian 316005, China
| |
Collapse
|
166
|
The expression pattern of VISTA in the PBMCs of relapsing-remitting multiple sclerosis patients: A single-cell RNA sequencing-based study. Biomed Pharmacother 2022; 148:112725. [DOI: 10.1016/j.biopha.2022.112725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 11/20/2022] Open
|
167
|
Robinson EL, Baker AH, Brittan M, McCracken I, Condorelli G, Emanueli C, Srivastava PK, Gaetano C, Thum T, Vanhaverbeke M, Angione C, Heymans S, Devaux Y, Pedrazzini T, Martelli F. Dissecting the transcriptome in cardiovascular disease. Cardiovasc Res 2022; 118:1004-1019. [PMID: 33757121 PMCID: PMC8930073 DOI: 10.1093/cvr/cvab117] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 03/22/2021] [Indexed: 12/12/2022] Open
Abstract
The human transcriptome comprises a complex network of coding and non-coding RNAs implicated in a myriad of biological functions. Non-coding RNAs exhibit highly organized spatial and temporal expression patterns and are emerging as critical regulators of differentiation, homeostasis, and pathological states, including in the cardiovascular system. This review defines the current knowledge gaps, unmet methodological needs, and describes the challenges in dissecting and understanding the role and regulation of the non-coding transcriptome in cardiovascular disease. These challenges include poor annotation of the non-coding genome, determination of the cellular distribution of transcripts, assessment of the role of RNA processing and identification of cell-type specific changes in cardiovascular physiology and disease. We highlight similarities and differences in the hurdles associated with the analysis of the non-coding and protein-coding transcriptomes. In addition, we discuss how the lack of consensus and absence of standardized methods affect reproducibility of data. These shortcomings should be defeated in order to make significant scientific progress and foster the development of clinically applicable non-coding RNA-based therapeutic strategies to lessen the burden of cardiovascular disease.
Collapse
Affiliation(s)
- Emma L Robinson
- Department of Cardiology, Cardiovascular Research Institute Maastricht (CARIM), Universiteitssingel 50, 6229 Maastricht University, Maastricht, The Netherlands
- The Division of Cardiology, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Andrew H Baker
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh, EH16 4TJ, UK
| | - Mairi Brittan
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh, EH16 4TJ, UK
| | - Ian McCracken
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh, EH16 4TJ, UK
| | - G Condorelli
- Humanitas Research Hospital, Humanitas University, Via Manzoni 113, Rozzano, MI 20089, Italy
| | - C Emanueli
- Imperial College, National Heart and Lung Institute, Hammersmith campus, Du Cane Road, London W12 0NN, UK
| | - P K Srivastava
- Imperial College, National Heart and Lung Institute, Hammersmith campus, Du Cane Road, London W12 0NN, UK
| | - C Gaetano
- Laboratorio di Epigenetica, Istituti Clinici Scientifici Maugeri IRCCS, Via Maugeri 4, Pavia 27100, Italy
| | - T Thum
- Hannover Medical School, Institute of Molecular and Translational Therapeutic Strategies (IMTTS), Carl-Neuberg-Straße 1 30625 Hannover, Germany
| | - M Vanhaverbeke
- UZ Gasthuisberg Campus, KU Leuven, Herestraat 49 3000 Leuven, Belgium
| | - C Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, TS4 3BX, UK
| | - S Heymans
- Department of Cardiology, Cardiovascular Research Institute Maastricht (CARIM), Universiteitssingel 50, 6229 Maastricht University, Maastricht, The Netherlands
| | - Y Devaux
- Cardiovascular Research Unit, Department of Population Health, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, L-1445 Strassen, Luxembourg
| | - T Pedrazzini
- Experimental Cardiology Unit, Division of Cardiology, Department of Cardiovascular Medicine, University of Lausanne Medical School, 1011 Lausanne, Switzerland
| | - F Martelli
- Molecular Cardiology Laboratory, IRCCS-Policlinico San Donato, Piazza Edmondo Malan, 2, 20097 San Donato, Milan, Italy
| | | |
Collapse
|
168
|
Cao X, Xing L, Majd E, He H, Gu J, Zhang X. A Systematic Evaluation of Supervised Machine Learning Algorithms for Cell Phenotype Classification Using Single-Cell RNA Sequencing Data. Front Genet 2022; 13:836798. [PMID: 35281805 PMCID: PMC8905542 DOI: 10.3389/fgene.2022.836798] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Accepted: 01/18/2022] [Indexed: 11/13/2022] Open
Abstract
The new technology of single-cell RNA sequencing (scRNA-seq) can yield valuable insights into gene expression and give critical information about the cellular compositions of complex tissues. In recent years, vast numbers of scRNA-seq datasets have been generated and made publicly available, and this has enabled researchers to train supervised machine learning models for predicting or classifying various cell-level phenotypes. This has led to the development of many new methods for analyzing scRNA-seq data. Despite the popularity of such applications, there has as yet been no systematic investigation of the performance of these supervised algorithms using predictors from various sizes of scRNA-seq datasets. In this study, 13 popular supervised machine learning algorithms for cell phenotype classification were evaluated using published real and simulated datasets with diverse cell sizes. This benchmark comprises two parts. In the first, real datasets were used to assess the computing speed and cell phenotype classification performance of popular supervised algorithms. The classification performances were evaluated using the area under the receiver operating characteristic curve, F1-score, Precision, Recall, and false-positive rate. In the second part, we evaluated gene-selection performance using published simulated datasets with a known list of real genes. The results showed that ElasticNet with interactions performed the best for small and medium-sized datasets. The NaiveBayes classifier was found to be another appropriate method for medium-sized datasets. With large datasets, the performance of the XGBoost algorithm was found to be excellent. Ensemble algorithms were not found to be significantly superior to individual machine learning methods. Including interactions in the ElasticNet algorithm caused a significant performance improvement for small datasets. The linear discriminant analysis algorithm was found to be the best choice when speed is critical; it is the fastest method, it can scale to handle large sample sizes, and its performance is not much worse than the top performers.
Collapse
Affiliation(s)
- Xiaowen Cao
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, China.,Department of Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | - Li Xing
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, SK, Canada
| | - Elham Majd
- Department of Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| | - Hua He
- School of Science, Hebei University of Technology, Tianjin, China
| | - Junhua Gu
- School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Xuekui Zhang
- Department of Mathematics and Statistics, University of Victoria, Victoria, BC, Canada
| |
Collapse
|
169
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
170
|
Mao S, Zhang Y, Seelig G, Kannan S. CellMeSH: probabilistic cell-type identification using indexed literature. Bioinformatics 2022; 38:1393-1402. [PMID: 34893819 PMCID: PMC8826164 DOI: 10.1093/bioinformatics/btab834] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Revised: 11/21/2021] [Accepted: 12/06/2021] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION Single-cell RNA sequencing (scRNA-seq) is widely used for analyzing gene expression in multi-cellular systems and provides unprecedented access to cellular heterogeneity. scRNA-seq experiments aim to identify and quantify all cell types present in a sample. Measured single-cell transcriptomes are grouped by similarity and the resulting clusters are mapped to cell types based on cluster-specific gene expression patterns. While the process of generating clusters has become largely automated, annotation remains a laborious ad hoc effort that requires expert biological knowledge. RESULTS Here, we introduce CellMeSH-a new automated approach to identifying cell types for clusters based on prior literature. CellMeSH combines a database of gene-cell-type associations with a probabilistic method for database querying. The database is constructed by automatically linking gene and cell-type information from millions of publications using existing indexed literature resources. Compared to manually constructed databases, CellMeSH is more comprehensive and is easily updated with new data. The probabilistic query method enables reliable information retrieval even though the gene-cell-type associations extracted from the literature are noisy. CellMeSH is also able to optionally utilize prior knowledge about tissues or cells for further annotation improvement. CellMeSH achieves top-one and top-three accuracies on a number of mouse and human datasets that are consistently better than existing approaches. AVAILABILITY AND IMPLEMENTATION Web server at https://uncurl.cs.washington.edu/db_query and API at https://github.com/shunfumao/cellmesh. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shunfu Mao
- Electrical and Computer Engineering Department, University of Washington, Seattle, WA 98195, USA
| | - Yue Zhang
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
| | - Georg Seelig
- Electrical and Computer Engineering Department, University of Washington, Seattle, WA 98195, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA 98195, USA
| | - Sreeram Kannan
- Electrical and Computer Engineering Department, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
171
|
Qin F, Luo X, Xiao F, Cai G. SCRIP: an accurate simulator for single-cell RNA sequencing data. Bioinformatics 2022; 38:1304-1311. [PMID: 34874992 DOI: 10.1093/bioinformatics/btab824] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 11/22/2021] [Accepted: 12/01/2021] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Recent advancements in single-cell RNA sequencing (scRNA-seq) have enabled time-efficient transcriptome profiling in individual cells. To optimize sequencing protocols and develop reliable analysis methods for various application scenarios, solid simulation methods for scRNA-seq data are required. However, due to the noisy nature of scRNA-seq data, currently available simulation methods cannot sufficiently capture and simulate important properties of real data, especially the biological variation. In this study, we developed scRNA-seq information producer (SCRIP), a novel simulator for scRNA-seq that is accurate and enables simulation of bursting kinetics. RESULTS Compared to existing simulators, SCRIP showed a significantly higher accuracy of stimulating key data features, including mean-variance dependency in all experiments. SCRIP also outperformed other methods in recovering cell-cell distances. The application of SCRIP in evaluating differential expression analysis methods showed that edgeR outperformed other examined methods in differential expression analyses, and ZINB-WaVE improved the AUC at high dropout rates. Collectively, this study provides the research community with a rigorous tool for scRNA-seq data simulation. AVAILABILITY AND IMPLEMENTATION https://CRAN.R-project.org/package=SCRIP. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fei Qin
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Xizhi Luo
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Feifei Xiao
- Department of Epidemiology and Biostatistics, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| | - Guoshuai Cai
- Department of Environmental Health Sciences, Arnold School of Public Health, University of South Carolina, Columbia, SC 29208, USA
| |
Collapse
|
172
|
Baruzzo G, Patuzzi I, Di Camillo B. Beware to ignore the rare: how imputing zero-values can improve the quality of 16S rRNA gene studies results. BMC Bioinformatics 2022; 22:618. [PMID: 35130833 PMCID: PMC8822630 DOI: 10.1186/s12859-022-04587-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 01/27/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND 16S rRNA-gene sequencing is a valuable approach to characterize the taxonomic content of the whole bacterial population inhabiting a metabolic and spatial niche, providing an important opportunity to study bacteria and their role in many health and environmental mechanisms. The analysis of data produced by amplicon sequencing, however, brings very specific methodological issues that need to be properly addressed to obtain reliable biological conclusions. Among these, 16S count data tend to be very sparse, with many null values reflecting species that are present but got unobserved due to the multiplexing constraints. However, current data workflows do not consider a step in which the information about unobserved species is recovered. RESULTS In this work, we evaluate for the first time the effects of introducing in the 16S data workflow a new preprocessing step, zero-imputation, to recover this lost information. Due to the lack of published zero-imputation methods specifically designed for 16S count data, we considered a set of zero-imputation strategies available for other frameworks, and benchmarked them using in silico 16S count data reflecting different experimental designs. Additionally, we assessed the effect of combining zero-imputation and normalization, i.e. the only preprocessing step in current 16S workflow. Overall, we benchmarked 35 16S preprocessing pipelines assessing their ability to handle data sparsity, identify species presence/absence, recovery sample proportional abundance distributions, and improve typical downstream analyses such as computation of alpha and beta diversity indices and differential abundance analysis. CONCLUSIONS The results clearly show that 16S data analysis greatly benefits from a properly-performed zero-imputation step, despite the choice of the right zero-imputation method having a pivotal role. In addition, we identify a set of best-performing pipelines that could be a valuable indication for data analysts.
Collapse
Affiliation(s)
- Giacomo Baruzzo
- Department of Information Engineering, University of Padova, Padua, Italy
| | - Ilaria Patuzzi
- Department of Information Engineering, University of Padova, Padua, Italy
- Microbial Ecology Unit, Istituto Zooprofilattico Sperimentale Delle Venezie, Padua, Italy
- Research & Development Division, EuBiome S.R.L., Padua, Italy
| | - Barbara Di Camillo
- Department of Information Engineering, University of Padova, Padua, Italy.
- CRIBI Biotechnology Centre, University of Padova, Padua, Italy.
- Department of Comparative Biomedicine and Food Science, University of Padova, Padua, Italy.
| |
Collapse
|
173
|
Lovero D, D’Oronzo S, Palmirotta R, Cafforio P, Brown J, Wood S, Porta C, Lauricella E, Coleman R, Silvestris F. Correlation between targeted RNAseq signature of breast cancer CTCs and onset of bone-only metastases. Br J Cancer 2022; 126:419-429. [PMID: 34272498 PMCID: PMC8810805 DOI: 10.1038/s41416-021-01481-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 06/04/2021] [Accepted: 06/30/2021] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Bone is the most frequent site of metastases from breast cancer (BC), but no biomarkers are yet available to predict skeletal dissemination. METHODS We attempted to identify a gene signature correlated with bone metastasis (BM) onset in circulating tumour cells (CTCs), isolated by a DEPArray-based protocol from 40 metastatic BC patients and grouped according to metastasis sites, namely "BM" (bone-only), "ES" (extra-skeletal) or BM + ES (bone + extra-skeletal). RESULTS A 134-gene panel was first validated through targeted RNA sequencing (RNAseq) on sub-clones of the MDA-MB-231 BC cell line with variable organotropism, which successfully shaped their clustering. The panel was then applied to CTC groups and, in particular, the "BM" vs "ES" CTC comparison revealed 31 differentially expressed genes, including MAF, CAPG, GIPC1 and IL1B, playing key prognostic roles in BC. CONCLUSION Such evidence confirms that CTCs are suitable biological sources for organotropism investigation through targeted RNAseq and might deserve future applications in wide-scale prospective studies.
Collapse
Affiliation(s)
- Domenica Lovero
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| | - Stella D’Oronzo
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| | - Raffaele Palmirotta
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| | - Paola Cafforio
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| | - Janet Brown
- grid.417079.c0000 0004 0391 9207Department of Oncology and Metabolism, University of Sheffield, Weston Park Hospital, Sheffield, UK
| | - Steven Wood
- grid.417079.c0000 0004 0391 9207Department of Oncology and Metabolism, University of Sheffield, Weston Park Hospital, Sheffield, UK
| | - Camillo Porta
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| | - Eleonora Lauricella
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| | - Robert Coleman
- grid.417079.c0000 0004 0391 9207Department of Oncology and Metabolism, University of Sheffield, Weston Park Hospital, Sheffield, UK
| | - Franco Silvestris
- grid.7644.10000 0001 0120 3326Department of Biomedical Sciences and Human Oncology—Section of Internal Medicine and Clinical Oncology, University of Bari Aldo Moro, Bari, Italy
| |
Collapse
|
174
|
Denninger JK, Walker LA, Chen X, Turkoglu A, Pan A, Tapp Z, Senthilvelan S, Rindani R, Kokiko-Cochran ON, Bundschuh R, Yan P, Kirby ED. Robust Transcriptional Profiling and Identification of Differentially Expressed Genes With Low Input RNA Sequencing of Adult Hippocampal Neural Stem and Progenitor Populations. Front Mol Neurosci 2022; 15:810722. [PMID: 35173579 PMCID: PMC8842474 DOI: 10.3389/fnmol.2022.810722] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 01/05/2022] [Indexed: 11/17/2022] Open
Abstract
Multipotent neural stem cells (NSCs) are found in several isolated niches of the adult mammalian brain where they have unique potential to assist in tissue repair. Modern transcriptomics offer high-throughput methods for identifying disease or injury associated gene expression signatures in endogenous adult NSCs, but they require adaptation to accommodate the rarity of NSCs. Bulk RNA sequencing (RNAseq) of NSCs requires pooling several mice, which impedes application to labor-intensive injury models. Alternatively, single cell RNAseq can profile hundreds to thousands of cells from a single mouse and is increasingly used to study NSCs. The consequences of the low RNA input from a single NSC on downstream identification of differentially expressed genes (DEGs) remains insufficiently explored. Here, to clarify the role that low RNA input plays in NSC DEG identification, we directly compared DEGs in an oxidative stress model of cultured NSCs by bulk and single cell sequencing. While both methods yielded DEGs that were replicable, single cell sequencing using the 10X Chromium platform yielded DEGs derived from genes with higher relative transcript counts compared to non-DEGs and exhibited smaller fold changes than DEGs identified by bulk RNAseq. The loss of high fold-change DEGs in the single cell platform presents an important limitation for identifying disease-relevant genes. To facilitate identification of such genes, we determined an RNA-input threshold that enables transcriptional profiling of NSCs comparable to standard bulk sequencing and used it to establish a workflow for in vivo profiling of endogenous NSCs. We then applied this workflow to identify DEGs after lateral fluid percussion injury, a labor-intensive animal model of traumatic brain injury. Our work joins an emerging body of evidence suggesting that single cell RNA sequencing may underestimate the diversity of pathologic DEGs. However, our data also suggest that population level transcriptomic analysis can be adapted to capture more of these DEGs with similar efficacy and diversity as standard bulk sequencing. Together, our data and workflow will be useful for investigators interested in understanding and manipulating adult hippocampal NSC responses to various stimuli.
Collapse
Affiliation(s)
- Jiyeon K. Denninger
- Department of Psychology, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
| | - Logan A. Walker
- Department of Physics, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
| | - Xi Chen
- Comprehensive Cancer Center, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Altan Turkoglu
- Department of Physics, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
| | - Alex Pan
- Comprehensive Cancer Center, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Zoe Tapp
- Department of Neuroscience, Institute for Behavioral Medicine Research, The Ohio State University, Columbus, OH, United States
| | - Sakthi Senthilvelan
- Department of Psychology, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
| | - Raina Rindani
- Department of Psychology, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
| | - Olga N. Kokiko-Cochran
- Department of Neuroscience, Institute for Behavioral Medicine Research, The Ohio State University, Columbus, OH, United States
- Chronic Brain Injury Program, The Ohio State University, Columbus, OH, United States
| | - Ralf Bundschuh
- Department of Physics, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
- Division of Hematology, Department of Internal Medicine, College of Medicine, The Ohio State University, Columbus, OH, United States
- Department of Chemistry and Biochemistry, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
| | - Pearlly Yan
- Comprehensive Cancer Center, College of Medicine, The Ohio State University, Columbus, OH, United States
- Division of Hematology, Department of Internal Medicine, College of Medicine, The Ohio State University, Columbus, OH, United States
| | - Elizabeth D. Kirby
- Department of Psychology, College of Arts and Sciences, The Ohio State University, Columbus, OH, United States
- Chronic Brain Injury Program, The Ohio State University, Columbus, OH, United States
- *Correspondence: Elizabeth D. Kirby,
| |
Collapse
|
175
|
Skoufos G, Almodaresi F, Zakeri M, Paulson JN, Patro R, Hatzigeorgiou AG, Vlachos IS. AGAMEMNON: an Accurate metaGenomics And MEtatranscriptoMics quaNtificatiON analysis suite. Genome Biol 2022; 23:39. [PMID: 35101114 PMCID: PMC8802518 DOI: 10.1186/s13059-022-02610-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 01/03/2022] [Indexed: 12/03/2022] Open
Abstract
We introduce AGAMEMNON ( https://github.com/ivlachos/agamemnon ) for the acquisition of microbial abundances from shotgun metagenomics and metatranscriptomic samples, single-microbe sequencing experiments, or sequenced host samples. AGAMEMNON delivers accurate abundances at genus, species, and strain resolution. It incorporates a time and space-efficient indexing scheme for fast pattern matching, enabling indexing and analysis of vast datasets with widely available computational resources. Host-specific modules provide exceptional accuracy for microbial abundance quantification from tissue RNA/DNA sequencing, enabling the expansion of experiments lacking metagenomic/metatranscriptomic analyses. AGAMEMNON provides an R-Shiny application, permitting performance of investigations and visualizations from a graphics interface.
Collapse
Affiliation(s)
- Giorgos Skoufos
- Department of Electrical & Computer Engineering, University of Thessaly, 38221, Volos, Greece.
- Hellenic Pasteur Institute, 11521, Athens, Greece.
- DIANA-Lab, Department of Computer Science and Biomedical Informatics, Univ. of Thessaly, 351 31, Lamia, Greece.
| | - Fatemeh Almodaresi
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Mohsen Zakeri
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Joseph N Paulson
- Department of Data Sciences, Genentech Inc., South San Francisco, CA, USA
| | - Rob Patro
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Artemis G Hatzigeorgiou
- Department of Electrical & Computer Engineering, University of Thessaly, 38221, Volos, Greece.
- Hellenic Pasteur Institute, 11521, Athens, Greece.
- DIANA-Lab, Department of Computer Science and Biomedical Informatics, Univ. of Thessaly, 351 31, Lamia, Greece.
| | - Ioannis S Vlachos
- Cancer Research Institute | HMS Initiative for RNA Medicine | Department of Pathology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, 02115, USA.
- Spatial Technologies Unit, Beth Israel Deaconess Medical Center, MA, Boston, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
176
|
Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol 2022; 23:31. [PMID: 35063006 PMCID: PMC8783472 DOI: 10.1186/s13059-022-02601-5] [Citation(s) in RCA: 122] [Impact Index Per Article: 61.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open
Abstract
Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.
Collapse
Affiliation(s)
- Ruochen Jiang
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA
| | - Tianyi Sun
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA
| | - Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, 90095-7246, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA.
- Department of Human Genetics, University of California, Los Angeles, 90095-7088, CA, USA.
- Department of Computational Medicine, University of California, Los Angeles, 90095-1766, CA, USA.
- Department of Biostatistics, University of California, Los Angeles, 90095-1772, CA, USA.
| |
Collapse
|
177
|
Dai M, Pei X, Wang XJ. Accurate and fast cell marker gene identification with COSG. Brief Bioinform 2022; 23:6511197. [PMID: 35048116 DOI: 10.1093/bib/bbab579] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 12/03/2021] [Accepted: 12/17/2021] [Indexed: 01/01/2023] Open
Abstract
Accurate cell classification is the groundwork for downstream analysis of single-cell sequencing data, yet how to identify true marker genes for different cell types still remains a big challenge. Here, we report COSine similarity-based marker Gene identification (COSG) as a cosine similarity-based method for more accurate and scalable marker gene identification. COSG is applicable to single-cell RNA sequencing data, single-cell ATAC sequencing data and spatially resolved transcriptome data. COSG is fast and scalable for ultra-large datasets of million-scale cells. Application on both simulated and real experimental datasets showed that the marker genes or genomic regions identified by COSG have greater cell-type specificity, demonstrating the superior performance of COSG in terms of both accuracy and efficiency as compared with other available methods.
Collapse
Affiliation(s)
- Min Dai
- Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaobing Pei
- School of Software, HuaZhong University of Science and Technology, Wuhan Hubei 430074, China
| | - Xiu-Jie Wang
- Institute of Genetics and Developmental Biology, Innovation Academy of Seed Design, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
178
|
Huang K, Wang C, Vagts C, Raguveer V, Finn PW, Perkins DL. Long non-coding RNAs (lncRNAs) NEAT1 and MALAT1 are differentially expressed in severe COVID-19 patients: An integrated single-cell analysis. PLoS One 2022; 17:e0261242. [PMID: 35007307 PMCID: PMC8746747 DOI: 10.1371/journal.pone.0261242] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 11/25/2021] [Indexed: 01/08/2023] Open
Abstract
Hyperactive and damaging inflammation is a hallmark of severe rather than mild Coronavirus disease 2019 (COVID-19). To uncover key inflammatory differentiators between severe and mild COVID-19, we applied an unbiased single-cell transcriptomic analysis. We integrated two single-cell RNA-seq datasets with COVID-19 patient samples, one that sequenced bronchoalveolar lavage (BAL) cells and one that sequenced peripheral blood mononuclear cells (PBMCs). The combined cell population was then analyzed with a focus on genes associated with disease severity. The immunomodulatory long non-coding RNAs (lncRNAs) NEAT1 and MALAT1 were highly differentially expressed between mild and severe patients in multiple cell types. Within those same cell types, the concurrent detection of other severity-associated genes involved in cellular stress response and apoptosis regulation suggests that the pro-inflammatory functions of these lncRNAs may foster cell stress and damage. Thus, NEAT1 and MALAT1 are potential components of immune dysregulation in COVID-19 that may provide targets for severity related diagnostic measures or therapy.
Collapse
Affiliation(s)
- Kai Huang
- Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, United States of America
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Catherine Wang
- College of Medicine, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Christen Vagts
- Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Vanitha Raguveer
- College of Medicine, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Patricia W. Finn
- Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, United States of America
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
- Department of Microbiology and Immunology, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - David L. Perkins
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
- Division of Nephrology, Department of Medicine, University of Illinois at Chicago, Chicago, Illinois, United States of America
- Department of Surgery, University of Illinois at Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
179
|
Multi-Omics Profiling of the Tumor Microenvironment. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:283-326. [DOI: 10.1007/978-3-030-91836-1_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
180
|
Koneshamoorthy A, Seniveratne-Epa D, Calder G, Sawyer M, Kay TWH, Farrell S, Loudovaris T, Mariana L, McCarthy D, Lyu R, Liu X, Thorn P, Tong J, Chin LK, Zacharin M, Trainer A, Taylor S, MacIsaac RJ, Sachithanandan N, Thomas HE, Krishnamurthy B. Case Report: Hypoglycemia Due to a Novel Activating Glucokinase Variant in an Adult - a Molecular Approach. Front Endocrinol (Lausanne) 2022; 13:842937. [PMID: 35370948 PMCID: PMC8969599 DOI: 10.3389/fendo.2022.842937] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Accepted: 02/17/2022] [Indexed: 11/13/2022] Open
Abstract
We present a case of an obese 22-year-old man with activating GCK variant who had neonatal hypoglycemia, re-emerging with hypoglycemia later in life. We investigated him for asymptomatic hypoglycemia with a family history of hypoglycemia. Genetic testing yielded a novel GCK missense class 3 variant that was subsequently found in his mother, sister and nephew and reclassified as a class 4 likely pathogenic variant. Glucokinase enables phosphorylation of glucose, the rate-limiting step of glycolysis in the liver and pancreatic β cells. It plays a crucial role in the regulation of insulin secretion. Inactivating variants in GCK cause hyperglycemia and activating variants cause hypoglycemia. Spleen-preserving distal pancreatectomy revealed diffuse hyperplastic islets, nuclear pleomorphism and periductular islets. Glucose stimulated insulin secretion revealed increased insulin secretion in response to glucose. Cytoplasmic calcium, which triggers exocytosis of insulin-containing granules, revealed normal basal but increased glucose-stimulated level. Unbiased gene expression analysis using 10X single cell sequencing revealed upregulated INS and CKB genes and downregulated DLK1 and NPY genes in β-cells. Further studies are required to see if alteration in expression of these genes plays a role in the metabolic and histological phenotype associated with glucokinase pathogenic variant. There were more large islets in the patient's pancreas than in control subjects but there was no difference in the proportion of β cells in the islets. His hypoglycemia was persistent after pancreatectomy, was refractory to diazoxide and improved with pasireotide. This case highlights the variable phenotype of GCK mutations. In-depth molecular analyses in the islets have revealed possible mechanisms for hyperplastic islets and insulin hypersecretion.
Collapse
Affiliation(s)
- Anojian Koneshamoorthy
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Dilan Seniveratne-Epa
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Genevieve Calder
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Matthew Sawyer
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Thomas W. H. Kay
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medicine, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Stephen Farrell
- Department of Surgery, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Thomas Loudovaris
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
| | - Lina Mariana
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
| | - Davis McCarthy
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
- Melbourne Integrative Genomics, Faculty of Science, University of Melbourne, Melbourne, VIC, Australia
| | - Ruqian Lyu
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
| | - Xin Liu
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
- Melbourne Integrative Genomics, Faculty of Science, University of Melbourne, Melbourne, VIC, Australia
| | - Peter Thorn
- Charles Perkins Centre, School of Medical Sciences, University of Sydney, Sydney, NSW, Australia
| | - Jason Tong
- Charles Perkins Centre, School of Medical Sciences, University of Sydney, Sydney, NSW, Australia
| | - Lit Kim Chin
- Department of Diabetes and Endocrinology, Royal Children’s Hospital, Melbourne, VIC, Australia
| | - Margaret Zacharin
- Department of Diabetes and Endocrinology, Royal Children’s Hospital, Melbourne, VIC, Australia
| | - Alison Trainer
- Department of Genomic Medicine, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Shelby Taylor
- Department of Genomic Medicine, Royal Melbourne Hospital, Melbourne, VIC, Australia
| | - Richard J. MacIsaac
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
- Department of Medicine, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Nirupa Sachithanandan
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
- Department of Medicine, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Helen E. Thomas
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medicine, St. Vincent’s Hospital, Melbourne, VIC, Australia
| | - Balasubramanian Krishnamurthy
- Department of Endocrinology and Diabetes, St. Vincent’s Hospital, Melbourne, VIC, Australia
- St. Vincent’s Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medicine, St. Vincent’s Hospital, Melbourne, VIC, Australia
- *Correspondence: Balasubramanian Krishnamurthy,
| |
Collapse
|
181
|
Lin D, Chen Y, Negi S, Cheng D, Ouyang Z, Sexton D, Li K, Zhang B. CellDepot: A unified repository for scRNA-seq data and visual exploration. J Mol Biol 2021; 434:167425. [PMID: 34971674 DOI: 10.1016/j.jmb.2021.167425] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 12/09/2021] [Accepted: 12/20/2021] [Indexed: 11/29/2022]
Abstract
CellDepot containing over 270 datasets from 8 species and many tissues serves as an integrated web application to empower scientists in exploring single-cell RNA-seq (scRNA-seq) datasets and comparing the datasets among various studies through a user-friendly interface with advanced visualization and analytical capabilities. To begin with, it provides an efficient data management system that users can upload single cell datasets and query the database by multiple attributes such as species and cell types. In addition, the graphical multi-logic, multi-condition query builder and convenient filtering tool backed by MySQL database system, allows users to quickly find the datasets of interest and compare the expression of gene(s) across these. Moreover, by embedding the cellxgene VIP tool, CellDepot enables fast exploration of individual dataset in the manner of interactivity and scalability to gain more refined insights such as cell composition, gene expression profiles, and differentially expressed genes among cell types by leveraging more than 20 frequently applied plotting functions and high-level analysis methods in single cell research. In summary, the web portal available at http://celldepot.bxgenomics.com, prompts large scale single cell data sharing, facilitates meta-analysis and visualization, and encourages scientists to contribute to the single-cell community in a tractable and collaborative way. Finally, CellDepot is released as open-source software under MIT license to motivate crowd contribution, broad adoption, and local deployment for private datasets.
Collapse
Affiliation(s)
- Dongdong Lin
- Research Department, Biogen, Inc., 225 Binney St, Cambridge, MA 02142, USA.
| | - Yirui Chen
- Research Department, Biogen, Inc., 225 Binney St, Cambridge, MA 02142, USA.
| | - Soumya Negi
- Research Department, Biogen, Inc., 225 Binney St, Cambridge, MA 02142, USA.
| | - Derrick Cheng
- BioInfoRx, Inc., 510 Charmany Dr, Suite 275A, Madison, WI 53719, USA.
| | - Zhengyu Ouyang
- BioInfoRx, Inc., 510 Charmany Dr, Suite 275A, Madison, WI 53719, USA.
| | - David Sexton
- Research Department, Biogen, Inc., 225 Binney St, Cambridge, MA 02142, USA.
| | - Kejie Li
- Research Department, Biogen, Inc., 225 Binney St, Cambridge, MA 02142, USA.
| | - Baohong Zhang
- Research Department, Biogen, Inc., 225 Binney St, Cambridge, MA 02142, USA.
| |
Collapse
|
182
|
Schnell A, Huang L, Singer M, Singaraju A, Barilla RM, Regan BML, Bollhagen A, Thakore PI, Dionne D, Delorey TM, Pawlak M, Meyer Zu Horste G, Rozenblatt-Rosen O, Irizarry RA, Regev A, Kuchroo VK. Stem-like intestinal Th17 cells give rise to pathogenic effector T cells during autoimmunity. Cell 2021; 184:6281-6298.e23. [PMID: 34875227 PMCID: PMC8900676 DOI: 10.1016/j.cell.2021.11.018] [Citation(s) in RCA: 108] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 10/13/2021] [Accepted: 11/11/2021] [Indexed: 12/24/2022]
Abstract
While intestinal Th17 cells are critical for maintaining tissue homeostasis, recent studies have implicated their roles in the development of extra-intestinal autoimmune diseases including multiple sclerosis. However, the mechanisms by which tissue Th17 cells mediate these dichotomous functions remain unknown. Here, we characterized the heterogeneity, plasticity, and migratory phenotypes of tissue Th17 cells in vivo by combined fate mapping with profiling of the transcriptomes and TCR clonotypes of over 84,000 Th17 cells at homeostasis and during CNS autoimmune inflammation. Inter- and intra-organ single-cell analyses revealed a homeostatic, stem-like TCF1+ IL-17+ SLAMF6+ population that traffics to the intestine where it is maintained by the microbiota, providing a ready reservoir for the IL-23-driven generation of encephalitogenic GM-CSF+ IFN-γ+ CXCR6+ T cells. Our study defines a direct in vivo relationship between IL-17+ non-pathogenic and GM-CSF+ and IFN-γ+ pathogenic Th17 populations and provides a mechanism by which homeostatic intestinal Th17 cells direct extra-intestinal autoimmune disease.
Collapse
Affiliation(s)
- Alexandra Schnell
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA; Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Linglin Huang
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Meromit Singer
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Immunology, Blavatnik Institute, Harvard Medical School, Boston, MA 02115, USA
| | - Anvita Singaraju
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Rocky M Barilla
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Brianna M L Regan
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Alina Bollhagen
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA; German Cancer Research Center, DKFZ, Heidelberg 69120, Germany
| | - Pratiksha I Thakore
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Danielle Dionne
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Toni M Delorey
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Mathias Pawlak
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA; Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Gerd Meyer Zu Horste
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA
| | - Orit Rozenblatt-Rosen
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Rafael A Irizarry
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| | - Vijay K Kuchroo
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women's Hospital, Boston, MA 02115, USA; Klarman Cell Observatory, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
183
|
Dinh HQ, Pan F, Wang G, Huang QF, Olingy CE, Wu ZY, Wang SH, Xu X, Xu XE, He JZ, Yang Q, Orsulic S, Haro M, Li LY, Huang GW, Breunig JJ, Koeffler HP, Hedrick CC, Xu LY, Lin DC, Li EM. Integrated single-cell transcriptome analysis reveals heterogeneity of esophageal squamous cell carcinoma microenvironment. Nat Commun 2021; 12:7335. [PMID: 34921160 PMCID: PMC8683407 DOI: 10.1038/s41467-021-27599-5] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 11/29/2021] [Indexed: 02/05/2023] Open
Abstract
The tumor microenvironment is a highly complex ecosystem of diverse cell types, which shape cancer biology and impact the responsiveness to therapy. Here, we analyze the microenvironment of esophageal squamous cell carcinoma (ESCC) using single-cell transcriptome sequencing in 62,161 cells from blood, adjacent nonmalignant and matched tumor samples from 11 ESCC patients. We uncover heterogeneity in most cell types of the ESCC stroma, particularly in the fibroblast and immune cell compartments. We identify a tumor-specific subset of CST1+ myofibroblasts with prognostic values and potential biological significance. CST1+ myofibroblasts are also highly tumor-specific in other cancer types. Additionally, a subset of antigen-presenting fibroblasts is revealed and validated. Analyses of myeloid and T lymphoid lineages highlight the immunosuppressive nature of the ESCC microenvironment, and identify cancer-specific expression of immune checkpoint inhibitors. This work establishes a rich resource of stromal cell types of the ESCC microenvironment for further understanding of ESCC biology.
Collapse
Affiliation(s)
- Huy Q Dinh
- McArdle Laboratory for Cancer Research, University of Wisconsin-Madison School of Medicine and Public Health, Madison, WI, USA.
- Division of Inflammation Biology, La Jolla Institute for Immunology, La Jolla, CA, USA.
| | - Feng Pan
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China
| | - Geng Wang
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China
- Department of Thoracic Surgery, Cancer Hospital of Shantou University Medical College, Shantou, China
| | - Qing-Feng Huang
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China
| | - Claire E Olingy
- Division of Inflammation Biology, La Jolla Institute for Immunology, La Jolla, CA, USA
| | | | | | - Xin Xu
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China
| | - Xiu-E Xu
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China
| | - Jian-Zhong He
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China
| | - Qian Yang
- Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sandra Orsulic
- Department of Obstetrics and Gynecology and Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Marcela Haro
- Department of Obstetrics and Gynecology and Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Li-Yan Li
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
| | - Guo-Wei Huang
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China
| | - Joshua J Breunig
- Board of Governors Regenerative Medicine Institute and Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - H Phillip Koeffler
- Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Catherine C Hedrick
- Division of Inflammation Biology, La Jolla Institute for Immunology, La Jolla, CA, USA
| | - Li-Yan Xu
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China.
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China.
| | - De-Chen Lin
- Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA.
| | - En-Min Li
- Guangdong Provincial Key Laboratory of Infectious Diseases and Molecular Immunopathology, The Key Laboratory of Molecular Biology for High Cancer Incidence Coastal Chaoshan Area, Shantou University Medical College, Shantou, China.
- Guangdong Esophageal Cancer Research Institute, Shantou Sub-center, Shantou, China.
| |
Collapse
|
184
|
Asami M, Lam BYH, Ma MK, Rainbow K, Braun S, VerMilyea MD, Yeo GSH, Perry ACF. Human embryonic genome activation initiates at the one-cell stage. Cell Stem Cell 2021; 29:209-216.e4. [PMID: 34936886 PMCID: PMC8826644 DOI: 10.1016/j.stem.2021.11.012] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Revised: 08/24/2021] [Accepted: 11/29/2021] [Indexed: 12/13/2022]
Abstract
In human embryos, the initiation of transcription (embryonic genome activation [EGA]) occurs by the eight-cell stage, but its exact timing and profile are unclear. To address this, we profiled gene expression at depth in human metaphase II oocytes and bipronuclear (2PN) one-cell embryos. High-resolution single-cell RNA sequencing revealed previously inaccessible oocyte-to-embryo gene expression changes. This confirmed transcript depletion following fertilization (maternal RNA degradation) but also uncovered low-magnitude upregulation of hundreds of spliced transcripts. Gene expression analysis predicted embryonic processes including cell-cycle progression and chromosome maintenance as well as transcriptional activators that included cancer-associated gene regulators. Transcription was disrupted in abnormal monopronuclear (1PN) and tripronuclear (3PN) one-cell embryos. These findings indicate that human embryonic transcription initiates at the one-cell stage, sooner than previously thought. The pattern of gene upregulation promises to illuminate processes involved at the onset of human development, with implications for epigenetic inheritance, stem-cell-derived embryos, and cancer. Gene expression initiates at the one-cell stage in human embryos Expression is of low magnitude but remains elevated until the eight-cell stage Upregulated transcripts are spliced and correspond to embryonic processes Upregulation is disrupted in morphologically abnormal one-cell embryos
Collapse
Affiliation(s)
- Maki Asami
- Laboratory of Mammalian Molecular Embryology, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, England
| | - Brian Y H Lam
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England
| | - Marcella K Ma
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England
| | - Kara Rainbow
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England
| | - Stefanie Braun
- Ovation Fertility Austin, Embryology and Andrology Laboratories, Austin, TX 78731, USA
| | - Matthew D VerMilyea
- Ovation Fertility Austin, Embryology and Andrology Laboratories, Austin, TX 78731, USA.
| | - Giles S H Yeo
- MRC Metabolic Diseases Unit, Wellcome-MRC Institute of Metabolic Science, Addenbrooke's Hospital, University of Cambridge, Cambridge CB2 0QQ, England.
| | - Anthony C F Perry
- Laboratory of Mammalian Molecular Embryology, Department of Biology and Biochemistry, University of Bath, Bath BA2 7AY, England.
| |
Collapse
|
185
|
Azodi CB, Zappia L, Oshlack A, McCarthy DJ. splatPop: simulating population scale single-cell RNA sequencing data. Genome Biol 2021; 22:341. [PMID: 34911537 DOI: 10.1186/s13059-021-02546-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Accepted: 11/19/2021] [Indexed: 11/10/2022] Open
Abstract
Population-scale single-cell RNA sequencing (scRNA-seq) is now viable, enabling finer resolution functional genomics studies and leading to a rush to adapt bulk methods and develop new single-cell-specific methods to perform these studies. Simulations are useful for developing, testing, and benchmarking methods but current scRNA-seq simulation frameworks do not simulate population-scale data with genetic effects. Here, we present splatPop, a model for flexible, reproducible, and well-documented simulation of population-scale scRNA-seq data with known expression quantitative trait loci. splatPop can also simulate complex batch, cell group, and conditional effects between individuals from different cohorts as well as genetically-driven co-expression.
Collapse
Affiliation(s)
- Christina B Azodi
- St. Vincent's Institute of Medical Research, 9 Princes Street, Fitzroy, 3065, VIC, Australia.,University of Melbourne, Royal Parade, Parkville, 3010, VIC, Australia
| | - Luke Zappia
- Department of Mathematics, Technical University of Munich, Boltzmannstraße 3, Garching bei München, 85748, Germany.,Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstraße 1, Neuherberg, 85764, Germany
| | - Alicia Oshlack
- University of Melbourne, Royal Parade, Parkville, 3010, VIC, Australia.,Peter MacCallum Cancer Centre, Grattan Street, Melbourne, 3000, VIC, Australia
| | - Davis J McCarthy
- St. Vincent's Institute of Medical Research, 9 Princes Street, Fitzroy, 3065, VIC, Australia. .,University of Melbourne, Royal Parade, Parkville, 3010, VIC, Australia.
| |
Collapse
|
186
|
Hu Z, Ahmed AA, Yau C. CIDER: an interpretable meta-clustering framework for single-cell RNA-seq data integration and evaluation. Genome Biol 2021; 22:337. [PMID: 34903266 PMCID: PMC8667531 DOI: 10.1186/s13059-021-02561-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open
Abstract
Clustering of joint single-cell RNA-Seq (scRNA-Seq) data is often challenged by confounding factors, such as batch effects and biologically relevant variability. Existing batch effect removal methods typically require strong assumptions on the composition of cell populations being near identical across samples. Here, we present CIDER, a meta-clustering workflow based on inter-group similarity measures. We demonstrate that CIDER outperforms other scRNA-Seq clustering methods and integration approaches in both simulated and real datasets. Moreover, we show that CIDER can be used to assess the biological correctness of integration in real datasets, while it does not require the existence of prior cellular annotations.
Collapse
Affiliation(s)
- Zhiyuan Hu
- Ovarian Cancer Cell Laboratory, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, OX3 9DS, UK
- Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, OX3 9DU, UK
- Current Address: MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, OX3 9DS, UK
| | - Ahmed A Ahmed
- Ovarian Cancer Cell Laboratory, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, OX3 9DS, UK.
- Nuffield Department of Women's and Reproductive Health, University of Oxford, Oxford, OX3 9DU, UK.
| | - Christopher Yau
- Division of Informatics, Imaging and Data Sciences, Faculty of Biology Medicine and Health, The University of Manchester, Manchester, M13 9PT, UK.
- Alan Turing Institute, London, NW1 2DB, UK.
- Health Data Research UK, Gibbs Building, 215 Euston Road, London, NW1 2BE, UK.
| |
Collapse
|
187
|
Das S, Rai A, Merchant ML, Cave MC, Rai SN. A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies. Genes (Basel) 2021; 12:1947. [PMID: 34946896 PMCID: PMC8701051 DOI: 10.3390/genes12121947] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 11/27/2021] [Accepted: 11/27/2021] [Indexed: 12/13/2022] Open
Abstract
Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.
Collapse
Affiliation(s)
- Samarendra Das
- Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India;
- Biostatistics and Bioinformatics Facility, JG Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, PUSA, New Delhi 110012, India;
| | - Michael L. Merchant
- Department of Medicine, School of Medicine, University of Louisville, Louisville, KY 40202, USA;
- Hepatobiology and Toxicology Center, University of Louisville, Louisville, KY 40202, USA
| | - Matthew C. Cave
- Biostatistics and Informatics Facility, Center for Integrative Environmental Health Sciences, University of Louisville, Louisville, KY 40202, USA;
| | - Shesh N. Rai
- Biostatistics and Bioinformatics Facility, JG Brown Cancer Center, University of Louisville, Louisville, KY 40202, USA
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY 40292, USA
- Hepatobiology and Toxicology Center, University of Louisville, Louisville, KY 40202, USA
- Biostatistics and Informatics Facility, Center for Integrative Environmental Health Sciences, University of Louisville, Louisville, KY 40202, USA;
- Christina Lee Brown Envirome Institute, University of Louisville, Louisville, KY 40202, USA
- Department of Bioinformatics and Biostatistics, School of Public Health and Information Science, University of Louisville, Louisville, KY 40202, USA
| |
Collapse
|
188
|
Kim HJ, Wang K, Chen C, Lin Y, Tam PPL, Lin DM, Yang JYH, Yang P. Uncovering cell identity through differential stability with Cepo. NATURE COMPUTATIONAL SCIENCE 2021; 1:784-790. [PMID: 38217190 DOI: 10.1038/s43588-021-00172-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 11/12/2021] [Indexed: 01/15/2024]
Abstract
The use of single-cell RNA-sequencing (scRNA-seq) allows observation of different cells at multi-tiered complexity in the same microenvironment. To get insights into cell identity using scRNA-seq data, we present Cepo, which generates cell-type-specific gene statistics of differentially stable genes from scRNA-seq data to define cell identity. When applied to multiple datasets, Cepo outperforms current methods in assigning cell identity and enhances several cell identification applications such as cell-type characterisation, spatial mapping of single cells and lineage inference of single cells.
Collapse
Affiliation(s)
- Hani Jieun Kim
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales, Australia
- Computational Systems Biology Group, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia
| | - Kevin Wang
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales, Australia
| | - Carissa Chen
- Computational Systems Biology Group, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia
| | - Yingxin Lin
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia
| | - Patrick P L Tam
- Embryology Unit, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia
- School of Medical Science, Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia
| | - David M Lin
- Department of Biomedical Sciences, Cornell University, Ithaca, NY, USA
| | - Jean Y H Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales, Australia
- Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia
| | - Pengyi Yang
- School of Mathematics and Statistics, The University of Sydney, Sydney, New South Wales, Australia.
- Computational Systems Biology Group, Children's Medical Research Institute, Faculty of Medicine and Health, The University of Sydney, Westmead, New South Wales, Australia.
- Charles Perkins Centre, The University of Sydney, Sydney, New South Wales, Australia.
- School of Medical Science, Faculty of Medicine and Health, The University of Sydney, Sydney, New South Wales, Australia.
| |
Collapse
|
189
|
Yang P, Huang H, Liu C. Feature selection revisited in the single-cell era. Genome Biol 2021; 22:321. [PMID: 34847932 PMCID: PMC8638336 DOI: 10.1186/s13059-021-02544-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Accepted: 11/15/2021] [Indexed: 12/13/2022] Open
Abstract
Recent advances in single-cell biotechnologies have resulted in high-dimensional datasets with increased complexity, making feature selection an essential technique for single-cell data analysis. Here, we revisit feature selection techniques and summarise recent developments. We review their application to a range of single-cell data types generated from traditional cytometry and imaging technologies and the latest array of single-cell omics technologies. We highlight some of the challenges and future directions and finally consider their scalability and make general recommendations on each type of feature selection method. We hope this review stimulates future research and application of feature selection in the single-cell era.
Collapse
Affiliation(s)
- Pengyi Yang
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia.
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia.
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia.
| | - Hao Huang
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia
| | - Chunlei Liu
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia
| |
Collapse
|
190
|
Ling W, Zhang W, Cheng B, Wei Y. Zero-inflated quantile rank-score based test (ZIQRank) with application to scRNA-seq differential gene expression analysis. Ann Appl Stat 2021; 15:1673-1696. [DOI: 10.1214/21-aoas1442] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Wodan Ling
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center
| | | | - Bin Cheng
- Department of Biostatistics, Columbia University
| | - Ying Wei
- Department of Biostatistics, Columbia University
| |
Collapse
|
191
|
Bouland GA, Mahfouz A, Reinders MJT. Differential analysis of binarized single-cell RNA sequencing data captures biological variation. NAR Genom Bioinform 2021; 3:lqab118. [PMID: 34988441 PMCID: PMC8693570 DOI: 10.1093/nargab/lqab118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 11/04/2021] [Accepted: 12/03/2021] [Indexed: 12/11/2022] Open
Abstract
Single-cell RNA sequencing data is characterized by a large number of zero counts, yet there is growing evidence that these zeros reflect biological variation rather than technical artifacts. We propose to use binarized expression profiles to identify the effects of biological variation in single-cell RNA sequencing data. Using 16 publicly available and simulated datasets, we show that a binarized representation of single-cell expression data accurately represents biological variation and reveals the relative abundance of transcripts more robustly than counts.
Collapse
Affiliation(s)
- Gerard A Bouland
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628 XE, The Netherlands
| | - Ahmed Mahfouz
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628 XE, The Netherlands
| | - Marcel J T Reinders
- Delft Bioinformatics Lab, Delft University of Technology, Delft 2628 XE, The Netherlands
| |
Collapse
|
192
|
Fujii T, Maehara K, Fujita M, Ohkawa Y. Discriminative feature of cells characterizes cell populations of interest by a small subset of genes. PLoS Comput Biol 2021; 17:e1009579. [PMID: 34797848 PMCID: PMC8641884 DOI: 10.1371/journal.pcbi.1009579] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 12/03/2021] [Accepted: 10/19/2021] [Indexed: 12/13/2022] Open
Abstract
Organisms are composed of various cell types with specific states. To obtain a comprehensive understanding of the functions of organs and tissues, cell types have been classified and defined by identifying specific marker genes. Statistical tests are critical for identifying marker genes, which often involve evaluating differences in the mean expression levels of genes. Differentially expressed gene (DEG)-based analysis has been the most frequently used method of this kind. However, in association with increases in sample size such as in single-cell analysis, DEG-based analysis has faced difficulties associated with the inflation of P-values. Here, we propose the concept of discriminative feature of cells (DFC), an alternative to using DEG-based approaches. We implemented DFC using logistic regression with an adaptive LASSO penalty to perform binary classification for discriminating a population of interest and variable selection to obtain a small subset of defining genes. We demonstrated that DFC prioritized gene pairs with non-independent expression using artificial data and that DFC enabled characterization of the muscle satellite/progenitor cell population. The results revealed that DFC well captured cell-type-specific markers, specific gene expression patterns, and subcategories of this cell population. DFC may complement DEG-based methods for interpreting large data sets. DEG-based analysis uses lists of genes with differences in expression between groups, while DFC, which can be termed a discriminative approach, has potential applications in the task of cell characterization. Upon recent advances in the high-throughput analysis of single cells, methods of cell characterization such as scRNA-seq can be effectively subjected to the discriminative methods. Statistical methods for detecting differences in individual gene expression are indispensable for understanding cell types. However, conventional statistical methods, such as differentially expressed gene (DEG)-based analysis, have faced difficulties associated with the inflation of P-values because of both the large sample size and selection bias introduced by exploratory data analysis such as single-cell transcriptomics. Here, we propose the concept of discriminative feature of cells (DFC), an alternative to using DEG-based approaches. We implemented DFC using logistic regression with an adaptive LASSO penalty to perform binary classification for the discrimination of a population of interest and variable selection to obtain a small subset of defining genes. We demonstrated that DFC prioritized gene pairs with non-independent expression using artificial data, and that it enabled characterization of the muscle satellite/progenitor cell population. The results revealed that DFC well captured cell-type-specific markers, specific gene expression patterns, and subcategories of this cell population. DFC may complement differentially expressed gene-based methods for interpreting large data sets.
Collapse
Affiliation(s)
- Takeru Fujii
- Division of Transcriptomics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
- Department of Cellular Biochemistry, Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka, Japan
| | - Kazumitsu Maehara
- Division of Transcriptomics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
- * E-mail: (KM); (YO)
| | - Masatoshi Fujita
- Department of Cellular Biochemistry, Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka, Japan
| | - Yasuyuki Ohkawa
- Division of Transcriptomics, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
- * E-mail: (KM); (YO)
| |
Collapse
|
193
|
Millard N, Korsunsky I, Weinand K, Fonseka CY, Nathan A, Kang JB, Raychaudhuri S. Maximizing statistical power to detect differentially abundant cell states with scPOST. CELL REPORTS METHODS 2021; 1:100120. [PMID: 35005693 PMCID: PMC8740883 DOI: 10.1016/j.crmeth.2021.100120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 07/27/2021] [Accepted: 10/27/2021] [Indexed: 11/30/2022]
Abstract
To estimate a study design's power to detect differential abundance, we require a framework that simulates many multi-sample single-cell datasets. However, current simulation methods are challenging for large-scale power analyses because they are computationally resource intensive and do not support easy simulation of multi-sample datasets. Current methods also lack modeling of important inter-sample variation, such as the variation in the frequency of cell states between samples that is observed in single-cell data. Thus, we developed single-cell POwer Simulation Tool (scPOST) to address these limitations and help investigators quickly simulate multi-sample single-cell datasets. Users may explore a range of effect sizes and study design choices (such as increasing the number of samples or cells per sample) to determine their effect on power, and thus choose the optimal study design for their planned experiments.
Collapse
Affiliation(s)
- Nghia Millard
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Ilya Korsunsky
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Kathryn Weinand
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Chamith Y. Fonseka
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Aparna Nathan
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Joyce B. Kang
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Boston, MA 02115, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Versus Arthritis Centre for Genetics and Genomics, Centre for Musculoskeletal Research, Manchester Academic Health Science Centre, The University of Manchester, Manchester 46962, UK
| |
Collapse
|
194
|
Schmid KT, Höllbacher B, Cruceanu C, Böttcher A, Lickert H, Binder EB, Theis FJ, Heinig M. scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies. Nat Commun 2021; 12:6625. [PMID: 34785648 PMCID: PMC8595682 DOI: 10.1038/s41467-021-26779-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 10/22/2021] [Indexed: 12/13/2022] Open
Abstract
Single cell RNA-seq has revolutionized transcriptomics by providing cell type resolution for differential gene expression and expression quantitative trait loci (eQTL) analyses. However, efficient power analysis methods for single cell data and inter-individual comparisons are lacking. Here, we present scPower; a statistical framework for the design and power analysis of multi-sample single cell transcriptomic experiments. We modelled the relationship between sample size, the number of cells per individual, sequencing depth, and the power of detecting differentially expressed genes within cell types. We systematically evaluated these optimal parameter combinations for several single cell profiling platforms, and generated broad recommendations. In general, shallow sequencing of high numbers of cells leads to higher overall power than deep sequencing of fewer cells. The model, including priors, is implemented as an R package and is accessible as a web tool. scPower is a highly customizable tool that experimentalists can use to quickly compare a multitude of experimental designs and optimize for a limited budget.
Collapse
Affiliation(s)
- Katharina T Schmid
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- Department of Informatics, Technical University Munich, Munich, Germany
| | - Barbara Höllbacher
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- Department of Informatics, Technical University Munich, Munich, Germany
| | - Cristiana Cruceanu
- Department of Translational Research, Max Planck Institute for Psychiatry, Munich, Germany
| | - Anika Böttcher
- Institute of Diabetes and Regeneration Research, Helmholtz Diabetes Center, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- School of Medicine, Technical University of Munich, Munich, Germany
| | - Heiko Lickert
- Institute of Diabetes and Regeneration Research, Helmholtz Diabetes Center, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- School of Medicine, Technical University of Munich, Munich, Germany
| | - Elisabeth B Binder
- Department of Translational Research, Max Planck Institute for Psychiatry, Munich, Germany
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Georgia, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
- Department of Mathematics, Technical University Munich, Munich, Germany
| | - Matthias Heinig
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany.
- Department of Informatics, Technical University Munich, Munich, Germany.
| |
Collapse
|
195
|
Pascual G, Domínguez D, Elosúa-Bayes M, Beckedorff F, Laudanna C, Bigas C, Douillet D, Greco C, Symeonidi A, Hernández I, Gil SR, Prats N, Bescós C, Shiekhattar R, Amit M, Heyn H, Shilatifard A, Benitah SA. Dietary palmitic acid promotes a prometastatic memory via Schwann cells. Nature 2021; 599:485-490. [PMID: 34759321 DOI: 10.1038/s41586-021-04075-0] [Citation(s) in RCA: 121] [Impact Index Per Article: 40.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 09/30/2021] [Indexed: 11/09/2022]
Abstract
Fatty acid uptake and altered metabolism constitute hallmarks of metastasis1,2, yet evidence of the underlying biology, as well as whether all dietary fatty acids are prometastatic, is lacking. Here we show that dietary palmitic acid (PA), but not oleic acid or linoleic acid, promotes metastasis in oral carcinomas and melanoma in mice. Tumours from mice that were fed a short-term palm-oil-rich diet (PA), or tumour cells that were briefly exposed to PA in vitro, remained highly metastatic even after being serially transplanted (without further exposure to high levels of PA). This PA-induced prometastatic memory requires the fatty acid transporter CD36 and is associated with the stable deposition of histone H3 lysine 4 trimethylation by the methyltransferase Set1A (as part of the COMPASS complex (Set1A/COMPASS)). Bulk, single-cell and positional RNA-sequencing analyses indicate that genes with this prometastatic memory predominantly relate to a neural signature that stimulates intratumoural Schwann cells and innervation, two parameters that are strongly correlated with metastasis but are aetiologically poorly understood3,4. Mechanistically, tumour-associated Schwann cells secrete a specialized proregenerative extracellular matrix, the ablation of which inhibits metastasis initiation. Both the PA-induced memory of this proneural signature and its long-term boost in metastasis require the transcription factor EGR2 and the glial-cell-stimulating peptide galanin. In summary, we provide evidence that a dietary metabolite induces stable transcriptional and chromatin changes that lead to a long-term stimulation of metastasis, and that this is related to a proregenerative state of tumour-activated Schwann cells.
Collapse
Affiliation(s)
- Gloria Pascual
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.
| | - Diana Domínguez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Marc Elosúa-Bayes
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Felipe Beckedorff
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Carmelo Laudanna
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Claudia Bigas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Delphine Douillet
- Department of Biochemistry and Molecular Genetics and Simpson Querrey Center for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Carolina Greco
- Center for Epigenetics and Metabolism, Department of Biological Chemistry, University of California, Irvine, CA, USA
| | - Aikaterini Symeonidi
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Inmaculada Hernández
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Sara Ruiz Gil
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Neus Prats
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Coro Bescós
- Department of Oral and Maxillofacial Surgery, Vall D'Hebron Hospital, Barcelona, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Ramin Shiekhattar
- Sylvester Comprehensive Cancer Center, University of Miami Miller School of Medicine, Miami, FL, USA
| | - Moran Amit
- Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Holger Heyn
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Ali Shilatifard
- Department of Biochemistry and Molecular Genetics and Simpson Querrey Center for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
| | - Salvador Aznar Benitah
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain. .,ICREA, Catalan Institution for Research and Advanced Studies, Barcelona, Spain.
| |
Collapse
|
196
|
Abstract
SARS-CoV-2, the etiological agent of COVID-19, is characterized by a delay in type I interferon (IFN-I)-mediated antiviral defenses alongside robust cytokine production. Here, we investigate the underlying molecular basis for this imbalance and implicate virus-mediated activation of NF-κB in the absence of other canonical IFN-I-related transcription factors. Epigenetic and single-cell transcriptomic analyses show a selective NF-κB signature that was most prominent in infected cells. Disruption of NF-κB signaling through the silencing of the NF-κB transcription factor p65 or p50 resulted in loss of virus replication that was rescued upon reconstitution. These findings could be further corroborated with the use of NF-κB inhibitors, which reduced SARS-CoV-2 replication in vitro. These data suggest that the robust cytokine production in response to SARS-CoV-2, despite a diminished IFN-I response, is the product of a dependency on NF-κB for viral replication. IMPORTANCE The COVID-19 pandemic has caused significant mortality and morbidity around the world. Although effective vaccines have been developed, large parts of the world remain unvaccinated while new SARS-CoV-2 variants keep emerging. Furthermore, despite extensive efforts and large-scale drug screenings, no fully effective antiviral treatment options have been discovered yet. Therefore, it is of the utmost importance to gain a better understanding of essential factors driving SARS-CoV-2 replication to be able to develop novel approaches to target SARS-CoV-2 biology.
Collapse
|
197
|
Wang L. Single-cell normalization and association testing unifying CRISPR screen and gene co-expression analyses with Normalisr. Nat Commun 2021; 12:6395. [PMID: 34737291 PMCID: PMC8568964 DOI: 10.1038/s41467-021-26682-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Accepted: 10/19/2021] [Indexed: 12/13/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) provides unprecedented technical and statistical potential to study gene regulation but is subject to technical variations and sparsity. Furthermore, statistical association testing remains difficult for scRNA-seq. Here we present Normalisr, a normalization and statistical association testing framework that unifies single-cell differential expression, co-expression, and CRISPR screen analyses with linear models. By systematically detecting and removing nonlinear confounders arising from library size at mean and variance levels, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased p-value estimation. The superior scalability allows us to reconstruct robust gene regulatory networks from trans-effects of guide RNAs in large-scale single cell CRISPRi screens. On conventional scRNA-seq, Normalisr recovers gene-level co-expression networks that recapitulated known gene functions.
Collapse
Affiliation(s)
- Lingfei Wang
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA.
- Molecular Pathology Unit and Center for Cancer Research, Massachusetts General Hospital Research Institute, Charlestown, MA, USA.
| |
Collapse
|
198
|
Sommerfeld L, Finkernagel F, Jansen JM, Wagner U, Nist A, Stiewe T, Müller‐Brüsselbach S, Sokol AM, Graumann J, Reinartz S, Müller R. The multicellular signalling network of ovarian cancer metastases. Clin Transl Med 2021; 11:e633. [PMID: 34841720 PMCID: PMC8574964 DOI: 10.1002/ctm2.633] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 10/08/2021] [Accepted: 10/15/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Transcoelomic spread is the major route of metastasis of ovarian high-grade serous carcinoma (HGSC) with the omentum as the major metastatic site. Its unique tumour microenvironment with its large populations of adipocytes, mesothelial cells and immune cells establishes an intercellular signaling network that is instrumental for metastatic growth yet poorly understood. METHODS Based on transcriptomic analysis of tumour cells, tumour-associated immune and stroma cells we defined intercellular signaling pathways for 284 cytokines and growth factors and their cognate receptors after bioinformatic adjustment for contaminating cell types. The significance of individual components of this network was validated by analysing clinical correlations and potentially pro-metastatic functions, including tumour cell migration, pro-inflammatory signal transduction and TAM expansion. RESULTS The data show an unexpected prominent role of host cells, and in particular of omental adipocytes, mesothelial cells and fibroblasts (CAF), in sustaining this signaling network. These cells, rather than tumour cells, are the major source of most cytokines and growth factors in the omental microenvironment (n = 176 vs. n = 13). Many of these factors target tumour cells, are linked to metastasis and are associated with a short survival. Likewise, tumour stroma cells play a major role in extracellular-matrix-triggered signaling. We have verified the functional significance of our observations for three exemplary instances. We show that the omental microenvironment (i) stimulates tumour cell migration and adhesion via WNT4 which is highly expressed by CAF; (ii) induces pro-tumourigenic TAM proliferation in conjunction with high CSF1 expression by omental stroma cells and (iii) triggers pro-inflammatory signaling, at least in part via a HSP70-NF-κB pathway. CONCLUSIONS The intercellular signaling network of omental metastases is majorly dependent on factors secreted by immune and stroma cells to provide an environment that supports ovarian HGSC progression. Clinically relevant pathways within this network represent novel options for therapeutic intervention.
Collapse
Affiliation(s)
- Leah Sommerfeld
- Department of Translational Oncology, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
| | - Florian Finkernagel
- Department of Translational Oncology, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
| | - Julia M. Jansen
- Clinic for Gynecology, Gynecological Oncology and Gynecological EndocrinologyUniversity Hospital (UKGM)MarburgGermany
| | - Uwe Wagner
- Clinic for Gynecology, Gynecological Oncology and Gynecological EndocrinologyUniversity Hospital (UKGM)MarburgGermany
| | - Andrea Nist
- Genomics Core Facility, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
| | - Thorsten Stiewe
- Genomics Core Facility, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
- Institute of Molecular OncologyPhilipps UniversityMarburgGermany
| | - Sabine Müller‐Brüsselbach
- Department of Translational Oncology, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
| | - Anna M. Sokol
- The German Centre for Cardiovascular Research (DZHK), Partner Site Rhine‐MainMax Planck Institute for Heart and Lung ResearchBad NauheimGermany
| | - Johannes Graumann
- The German Centre for Cardiovascular Research (DZHK), Partner Site Rhine‐MainMax Planck Institute for Heart and Lung ResearchBad NauheimGermany
- Institute for Translational Proteomics, Philipps UniversityMarburgGermany
| | - Silke Reinartz
- Department of Translational Oncology, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
| | - Rolf Müller
- Department of Translational Oncology, Center for Tumor Biology and Immunology (ZTI)Philipps UniversityMarburgGermany
| |
Collapse
|
199
|
Single-Cell RNA-Sequencing Identifies Infrapatellar Fat Pad Macrophage Polarization in Acute Synovitis/Fat Pad Fibrosis and Cell Therapy. Bioengineering (Basel) 2021; 8:bioengineering8110166. [PMID: 34821732 PMCID: PMC8615266 DOI: 10.3390/bioengineering8110166] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 10/22/2021] [Accepted: 10/26/2021] [Indexed: 12/13/2022] Open
Abstract
The pathogenesis and progression of knee inflammatory pathologies is modulated partly by residing macrophages in the infrapatellar fat pad (IFP), thus, macrophage polarization towards pro-inflammatory (M1) or anti-inflammatory (M2) phenotypes is important in joint disease pathologies. Alteration of M1/M2 balance contributes to the initiation and progression of joint inflammation and can be potentially altered with mesenchymal stem cell (MSC) therapy. In an acute synovial/IFP inflammation rat model a single intra-articular injection of IFP-MSC was performed, having as controls (1) diseased rats not receiving IFP-MSC and (2) non-diseased rats. After 4 days, cell specific transcriptional profiling via single-cell RNA-sequencing was performed on isolated IFP tissue from each group. Eight transcriptomically distinct cell populations were identified within the IFP across all three treatment groups with a noted difference in the proportion of myeloid cells across the groups. Largely myeloid cells consisted of macrophages (>90%); one M1 sub-cluster highly expressing pro-inflammatory markers and two M2 sub-clusters with one of them expressing higher levels of canonical M2 markers. Notably, the diseased samples (11.9%) had the lowest proportion of cells expressing M2 markers relative to healthy (14.8%) and MSC treated (19.4%) samples. These results suggest a phenotypic polarization of IFP macrophages towards the pro-inflammatory M1 phenotype in an acute model of inflammation, which are alleviated by IFP-MSC therapy inducing a switch towards an alternate M2 status. Understanding the IFP cellular heterogeneity and associated transcriptional programs may offer insights into novel therapeutic strategies for disabling joint disease pathologies.
Collapse
|
200
|
Li H, Zhu B, Xu Z, Adams T, Kaminski N, Zhao H. A Markov random field model for network-based differential expression analysis of single-cell RNA-seq data. BMC Bioinformatics 2021; 22:524. [PMID: 34702190 PMCID: PMC8549347 DOI: 10.1186/s12859-021-04412-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 09/15/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Recent development of single cell sequencing technologies has made it possible to identify genes with different expression (DE) levels at the cell type level between different groups of samples. In this article, we propose to borrow information through known biological networks to increase statistical power to identify differentially expressed genes (DEGs). RESULTS We develop MRFscRNAseq, which is based on a Markov random field (MRF) model to appropriately accommodate gene network information as well as dependencies among cell types to identify cell-type specific DEGs. We implement an Expectation-Maximization (EM) algorithm with mean field-like approximation to estimate model parameters and a Gibbs sampler to infer DE status. Simulation study shows that our method has better power to detect cell-type specific DEGs than conventional methods while appropriately controlling type I error rate. The usefulness of our method is demonstrated through its application to study the pathogenesis and biological processes of idiopathic pulmonary fibrosis (IPF) using a single-cell RNA-sequencing (scRNA-seq) data set, which contains 18,150 protein-coding genes across 38 cell types on lung tissues from 32 IPF patients and 28 normal controls. CONCLUSIONS The proposed MRF model is implemented in the R package MRFscRNAseq available on GitHub. By utilizing gene-gene and cell-cell networks, our method increases statistical power to detect differentially expressed genes from scRNA-seq data.
Collapse
Affiliation(s)
- Hongyu Li
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT 06511 USA
| | - Biqing Zhu
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511 USA
| | - Zhichao Xu
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT 06511 USA
| | - Taylor Adams
- Section of Pulmonary, Critical Care and Sleep Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT 06520 USA
| | - Naftali Kaminski
- Section of Pulmonary, Critical Care and Sleep Medicine, Department of Internal Medicine, Yale School of Medicine, New Haven, CT 06520 USA
| | - Hongyu Zhao
- Department of Biostatistics, School of Public Health, Yale University, New Haven, CT 06511 USA
- Program of Computational Biology and Bioinformatics, Yale University, New Haven, CT 06511 USA
| |
Collapse
|