1
|
Protti G, Spreafico R. A primer on single-cell RNA-seq analysis using dendritic cells as a case study. FEBS Lett 2024. [PMID: 39245787 DOI: 10.1002/1873-3468.15009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 07/18/2024] [Accepted: 08/12/2024] [Indexed: 09/10/2024]
Abstract
Recent advances in single-cell (sc) transcriptomics have revolutionized our understanding of dendritic cells (DCs), pivotal players of the immune system. ScRNA-sequencing (scRNA-seq) has unraveled a previously unrecognized complexity and heterogeneity of DC subsets, shedding light on their ontogeny and specialized roles. However, navigating the rapid technological progress and computational methods can be daunting for researchers unfamiliar with the field. This review aims to provide immunologists with a comprehensive introduction to sc transcriptomic analysis, offering insights into recent developments in DC biology. Addressing common analytical queries, we guide readers through popular tools and methodologies, supplemented with references to benchmarks and tutorials for in-depth understanding. By examining findings from pioneering studies, we illustrate how computational techniques have expanded our knowledge of DC biology. Through this synthesis, we aim to equip researchers with the necessary tools and knowledge to navigate and leverage scRNA-seq for unraveling the intricacies of DC biology and advancing immunological research.
Collapse
Affiliation(s)
- Giulia Protti
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy
| | - Roberto Spreafico
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
| |
Collapse
|
2
|
Zhao Y, Kohl C, Rosebrock D, Hu Q, Hu Y, Vingron M. CAbiNet: joint clustering and visualization of cells and genes for single-cell transcriptomics. Nucleic Acids Res 2024; 52:e57. [PMID: 38850160 PMCID: PMC11260446 DOI: 10.1093/nar/gkae480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 04/10/2024] [Accepted: 05/27/2024] [Indexed: 06/10/2024] Open
Abstract
A fundamental analysis task for single-cell transcriptomics data is clustering with subsequent visualization of cell clusters. The genes responsible for the clustering are only inferred in a subsequent step. Clustering cells and genes together would be the remit of biclustering algorithms, which are often bogged down by the size of single-cell data. Here we present 'Correspondence Analysis based Biclustering on Networks' (CAbiNet) for joint clustering and visualization of single-cell RNA-sequencing data. CAbiNet performs efficient co-clustering of cells and their respective marker genes and jointly visualizes the biclusters in a non-linear embedding for easy and interactive visual exploration of the data.
Collapse
Affiliation(s)
- Yan Zhao
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
- Department of Pharmacology, School of Medicine, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055, Guangdong, P.R. China
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055 Guangdong, P.R. China
| | - Clemens Kohl
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Daniel Rosebrock
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| | - Qinan Hu
- Department of Pharmacology, School of Medicine, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055, Guangdong, P.R. China
- Joint Laboratory of Guangdong-Hong Kong Universities for Vascular Homeostasis and Diseases, School of Medicine, Southern University of Science and Technology,1088 Xueyuan Avenue, Shenzhen 518055 Guangdong, P.R. China
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055 Guangdong, P.R. China
| | - Yuhui Hu
- Department of Pharmacology, School of Medicine, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055, Guangdong, P.R. China
- Joint Laboratory of Guangdong-Hong Kong Universities for Vascular Homeostasis and Diseases, School of Medicine, Southern University of Science and Technology,1088 Xueyuan Avenue, Shenzhen 518055 Guangdong, P.R. China
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, 1088 Xueyuan Avenue, Shenzhen 518055 Guangdong, P.R. China
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63-73, 14195 Berlin, Germany
| |
Collapse
|
3
|
Canzar S, Do VH, Jelić S, Laue S, Matijević D, Prusina T. Metric multidimensional scaling for large single-cell datasets using neural networks. Algorithms Mol Biol 2024; 19:21. [PMID: 38863064 PMCID: PMC11165904 DOI: 10.1186/s13015-024-00265-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 05/22/2024] [Indexed: 06/13/2024] Open
Abstract
Metric multidimensional scaling is one of the classical methods for embedding data into low-dimensional Euclidean space. It creates the low-dimensional embedding by approximately preserving the pairwise distances between the input points. However, current state-of-the-art approaches only scale to a few thousand data points. For larger data sets such as those occurring in single-cell RNA sequencing experiments, the running time becomes prohibitively large and thus alternative methods such as PCA are widely used instead. Here, we propose a simple neural network-based approach for solving the metric multidimensional scaling problem that is orders of magnitude faster than previous state-of-the-art approaches, and hence scales to data sets with up to a few million cells. At the same time, it provides a non-linear mapping between high- and low-dimensional space that can place previously unseen cells in the same embedding.
Collapse
Affiliation(s)
- Stefan Canzar
- Faculty of Informatics and Data Science, University of Regensburg, Regensburg, Germany.
| | - Van Hoan Do
- Center for Applied Mathematics and Informatics, Le Quy Don Technical University, Hanoi, Vietnam
| | - Slobodan Jelić
- School of Applied Mathematics and Informatics, University of Osijek, Osijek, Croatia
| | - Sören Laue
- Department of Informatics, Universität Hamburg, Hamburg, Germany
| | - Domagoj Matijević
- School of Applied Mathematics and Informatics, University of Osijek, Osijek, Croatia
| | - Tomislav Prusina
- Department of Informatics, Universität Hamburg, Hamburg, Germany
| |
Collapse
|
4
|
Roux de Bézieux H, Street K, Fischer S, Van den Berge K, Chance R, Risso D, Gillis J, Ngai J, Purdom E, Dudoit S. Improving replicability in single-cell RNA-Seq cell type discovery with Dune. BMC Bioinformatics 2024; 25:198. [PMID: 38789920 PMCID: PMC11127396 DOI: 10.1186/s12859-024-05814-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 05/17/2024] [Indexed: 05/26/2024] Open
Abstract
BACKGROUND Single-cell transcriptome sequencing (scRNA-Seq) has allowed new types of investigations at unprecedented levels of resolution. Among the primary goals of scRNA-Seq is the classification of cells into distinct types. Many approaches build on existing clustering literature to develop tools specific to single-cell. However, almost all of these methods rely on heuristics or user-supplied parameters to control the number of clusters. This affects both the resolution of the clusters within the original dataset as well as their replicability across datasets. While many recommendations exist, in general, there is little assurance that any given set of parameters will represent an optimal choice in the trade-off between cluster resolution and replicability. For instance, another set of parameters may result in more clusters that are also more replicable. RESULTS Here, we propose Dune, a new method for optimizing the trade-off between the resolution of the clusters and their replicability. Our method takes as input a set of clustering results-or partitions-on a single dataset and iteratively merges clusters within each partitions in order to maximize their concordance between partitions. As demonstrated on multiple datasets from different platforms, Dune outperforms existing techniques, that rely on hierarchical merging for reducing the number of clusters, in terms of replicability of the resultant merged clusters as well as concordance with ground truth. Dune is available as an R package on Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/Dune.html . CONCLUSIONS Cluster refinement by Dune helps improve the robustness of any clustering analysis and reduces the reliance on tuning parameters. This method provides an objective approach for borrowing information across multiple clusterings to generate replicable clusters most likely to represent common biological features across multiple datasets.
Collapse
Affiliation(s)
- Hector Roux de Bézieux
- Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Kelly Street
- Division of Biostatistics, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | | | - Koen Van den Berge
- Department of Statistics, University of California, Berkeley, CA, USA
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Rebecca Chance
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Davide Risso
- Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Jesse Gillis
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - John Ngai
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Elizabeth Purdom
- Department of Statistics, University of California, Berkeley, CA, USA
- Center for Computational Biology, University of California, Berkeley, CA, USA
| | - Sandrine Dudoit
- Department of Statistics, University of California, Berkeley, CA, USA.
- Division of Biostatistics, School of Public Health, University of California, Berkeley, CA, USA.
- Center for Computational Biology, University of California, Berkeley, CA, USA.
| |
Collapse
|
5
|
Abdallah AT, Konermann A. Unraveling Divergent Transcriptomic Profiles: A Comparative Single-Cell RNA Sequencing Study of Epithelium, Gingiva, and Periodontal Ligament Tissues. Int J Mol Sci 2024; 25:5617. [PMID: 38891804 PMCID: PMC11172200 DOI: 10.3390/ijms25115617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/16/2024] [Accepted: 05/18/2024] [Indexed: 06/21/2024] Open
Abstract
The periodontium comprising periodontal ligament (PDL), gingiva, and epithelium play crucial roles in maintaining tooth integrity and function. Understanding tissue cellular composition and gene expression is crucial for illuminating periodontal pathophysiology. This study aimed to identify tissue-specific markers via scRNA-Seq. Primary human PDL, gingiva, and epithelium tissues (n = 7) were subjected to cell hashing and sorting. scRNA-Seq library preparation using 10× Genomics protocol and Illumina sequencing was conducted. The analysis was performed using Cellranger (v3.1.0), with downstream analysis via R packages Seurat (v5.0.1) and SCORPIUS (v1.0.9). Investigations identified eight distinct cellular clusters, revealing the ubiquitous presence of epithelial and gingival cells. PDL cells evolved in two clusters with numerical superiority. The other clusters showed varied predominance regarding gingival and epithelial cells or an equitable distribution of both. The cluster harboring most cells mainly consisted of PDL cells and was present in all donors. Some of the other clusters were also tissue-inherent, while the presence of others was environmentally influenced, revealing variability across donors. Two clusters exhibited genetic profiles associated with tissue development and cellular integrity, respectively, while all other clusters were distinguished by genes characteristic of immune responses. Developmental trajectory analysis uncovered that PDL cells may develop after epithelial and gingival cells, suggesting the inherent PDL cell-dominated cluster as a final developmental stage. This single-cell RNA sequencing study delineates the hierarchical organization of periodontal tissue development, identifies tissue-specific markers, and reveals the influence of environmental factors on cellular composition, advancing our understanding of periodontal biology and offering potential insights for therapeutic interventions.
Collapse
Affiliation(s)
- Ali T. Abdallah
- Cluster of Excellence Cellular Stress Responses in Aging-Associated Diseases (CECAD), Medical Faculty and University Hospital Cologne, University of Cologne, 50931 Cologne, Germany
- Institute of Medical Statistics and Computational Biology, Faculty of Medicine, University of Cologne, 50924 Cologne, Germany
- Cluster of Excellence Cellular Stress Responses in Aging-Associated Diseases (CECAD), Faculty of Mathematics and Natural Sciences, University of Cologne, 50931 Cologne, Germany
- Interdisciplinary Center for Clinical Research, University Hospital RWTH, 52074 Aachen, Germany
| | - Anna Konermann
- Department of Orthodontics, University Hospital Bonn, 53111 Bonn, Germany
| |
Collapse
|
6
|
Zhang K, Kan H, Mao A, Yu F, Geng L, Zhou T, Feng L, Ma X. Integrated Single-Cell Transcriptomic Atlas of Human Kidney Endothelial Cells. J Am Soc Nephrol 2024; 35:578-593. [PMID: 38351505 PMCID: PMC11149048 DOI: 10.1681/asn.0000000000000320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 02/09/2024] [Indexed: 03/23/2024] Open
Abstract
Key Points We created a comprehensive reference atlas of normal human kidney endothelial cells. We confirmed that endothelial cell types in the human kidney were also highly conserved in the mouse kidney. Background Kidney endothelial cells are exposed to different microenvironmental conditions that support specific physiologic processes. However, the heterogeneity of human kidney endothelial cells has not yet been systematically described. Methods We reprocessed and integrated seven human kidney control single-cell/single-nucleus RNA sequencing datasets of >200,000 kidney cells in the same process. Results We identified five major cell types, 29,992 of which were endothelial cells. Endothelial cell reclustering identified seven subgroups that differed in molecular characteristics and physiologic functions. Mapping new data to a normal kidney endothelial cell atlas allows rapid data annotation and analysis. We confirmed that endothelial cell types in the human kidney were also highly conserved in the mouse kidney and identified endothelial marker genes that were conserved in humans and mice, as well as differentially expressed genes between corresponding subpopulations. Furthermore, combined analysis of single-cell transcriptome data with public genome-wide association study data showed a significant enrichment of endothelial cells, especially arterial endothelial cells, in BP heritability. Finally, we identified M1 and M12 from coexpression networks in endothelial cells that may be deeply involved in BP regulation. Conclusions We created a comprehensive reference atlas of normal human kidney endothelial cells that provides the molecular foundation for understanding how the identity and function of kidney endothelial cells are altered in disease, aging, and between species. Finally, we provide a publicly accessible online tool to explore the datasets described in this work (https://vascularmap.jiangnan.edu.cn ).
Collapse
Affiliation(s)
- Ka Zhang
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
- School of Food Science and Technology, Jiangnan University, Wuxi, China
| | - Hao Kan
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Aiqin Mao
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Fan Yu
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Li Geng
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Tingting Zhou
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Lei Feng
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Xin Ma
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| |
Collapse
|
7
|
Abdallah AT, Peitz M, Konermann A. Revealing Genetic Dynamics: scRNA-seq Unravels Modifications in Human PDL Cells across In Vivo and In Vitro Environments. Int J Mol Sci 2024; 25:4731. [PMID: 38731950 PMCID: PMC11083143 DOI: 10.3390/ijms25094731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 04/15/2024] [Accepted: 04/22/2024] [Indexed: 05/13/2024] Open
Abstract
The periodontal ligament (PDL) is a highly specialized fibrous tissue comprising heterogeneous cell populations of an intricate nature. These complexities, along with challenges due to cell culture, impede a comprehensive understanding of periodontal pathophysiology. This study aims to address this gap, employing single-cell RNA sequencing (scRNA-seq) technology to analyze the genetic intricacies of PDL both in vivo and in vitro. Primary human PDL samples (n = 7) were split for direct in vivo analysis and cell culture under serum-containing and serum-free conditions. Cell hashing and sorting, scRNA-seq library preparation using the 10x Genomics protocol, and Illumina sequencing were conducted. Primary analysis was performed using Cellranger, with downstream analysis via the R packages Seurat and SCORPIUS. Seven distinct PDL cell clusters were identified comprising different cellular subsets, each characterized by unique genetic profiles, with some showing donor-specific patterns in representation and distribution. Formation of these cellular clusters was influenced by culture conditions, particularly serum presence. Furthermore, certain cell populations were found to be inherent to the PDL tissue, while others exhibited variability across donors. This study elucidates specific genes and cell clusters within the PDL, revealing both inherent and context-driven subpopulations. The impact of culture conditions-notably the presence of serum-on cell cluster formation highlights the critical need for refining culture protocols, as comprehending these influences can drive the creation of superior culture systems vital for advancing research in PDL biology and regenerative therapies. These discoveries not only deepen our comprehension of PDL biology but also open avenues for future investigations into uncovering underlying mechanisms.
Collapse
Affiliation(s)
- Ali T. Abdallah
- Cluster of Excellence Cellular Stress Responses in Aging-Associated Diseases (CECAD), 50931 Cologne, Germany;
- Institute of Medical Statistics and Computational Biology, Faculty of Medicine, University of Cologne, 50923 Cologne, Germany
- Interdisciplinary Center for Clinical Research, University Hospital RWTH, 52074 Aachen, Germany
| | - Michael Peitz
- Institute of Reconstructive Neurobiology, Life and Brain Center, University Hospital Bonn, 53105 Bonn, Germany
| | - Anna Konermann
- Department of Orthodontics, University Hospital Bonn, 53111 Bonn, Germany
| |
Collapse
|
8
|
Tian J, Bai X, Quek C. Single-Cell Informatics for Tumor Microenvironment and Immunotherapy. Int J Mol Sci 2024; 25:4485. [PMID: 38674070 PMCID: PMC11050520 DOI: 10.3390/ijms25084485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/12/2024] [Accepted: 04/16/2024] [Indexed: 04/28/2024] Open
Abstract
Cancer comprises malignant cells surrounded by the tumor microenvironment (TME), a dynamic ecosystem composed of heterogeneous cell populations that exert unique influences on tumor development. The immune community within the TME plays a substantial role in tumorigenesis and tumor evolution. The innate and adaptive immune cells "talk" to the tumor through ligand-receptor interactions and signaling molecules, forming a complex communication network to influence the cellular and molecular basis of cancer. Such intricate intratumoral immune composition and interactions foster the application of immunotherapies, which empower the immune system against cancer to elicit durable long-term responses in cancer patients. Single-cell technologies have allowed for the dissection and characterization of the TME to an unprecedented level, while recent advancements in bioinformatics tools have expanded the horizon and depth of high-dimensional single-cell data analysis. This review will unravel the intertwined networks between malignancy and immunity, explore the utilization of computational tools for a deeper understanding of tumor-immune communications, and discuss the application of these approaches to aid in diagnosis or treatment decision making in the clinical setting, as well as the current challenges faced by the researchers with their potential future improvements.
Collapse
Affiliation(s)
| | | | - Camelia Quek
- Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia; (J.T.); (X.B.)
| |
Collapse
|
9
|
Wilk AJ, Shalek AK, Holmes S, Blish CA. Comparative analysis of cell-cell communication at single-cell resolution. Nat Biotechnol 2024; 42:470-483. [PMID: 37169965 PMCID: PMC10638471 DOI: 10.1038/s41587-023-01782-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 04/05/2023] [Indexed: 05/13/2023]
Abstract
Inference of cell-cell communication from single-cell RNA sequencing data is a powerful technique to uncover intercellular communication pathways, yet existing methods perform this analysis at the level of the cell type or cluster, discarding single-cell-level information. Here we present Scriabin, a flexible and scalable framework for comparative analysis of cell-cell communication at single-cell resolution that is performed without cell aggregation or downsampling. We use multiple published atlas-scale datasets, genetic perturbation screens and direct experimental validation to show that Scriabin accurately recovers expected cell-cell communication edges and identifies communication networks that can be obscured by agglomerative methods. Additionally, we use spatial transcriptomic data to show that Scriabin can uncover spatial features of interaction from dissociated data alone. Finally, we demonstrate applications to longitudinal datasets to follow communication pathways operating between timepoints. Our approach represents a broadly applicable strategy to reveal the full structure of niche-phenotype relationships in health and disease.
Collapse
Affiliation(s)
- Aaron J Wilk
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
- Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA, USA.
| | - Alex K Shalek
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA, USA
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Susan Holmes
- Department of Statistics, Stanford University, Stanford, CA, USA
| | - Catherine A Blish
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA, USA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
- Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
10
|
Xia J, Fei S, Huang Y, Lai W, Yu Y, Liang L, Wu H, Swevers L, Sun J, Feng M. Single-nucleus sequencing of silkworm larval midgut reveals the immune escape strategy of BmNPV in the midgut during the late stage of infection. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2024; 164:104043. [PMID: 38013005 DOI: 10.1016/j.ibmb.2023.104043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 11/22/2023] [Accepted: 11/22/2023] [Indexed: 11/29/2023]
Abstract
The midgut is an important barrier against microorganism invasion and proliferation, yet is the first tissue encountered when a baculovirus naturally invades the host. However, only limited knowledge is available how different midgut cell types contribute to the immune response and the clearance or promotion of viral infection. Here, single-nucleus RNA sequencing (snRNA seq) was employed to analyze the responses of various cell subpopulations in the silkworm larval midgut to B. mori nucleopolyhedrovirus (BmNPV) infection. We identified 22 distinct clusters representing enteroendocrine cells (EEs), enterocytes (ECs), intestinal stem cells (ISCs), Goblet cell-like and muscle cell types in the BmNPV-infected and uninfected silkworm larvae midgut at 72 h post infection. Further, our results revealed that the strategies for immune escape of BmNPV in the midgut at the late stage of infection include (1) inhibiting the response of antiviral pathways; (2) inhibiting the expression of antiviral host factors; (3) stimulating expression levels of genes promoting BmNPV replication. These findings suggest that the midgut, as the first line of defense against the invasion of the baculovirus, has dual characteristics of "resistance" and "tolerance". Our single-cell dataset reveals the diversity of silkworm larval midgut cells, and the transcriptome analysis provides insights into the interaction between host and virus infection at the single-cell level.
Collapse
Affiliation(s)
- Junming Xia
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Shigang Fei
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Yigui Huang
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Wenxuan Lai
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Yue Yu
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Lingying Liang
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Hailin Wu
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Luc Swevers
- Insect Molecular Genetics and Biotechnology, National Centre for Scientific Research Demokritos, Institute of Biosciences and Applications, Athens, Greece.
| | - Jingchen Sun
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| | - Min Feng
- Guangdong Provincial Key Laboratory of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China.
| |
Collapse
|
11
|
Goggin SM, Zunder ER. A hyperparameter-randomized ensemble approach for robust clustering across diverse datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.18.571953. [PMID: 38187667 PMCID: PMC10769222 DOI: 10.1101/2023.12.18.571953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Clustering analysis is widely used to group objects by similarity, but for complex datasets such as those produced by single-cell analysis, the currently available clustering methods are limited by accuracy, robustness, ease of use, and interpretability. To address these limitations, we developed an ensemble clustering method with hyperparameter randomization that outperforms other methods across a broad range of single-cell and synthetic datasets, without the need for manual hyperparameter selection. In addition to hard cluster labels, it also outputs soft cluster memberships to characterize continuum-like regions and per cell overlap scores to quantify the uncertainty in cluster assignment. We demonstrate the improved clustering interpretability from these features by tracing the intermediate stages between handwritten digits in the MNIST dataset, and between tanycyte subpopulations in the hypothalamus. This approach improves the quality of clustering and subsequent downstream analyses for single-cell datasets, and may also prove useful in other fields of data analysis.
Collapse
Affiliation(s)
- Sarah M. Goggin
- Neuroscience Graduate Program, School of Medicine, University of Virginia, Charlottesville, VA 22902
| | - Eli R. Zunder
- Neuroscience Graduate Program, School of Medicine, University of Virginia, Charlottesville, VA 22902
- Department of Biomedical Engineering, School of Engineering, University of Virginia, Charlottesville, VA 22902
| |
Collapse
|
12
|
Shree A, Pavan MK, Zafar H. scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier. Nat Commun 2023; 14:7781. [PMID: 38012145 PMCID: PMC10682386 DOI: 10.1038/s41467-023-43590-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Accepted: 11/14/2023] [Indexed: 11/29/2023] Open
Abstract
Integration of heterogeneous single-cell sequencing datasets generated across multiple tissue locations, time, and conditions is essential for a comprehensive understanding of the cellular states and expression programs underlying complex biological systems. Here, we present scDREAMER ( https://github.com/Zafar-Lab/scDREAMER ), a data-integration framework that employs deep generative models and adversarial training for both unsupervised and supervised (scDREAMER-Sup) integration of multiple batches. Using six real benchmarking datasets, we demonstrate that scDREAMER can overcome critical challenges including skewed cell type distribution among batches, nested batch-effects, large number of batches and conservation of development trajectory across batches. Our experiments also show that scDREAMER and scDREAMER-Sup outperform state-of-the-art unsupervised and supervised integration methods respectively in batch-correction and conservation of biological variation. Using a 1 million cells dataset, we demonstrate that scDREAMER is scalable and can perform atlas-level cross-species (e.g., human and mouse) integration while being faster than other deep-learning-based methods.
Collapse
Affiliation(s)
- Ajita Shree
- Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India
| | - Musale Krushna Pavan
- Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India
| | - Hamim Zafar
- Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India.
- Department of Biological Sciences and Bioengineering, Indian Institute of Technology Kanpur, Kanpur, India.
- Mehta Family Centre for Engineering in Medicine, Indian Institute of Technology Kanpur, Kanpur, India.
| |
Collapse
|
13
|
Mou CY, Zhang L, Zhao H, Huang ZP, Duan YL, Zhao ZM, Ke HY, Du J, Li Q, Zhou J. Single-nuclei RNA-seq reveals skin cell responses to Aeromonas hydrophila infection in Chinese longsnout catfish Leiocassis longirostris. Front Immunol 2023; 14:1271466. [PMID: 37908355 PMCID: PMC10613986 DOI: 10.3389/fimmu.2023.1271466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/25/2023] [Indexed: 11/02/2023] Open
Abstract
As the primary natural barrier that protects against adverse environmental conditions, the skin plays a crucial role in the innate immune response of fish, particularly in relation to bacterial infections. However, due to the diverse functionality and intricate anatomical and cellular composition of the skin, deciphering the immune response of the host is a challenging task. In this study, single nuclei RNA-sequencing (snRNA-seq) was performed on skin biopsies obtained from Chinese longsnout catfish (Leiocassis longirostris), comparing Aeromonas hydrophila-infected subjects to healthy control subjects. A total of 19,581 single nuclei cells were sequenced using 10x Genomics (10,400 in the control group and 9,181 in the treated group). Based on expressed unique transcriptional profiles, 33 cell clusters were identified and classified into 12 cell types including keratinocyte (KC), fibroblast (FB), endothelial cells (EC), secretory cells (SC), immune cells, smooth muscle cells (SMC), and other cells such as pericyte (PC), brush cell (BC), red blood cell (RBC), neuroendocrine cell (NDC), neuron cells (NC), and melanocyte (MC). Among these, three clusters of KCs, namely, KC1, KC2, and KC5 exhibited significant expansion after A. hydrophila infection. Analysis of pathway enrichment revealed that KC1 was primarily involved in environmental signal transduction, KC2 was primarily involved in endocrine function, and KC5 was primarily involved in metabolism. Finally, our findings suggest that neutrophils may play a crucial role in combating A. hydrophila infections. In summary, this study not only provides the first detailed comprehensive map of all cell types present in the skin of teleost fish but also sheds light on the immune response mechanism of the skin following A. hydrophila infection in Chinese longsnout catfish.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Qiang Li
- Fisheries Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, Sichuan, China
| | - Jian Zhou
- Fisheries Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, Sichuan, China
| |
Collapse
|
14
|
Carbonetto P, Luo K, Sarkar A, Hung A, Tayeb K, Pott S, Stephens M. GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.03.531029. [PMID: 36945441 PMCID: PMC10028846 DOI: 10.1101/2023.03.03.531029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
Abstract
Parts-based representations, such as non-negative matrix factorization and topic modeling, have been used to identify structure from single-cell sequencing data sets, in particular structure that is not as well captured by clustering or other dimensionality reduction methods. However, interpreting the individual parts remains a challenge. To address this challenge, we extend methods for differential expression analysis by allowing cells to have partial membership to multiple groups. We call this grade of membership differential expression (GoM DE). We illustrate the benefits of GoM DE for annotating topics identified in several single-cell RNA-seq and ATAC-seq data sets.
Collapse
Affiliation(s)
- Peter Carbonetto
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Research Computing Center, University of Chicago, Chicago, IL, USA
| | - Kaixuan Luo
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Abhishek Sarkar
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Vesalius Therapeutics, Cambridge, MA, USA
| | - Anthony Hung
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA
| | - Karl Tayeb
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Sebastian Pott
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Department of Statistics, University of Chicago, Chicago, IL, USA
| |
Collapse
|
15
|
Michielsen L, Lotfollahi M, Strobl D, Sikkema L, Reinders MT, Theis F, Mahfouz A. Single-cell reference mapping to construct and extend cell-type hierarchies. NAR Genom Bioinform 2023; 5:lqad070. [PMID: 37502708 PMCID: PMC10370450 DOI: 10.1093/nargab/lqad070] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 07/10/2023] [Indexed: 07/29/2023] Open
Abstract
Single-cell genomics is now producing an ever-increasing amount of datasets that, when integrated, could provide large-scale reference atlases of tissue in health and disease. Such large-scale atlases increase the scale and generalizability of analyses and enable combining knowledge generated by individual studies. Specifically, individual studies often differ regarding cell annotation terminology and depth, with different groups specializing in different cell type compartments, often using distinct terminology. Understanding how these distinct sets of annotations are related and complement each other would mark a major step towards a consensus-based cell-type annotation reflecting the latest knowledge in the field. Whereas recent computational techniques, referred to as 'reference mapping' methods, facilitate the usage and expansion of existing reference atlases by mapping new datasets (i.e. queries) onto an atlas; a systematic approach towards harmonizing dataset-specific cell-type terminology and annotation depth is still lacking. Here, we present 'treeArches', a framework to automatically build and extend reference atlases while enriching them with an updatable hierarchy of cell-type annotations across different datasets. We demonstrate various use cases for treeArches, from automatically resolving relations between reference and query cell types to identifying unseen cell types absent in the reference, such as disease-associated cell states. We envision treeArches enabling data-driven construction of consensus atlas-level cell-type hierarchies and facilitating efficient usage of reference atlases.
Collapse
Affiliation(s)
| | | | - Daniel Strobl
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- Institute of Clinical Chemistry and Pathobiochemistry, TUM School of Medicine, Technical University of Munich, 81675 Munich, Germany
| | - Lisa Sikkema
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Germany
| | - Marcel J T Reinders
- Department of Human Genetics, Leiden University Medical Center, 2333ZC Leiden, The Netherlands
- Leiden Computational Biology Center, Leiden University Medical Center, 2333ZC Leiden, The Netherlands
- Delft Bioinformatics Lab, Delft University of Technology, 2628XE Delft, The Netherlands
| | - Fabian J Theis
- To whom correspondence should be addressed. Tel: +49 89 3187 43260;
| | - Ahmed Mahfouz
- Correspondence may also be addressed to Ahmed Mahfouz. Tel: +31 71 52 69513;
| |
Collapse
|
16
|
Jeon SB, Koh H, Han AR, Kim J, Lee S, Lee JH, Im SS, Yoon YS, Lee JH, Lee JY. Ferric citrate and apo-transferrin enable erythroblast maturation with β-globin from hemogenic endothelium. NPJ Regen Med 2023; 8:46. [PMID: 37626061 PMCID: PMC10457393 DOI: 10.1038/s41536-023-00320-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 08/08/2023] [Indexed: 08/27/2023] Open
Abstract
Red blood cell (RBC) generation from human pluripotent stem cells (PSCs) offers potential for innovative cell therapy in regenerative medicine as well as developmental studies. Ex vivo erythropoiesis from PSCs is currently limited by the low efficiency of functional RBCs with β-globin expression in culture systems. During induction of β-globin expression, the absence of a physiological microenvironment, such as a bone marrow niche, may impair cell maturation and lineage specification. Here, we describe a simple and reproducible culture system that can be used to generate erythroblasts with β-globin expression. We prepared a two-dimensional defined culture with ferric citrate treatment based on definitive hemogenic endothelium (HE). Floating erythroblasts derived from HE cells were primarily CD45+CD71+CD235a+ cells, and their number increased remarkably upon Fe treatment. Upon maturation, the erythroblasts cultured in the presence of ferric citrate showed high transcriptional levels of β-globin and enrichment of genes associated with heme synthesis and cell cycle regulation, indicating functionality. The rapid maturation of these erythroblasts into RBCs was observed when injected in vivo, suggesting the development of RBCs that were ready to grow. Hence, induction of β-globin expression may be explained by the effects of ferric citrate that promote cell maturation by binding with soluble transferrin and entering the cells.Taken together, upon treatment with Fe, erythroblasts showed advanced maturity with a high transcription of β-globin. These findings can help devise a stable protocol for the generation of clinically applicable RBCs.
Collapse
Affiliation(s)
- Soo-Been Jeon
- CHA Advanced Research Institute, Bundang CHA Medical Center, CHA University, Seongnam, Kyunggi-do, 13488, South Korea
| | - Hyebin Koh
- Futuristic Animal Resource & Research Center (FARRC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, Republic of Korea
- Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, Republic of Korea
| | - A-Reum Han
- CHA Advanced Research Institute, Bundang CHA Medical Center, CHA University, Seongnam, Kyunggi-do, 13488, South Korea
| | - Jieun Kim
- National Primate Research Center (NPRC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, Republic of Korea
| | - Sunghun Lee
- CHA Advanced Research Institute, Bundang CHA Medical Center, CHA University, Seongnam, Kyunggi-do, 13488, South Korea
| | - Jae-Ho Lee
- Department of Physiology, Keimyung University School of Medicine, Daegu, 42601, Korea
| | - Seung-Soon Im
- Department of Physiology, Keimyung University School of Medicine, Daegu, 42601, Korea
| | - Young-Sup Yoon
- Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Korea
- Department of Medicine, Emory University, Atlanta, USA
| | - Jong-Hee Lee
- Department of Functional Genomics, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, Republic of Korea.
- National Primate Research Center (NPRC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), Cheongju, Republic of Korea.
| | - Ji Yoon Lee
- CHA Advanced Research Institute, Bundang CHA Medical Center, CHA University, Seongnam, Kyunggi-do, 13488, South Korea.
- Department of Biomedical Science, CHA University, Seongnam, Kyunggi-do, 13488, South Korea.
- Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Korea.
| |
Collapse
|
17
|
Casey MJ, Fliege J, Sánchez-García RJ, MacArthur BD. An information-theoretic approach to single cell sequencing analysis. BMC Bioinformatics 2023; 24:311. [PMID: 37573291 PMCID: PMC10422744 DOI: 10.1186/s12859-023-05424-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 07/18/2023] [Indexed: 08/14/2023] Open
Abstract
BACKGROUND Single-cell sequencing (sc-Seq) experiments are producing increasingly large data sets. However, large data sets do not necessarily contain large amounts of information. RESULTS Here, we formally quantify the information obtained from a sc-Seq experiment and show that it corresponds to an intuitive notion of gene expression heterogeneity. We demonstrate a natural relation between our notion of heterogeneity and that of cell type, decomposing heterogeneity into that component attributable to differential expression between cell types (inter-cluster heterogeneity) and that remaining (intra-cluster heterogeneity). We test our definition of heterogeneity as the objective function of a clustering algorithm, and show that it is a useful descriptor for gene expression patterns associated with different cell types. CONCLUSIONS Thus, our definition of gene heterogeneity leads to a biologically meaningful notion of cell type, as groups of cells that are statistically equivalent with respect to their patterns of gene expression. Our measure of heterogeneity, and its decomposition into inter- and intra-cluster, is non-parametric, intrinsic, unbiased, and requires no additional assumptions about expression patterns. Based on this theory, we develop an efficient method for the automatic unsupervised clustering of cells from sc-Seq data, and provide an R package implementation.
Collapse
Affiliation(s)
- Michael J Casey
- Mathematical Sciences, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - Jörg Fliege
- Mathematical Sciences, University of Southampton, Southampton, UK
| | - Rubén J Sánchez-García
- Mathematical Sciences, University of Southampton, Southampton, UK.
- Institute for Life Sciences, University of Southampton, Southampton, UK.
- The Alan Turing Institute, London, UK.
| | - Ben D MacArthur
- Mathematical Sciences, University of Southampton, Southampton, UK.
- Institute for Life Sciences, University of Southampton, Southampton, UK.
- The Alan Turing Institute, London, UK.
- Centre for Human Development, Stem Cells and Regeneration, Faculty of Medicine, University of Southampton, Southampton, UK.
| |
Collapse
|
18
|
Heumos L, Schaar AC, Lance C, Litinetskaya A, Drost F, Zappia L, Lücken MD, Strobl DC, Henao J, Curion F, Schiller HB, Theis FJ. Best practices for single-cell analysis across modalities. Nat Rev Genet 2023; 24:550-572. [PMID: 37002403 PMCID: PMC10066026 DOI: 10.1038/s41576-023-00586-w] [Citation(s) in RCA: 191] [Impact Index Per Article: 191.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/14/2023] [Indexed: 04/03/2023]
Abstract
Recent advances in single-cell technologies have enabled high-throughput molecular profiling of cells across modalities and locations. Single-cell transcriptomics data can now be complemented by chromatin accessibility, surface protein expression, adaptive immune receptor repertoire profiling and spatial information. The increasing availability of single-cell data across modalities has motivated the development of novel computational methods to help analysts derive biological insights. As the field grows, it becomes increasingly difficult to navigate the vast landscape of tools and analysis steps. Here, we summarize independent benchmarking studies of unimodal and multimodal single-cell analysis across modalities to suggest comprehensive best-practice workflows for the most common analysis steps. Where independent benchmarks are not available, we review and contrast popular methods. Our article serves as an entry point for novices in the field of single-cell (multi-)omic analysis and guides advanced users to the most recent best practices.
Collapse
Affiliation(s)
- Lukas Heumos
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Institute of Lung Health and Immunity and Comprehensive Pneumology Center, Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Anna C Schaar
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Munich Center for Machine Learning, Technical University of Munich, Garching, Germany
| | - Christopher Lance
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Department of Paediatrics, Dr von Hauner Children's Hospital, University Hospital, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Anastasia Litinetskaya
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Felix Drost
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Luke Zappia
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Malte D Lücken
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Institute of Lung Health and Immunity, Helmholtz Munich, Munich, Germany
| | - Daniel C Strobl
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
- Institute of Clinical Chemistry and Pathobiochemistry, School of Medicine, Technical University of Munich, Munich, Germany
- TranslaTUM, Center for Translational Cancer Research, Technical University of Munich, Munich, Germany
| | - Juan Henao
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
| | - Fabiola Curion
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Herbert B Schiller
- Institute of Lung Health and Immunity and Comprehensive Pneumology Center, Helmholtz Munich; Member of the German Center for Lung Research (DZL), Munich, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Department of Computational Health, Helmholtz Munich, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Munich Center for Machine Learning, Technical University of Munich, Garching, Germany.
| |
Collapse
|
19
|
Tosoni G, Ayyildiz D, Bryois J, Macnair W, Fitzsimons CP, Lucassen PJ, Salta E. Mapping human adult hippocampal neurogenesis with single-cell transcriptomics: Reconciling controversy or fueling the debate? Neuron 2023; 111:1714-1731.e3. [PMID: 37015226 DOI: 10.1016/j.neuron.2023.03.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 02/06/2023] [Accepted: 03/08/2023] [Indexed: 04/05/2023]
Abstract
The notion of exploiting the regenerative potential of the human brain in physiological aging or neurological diseases represents a particularly attractive alternative to conventional strategies for enhancing or restoring brain function. However, a major first question to address is whether the human brain does possess the ability to regenerate. The existence of human adult hippocampal neurogenesis (AHN) has been at the center of a fierce scientific debate for many years. The advent of single-cell transcriptomic technologies was initially viewed as a panacea to resolving this controversy. However, recent single-cell RNA sequencing studies in the human hippocampus yielded conflicting results. Here, we critically discuss and re-analyze previously published AHN-related single-cell transcriptomic datasets. We argue that, although promising, the single-cell transcriptomic profiling of AHN in the human brain can be confounded by methodological, conceptual, and biological factors that need to be consistently addressed across studies and openly discussed within the scientific community.
Collapse
Affiliation(s)
- Giorgia Tosoni
- Laboratory of Neurogenesis and Neurodegeneration, Netherlands Institute for Neuroscience, 1105 BA, Amsterdam, the Netherlands
| | - Dilara Ayyildiz
- Laboratory of Neurogenesis and Neurodegeneration, Netherlands Institute for Neuroscience, 1105 BA, Amsterdam, the Netherlands
| | - Julien Bryois
- Roche Pharma Research and Early Development, Neuroscience and Rare Diseases, Roche Innovation Center, CH-4070, Basel, Switzerland
| | - Will Macnair
- Roche Pharma Research and Early Development, Neuroscience and Rare Diseases, Roche Innovation Center, CH-4070, Basel, Switzerland
| | - Carlos P Fitzsimons
- Brain Plasticity group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, 1098 XH, Amsterdam, the Netherlands
| | - Paul J Lucassen
- Brain Plasticity group, Swammerdam Institute for Life Sciences, Faculty of Science, University of Amsterdam, 1098 XH, Amsterdam, the Netherlands; Center for Urban Mental Health, University of Amsterdam, 1098 SM, Amsterdam, the Netherlands
| | - Evgenia Salta
- Laboratory of Neurogenesis and Neurodegeneration, Netherlands Institute for Neuroscience, 1105 BA, Amsterdam, the Netherlands.
| |
Collapse
|
20
|
Thirant C, Peltier A, Durand S, Kramdi A, Louis-Brennetot C, Pierre-Eugène C, Gautier M, Costa A, Grelier A, Zaïdi S, Gruel N, Jimenez I, Lapouble E, Pierron G, Sitbon D, Brisse HJ, Gauthier A, Fréneaux P, Grossetête S, Baudrin LG, Raynal V, Baulande S, Bellini A, Bhalshankar J, Carcaboso AM, Geoerger B, Rohrer H, Surdez D, Boeva V, Schleiermacher G, Delattre O, Janoueix-Lerosey I. Reversible transitions between noradrenergic and mesenchymal tumor identities define cell plasticity in neuroblastoma. Nat Commun 2023; 14:2575. [PMID: 37142597 PMCID: PMC10160107 DOI: 10.1038/s41467-023-38239-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 04/21/2023] [Indexed: 05/06/2023] Open
Abstract
Noradrenergic and mesenchymal identities have been characterized in neuroblastoma cell lines according to their epigenetic landscapes and core regulatory circuitries. However, their relationship and relative contribution in patient tumors remain poorly defined. We now document spontaneous and reversible plasticity between the two identities, associated with epigenetic reprogramming, in several neuroblastoma models. Interestingly, xenografts with cells from each identity eventually harbor a noradrenergic phenotype suggesting that the microenvironment provides a powerful pressure towards this phenotype. Accordingly, such a noradrenergic cell identity is systematically observed in single-cell RNA-seq of 18 tumor biopsies and 15 PDX models. Yet, a subpopulation of these noradrenergic tumor cells presents with mesenchymal features that are shared with plasticity models, indicating that the plasticity described in these models has relevance in neuroblastoma patients. This work therefore emphasizes that intrinsic plasticity properties of neuroblastoma cells are dependent upon external cues of the environment to drive cell identity.
Collapse
Affiliation(s)
- Cécile Thirant
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Agathe Peltier
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Simon Durand
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Amira Kramdi
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Caroline Louis-Brennetot
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Cécile Pierre-Eugène
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Margot Gautier
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Ana Costa
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Amandine Grelier
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Sakina Zaïdi
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Nadège Gruel
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- Institut Curie, Department of Translational Research, Paris, France
| | - Irène Jimenez
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
- Institut Curie, Department of Translational Research, Paris, France
- Institut Curie, Laboratoire Recherche Translationnelle en Oncologie Pédiatrique (RTOP), Laboratoire "Gilles Thomas", Paris, France
| | - Eve Lapouble
- Institut Curie, Unité de Génétique Somatique, Paris, France
| | - Gaëlle Pierron
- Institut Curie, Unité de Génétique Somatique, Paris, France
| | - Déborah Sitbon
- Institut Curie, Unité de Génétique Somatique, Paris, France
| | - Hervé J Brisse
- Institut Curie, Department of Imaging, PSL Research University, Paris, France
| | | | - Paul Fréneaux
- Institut Curie, Department of Biopathology, Paris, France
| | - Sandrine Grossetête
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
| | - Laura G Baudrin
- Institut Curie, Genomics of Excellence (ICGex) Platform, Paris, France. Institut Curie, Single Cell Initiative, Paris, France
| | - Virginie Raynal
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- Institut Curie, Genomics of Excellence (ICGex) Platform, Paris, France. Institut Curie, Single Cell Initiative, Paris, France
| | - Sylvain Baulande
- Institut Curie, Genomics of Excellence (ICGex) Platform, Paris, France. Institut Curie, Single Cell Initiative, Paris, France
| | - Angela Bellini
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
- Institut Curie, Department of Translational Research, Paris, France
- Institut Curie, Laboratoire Recherche Translationnelle en Oncologie Pédiatrique (RTOP), Laboratoire "Gilles Thomas", Paris, France
| | - Jaydutt Bhalshankar
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
- Institut Curie, Department of Translational Research, Paris, France
- Institut Curie, Laboratoire Recherche Translationnelle en Oncologie Pédiatrique (RTOP), Laboratoire "Gilles Thomas", Paris, France
| | - Angel M Carcaboso
- SJD Pediatric Cancer Center Barcelona, Institut de Recerca Sant Joan de Déu, Barcelona, Spain
| | - Birgit Geoerger
- Gustave Roussy Cancer Campus, INSERM U1015, Department of Pediatric and Adolescent Oncology, Université Paris-Saclay, Villejuif, France
| | - Hermann Rohrer
- Institute of Clinical Neuroanatomy, Dr. Senckenberg Anatomy, Neuroscience Center, Goethe University, Frankfurt/M, Germany
| | - Didier Surdez
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
- Balgrist University Hospital, Faculty of Medicine, University of Zurich (UZH), Zurich, Switzerland
| | - Valentina Boeva
- Inserm, U1016, Cochin Institute, CNRS UMR8104, Paris University, Paris, France
- ETH Zürich, Department of Computer Science, Institute for Machine Learning, Zürich, Switzerland
- Swiss Institute of Bioinformatics (SIB), Zürich, Switzerland
| | - Gudrun Schleiermacher
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
- Institut Curie, Department of Translational Research, Paris, France
- Institut Curie, Laboratoire Recherche Translationnelle en Oncologie Pédiatrique (RTOP), Laboratoire "Gilles Thomas", Paris, France
| | - Olivier Delattre
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France
- Institut Curie, Unité de Génétique Somatique, Paris, France
| | - Isabelle Janoueix-Lerosey
- Institut Curie, Inserm U830, PSL Research University, Diversity and Plasticity of Childhood Tumors Lab, Paris, France.
- SIREDO: Care, Innovation and Research for Children, Adolescents and Young Adults with Cancer, Institut Curie, Paris, France.
| |
Collapse
|
21
|
Zhang S, Li X, Lin J, Lin Q, Wong KC. Review of single-cell RNA-seq data clustering for cell-type identification and characterization. RNA (NEW YORK, N.Y.) 2023; 29:517-530. [PMID: 36737104 PMCID: PMC10158997 DOI: 10.1261/rna.078965.121] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 01/03/2023] [Indexed: 05/06/2023]
Abstract
In recent years, the advances in single-cell RNA-seq techniques have enabled us to perform large-scale transcriptomic profiling at single-cell resolution in a high-throughput manner. Unsupervised learning such as data clustering has become the central component to identify and characterize novel cell types and gene expression patterns. In this study, we review the existing single-cell RNA-seq data clustering methods with critical insights into the related advantages and limitations. In addition, we also review the upstream single-cell RNA-seq data processing techniques such as quality control, normalization, and dimension reduction. We conduct performance comparison experiments to evaluate several popular single-cell RNA-seq clustering approaches on simulated and multiple single-cell transcriptomic data sets.
Collapse
Affiliation(s)
- Shixiong Zhang
- School of Computer Science and Technology, Xidian University, Xi'an 710071, China
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin 130012, China
| | - Jiecong Lin
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| | - Qiuzhen Lin
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
22
|
Xu Y, Kramann R, McCord RP, Hayat S. MASI enables fast model-free standardization and integration of single-cell transcriptomics data. Commun Biol 2023; 6:465. [PMID: 37117305 PMCID: PMC10144903 DOI: 10.1038/s42003-023-04820-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 04/06/2023] [Indexed: 04/30/2023] Open
Abstract
Single-cell transcriptomics datasets from the same anatomical sites generated by different research labs are becoming increasingly common. However, fast and computationally inexpensive tools for standardization of cell-type annotation and data integration are still needed in order to increase research inclusivity. To standardize cell-type annotation and integrate single-cell transcriptomics datasets, we have built a fast model-free integration method, named MASI (Marker-Assisted Standardization and Integration). We benchmark MASI with other well-established methods and demonstrate that MASI outperforms other methods, in terms of integration, annotation, and speed. To harness knowledge from single-cell atlases, we demonstrate three case studies that cover integration across biological conditions, surveyed participants, and research groups, respectively. Finally, we show MASI can annotate approximately one million cells on a personal laptop, making large-scale single-cell data integration more accessible. We envision that MASI can serve as a cheap computational alternative for the single-cell research community.
Collapse
Affiliation(s)
- Yang Xu
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN, 37996, USA
- Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Rafael Kramann
- Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany
| | - Rachel Patton McCord
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN, 37996, USA.
| | - Sikander Hayat
- Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany.
| |
Collapse
|
23
|
Riojas AM, Spradling-Reeves KD, Christensen CL, Hall-Ursone S, Cox LA. Cell-type deconvolution of bulk RNA-Seq from kidney using opensource bioinformatic tools. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.13.528258. [PMID: 36824792 PMCID: PMC9949078 DOI: 10.1101/2023.02.13.528258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Traditional bulk RNA-Seq pipelines do not assess cell-type composition within heterogeneous tissues. Therefore, it is difficult to determine whether conflicting findings among samples or datasets are the result of biological differences or technical differences due to variation in sample collections. This report provides a user-friendly, open source method to assess cell-type composition in bulk RNA-Seq datasets for heterogeneous tissues using published single cell (sc)RNA-Seq data as a reference. As an example, we apply the method to analysis of kidney cortex bulk RNA-Seq data from female (N=8) and male (N=9) baboons to assess whether observed transcriptome sex differences are biological or technical, i.e., variation due to ultrasound guided biopsy collections. We found cell-type composition was not statistically different in female versus male transcriptomes based on expression of 274 kidney cell-type specific transcripts, indicating differences in gene expression are not due to sampling differences. This method of cell-type composition analysis is recommended for providing rigor in analysis of bulk RNA-Seq datasets from complex tissues. It is clear that with reduced costs, more analyses will be done using scRNA-Seq; however, the approach described here is relevant for data mining and meta analyses of the thousands of bulk RNA-Seq data archived in the NCBI GEO public database.
Collapse
Affiliation(s)
- Angelica M. Riojas
- Center for Precision Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Kimberly D. Spradling-Reeves
- Section on Molecular Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | | | - Shannan Hall-Ursone
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, Texas, USA
| | - Laura A. Cox
- Center for Precision Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
- Section on Molecular Medicine, Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, Texas, USA
| |
Collapse
|
24
|
Lotfollahi M, Rybakov S, Hrovatin K, Hediyeh-Zadeh S, Talavera-López C, Misharin AV, Theis FJ. Biologically informed deep learning to query gene programs in single-cell atlases. Nat Cell Biol 2023; 25:337-350. [PMID: 36732632 PMCID: PMC9928587 DOI: 10.1038/s41556-022-01072-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 12/08/2022] [Indexed: 02/04/2023]
Abstract
The increasing availability of large-scale single-cell atlases has enabled the detailed description of cell states. In parallel, advances in deep learning allow rapid analysis of newly generated query datasets by mapping them into reference atlases. However, existing data transformations learned to map query data are not easily explainable using biologically known concepts such as genes or pathways. Here we propose expiMap, a biologically informed deep-learning architecture that enables single-cell reference mapping. ExpiMap learns to map cells into biologically understandable components representing known 'gene programs'. The activity of each cell for a gene program is learned while simultaneously refining them and learning de novo programs. We show that expiMap compares favourably to existing methods while bringing an additional layer of interpretability to integrative single-cell analysis. Furthermore, we demonstrate its applicability to analyse single-cell perturbation responses in different tissues and species and resolve responses of patients who have coronavirus disease 2019 to different treatments across cell types.
Collapse
Affiliation(s)
- Mohammad Lotfollahi
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Wellcome Sanger Institute, Cambridge, UK
| | - Sergei Rybakov
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Karin Hrovatin
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Soroor Hediyeh-Zadeh
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Bioinformatics Division, WEHI, Melbourne, Victoria, Australia
| | - Carlos Talavera-López
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Division of Infectious Diseases and Tropical Medicine, Ludwig-Maximilian-Universität Klinikum, Munich, Germany
| | - Alexander V Misharin
- Division of Pulmonary and Critical Care Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- Wellcome Sanger Institute, Cambridge, UK.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
| |
Collapse
|
25
|
Sun L, Wang G, Zhang Z. SimCH: simulation of single-cell RNA sequencing data by modeling cellular heterogeneity at gene expression level. Brief Bioinform 2023; 24:6961608. [PMID: 36575569 DOI: 10.1093/bib/bbac590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 11/08/2022] [Accepted: 12/02/2022] [Indexed: 12/29/2022] Open
Abstract
Single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) has been a powerful technology for transcriptome analysis. However, the systematic validation of diverse computational tools used in scRNA-seq analysis remains challenging. Here, we propose a novel simulation tool, termed as Simulation of Cellular Heterogeneity (SimCH), for the flexible and comprehensive assessment of scRNA-seq computational methods. The Gaussian Copula framework is recruited to retain gene coexpression of experimental data shown to be associated with cellular heterogeneity. The synthetic count matrices generated by suitable SimCH modes closely match experimental data originating from either homogeneous or heterogeneous cell populations and either unique molecular identifier (UMI)-based or non-UMI-based techniques. We demonstrate how SimCH can benchmark several types of computational methods, including cell clustering, discovery of differentially expressed genes, trajectory inference, batch correction and imputation. Moreover, we show how SimCH can be used to conduct power evaluation of cell clustering methods. Given these merits, we believe that SimCH can accelerate single-cell research.
Collapse
Affiliation(s)
- Lei Sun
- School of Information Engineering, Yangzhou University, Yangzhou, P.R. China.,School of Artificial Intelligence, Yangzhou University, Yangzhou, P.R. China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing, P.R. China
| | - Gongming Wang
- School of Information Engineering, Yangzhou University, Yangzhou, P.R. China.,School of Artificial Intelligence, Yangzhou University, Yangzhou, P.R. China.,China Unicom Software Research Institute Jinan Branch, Jinan, P.R. China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing, P.R. China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, P.R. China
| |
Collapse
|
26
|
Wrobel J, Harris C, Vandekar S. Statistical Analysis of Multiplex Immunofluorescence and Immunohistochemistry Imaging Data. Methods Mol Biol 2023; 2629:141-168. [PMID: 36929077 DOI: 10.1007/978-1-0716-2986-4_8] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Advances in multiplexed single-cell immunofluorescence (mIF) and multiplex immunohistochemistry (mIHC) imaging technologies have enabled the analysis of cell-to-cell spatial relationships that promise to revolutionize our understanding of tissue-based diseases and autoimmune disorders. Multiplex images are collected as multichannel TIFF files; then denoised, segmented to identify cells and nuclei, normalized across slides with protein markers to correct for batch effects, and phenotyped; and then tissue composition and spatial context at the cellular level are analyzed. This chapter discusses methods and software infrastructure for image processing and statistical analysis of mIF/mIHC data.
Collapse
Affiliation(s)
- Julia Wrobel
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
| | - Coleman Harris
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Simon Vandekar
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
27
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
- Department of Immunology, Nanjing Medical University, Nanjing, 211166 China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children’s Hospital of Chongqing Medical University, Chongqing, 401174 China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110 Guangdong China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Murray J. Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW 2308 Australia
- Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| |
Collapse
|
28
|
Zeng L, Yang K, Zhang T, Zhu X, Hao W, Chen H, Ge J. Research progress of single-cell transcriptome sequencing in autoimmune diseases and autoinflammatory disease: A review. J Autoimmun 2022; 133:102919. [PMID: 36242821 DOI: 10.1016/j.jaut.2022.102919] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/16/2022] [Accepted: 09/19/2022] [Indexed: 12/07/2022]
Abstract
Autoimmunity refers to the phenomenon that the body's immune system produces antibodies or sensitized lymphocytes to its own tissues to cause an immune response. Immune disorders caused by autoimmunity can mediate autoimmune diseases. Autoimmune diseases have complicated pathogenesis due to the many types of cells involved, and the mechanism is still unclear. The emergence of single-cell research technology can solve the problem that ordinary transcriptome technology cannot be accurate to cell type. It provides unbiased results through independent analysis of cells in tissues and provides more mRNA information for identifying cell subpopulations, which provides a novel approach to study disruption of immune tolerance and disturbance of pro-inflammatory pathways on a cellular basis. It may fundamentally change the understanding of molecular pathways in the pathogenesis of autoimmune diseases and develop targeted drugs. Single-cell transcriptome sequencing (scRNA-seq) has been widely applied in autoimmune diseases, which provides a powerful tool for demonstrating the cellular heterogeneity of tissues involved in various immune inflammations, identifying pathogenic cell populations, and revealing the mechanism of disease occurrence and development. This review describes the principles of scRNA-seq, introduces common sequencing platforms and practical procedures, and focuses on the progress of scRNA-seq in 41 autoimmune diseases, which include 9 systemic autoimmune diseases and autoinflammatory diseases (rheumatoid arthritis, systemic lupus erythematosus, etc.) and 32 organ-specific autoimmune diseases (5 Skin diseases, 3 Nervous system diseases, 4 Eye diseases, 2 Respiratory system diseases, 2 Circulatory system diseases, 6 Liver, Gallbladder and Pancreas diseases, 2 Gastrointestinal system diseases, 3 Muscle, Bones and joint diseases, 3 Urinary system diseases, 2 Reproductive system diseases). This review also prospects the molecular mechanism targets of autoimmune diseases from the multi-molecular level and multi-dimensional analysis combined with single-cell multi-omics sequencing technology (such as scRNA-seq, Single cell ATAC-seq and single cell immune group library sequencing), which provides a reference for further exploring the pathogenesis and marker screening of autoimmune diseases and autoimmune inflammatory diseases in the future.
Collapse
Affiliation(s)
- Liuting Zeng
- Department of Rheumatology, Peking Union Medical College Hospital, Chinese Academy of Medical Science & Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases, State Key Laboratory of Complex Severe and Rare Diseases, Beijing, China.
| | - Kailin Yang
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China.
| | - Tianqing Zhang
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China
| | - Xiaofei Zhu
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China.
| | - Wensa Hao
- Institute of Materia Medica, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
| | - Hua Chen
- Department of Rheumatology, Peking Union Medical College Hospital, Chinese Academy of Medical Science & Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases, State Key Laboratory of Complex Severe and Rare Diseases, Beijing, China.
| | - Jinwen Ge
- Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China; Hunan Academy of Chinese Medicine, Changsha, China.
| |
Collapse
|
29
|
Hrncir HR, Gracz AD. Cellular and transcriptional heterogeneity in the intrahepatic biliary epithelium. GASTRO HEP ADVANCES 2022; 2:108-120. [PMID: 36593993 PMCID: PMC9802653 DOI: 10.1016/j.gastha.2022.07.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 07/19/2022] [Indexed: 01/05/2023]
Abstract
Epithelial tissues comprise heterogeneous cellular subpopulations, which often compartmentalize specialized functions like absorption and secretion to distinct cell types. In the liver, hepatocytes and biliary epithelial cells (BECs; also called cholangiocytes) are the two major epithelial lineages and play distinct roles in (1) metabolism, protein synthesis, detoxification, and (2) bile transport and modification, respectively. Recent technological advances, including single cell transcriptomic assays, have shed new light on well-established heterogeneity among hepatocytes, endothelial cells, and immune cells in the liver. However, a "ground truth" understanding of molecular heterogeneity in BECs has remained elusive, and the field currently lacks a set of consensus biomarkers for identifying BEC subpopulations. Here, we review long-standing definitions of BEC heterogeneity as well as emerging studies that aim to characterize BEC subpopulations using next generation single cell assays. Understanding cellular heterogeneity in the intrahepatic bile ducts holds promise for expanding our foundational mechanistic knowledge of BECs during homeostasis and disease.
Collapse
Affiliation(s)
- Hannah R. Hrncir
- Division of Digestive Diseases, Department of Medicine, Emory University, Atlanta, Georgia
- Graduate Program in Biochemistry, Cell and Developmental Biology, Emory University, Atlanta, Georgia
| | - Adam D. Gracz
- Division of Digestive Diseases, Department of Medicine, Emory University, Atlanta, Georgia
- Graduate Program in Biochemistry, Cell and Developmental Biology, Emory University, Atlanta, Georgia
- Graduate Program in Genetics and Molecular Biology, Emory University, Atlanta, Georgia
| |
Collapse
|
30
|
Li Z, Zhou X. BASS: multi-scale and multi-sample analysis enables accurate cell type clustering and spatial domain detection in spatial transcriptomic studies. Genome Biol 2022; 23:168. [PMID: 35927760 PMCID: PMC9351148 DOI: 10.1186/s13059-022-02734-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 07/21/2022] [Indexed: 02/08/2023] Open
Abstract
Spatial transcriptomic studies are reaching single-cell spatial resolution, with data often collected from multiple tissue sections. Here, we present a computational method, BASS, that enables multi-scale and multi-sample analysis for single-cell resolution spatial transcriptomics. BASS performs cell type clustering at the single-cell scale and spatial domain detection at the tissue regional scale, with the two tasks carried out simultaneously within a Bayesian hierarchical modeling framework. We illustrate the benefits of BASS through comprehensive simulations and applications to three datasets. The substantial power gain brought by BASS allows us to reveal accurate transcriptomic and cellular landscape in both cortex and hypothalamus.
Collapse
Affiliation(s)
- Zheng Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
31
|
Wilk AJ, Shalek AK, Holmes S, Blish CA. Comparative analysis of cell-cell communication at single-cell resolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.02.04.479209. [PMID: 35169794 PMCID: PMC8845414 DOI: 10.1101/2022.02.04.479209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Inference of cell-cell communication (CCC) from single-cell RNA-sequencing data is a powerful technique to uncover putative axes of multicellular coordination, yet existing methods perform this analysis at the level of the cell type or cluster, discarding single-cell level information. Here we present Scriabin â€" a flexible and scalable framework for comparative analysis of CCC at single-cell resolution. We leverage multiple published datasets to show that Scriabin recovers expected CCC edges and use spatial transcriptomic data, genetic perturbation screens, and direct experimental manipulation of receptor-ligand interactions to validate that the recovered edges are biologically meaningful. We then apply Scriabin to uncover co-expressed programs of CCC from atlas-scale datasets, validating known communication pathways required for maintaining the intestinal stem cell niche and revealing species-specific communication pathways. Finally, we utilize single-cell communication networks calculated using Scriabin to follow communication pathways that operate between timepoints in longitudinal datasets, highlighting bystander cells as important initiators of inflammatory reactions in acute SARS-CoV-2 infection. Our approach represents a broadly applicable strategy to leverage single-cell resolution data maximally toward uncovering CCC circuitry and rich niche-phenotype relationships in health and disease.
Collapse
|
32
|
Cytotoxic innate lymphoid cells sense cancer cell-expressed interleukin-15 to suppress human and murine malignancies. Nat Immunol 2022; 23:904-915. [PMID: 35618834 DOI: 10.1038/s41590-022-01213-2] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 04/14/2022] [Indexed: 12/15/2022]
Abstract
Malignancy can be suppressed by the immune system. However, the classes of immunosurveillance responses and their mode of tumor sensing remain incompletely understood. Here, we show that although clear cell renal cell carcinoma (ccRCC) was infiltrated by exhaustion-phenotype CD8+ T cells that negatively correlated with patient prognosis, chromophobe RCC (chRCC) had abundant infiltration of granzyme A-expressing intraepithelial type 1 innate lymphoid cells (ILC1s) that positively associated with patient survival. Interleukin-15 (IL-15) promoted ILC1 granzyme A expression and cytotoxicity, and IL-15 expression in chRCC tumor tissue positively tracked with the ILC1 response. An ILC1 gene signature also predicted survival of a subset of breast cancer patients in association with IL-15 expression. Notably, ILC1s directly interacted with cancer cells, and IL-15 produced by cancer cells supported the expansion and anti-tumor function of ILC1s in a murine breast cancer model. Thus, ILC1 sensing of cancer cell IL-15 defines an immunosurveillance mechanism of epithelial malignancies.
Collapse
|
33
|
Affiliation(s)
- Greg Gibson
- School of Biological Sciences and Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, Georgia, United States of America
| |
Collapse
|
34
|
M Ascensión A, Ibáñez-Solé O, Inza I, Izeta A, Araúzo-Bravo MJ. Triku: a feature selection method based on nearest neighbors for single-cell data. Gigascience 2022; 11:6547682. [PMID: 35277963 PMCID: PMC8917514 DOI: 10.1093/gigascience/giac017] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 09/24/2021] [Indexed: 01/03/2023] Open
Abstract
Background Feature selection is a relevant step in the analysis of single-cell RNA sequencing datasets. Most of the current feature selection methods are based on general univariate descriptors of the data such as the dispersion or the percentage of zeros. Despite the use of correction methods, the generality of these feature selection methods biases the genes selected towards highly expressed genes, instead of the genes defining the cell populations of the dataset. Results Triku is a feature selection method that favors genes defining the main cell populations. It does so by selecting genes expressed by groups of cells that are close in the k-nearest neighbor graph. The expression of these genes is higher than the expected expression if the k-cells were chosen at random. Triku efficiently recovers cell populations present in artificial and biological benchmarking datasets, based on adjusted Rand index, normalized mutual information, supervised classification, and silhouette coefficient measurements. Additionally, gene sets selected by triku are more likely to be related to relevant Gene Ontology terms and contain fewer ribosomal and mitochondrial genes. Conclusion Triku is developed in Python 3 and is available at https://github.com/alexmascension/triku.
Collapse
Affiliation(s)
- Alex M Ascensión
- Biodonostia Health Research Institute, Computational Biology and Systems Biomedicine Group, Paseo Dr. Begiristain, s/n, Donostia-San Sebastian, 20014, Spain
- Biodonostia Health Research Institute, Tissue Engineering Group, Paseo Dr. Begiristain, s/n, Donostia-San Sebastian, 20014, Spain
| | - Olga Ibáñez-Solé
- Biodonostia Health Research Institute, Computational Biology and Systems Biomedicine Group, Paseo Dr. Begiristain, s/n, Donostia-San Sebastian, 20014, Spain
- Biodonostia Health Research Institute, Tissue Engineering Group, Paseo Dr. Begiristain, s/n, Donostia-San Sebastian, 20014, Spain
| | - Iñaki Inza
- Intelligent Systems Group, Computer Science Faculty, University of the Basque Country, Donostia-San Sebastian, 20018, Spain
| | - Ander Izeta
- Biodonostia Health Research Institute, Tissue Engineering Group, Paseo Dr. Begiristain, s/n, Donostia-San Sebastian, 20014, Spain
| | - Marcos J Araúzo-Bravo
- Biodonostia Health Research Institute, Computational Biology and Systems Biomedicine Group, Paseo Dr. Begiristain, s/n, Donostia-San Sebastian, 20014, Spain
- Max Planck Institute for Molecular Biomedicine, Roentgenstr. 20, 48149 Muenster, German
- IKERBASQUE, Basque Foundation for Science, Euskadi plaza 5, Bilbao, 48009, Spain
- Department of Cell Biology and Histology, Faculty of Medicine and Nursing, University of Basque Country (UPV/EHU), 48940 Leioa, Spain
| |
Collapse
|
35
|
Yu L, Cao Y, Yang JYH, Yang P. Benchmarking clustering algorithms on estimating the number of cell types from single-cell RNA-sequencing data. Genome Biol 2022; 23:49. [PMID: 35135612 PMCID: PMC8822786 DOI: 10.1186/s13059-022-02622-0] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 01/27/2022] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND A key task in single-cell RNA-seq (scRNA-seq) data analysis is to accurately detect the number of cell types in the sample, which can be critical for downstream analyses such as cell type identification. Various scRNA-seq data clustering algorithms have been specifically designed to automatically estimate the number of cell types through optimising the number of clusters in a dataset. The lack of benchmark studies, however, complicates the choice of the methods. RESULTS We systematically benchmark a range of popular clustering algorithms on estimating the number of cell types in a variety of settings by sampling from the Tabula Muris data to create scRNA-seq datasets with a varying number of cell types, varying number of cells in each cell type, and different cell type proportions. The large number of datasets enables us to assess the performance of the algorithms, covering four broad categories of approaches, from various aspects using a panel of criteria. We further cross-compared the performance on datasets with high cell numbers using Tabula Muris and Tabula Sapiens data. CONCLUSIONS We identify the strengths and weaknesses of each method on multiple criteria including the deviation of estimation from the true number of cell types, variability of estimation, clustering concordance of cells to their predefined cell types, and running time and peak memory usage. We then summarise these results into a multi-aspect recommendation to the users. The proposed stability-based approach for estimating the number of cell types is implemented in an R package and is freely available from ( https://github.com/PYangLab/scCCESS ).
Collapse
Affiliation(s)
- Lijia Yu
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia
| | - Yue Cao
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia
| | - Jean Y H Yang
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia
| | - Pengyi Yang
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia.
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia.
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia.
| |
Collapse
|
36
|
Lotfollahi M, Naghipourfar M, Luecken MD, Khajavi M, Büttner M, Wagenstetter M, Avsec Ž, Gayoso A, Yosef N, Interlandi M, Rybakov S, Misharin AV, Theis FJ. Mapping single-cell data to reference atlases by transfer learning. Nat Biotechnol 2022; 40:121-130. [PMID: 34462589 PMCID: PMC8763644 DOI: 10.1038/s41587-021-01001-7] [Citation(s) in RCA: 189] [Impact Index Per Article: 94.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 06/28/2021] [Indexed: 02/07/2023]
Abstract
Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases.
Collapse
Affiliation(s)
- Mohammad Lotfollahi
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Mohsen Naghipourfar
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Malte D Luecken
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Matin Khajavi
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Maren Büttner
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Marco Wagenstetter
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Žiga Avsec
- Department of Computer Science, Technical University of Munich, Munich, Germany
| | - Adam Gayoso
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Nir Yosef
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, USA
| | - Marta Interlandi
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Sergei Rybakov
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Alexander V Misharin
- Division of Pulmonary and Critical Care Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Fabian J Theis
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
| |
Collapse
|
37
|
Mapping single-cell data to reference atlases by transfer learning. Nat Biotechnol 2022. [DOI: 10.1038/s41587-021-01001-7\] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
AbstractLarge single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases.
Collapse
|
38
|
Schiebout C, Frost HR. CAMML: Multi-Label Immune Cell-Typing and Stemness Analysis for Single-Cell RNA-sequencing. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2022; 27:199-210. [PMID: 34890149 PMCID: PMC8669732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Inferring the cell types in single-cell RNA-sequencing (scRNA-seq) data is of particular importance for understanding the potential cellular mechanisms and phenotypes occurring in complex tissues, such as the tumor-immune microenvironment (TME). The sparsity and noise of scRNA-seq data, combined with the fact that immune cell types often occur on a continuum, make cell typing of TME scRNA-seq data a significant challenge. Several single-label cell typing methods have been put forth to address the limitations of noise and sparsity, but accounting for the often overlapped spectrum of cell types in the immune TME remains an obstacle. To address this, we developed a new scRNA-seq cell-typing method, Cell-typing using variance Adjusted Mahalanobis distances with Multi-Labeling (CAMML). CAMML leverages cell type-specific weighted gene sets to score every cell in a dataset for every potential cell type. This allows cells to be labelled either by their highest scoring cell type as a single label classification or based on a score cut-off to give multi-label classification. For single-label cell typing, CAMML performance is comparable to existing cell typing methods, SingleR and Garnett. For scenarios where cells may exhibit features of multiple cell types (e.g., undifferentiated cells), the multi-label classification supported by CAMML offers important benefits relative to the current state-of-the-art methods. By integrating data across studies, omics platforms, and species, CAMML serves as a robust and adaptable method for overcoming the challenges of scRNA-seq analysis.
Collapse
|
39
|
You Y, Tian L, Su S, Dong X, Jabbari JS, Hickey PF, Ritchie ME. Benchmarking UMI-based single-cell RNA-seq preprocessing workflows. Genome Biol 2021; 22:339. [PMID: 34906205 PMCID: PMC8672463 DOI: 10.1186/s13059-021-02552-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/22/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Single-cell RNA-sequencing (scRNA-seq) technologies and associated analysis methods have rapidly developed in recent years. This includes preprocessing methods, which assign sequencing reads to genes to create count matrices for downstream analysis. While several packaged preprocessing workflows have been developed to provide users with convenient tools for handling this process, how they compare to one another and how they influence downstream analysis have not been well studied. RESULTS Here, we systematically benchmark the performance of 10 end-to-end preprocessing workflows (Cell Ranger, Optimus, salmon alevin, alevin-fry, kallisto bustools, dropSeqPipe, scPipe, zUMIs, celseq2, and scruff) using datasets yielding different biological complexity levels generated by CEL-Seq2 and 10x Chromium platforms. We compare these workflows in terms of their quantification properties directly and their impact on normalization and clustering by evaluating the performance of different method combinations. While the scRNA-seq preprocessing workflows compared vary in their detection and quantification of genes across datasets, after downstream analysis with performant normalization and clustering methods, almost all combinations produce clustering results that agree well with the known cell type labels that provided the ground truth in our analysis. CONCLUSIONS In summary, the choice of preprocessing method was found to be less important than other steps in the scRNA-seq analysis process. Our study comprehensively compares common scRNA-seq preprocessing workflows and summarizes their characteristics to guide workflow users.
Collapse
Affiliation(s)
- Yue You
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Luyi Tian
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Shian Su
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Xueyi Dong
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Jafar S. Jabbari
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, Australia
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Peter F. Hickey
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
- Single-Cell Open Research Endeavour (SCORE), The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
| | - Matthew E. Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
- School of Mathematics and Statistics, The University of Melbourne, Parkville, Australia
| |
Collapse
|
40
|
Liu J, Liu L, Qu S, Zhang T, Wang D, Ji Q, Wang T, Shi H, Song K, Fang W, Chen W, Yin W. GdClean: removal of Gadolinium contamination in mass cytometry data. Bioinformatics 2021; 37:4787-4792. [PMID: 34320625 DOI: 10.1093/bioinformatics/btab537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 07/12/2021] [Accepted: 07/27/2021] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Mass cytometry (Cytometry by Time-Of-Flight, CyTOF) is a single-cell technology that is able to quantify multiplex biomarker expressions and is commonly used in basic life science and translational research. However, the widely used Gadolinium (Gd)-based contrast agents (GBCAs) in magnetic resonance imaging (MRI) scanning in clinical practice can lead to signal contamination on the Gd channels in the CyTOF analysis. This Gd contamination greatly affects the characterization of the real signal from Gd-isotope-conjugated antibodies, severely impairing the CyTOF data quality and ruining downstream single-cell data interpretation. RESULTS We first in-depth characterized the signals of Gd isotopes from a control sample that was not stained with Gd-labeled antibodies but was contaminated by Gd isotopes from GBCAs, and revealed the collinear intensity relationship across Gd contamination signals. We also found that the intensity ratios of detected Gd contamination signals to the reference Gd signal were highly correlated with the natural abundance ratios of corresponding Gd isotopes. We then developed a computational method named by GdClean to remove the Gd contamination signal at the single-cell level in the CyTOF data. We further demonstrated that the GdClean effectively cleaned up the Gd contamination signal while preserving the real Gd-labeled antibodies signal in Gd channels. All of these shed lights on the promising applications of the GdClean method in preprocessing CyTOF datasets for revealing the true single-cell information. AVAILABILITY AND IMPLEMENTATION The R package GdClean is available on GitHub at https://github.com/JunweiLiu0208/GdClean. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Junwei Liu
- Key Laboratory for Biomedical Engineering of the Ministry of Education, College of Biomedical Engineering and Instrument Science, School of Basic Medical Science and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Lulu Liu
- Department of Medical Oncology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310000, China
| | - Saisi Qu
- Key Laboratory for Biomedical Engineering of the Ministry of Education, College of Biomedical Engineering and Instrument Science, School of Basic Medical Science and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Tongtong Zhang
- Department of Hepatobiliary and Pancreatic Surgery, The Center for Integrated Oncology and Precision Medicine, Affiliated Hangzhou First People's Hospital, Zhejiang University, Hangzhou 310006, China
| | - Danyang Wang
- Department of Colorectal Surgery, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Qinghua Ji
- Department of Biological Testing, Zhejiang Puluoting Health Technology Co., Ltd, Hangzhou 311121, China
| | - Tian Wang
- Department of Biological Testing, Zhejiang Puluoting Health Technology Co., Ltd, Hangzhou 311121, China
| | - Hongyu Shi
- Department of Biological Testing, Zhejiang Puluoting Health Technology Co., Ltd, Hangzhou 311121, China
| | - Kaichen Song
- Key Laboratory for Biomedical Engineering of the Ministry of Education, College of Biomedical Engineering and Instrument Science, School of Basic Medical Science and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Weijia Fang
- Department of Medical Oncology, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou 310000, China
| | - Wei Chen
- Key Laboratory for Biomedical Engineering of the Ministry of Education, College of Biomedical Engineering and Instrument Science, School of Basic Medical Science and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Weiwei Yin
- Key Laboratory for Biomedical Engineering of the Ministry of Education, College of Biomedical Engineering and Instrument Science, School of Basic Medical Science and Department of Cardiology of the Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Department of Thoracic Surgery, Sir Run Run Shaw Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| |
Collapse
|
41
|
Zhao ZH, Wang XY, Schatten H, Sun QY. Single cell RNA sequencing techniques and applications in research of ovary development and related diseases. Reprod Toxicol 2021; 107:97-103. [PMID: 34896591 DOI: 10.1016/j.reprotox.2021.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/21/2021] [Accepted: 12/08/2021] [Indexed: 11/17/2022]
Abstract
The ovary is a highly organized composite of germ cells and various types of somatic cells, whose communications dictate ovary development to generate functional oocytes. The differences between individual cells might have profound effects on ovary functions. Single cell RNA sequencing techniques are promising approaches to explore the cell type composition of organisms, the dynamics of transcriptomes, the regulatory network between genes and the signaling pathways between cell types at the single cell resolution. In this review, we provide an overview of the currently available single cell RNA sequencing techniques including Smart-seq2 and Drop-seq, as well as their applications in biological and clinical research to provide a better understanding on the molecular mechanisms underlying ovary development and associated diseases.
Collapse
Affiliation(s)
- Zheng-Hui Zhao
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xiao-Yu Wang
- School of Social Development and Public Policy, Beijing Normal University, Beijing, 100875, China
| | - Heide Schatten
- Department of Veterinary Pathobiology, University of Missouri, Columbia, MO, 65211, United States
| | - Qing-Yuan Sun
- Fertility Preservation Lab, Guangdong-Hong Kong Metabolism & Reproduction Joint Laboratory, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, 510317, China.
| |
Collapse
|
42
|
Bej S, Galow AM, David R, Wolfien M, Wolkenhauer O. Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling. BMC Bioinformatics 2021; 22:557. [PMID: 34798805 PMCID: PMC8603509 DOI: 10.1186/s12859-021-04469-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 11/03/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The research landscape of single-cell and single-nuclei RNA-sequencing is evolving rapidly. In particular, the area for the detection of rare cells was highly facilitated by this technology. However, an automated, unbiased, and accurate annotation of rare subpopulations is challenging. Once rare cells are identified in one dataset, it is usually necessary to generate further specific datasets to enrich the analysis (e.g., with samples from other tissues). From a machine learning perspective, the challenge arises from the fact that rare-cell subpopulations constitute an imbalanced classification problem. We here introduce a Machine Learning (ML)-based oversampling method that uses gene expression counts of already identified rare cells as an input to generate synthetic cells to then identify similar (rare) cells in other publicly available experiments. We utilize single-cell synthetic oversampling (sc-SynO), which is based on the Localized Random Affine Shadowsampling (LoRAS) algorithm. The algorithm corrects for the overall imbalance ratio of the minority and majority class. RESULTS We demonstrate the effectiveness of our method for three independent use cases, each consisting of already published datasets. The first use case identifies cardiac glial cells in snRNA-Seq data (17 nuclei out of 8635). This use case was designed to take a larger imbalance ratio (~1 to 500) into account and only uses single-nuclei data. The second use case was designed to jointly use snRNA-Seq data and scRNA-Seq on a lower imbalance ratio (~1 to 26) for the training step to likewise investigate the potential of the algorithm to consider both single-cell capture procedures and the impact of "less" rare-cell types. The third dataset refers to the murine data of the Allen Brain Atlas, including more than 1 million cells. For validation purposes only, all datasets have also been analyzed traditionally using common data analysis approaches, such as the Seurat workflow. CONCLUSIONS In comparison to baseline testing without oversampling, our approach identifies rare-cells with a robust precision-recall balance, including a high accuracy and low false positive detection rate. A practical benefit of our algorithm is that it can be readily implemented in other and existing workflows. The code basis in R and Python is publicly available at FairdomHub, as well as GitHub, and can easily be transferred to identify other rare-cell types.
Collapse
Affiliation(s)
- Saptarshi Bej
- Department of Systems Biology and Bioinformatics, University of Rostock, 18057, Rostock, Germany
- Leibniz-Institute for Food Systems Biology, Technical University of Munich, 85354, Freising, Germany
| | - Anne-Marie Galow
- Institute of Genome Biology, Research Institute for Farm Animal Biology, 18196, Dummerstorf, Germany
| | - Robert David
- Department of Cardiac Surgery, Rostock University Medical Centre, 18057, Rostock, Germany
- Department of Life, Light and Matter, University of Rostock, 18059, Rostock, Germany
| | - Markus Wolfien
- Department of Systems Biology and Bioinformatics, University of Rostock, 18057, Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, 18057, Rostock, Germany.
- Leibniz-Institute for Food Systems Biology, Technical University of Munich, 85354, Freising, Germany.
- Stellenbosch Institute of Advanced Study, Stellenbosch University, Stellenbosch, 7602, South Africa.
| |
Collapse
|
43
|
Zappia L, Theis FJ. Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape. Genome Biol 2021; 22:301. [PMID: 34715899 PMCID: PMC8555270 DOI: 10.1186/s13059-021-02519-4] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 10/14/2021] [Indexed: 11/16/2022] Open
Abstract
Recent years have seen a revolution in single-cell RNA-sequencing (scRNA-seq) technologies, datasets, and analysis methods. Since 2016, the scRNA-tools database has cataloged software tools for analyzing scRNA-seq data. With the number of tools in the database passing 1000, we provide an update on the state of the project and the field. This data shows the evolution of the field and a change of focus from ordering cells on continuous trajectories to integrating multiple samples and making use of reference datasets. We also find that open science practices reward developers with increased recognition and help accelerate the field.
Collapse
Affiliation(s)
- Luke Zappia
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany.
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
44
|
Transcriptional Differences in Lipid-Metabolizing Enzymes in Murine Sebocytes Derived from Sebaceous Glands of the Skin and Preputial Glands. Int J Mol Sci 2021; 22:ijms222111631. [PMID: 34769061 PMCID: PMC8584257 DOI: 10.3390/ijms222111631] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 10/22/2021] [Accepted: 10/25/2021] [Indexed: 12/18/2022] Open
Abstract
Sebaceous glands are adnexal structures, which critically contribute to skin homeostasis and the establishment of a functional epidermal barrier. Sebocytes, the main cell population found within the sebaceous glands, are highly specialized lipid-producing cells. Sebaceous gland-resembling tissue structures are also found in male rodents in the form of preputial glands. Similar to sebaceous glands, they are composed of lipid-specialized sebocytes. Due to a lack of adequate organ culture models for skin sebaceous glands and the fact that preputial glands are much larger and easier to handle, previous studies used preputial glands as a model for skin sebaceous glands. Here, we compared both types of sebocytes, using a single-cell RNA sequencing approach, to unravel potential similarities and differences between the two sebocyte populations. In spite of common gene expression patterns due to general lipid-producing properties, we found significant differences in the expression levels of genes encoding enzymes involved in the biogenesis of specialized lipid classes. Specifically, genes critically involved in the mevalonate pathway, including squalene synthase, as well as the sphingolipid salvage pathway, such as ceramide synthase, (acid) sphingomyelinase or acid and alkaline ceramidases, were significantly less expressed by preputial gland sebocytes. Together, our data revealed tissue-specific sebocyte populations, indicating major developmental, functional as well as biosynthetic differences between both glands. The use of preputial glands as a surrogate model to study skin sebaceous glands is therefore limited, and major differences between both glands need to be carefully considered before planning an experiment.
Collapse
|
45
|
Lu S, Conn DJ, Chen S, Johnson KD, Bresnick EH, Keleş S. MLG: multilayer graph clustering for multi-condition scRNA-seq data. Nucleic Acids Res 2021; 49:e127. [PMID: 34581807 PMCID: PMC8682753 DOI: 10.1093/nar/gkab823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 08/13/2021] [Accepted: 09/21/2021] [Indexed: 11/14/2022] Open
Abstract
Single-cell transcriptome sequencing (scRNA-seq) enabled investigations of cellular heterogeneity at exceedingly higher resolutions. Identification of novel cell types or transient developmental stages across multiple experimental conditions is one of its key applications. Linear and non-linear dimensionality reduction for data integration became a foundational tool in inference from scRNA-seq data. We present multilayer graph clustering (MLG) as an integrative approach for combining multiple dimensionality reduction of multi-condition scRNA-seq data. MLG generates a multilayer shared nearest neighbor cell graph with higher signal-to-noise ratio and outperforms current best practices in terms of clustering accuracy across large-scale benchmarking experiments. Application of MLG to a wide variety of datasets from multiple conditions highlights how MLG boosts signal-to-noise ratio for fine-grained sub-population identification. MLG is widely applicable to settings with single cell data integration via dimension reduction.
Collapse
Affiliation(s)
- Shan Lu
- Department of Statistics, University of Wisconsin, Madison, WI 53706, USA
| | - Daniel J Conn
- Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, WI 53792, USA
| | - Shuyang Chen
- Department of Statistics, University of Wisconsin, Madison, WI 53706, USA
| | - Kirby D Johnson
- Wisconsin Blood Cancer Research Institute, Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, WI 53705, USA
| | - Emery H Bresnick
- Wisconsin Blood Cancer Research Institute, Department of Cell and Regenerative Biology, University of Wisconsin School of Medicine and Public Health, Madison, WI 53705, USA
| | - Sündüz Keleş
- Department of Statistics, University of Wisconsin, Madison, WI 53706, USA.,Department of Biostatistics and Medical Informatics, University of Wisconsin School of Medicine and Public Health, Madison, WI 53792, USA
| |
Collapse
|
46
|
Schmidt F, Ranjan B, Lin Q, Krishnan V, Joanito I, Honardoost M, Nawaz Z, Venkatesh P, Tan J, Rayan N, Ong S, Prabhakar S. RCA2: a scalable supervised clustering algorithm that reduces batch effects in scRNA-seq data. Nucleic Acids Res 2021; 49:8505-8519. [PMID: 34320202 PMCID: PMC8344557 DOI: 10.1093/nar/gkab632] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 07/06/2021] [Accepted: 07/13/2021] [Indexed: 11/13/2022] Open
Abstract
The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.
Collapse
Affiliation(s)
- Florian Schmidt
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Bobby Ranjan
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Quy Xiao Xuan Lin
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | | | - Ignasius Joanito
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Mohammad Amin Honardoost
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
- Department of Medicine, School of Medicine, National University of Singapore, 1 Kent Ridge Road, level 10, NUHS Tower Block, 119228, Singapore
| | - Zahid Nawaz
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Prasanna Nori Venkatesh
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Joanna Tan
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Nirmala Arul Rayan
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Sin Tiong Ong
- DUKE-NUS Medical School, 8 College Rd, 169857, Singapore
- Department of Medicine, Duke University Medical Center, Durham, NC 27710, USA
| | - Shyam Prabhakar
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| |
Collapse
|
47
|
Wang J, Zou Q, Lin C. A comparison of deep learning-based pre-processing and clustering approaches for single-cell RNA sequencing data. Brief Bioinform 2021; 23:6361043. [PMID: 34472590 DOI: 10.1093/bib/bbab345] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2021] [Revised: 07/22/2021] [Accepted: 08/04/2021] [Indexed: 11/13/2022] Open
Abstract
The emergence of single cell RNA sequencing has facilitated the studied of genomes, transcriptomes and proteomes. As available single-cell RNA-seq datasets are released continuously, one of the major challenges facing traditional RNA analysis tools is the high-dimensional, high-sparsity, high-noise and large-scale characteristics of single-cell RNA-seq data. Deep learning technologies match the characteristics of single-cell RNA-seq data perfectly and offer unprecedented promise. Here, we give a systematic review for most popular single-cell RNA-seq analysis methods and tools based on deep learning models, involving the procedures of data preprocessing (quality control, normalization, data correction, dimensionality reduction and data visualization) and clustering task for downstream analysis. We further evaluate the deep model-based analysis methods of data correction and clustering quantitatively on 11 gold standard datasets. Moreover, we discuss the data preferences of these methods and their limitations, and give some suggestions and guidance for users to select appropriate methods and tools.
Collapse
Affiliation(s)
- Jiacheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Quan Zou
- School of Informatics, Xiamen University, Xiamen, China
| | - Chen Lin
- School of Informatics, Xiamen University, Xiamen, China
| |
Collapse
|
48
|
Talukdar S, Chang Z, Winterhoff B, Starr TK. Single-Cell RNA Sequencing of Ovarian Cancer: Promises and Challenges. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2021; 1330:113-123. [PMID: 34339033 DOI: 10.1007/978-3-030-73359-9_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Ovarian cancer remains the leading cause of death from gynecologic malignancy in the Western world. Tumors are comprised of heterogeneous populations of various cancer, immune, and stromal cells; it is hypothesized that rare cancer stem cells within these subpopulations lead to disease recurrence and treatment resistance. Technological advances now allow for the analysis of tumor genomes and transcriptomes at the single-cell level, which provides the resolution to potentially identify these rare cancer stem cells within the larger tumor.In this chapter, we review the evolution of next-generation RNA sequencing techniques, the methodology of single-cell isolation and sequencing, sequencing data analysis, and the potential applications in ovarian cancer. We also summarize the current published work using single-cell sequencing in ovarian cancer.By utilizing this novel technique to characterize the gene expression of rare subpopulations, new targets and treatment pathways may be identified in ovarian cancer to change treatment paradigms.
Collapse
Affiliation(s)
- Shobhana Talukdar
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Zenas Chang
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Boris Winterhoff
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Timothy K Starr
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA.
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA.
- Institute of Health Informatics, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
49
|
Wilk AJ, Lee MJ, Wei B, Parks B, Pi R, Martínez-Colón GJ, Ranganath T, Zhao NQ, Taylor S, Becker W, Jimenez-Morales D, Blomkalns AL, O’Hara R, Ashley EA, Nadeau KC, Yang S, Holmes S, Rabinovitch M, Rogers AJ, Greenleaf WJ, Blish CA. Multi-omic profiling reveals widespread dysregulation of innate immunity and hematopoiesis in COVID-19. J Exp Med 2021; 218:e20210582. [PMID: 34128959 PMCID: PMC8210586 DOI: 10.1084/jem.20210582] [Citation(s) in RCA: 124] [Impact Index Per Article: 41.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 05/13/2021] [Accepted: 05/13/2021] [Indexed: 12/20/2022] Open
Abstract
Our understanding of protective versus pathological immune responses to SARS-CoV-2, the virus that causes coronavirus disease 2019 (COVID-19), is limited by inadequate profiling of patients at the extremes of the disease severity spectrum. Here, we performed multi-omic single-cell immune profiling of 64 COVID-19 patients across the full range of disease severity, from outpatients with mild disease to fatal cases. Our transcriptomic, epigenomic, and proteomic analyses revealed widespread dysfunction of peripheral innate immunity in severe and fatal COVID-19, including prominent hyperactivation signatures in neutrophils and NK cells. We also identified chromatin accessibility changes at NF-κB binding sites within cytokine gene loci as a potential mechanism for the striking lack of pro-inflammatory cytokine production observed in monocytes in severe and fatal COVID-19. We further demonstrated that emergency myelopoiesis is a prominent feature of fatal COVID-19. Collectively, our results reveal disease severity-associated immune phenotypes in COVID-19 and identify pathogenesis-associated pathways that are potential targets for therapeutic intervention.
Collapse
Affiliation(s)
- Aaron J. Wilk
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Madeline J. Lee
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Bei Wei
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Benjamin Parks
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Graduate Program in Computer Science, Stanford University School of Medicine, Stanford, CA
| | - Ruoxi Pi
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | | | - Thanmayi Ranganath
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Nancy Q. Zhao
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Shalina Taylor
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA
- Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA
- Vera Moulton Wall Center for Pulmonary Vascular Disease, Stanford University School of Medicine, Stanford, CA
| | - Winston Becker
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | | | | | - Andra L. Blomkalns
- Department of Emergency Medicine, Stanford University School of Medicine, Stanford, CA
| | - Ruth O’Hara
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA
| | - Euan A. Ashley
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Kari C. Nadeau
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
- Sean N. Parker Center for Allergy and Asthma Research, Stanford University School of Medicine, Stanford, CA
| | - Samuel Yang
- Department of Emergency Medicine, Stanford University School of Medicine, Stanford, CA
| | - Susan Holmes
- Department of Statistics, Stanford University, Stanford, CA
| | - Marlene Rabinovitch
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA
- Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA
- Vera Moulton Wall Center for Pulmonary Vascular Disease, Stanford University School of Medicine, Stanford, CA
| | - Angela J. Rogers
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Department of Applied Physics, Stanford University, Stanford, CA
| | - Catherine A. Blish
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
- Chan Zuckerberg Biohub, San Francisco, CA
| |
Collapse
|
50
|
Song D, Li K, Hemminger Z, Wollman R, Li JJ. scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling. Bioinformatics 2021; 37:i358-i366. [PMID: 34252925 PMCID: PMC8275345 DOI: 10.1093/bioinformatics/btab273] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Motivation Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data. Results Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data. Availability and implementation The R package is open-access and available at https://github.com/JSB-UCLA/scPNMF. The data used in this work are available at Zenodo: https://doi.org/10.5281/zenodo.4797997. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, CA 90095-7246, USA
| | - Kexin Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA
| | - Zachary Hemminger
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA 90095, USA.,Department of Integrative Biology and Physiology, University of California, Los Angeles, CA 90095-7239, USA
| | - Roy Wollman
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA 90095, USA.,Department of Integrative Biology and Physiology, University of California, Los Angeles, CA 90095-7239, USA.,Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095-1569, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA.,Department of Human Genetics, University of California, Los Angeles, CA 90095-7088, USA.,Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766, USA.,Department of Biostatistics, University of California Los Angeles, CA 90095-1772, USA
| |
Collapse
|