1
|
Malekpour SA, Haghverdi L, Sadeghi M. Single-cell multi-omics analysis identifies context-specific gene regulatory gates and mechanisms. Brief Bioinform 2024; 25:bbae180. [PMID: 38653489 PMCID: PMC11036345 DOI: 10.1093/bib/bbae180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 01/29/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
There is a growing interest in inferring context specific gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. This involves identifying the regulatory relationships between transcription factors (TFs) and genes in individual cells, and then characterizing these relationships at the level of specific cell types or cell states. In this study, we introduce scGATE (single-cell gene regulatory gate) as a novel computational tool for inferring TF-gene interaction networks and reconstructing Boolean logic gates involving regulatory TFs using scRNA-seq data. In contrast to current Boolean models, scGATE eliminates the need for individual formulations and likelihood calculations for each Boolean rule (e.g. AND, OR, XOR). By employing a Bayesian framework, scGATE infers the Boolean rule after fitting the model to the data, resulting in significant reductions in time-complexities for logic-based studies. We have applied assay for transposase-accessible chromatin with sequencing (scATAC-seq) data and TF DNA binding motifs to filter out non-relevant TFs in gene regulations. By integrating single-cell clustering with these external cues, scGATE is able to infer context specific networks. The performance of scGATE is evaluated using synthetic and real single-cell multi-omics data from mouse tissues and human blood, demonstrating its superiority over existing tools for reconstructing TF-gene networks. Additionally, scGATE provides a flexible framework for understanding the complex combinatorial and cooperative relationships among TFs regulating target genes by inferring Boolean logic gates among them.
Collapse
Affiliation(s)
- Seyed Amir Malekpour
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), 19395-5746, Tehran, Iran
| | - Laleh Haghverdi
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
| | - Mehdi Sadeghi
- Department of Medical Genetics, National Institute of Genetic Engineering and Biotechnology, 1497716316, Tehran, Iran
| |
Collapse
|
2
|
Bouman BJ, Demerdash Y, Sood S, Grünschläger F, Pilz F, Itani AR, Kuck A, Marot-Lassauzaie V, Haas S, Haghverdi L, Essers MA. Single-cell time series analysis reveals the dynamics of HSPC response to inflammation. Life Sci Alliance 2024; 7:e202302309. [PMID: 38110222 PMCID: PMC10728485 DOI: 10.26508/lsa.202302309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/08/2023] [Accepted: 12/11/2023] [Indexed: 12/20/2023] Open
Abstract
Hematopoietic stem and progenitor cells (HSPCs) are known to respond to acute inflammation; however, little is understood about the dynamics and heterogeneity of these stress responses in HSPCs. Here, we performed single-cell sequencing during the sensing, response, and recovery phases of the inflammatory response of HSPCs to treatment (a total of 10,046 cells from four time points spanning the first 72 h of response) with the pro-inflammatory cytokine IFNα to investigate the HSPCs' dynamic changes during acute inflammation. We developed the essential novel computational approaches to process and analyze the resulting single-cell time series dataset. This includes an unbiased cell type annotation and abundance analysis post inflammation, tools for identification of global and cell type-specific responding genes, and a semi-supervised linear regression approach for response pseudotime reconstruction. We discovered a variety of different gene responses of the HSPCs to the treatment. Interestingly, we were able to associate a global reduced myeloid differentiation program and a locally enhanced pyroptosis activity with reduced myeloid progenitor and differentiated cells after IFNα treatment. Altogether, the single-cell time series analyses have allowed us to unbiasedly study the heterogeneous and dynamic impact of IFNα on the HSPCs.
Collapse
Affiliation(s)
- Brigitte J Bouman
- Berlin Institute for Medical Systems Biology, Max Delbrück Center in the Helmholtz Association, Berlin, Germany
- Institute for Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Yasmin Demerdash
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
| | - Shubhankar Sood
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
| | - Florian Grünschläger
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
- Division of Stem Cells and Cancer, Deutsches Krebsforschungszentrum (DKFZ) and DKFZ-ZMBH Alliance, Heidelberg, Germany
| | - Franziska Pilz
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
| | - Abdul R Itani
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
| | - Andrea Kuck
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
| | - Valérie Marot-Lassauzaie
- Berlin Institute for Medical Systems Biology, Max Delbrück Center in the Helmholtz Association, Berlin, Germany
- Charité-Universitätsmedizin, Berlin, Germany
| | - Simon Haas
- Berlin Institute for Medical Systems Biology, Max Delbrück Center in the Helmholtz Association, Berlin, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Department of Hematology, Oncology and Cancer Immunology, Campus Benjamin Franklin, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
- Berlin Institute of Health (BIH) at Charité - Universitätsmedizin Berlin, Berlin, Germany
- Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
- Charité-Universitätsmedizin, Berlin, Germany
| | - Laleh Haghverdi
- Berlin Institute for Medical Systems Biology, Max Delbrück Center in the Helmholtz Association, Berlin, Germany
| | - Marieke Ag Essers
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- DKFZ-ZMBH Alliance, Heidelberg, Germany
| |
Collapse
|
3
|
Mölbert C, Haghverdi L. Adjustments to the reference dataset design improve cell type label transfer. Front Bioinform 2023; 3:1150099. [PMID: 37091908 PMCID: PMC10114588 DOI: 10.3389/fbinf.2023.1150099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 03/21/2023] [Indexed: 04/07/2023] Open
Abstract
The transfer of cell type labels from pre-annotated (reference) to newly collected data is an important task in single-cell data analysis. As the number of publicly available annotated datasets which can be used as reference, as well as the number of computational methods for cell type label transfer are constantly growing, rationals to understand and decide which reference design and which method to use for a particular query dataset are needed. Using detailed data visualisations and interpretable statistical assessments, we benchmark a set of popular cell type annotation methods, test their performance on different cell types and study the effects of the design of reference data (e.g., cell sampling criteria, inclusion of multiple datasets in one reference, gene set selection) on the reliability of predictions. Our results highlight the need for further improvements in label transfer methods, as well as preparation of high-quality pre-annotated reference data of adequate sampling from all cell types of interest, for more reliable annotation of new datasets.
Collapse
Affiliation(s)
- Carla Mölbert
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin Institute for Medical Systems Biology, Berlin, Germany
- Department of Biology, Humboldt-Universität zu Berlin, Berlin, Germany
- *Correspondence: Carla Mölbert,
| | - Laleh Haghverdi
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, Berlin Institute for Medical Systems Biology, Berlin, Germany
| |
Collapse
|
4
|
Haghverdi L, Ludwig LS. Single-cell multi-omics and lineage tracing to dissect cell fate decision-making. Stem Cell Reports 2023; 18:13-25. [PMID: 36630900 PMCID: PMC9860164 DOI: 10.1016/j.stemcr.2022.12.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 12/07/2022] [Accepted: 12/07/2022] [Indexed: 01/12/2023] Open
Abstract
The concept of cell fate relates to the future identity of a cell, and its daughters, which is obtained via cell differentiation and division. Understanding, predicting, and manipulating cell fate has been a long-sought goal of developmental and regenerative biology. Recent insights obtained from single-cell genomic and integrative lineage-tracing approaches have further aided to identify molecular features predictive of cell fate. In this perspective, we discuss these approaches with a focus on theoretical concepts and future directions of the field to dissect molecular mechanisms underlying cell fate.
Collapse
Affiliation(s)
- Laleh Haghverdi
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin Institute for Medical Systems Biology (BIMSB), Berlin, Germany.
| | - Leif S. Ludwig
- Max-Delbrück-Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin Institute for Medical Systems Biology (BIMSB), Berlin, Germany,Berlin Institute of Health at Charité – Universitätsmedizin Berlin, Berlin, Germany,Corresponding author
| |
Collapse
|
5
|
Hinze C, Kocks C, Leiz J, Karaiskos N, Boltengagen A, Cao S, Skopnik CM, Klocke J, Hardenberg JH, Stockmann H, Gotthardt I, Obermayer B, Haghverdi L, Wyler E, Landthaler M, Bachmann S, Hocke AC, Corman V, Busch J, Schneider W, Himmerkus N, Bleich M, Eckardt KU, Enghard P, Rajewsky N, Schmidt-Ott KM. Single-cell transcriptomics reveals common epithelial response patterns in human acute kidney injury. Genome Med 2022; 14:103. [PMID: 36085050 PMCID: PMC9462075 DOI: 10.1186/s13073-022-01108-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 08/12/2022] [Indexed: 01/07/2023] Open
Abstract
Background Acute kidney injury (AKI) occurs frequently in critically ill patients and is associated with adverse outcomes. Cellular mechanisms underlying AKI and kidney cell responses to injury remain incompletely understood. Methods We performed single-nuclei transcriptomics, bulk transcriptomics, molecular imaging studies, and conventional histology on kidney tissues from 8 individuals with severe AKI (stage 2 or 3 according to Kidney Disease: Improving Global Outcomes (KDIGO) criteria). Specimens were obtained within 1–2 h after individuals had succumbed to critical illness associated with respiratory infections, with 4 of 8 individuals diagnosed with COVID-19. Control kidney tissues were obtained post-mortem or after nephrectomy from individuals without AKI. Results High-depth single cell-resolved gene expression data of human kidneys affected by AKI revealed enrichment of novel injury-associated cell states within the major cell types of the tubular epithelium, in particular in proximal tubules, thick ascending limbs, and distal convoluted tubules. Four distinct, hierarchically interconnected injured cell states were distinguishable and characterized by transcriptome patterns associated with oxidative stress, hypoxia, interferon response, and epithelial-to-mesenchymal transition, respectively. Transcriptome differences between individuals with AKI were driven primarily by the cell type-specific abundance of these four injury subtypes rather than by private molecular responses. AKI-associated changes in gene expression between individuals with and without COVID-19 were similar. Conclusions The study provides an extensive resource of the cell type-specific transcriptomic responses associated with critical illness-associated AKI in humans, highlighting recurrent disease-associated signatures and inter-individual heterogeneity. Personalized molecular disease assessment in human AKI may foster the development of tailored therapies.
Supplementary Information The online version contains supplementary material available at 10.1186/s13073-022-01108-9.
Collapse
Affiliation(s)
- Christian Hinze
- Department of Nephrology and Hypertension, Hannover Medical School, Hannover, Germany.,Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany.,Max Delbrueck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Christine Kocks
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Janna Leiz
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany.,Max Delbrueck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Nikos Karaiskos
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Anastasiya Boltengagen
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Shuang Cao
- Department of Nephrology and Hypertension, Hannover Medical School, Hannover, Germany.,Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany.,Max Delbrueck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany
| | - Christopher Mark Skopnik
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany.,Deutsches Rheumaforschungszentrum, an Institute of the Leibniz Foundation, Berlin, Germany
| | - Jan Klocke
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany.,Deutsches Rheumaforschungszentrum, an Institute of the Leibniz Foundation, Berlin, Germany.,Berlin Institute of Health, Berlin, Germany
| | - Jan-Hendrik Hardenberg
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Helena Stockmann
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Inka Gotthardt
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | | | - Laleh Haghverdi
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Emanuel Wyler
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Markus Landthaler
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Sebastian Bachmann
- Institute for Functional Anatomy, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Andreas C Hocke
- Berlin Institute of Health, Berlin, Germany.,Department of Infectious Diseases and Respiratory Medicine, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Victor Corman
- Institute of Virology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Jonas Busch
- Department of Urology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Wolfgang Schneider
- Department of Pathology, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Nina Himmerkus
- Institute of Physiology, Christian-Albrechts-Universität, Kiel, Germany
| | - Markus Bleich
- Institute of Physiology, Christian-Albrechts-Universität, Kiel, Germany
| | - Kai-Uwe Eckardt
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany
| | - Philipp Enghard
- Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany.,Deutsches Rheumaforschungszentrum, an Institute of the Leibniz Foundation, Berlin, Germany
| | - Nikolaus Rajewsky
- Berlin Institute for Medical Systems Biology, Max Delbrueck Center in the Helmholtz Association, Berlin, Germany
| | - Kai M Schmidt-Ott
- Department of Nephrology and Hypertension, Hannover Medical School, Hannover, Germany. .,Department of Nephrology and Medical Intensive Care, Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität Zu Berlin, Berlin, Germany. .,Max Delbrueck Center for Molecular Medicine in the Helmholtz Association, Berlin, Germany.
| |
Collapse
|
6
|
Marot-Lassauzaie V, Bouman BJ, Donaghy FD, Demerdash Y, Essers MAG, Haghverdi L. Towards reliable quantification of cell state velocities. PLoS Comput Biol 2022; 18:e1010031. [PMID: 36170235 PMCID: PMC9550177 DOI: 10.1371/journal.pcbi.1010031] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 10/10/2022] [Accepted: 08/26/2022] [Indexed: 11/25/2022] Open
Abstract
A few years ago, it was proposed to use the simultaneous quantification of unspliced and spliced messenger RNA (mRNA) to add a temporal dimension to high-throughput snapshots of single cell RNA sequencing data. This concept can yield additional insight into the transcriptional dynamics of the biological systems under study. However, current methods for inferring cell state velocities from such data (known as RNA velocities) are afflicted by several theoretical and computational problems, hindering realistic and reliable velocity estimation. We discuss these issues and propose new solutions for addressing some of the current challenges in consistency of data processing, velocity inference and visualisation. We translate our computational conclusion in two velocity analysis tools: one detailed method κ-velo and one heuristic method eco-velo, each of which uses a different set of assumptions about the data.
Collapse
Affiliation(s)
- Valérie Marot-Lassauzaie
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
- Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt- Universität zu Berlin, Berlin, Germany
| | - Brigitte Joanne Bouman
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
- Humboldt Universität zu Berlin, Institute for Biology, Berlin, Germany
| | - Fearghal Declan Donaghy
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
| | - Yasmin Demerdash
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- Faculty of Biosciences, University of Heidelberg, Heidelberg, Germany
| | - Marieke Alida Gertruda Essers
- Division Inflammatory Stress in Stem Cells, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Heidelberg Institute for Stem Cell Technology and Experimental Medicine (HI-STEM gGMBH), Heidelberg, Germany
- DKFZ-ZMBH Alliance, Heidelberg, Germany
| | - Laleh Haghverdi
- Berlin Institute for Medical Systems Biology, Max Delbrück Center (BIMSB-MDC) in the Helmholtz Association, Berlin, Germany
| |
Collapse
|
7
|
Sood S, Demerdash Y, Bouman B, Mascaro RB, Jolly AS, Klein LS, Bast L, Pilz F, Grünschläger F, Haghverdi L, Essers M. 2023 – INFLAMMATION-RESPONDING MESENCHYMAL STROMAL CELLS MODULATE THE BONE MARROW MICROENVIRONMENT RESPONSE TO STRESS OVER TIME. Exp Hematol 2021. [DOI: 10.1016/j.exphem.2021.12.388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
8
|
Haghverdi L, Lun ATL, Morgan MD, Marioni JC. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 2018; 36:421-427. [PMID: 29608177 PMCID: PMC6152897 DOI: 10.1038/nbt.4091] [Citation(s) in RCA: 1056] [Impact Index Per Article: 176.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 02/01/2018] [Indexed: 12/23/2022]
Abstract
Large-scale single-cell RNA sequencing (scRNA-seq) data sets that are produced in different laboratories and at different times contain batch effects that may compromise the integration and interpretation of the data. Existing scRNA-seq analysis methods incorrectly assume that the composition of cell populations is either known or identical across batches. We present a strategy for batch correction based on the detection of mutual nearest neighbors (MNNs) in the high-dimensional expression space. Our approach does not rely on predefined or equal population compositions across batches; instead, it requires only that a subset of the population be shared between batches. We demonstrate the superiority of our approach compared with existing methods by using both simulated and real scRNA-seq data sets. Using multiple droplet-based scRNA-seq data sets, we demonstrate that our MNN batch-effect-correction method can be scaled to large numbers of cells.
Collapse
Affiliation(s)
- Laleh Haghverdi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Aaron T L Lun
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
| | | | - John C Marioni
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Cambridge, UK
- Wellcome Trust Sanger Institute, Cambridge, UK
| |
Collapse
|
9
|
Haghverdi L, Büttner M, Wolf FA, Buettner F, Theis FJ. Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods 2016; 13:845-8. [PMID: 27571553 DOI: 10.1038/nmeth.3971] [Citation(s) in RCA: 622] [Impact Index Per Article: 77.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Accepted: 07/20/2016] [Indexed: 12/15/2022]
Abstract
The temporal order of differentiating cells is intrinsically encoded in their single-cell expression profiles. We describe an efficient way to robustly estimate this order according to diffusion pseudotime (DPT), which measures transitions between cells using diffusion-like random walks. Our DPT software implementations make it possible to reconstruct the developmental progression of cells and identify transient or metastable states, branching decisions and differentiation endpoints.
Collapse
Affiliation(s)
- Laleh Haghverdi
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.,Department of Mathematics, Technische Universität München, Munich, Germany
| | - Maren Büttner
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - F Alexander Wolf
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Florian Buettner
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.,European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Fabian J Theis
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.,Department of Mathematics, Technische Universität München, Munich, Germany
| |
Collapse
|
10
|
Abstract
Motivation: High-dimensional single-cell snapshot data are becoming widespread in the systems biology community, as a mean to understand biological processes at the cellular level. However, as temporal information is lost with such data, mathematical models have been limited to capture only static features of the underlying cellular mechanisms. Results: Here, we present a modular framework which allows to recover the temporal behaviour from single-cell snapshot data and reverse engineer the dynamics of gene expression. The framework combines a dimensionality reduction method with a cell time-ordering algorithm to generate pseudo time-series observations. These are in turn used to learn transcriptional ODE models and do model selection on structural network features. We apply it on synthetic data and then on real hematopoietic stem cells data, to reconstruct gene expression dynamics during differentiation pathways and infer the structure of a key gene regulatory network. Availability and implementation: C++ and Matlab code available at https://www.helmholtz-muenchen.de/fileadmin/ICB/software/inferenceSnapshot.zip. Contact:fabian.theis@helmholtz-muenchen.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrea Ocone
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| | - Laleh Haghverdi
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| | - Nikola S Mueller
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany and Department of Mathematics, Technical University Munich, 85747 Garching, Germany
| |
Collapse
|
11
|
Angerer P, Haghverdi L, Büttner M, Theis FJ, Marr C, Buettner F. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 2015; 32:1241-3. [PMID: 26668002 DOI: 10.1093/bioinformatics/btv715] [Citation(s) in RCA: 357] [Impact Index Per Article: 39.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2015] [Accepted: 12/01/2015] [Indexed: 11/14/2022] Open
Abstract
UNLABELLED : Diffusion maps are a spectral method for non-linear dimension reduction and have recently been adapted for the visualization of single-cell expression data. Here we present destiny, an efficient R implementation of the diffusion map algorithm. Our package includes a single-cell specific noise model allowing for missing and censored values. In contrast to previous implementations, we further present an efficient nearest-neighbour approximation that allows for the processing of hundreds of thousands of cells and a functionality for projecting new data on existing diffusion maps. We exemplarily apply destiny to a recent time-resolved mass cytometry dataset of cellular reprogramming. AVAILABILITY AND IMPLEMENTATION destiny is an open-source R/Bioconductor package "bioconductor.org/packages/destiny" also available at www.helmholtz-muenchen.de/icb/destiny A detailed vignette describing functions and workflows is provided with the package. CONTACT carsten.marr@helmholtz-muenchen.de or f.buettner@helmholtz-muenchen.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Philipp Angerer
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany and
| | - Laleh Haghverdi
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany and
| | - Maren Büttner
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany and
| | - Fabian J Theis
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany and Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Boltzmannstr. 3, 85748 Garching, Germany
| | - Carsten Marr
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany and
| | - Florian Buettner
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany and
| |
Collapse
|
12
|
Haghverdi L, Buettner F, Theis FJ. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 2015; 31:2989-98. [PMID: 26002886 DOI: 10.1093/bioinformatics/btv325] [Citation(s) in RCA: 364] [Impact Index Per Article: 40.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Accepted: 05/18/2015] [Indexed: 01/10/2023] Open
Abstract
MOTIVATION Single-cell technologies have recently gained popularity in cellular differentiation studies regarding their ability to resolve potential heterogeneities in cell populations. Analyzing such high-dimensional single-cell data has its own statistical and computational challenges. Popular multivariate approaches are based on data normalization, followed by dimension reduction and clustering to identify subgroups. However, in the case of cellular differentiation, we would not expect clear clusters to be present but instead expect the cells to follow continuous branching lineages. RESULTS Here, we propose the use of diffusion maps to deal with the problem of defining differentiation trajectories. We adapt this method to single-cell data by adequate choice of kernel width and inclusion of uncertainties or missing measurement values, which enables the establishment of a pseudotemporal ordering of single cells in a high-dimensional gene expression space. We expect this output to reflect cell differentiation trajectories, where the data originates from intrinsic diffusion-like dynamics. Starting from a pluripotent stage, cells move smoothly within the transcriptional landscape towards more differentiated states with some stochasticity along their path. We demonstrate the robustness of our method with respect to extrinsic noise (e.g. measurement noise) and sampling density heterogeneities on simulated toy data as well as two single-cell quantitative polymerase chain reaction datasets (i.e. mouse haematopoietic stem cells and mouse embryonic stem cells) and an RNA-Seq data of human pre-implantation embryos. We show that diffusion maps perform considerably better than Principal Component Analysis and are advantageous over other techniques for non-linear dimension reduction such as t-distributed Stochastic Neighbour Embedding for preserving the global structures and pseudotemporal ordering of cells. AVAILABILITY AND IMPLEMENTATION The Matlab implementation of diffusion maps for single-cell data is available at https://www.helmholtz-muenchen.de/icb/single-cell-diffusion-map. CONTACT fbuettner.phys@gmail.com, fabian.theis@helmholtz-muenchen.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Laleh Haghverdi
- Institute of Computational Biology, Helmholtz Zentrum München 85764 Neuherberg, Germany and Department of Mathematics, Technische Universität München 85748 Garching, Germany Institute of Computational Biology, Helmholtz Zentrum München 85764 Neuherberg, Germany and Department of Mathematics, Technische Universität München 85748 Garching, Germany
| | - Florian Buettner
- Institute of Computational Biology, Helmholtz Zentrum München 85764 Neuherberg, Germany and Department of Mathematics, Technische Universität München 85748 Garching, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München 85764 Neuherberg, Germany and Department of Mathematics, Technische Universität München 85748 Garching, Germany Institute of Computational Biology, Helmholtz Zentrum München 85764 Neuherberg, Germany and Department of Mathematics, Technische Universität München 85748 Garching, Germany
| |
Collapse
|
13
|
Moignard V, Woodhouse S, Haghverdi L, Lilly AJ, Tanaka Y, Wilkinson AC, Buettner F, Macaulay IC, Jawaid W, Diamanti E, Nishikawa SI, Piterman N, Kouskoff V, Theis FJ, Fisher J, Göttgens B. Decoding the regulatory network of early blood development from single-cell gene expression measurements. Nat Biotechnol 2015; 33:269-276. [PMID: 25664528 PMCID: PMC4374163 DOI: 10.1038/nbt.3154] [Citation(s) in RCA: 288] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2014] [Accepted: 01/16/2015] [Indexed: 11/16/2022]
Abstract
Reconstruction of the molecular pathways controlling organ development has been hampered by a lack of methods to resolve embryonic progenitor cells. Here we describe a strategy to address this problem that combines gene expression profiling of large numbers of single cells with data analysis based on diffusion maps for dimensionality reduction and network synthesis from state transition graphs. Applying the approach to hematopoietic development in the mouse embryo, we map the progression of mesoderm toward blood using single-cell gene expression analysis of 3,934 cells with blood-forming potential captured at four time points between E7.0 and E8.5. Transitions between individual cellular states are then used as input to develop a single-cell network synthesis toolkit to generate a computationally executable transcriptional regulatory network model of blood development. Several model predictions concerning the roles of Sox and Hox factors are validated experimentally. Our results demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the transcriptional programs that underpin organogenesis.
Collapse
Affiliation(s)
- Victoria Moignard
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
- Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Steven Woodhouse
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
- Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Laleh Haghverdi
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Mathematics, Technische Universität München, Garching, Germany
| | - Andrew J. Lilly
- Cancer Research UK Stem Cell Haematopoiesis Group, Paterson Institute for Cancer Research, University of Manchester, Manchester, UK
| | - Yosuke Tanaka
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
- Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Laboratory for Stem Cell Biology, RIKEN Center for Developmental Biology, Chuo-ku, Kobe, Japan
| | - Adam C. Wilkinson
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
- Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Florian Buettner
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
| | - Iain C. Macaulay
- Sanger Institute-EBI Single Cell Genomics Centre, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Wajid Jawaid
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
| | - Evangelia Diamanti
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
- Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| | - Shin-Ichi Nishikawa
- Laboratory for Stem Cell Biology, RIKEN Center for Developmental Biology, Chuo-ku, Kobe, Japan
| | - Nir Piterman
- Department of Computer Science, University of Leicester, Leicester, UK
| | - Valerie Kouskoff
- Cancer Research UK Stem Cell Haematopoiesis Group, Paterson Institute for Cancer Research, University of Manchester, Manchester, UK
| | - Fabian J. Theis
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg, Germany
- Department of Mathematics, Technische Universität München, Garching, Germany
| | - Jasmin Fisher
- Microsoft Research Cambridge, Cambridge, UK
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Berthold Göttgens
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, UK
- Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
| |
Collapse
|
14
|
Moignard V, Woodhouse S, Haghverdi L, Lilly J, Tanaka Y, Wilkinson A, Buettner F, Nishikawa SI, Piterman N, Kouskoff V, Theis F, Fisher J, Gottgens B. Decoding the transcriptional program for blood development from whole tissue single-cell gene expression measurements. Exp Hematol 2014. [DOI: 10.1016/j.exphem.2014.07.195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|