1
|
Song Y, Parada G, Lee JTH, Hemberg M. Mining alternative splicing patterns in scRNA-seq data using scASfind. Genome Biol 2024; 25:197. [PMID: 39075577 PMCID: PMC11285346 DOI: 10.1186/s13059-024-03323-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 06/26/2024] [Indexed: 07/31/2024] Open
Abstract
Single-cell RNA-seq (scRNA-seq) is widely used for transcriptome profiling, but most analyses focus on gene-level events, with less attention devoted to alternative splicing. Here, we present scASfind, a novel computational method to allow for quantitative analysis of cell type-specific splicing events using full-length scRNA-seq data. ScASfind utilizes an efficient data structure to store the percent spliced-in value for each splicing event. This makes it possible to exhaustively search for patterns among all differential splicing events, allowing us to identify marker events, mutually exclusive events, and events involving large blocks of exons that are specific to one or more cell types.
Collapse
Affiliation(s)
- Yuyao Song
- Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
- European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, CB10 1SD, UK
| | - Guillermo Parada
- Wellcome Sanger Institute, Hinxton, CB10 1SA, UK
- Donnelly Centre, University of Toronto, Toronto, ON, M5S 3E1, Canada
| | | | - Martin Hemberg
- Wellcome Sanger Institute, Hinxton, CB10 1SA, UK.
- The Gene Lay Institute of Immunology and Inflammation, Brigham and Women's Hospital, Massachusetts General Hospital, and Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
2
|
Patikas N, Ansari R, Metzakopian E. Single-cell transcriptomics identifies perturbed molecular pathways in midbrain organoids using α-synuclein triplication Parkinson's disease patient-derived iPSCs. Neurosci Res 2023; 195:13-28. [PMID: 37271312 DOI: 10.1016/j.neures.2023.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 05/26/2023] [Accepted: 06/01/2023] [Indexed: 06/06/2023]
Abstract
Three-dimensional (3D) brain organoids provide a platform to study brain development, cellular coordination, and disease using human tissue. Here, we generate midbrain dopaminergic (mDA) organoids from induced pluripotent stem cells (iPSC) from healthy and Parkinson's Disease (PD) donors and assess them as a human PD model using single-cell RNAseq. We characterize cell types in our organoid cultures and analyze our model's Dopamine (DA) neurons using cytotoxic and genetic stressors. Our study provides the first in-depth, single-cell analysis of SNCA triplication and shows evidence for molecular dysfunction in oxidative phosphorylation, translation, and ER protein-folding in DA neurons. We perform an in-silico identification of rotenone-sensitive DA neurons and characterization of corresponding transcriptomic profiles associated with synaptic signalling and cholesterol biosynthesis. Finally, we show a novel chimera organoid model from healthy and PD iPSCs allowing the study of DA neurons from different individuals within the same tissue.
Collapse
Affiliation(s)
- Nikolaos Patikas
- UK Dementia Research Institute, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0AH, UK.
| | - Rizwan Ansari
- UK Dementia Research Institute, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0AH, UK
| | - Emmanouil Metzakopian
- UK Dementia Research Institute, Department of Clinical Neurosciences, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0AH, UK.
| |
Collapse
|
3
|
Mishra S, Pandey N, Chawla S, Sharma M, Chandra O, Jha IP, SenGupta D, Natarajan KN, Kumar V. Matching queried single-cell open-chromatin profiles to large pools of single-cell transcriptomes and epigenomes for reference supported analysis. Genome Res 2023; 33:218-231. [PMID: 36653120 PMCID: PMC10069468 DOI: 10.1101/gr.277015.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Accepted: 01/09/2023] [Indexed: 01/19/2023]
Abstract
The true benefits of large single-cell transcriptome and epigenome data sets can be realized only with the development of new approaches and search tools for annotating individual cells. Matching a single-cell epigenome profile to a large pool of reference cells remains a major challenge. Here, we present scEpiSearch, which enables searching, comparison, and independent classification of single-cell open-chromatin profiles against a large reference of single-cell expression and open-chromatin data sets. Across performance benchmarks, scEpiSearch outperformed multiple methods in accuracy of search and low-dimensional coembedding of single-cell profiles, irrespective of platforms and species. Here we also demonstrate the unconventional utilities of scEpiSearch by applying it on single-cell epigenome profiles of K562 cells and samples from patients with acute leukaemia to reveal different aspects of their heterogeneity, multipotent behavior, and dedifferentiated states. Applying scEpiSearch on our single-cell open-chromatin profiles from embryonic stem cells (ESCs), we identified ESC subpopulations with more activity and poising for endoplasmic reticulum stress and unfolded protein response. Thus, scEpiSearch solves the nontrivial problem of amalgamating information from a large pool of single cells to identify and study the regulatory states of cells using their single-cell epigenomes.
Collapse
Affiliation(s)
- Shreya Mishra
- Department for Computational Biology, IIIT Delhi 110020, India
| | - Neetesh Pandey
- Department for Computational Biology, IIIT Delhi 110020, India
| | - Smriti Chawla
- Department for Computational Biology, IIIT Delhi 110020, India
| | - Madhu Sharma
- Department for Computational Biology, IIIT Delhi 110020, India
| | - Omkar Chandra
- Department for Computational Biology, IIIT Delhi 110020, India
| | | | - Debarka SenGupta
- Department for Computational Biology, IIIT Delhi 110020, India.,Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane 4001, Australia
| | - Kedar Nath Natarajan
- DTU Bioengineering, Technical University of Denmark, DK-2800 Kgs. Lyngby, Denmark
| | - Vibhor Kumar
- Department for Computational Biology, IIIT Delhi 110020, India;
| |
Collapse
|
4
|
Townes FW, Engelhardt BE. Nonnegative spatial factorization applied to spatial genomics. Nat Methods 2023; 20:229-238. [PMID: 36587187 PMCID: PMC9911348 DOI: 10.1038/s41592-022-01687-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 10/17/2022] [Indexed: 01/01/2023]
Abstract
Nonnegative matrix factorization (NMF) is widely used to analyze high-dimensional count data because, in contrast to real-valued alternatives such as factor analysis, it produces an interpretable parts-based representation. However, in applications such as spatial transcriptomics, NMF fails to incorporate known structure between observations. Here, we present nonnegative spatial factorization (NSF), a spatially-aware probabilistic dimension reduction model based on transformed Gaussian processes that naturally encourages sparsity and scales to tens of thousands of observations. NSF recovers ground truth factors more accurately than real-valued alternatives such as MEFISTO in simulations, and has lower out-of-sample prediction error than probabilistic NMF on three spatial transcriptomics datasets from mouse brain and liver. Since not all patterns of gene expression have spatial correlations, we also propose a hybrid extension of NSF that combines spatial and nonspatial components, enabling quantification of spatial importance for both observations and features. A TensorFlow implementation of NSF is available from https://github.com/willtownes/nsf-paper .
Collapse
Affiliation(s)
- F. William Townes
- grid.147455.60000 0001 2097 0344Present Address: Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA USA
| | - Barbara E. Engelhardt
- grid.249878.80000 0004 0572 7110Present Address: Data Science and Biotechnology Institute, Gladstone Institutes, San Francisco, CA USA ,grid.168010.e0000000419368956Present Address: Department of Biomedical Data Science, Stanford University, Stanford, CA USA
| |
Collapse
|
5
|
Cha J, Yu J, Cho JW, Hemberg M, Lee I. scHumanNet: a single-cell network analysis platform for the study of cell-type specificity of disease genes. Nucleic Acids Res 2022; 51:e8. [PMID: 36350625 PMCID: PMC9881140 DOI: 10.1093/nar/gkac1042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/19/2022] [Accepted: 10/25/2022] [Indexed: 11/10/2022] Open
Abstract
A major challenge in single-cell biology is identifying cell-type-specific gene functions, which may substantially improve precision medicine. Differential expression analysis of genes is a popular, yet insufficient approach, and complementary methods that associate function with cell type are required. Here, we describe scHumanNet (https://github.com/netbiolab/scHumanNet), a single-cell network analysis platform for resolving cellular heterogeneity across gene functions in humans. Based on cell-type-specific gene networks (CGNs) constructed under the guidance of the HumanNet reference interactome, scHumanNet displayed higher functional relevance to the cellular context than CGNs built by other methods on single-cell transcriptome data. Cellular deconvolution of gene signatures based on network compactness across cell types revealed breast cancer prognostic markers associated with T cells. scHumanNet could also prioritize genes associated with particular cell types using CGN centrality and identified the differential hubness of CGNs between disease and healthy conditions. We demonstrated the usefulness of scHumanNet by uncovering T-cell-specific functional effects of GITR, a prognostic gene for breast cancer, and functional defects in autism spectrum disorder genes specific for inhibitory neurons. These results suggest that scHumanNet will advance our understanding of cell-type specificity across human disease genes.
Collapse
Affiliation(s)
- Junha Cha
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Jiwon Yu
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Jae-Won Cho
- Evergrande Center for Immunologic Disease, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA
| | - Martin Hemberg
- Correspondence may also be addressed to Martin Hemberg. Tel: +1 857 307 1422;
| | - Insuk Lee
- To whom correspondence should be addressed. Tel: +82 2 2123 5559; Fax: +82 2 362 7265;
| |
Collapse
|
6
|
Bergmann T, Liu Y, Skov J, Mogus L, Lee J, Pfisterer U, Handfield LF, Asenjo-Martinez A, Lisa-Vargas I, Seemann SE, Lee JTH, Patikas N, Kornum BR, Denham M, Hyttel P, Witter MP, Gorodkin J, Pers TH, Hemberg M, Khodosevich K, Hall VJ. Production of human entorhinal stellate cell-like cells by forward programming shows an important role of Foxp1 in reprogramming. Front Cell Dev Biol 2022; 10:976549. [PMID: 36046338 PMCID: PMC9420913 DOI: 10.3389/fcell.2022.976549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 07/11/2022] [Indexed: 11/13/2022] Open
Abstract
Stellate cells are principal neurons in the entorhinal cortex that contribute to spatial processing. They also play a role in the context of Alzheimer's disease as they accumulate Amyloid beta early in the disease. Producing human stellate cells from pluripotent stem cells would allow researchers to study early mechanisms of Alzheimer's disease, however, no protocols currently exist for producing such cells. In order to develop novel stem cell protocols, we characterize at high resolution the development of the porcine medial entorhinal cortex by tracing neuronal and glial subtypes from mid-gestation to the adult brain to identify the transcriptomic profile of progenitor and adult stellate cells. Importantly, we could confirm the robustness of our data by extracting developmental factors from the identified intermediate stellate cell cluster and implemented these factors to generate putative intermediate stellate cells from human induced pluripotent stem cells. Six transcription factors identified from the stellate cell cluster including RUNX1T1, SOX5, FOXP1, MEF2C, TCF4, EYA2 were overexpressed using a forward programming approach to produce neurons expressing a unique combination of RELN, SATB2, LEF1 and BCL11B observed in stellate cells. Further analyses of the individual transcription factors led to the discovery that FOXP1 is critical in the reprogramming process and omission of RUNX1T1 and EYA2 enhances neuron conversion. Our findings contribute not only to the profiling of cell types within the developing and adult brain's medial entorhinal cortex but also provides proof-of-concept for using scRNAseq data to produce entorhinal intermediate stellate cells from human pluripotent stem cells in-vitro.
Collapse
Affiliation(s)
- Tobias Bergmann
- Group of Brain Development and Disease, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Yong Liu
- Group of Brain Development and Disease, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Jonathan Skov
- Group of Brain Development and Disease, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Leo Mogus
- Group of Brain Development and Disease, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Julie Lee
- Novo Nordisk Foundation Center for Stem Cell Research, DanStem University of Copenhagen, Copenhagen, Denmark
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Ulrich Pfisterer
- Biotech Research and Innovation Centre (BRIC), Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | | | - Andrea Asenjo-Martinez
- Biotech Research and Innovation Centre (BRIC), Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Irene Lisa-Vargas
- Biotech Research and Innovation Centre (BRIC), Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Stefan E. Seemann
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Jimmy Tsz Hang Lee
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Nikolaos Patikas
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Birgitte Rahbek Kornum
- Department of Neuroscience, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Mark Denham
- Danish Research Institute of Translational Neuroscience (DANDRITE), Nordic EMBL Partnership for Molecular Medicine, Aarhus University, Aarhus, Denmark
| | - Poul Hyttel
- Disease, Stem Cells and Embryology, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Menno P. Witter
- Kavli Institute for Systems Neuroscience, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology, Trondheim, Norway
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Tune H. Pers
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Konstantin Khodosevich
- Biotech Research and Innovation Centre (BRIC), Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Vanessa Jane Hall
- Group of Brain Development and Disease, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| |
Collapse
|
7
|
Ji Y, Lotfollahi M, Wolf FA, Theis FJ. Machine learning for perturbational single-cell omics. Cell Syst 2021; 12:522-537. [PMID: 34139164 DOI: 10.1016/j.cels.2021.05.016] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 05/04/2021] [Accepted: 05/19/2021] [Indexed: 12/18/2022]
Abstract
Cell biology is fundamentally limited in its ability to collect complete data on cellular phenotypes and the wide range of responses to perturbation. Areas such as computer vision and speech recognition have addressed this problem of characterizing unseen or unlabeled conditions with the combined advances of big data, deep learning, and computing resources in the past 5 years. Similarly, recent advances in machine learning approaches enabled by single-cell data start to address prediction tasks in perturbation response modeling. We first define objectives in learning perturbation response in single-cell omics; survey existing approaches, resources, and datasets (https://github.com/theislab/sc-pert); and discuss how a perturbation atlas can enable deep learning models to construct an informative perturbation latent space. We then examine future avenues toward more powerful and explainable modeling using deep neural networks, which enable the integration of disparate information sources and an understanding of heterogeneous, complex, and unseen systems.
Collapse
Affiliation(s)
- Yuge Ji
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Mohammad Lotfollahi
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - F Alexander Wolf
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; Cellarity, Cambridge, MA, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany; Department of Mathematics, Technical University of Munich, Munich, Germany; Cellarity, Cambridge, MA, USA.
| |
Collapse
|