1
|
Hu H, Wang X, Feng S, Xu Z, Liu J, Heidrich-O'Hare E, Chen Y, Yue M, Zeng L, Rong Z, Chen T, Billiar T, Ding Y, Huang H, Duerr RH, Chen W. A unified model-based framework for doublet or multiplet detection in single-cell multiomics data. Nat Commun 2024; 15:5562. [PMID: 38956023 PMCID: PMC11220103 DOI: 10.1038/s41467-024-49448-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 06/03/2024] [Indexed: 07/04/2024] Open
Abstract
Droplet-based single-cell sequencing techniques rely on the fundamental assumption that each droplet encapsulates a single cell, enabling individual cell omics profiling. However, the inevitable issue of multiplets, where two or more cells are encapsulated within a single droplet, can lead to spurious cell type annotations and obscure true biological findings. The issue of multiplets is exacerbated in single-cell multiomics settings, where integrating cross-modality information for clustering can inadvertently promote the aggregation of multiplet clusters and increase the risk of erroneous cell type annotations. Here, we propose a compound Poisson model-based framework for multiplet detection in single-cell multiomics data. Leveraging experimental cell hashing results as the ground truth for multiplet status, we conducted trimodal DOGMA-seq experiments and generated 17 benchmarking datasets from two tissues, involving a total of 280,123 droplets. We demonstrated that the proposed method is an essential tool for integrating cross-modality multiplet signals, effectively eliminating multiplet clusters in single-cell multiomics data-a task at which the benchmarked single-omics methods proved inadequate.
Collapse
Affiliation(s)
- Haoran Hu
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Xinjun Wang
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Site Feng
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, 15261, USA
- School of Medicine, Tsinghua University, 100084, Beijing, China
| | - Zhongli Xu
- School of Medicine, Tsinghua University, 100084, Beijing, China
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, 15224, USA
| | - Jing Liu
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, 15224, USA
| | | | - Yanshuo Chen
- Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
- Center of Bioinformatics and Computational Biology, College Park, MD, 20740, USA
| | - Molin Yue
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Lang Zeng
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Ziqi Rong
- School of Information, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Tianmeng Chen
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Timothy Billiar
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Ying Ding
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15213, USA
| | - Heng Huang
- Department of Computer Science, University of Maryland, College Park, MD, 20742, USA
- Center of Bioinformatics and Computational Biology, College Park, MD, 20740, USA
| | - Richard H Duerr
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, 15261, USA.
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, 15261, USA.
| | - Wei Chen
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15213, USA.
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, 15224, USA.
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA, 15261, USA.
| |
Collapse
|
2
|
Yang Y, Pe’er D. REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data. Bioinformatics 2024; 40:i567-i575. [PMID: 38940155 PMCID: PMC11211829 DOI: 10.1093/bioinformatics/btae234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Profiling of gene expression and chromatin accessibility by single-cell multi-omics approaches can help to systematically decipher how transcription factors (TFs) regulate target gene expression via cis-region interactions. However, integrating information from different modalities to discover regulatory associations is challenging, in part because motif scanning approaches miss many likely TF binding sites. RESULTS We develop REUNION, a framework for predicting genome-wide TF binding and cis-region-TF-gene "triplet" regulatory associations using single-cell multi-omics data. The first component of REUNION, Unify, utilizes information theory-inspired complementary score functions that incorporate TF expression, chromatin accessibility, and target gene expression to identify regulatory associations. The second component, Rediscover, takes Unify estimates as input for pseudo semi-supervised learning to predict TF binding in accessible genomic regions that may or may not include detected TF motifs. Rediscover leverages latent chromatin accessibility and sequence feature spaces of the genomic regions, without requiring chromatin immunoprecipitation data for model training. Applied to peripheral blood mononuclear cell data, REUNION outperforms alternative methods in TF binding prediction on average performance. In particular, it recovers missing region-TF associations from regions lacking detected motifs, which circumvents the reliance on motif scanning and facilitates discovery of novel associations involving potential co-binding transcriptional regulators. Newly identified region-TF associations, even in regions lacking a detected motif, improve the prediction of target gene expression in regulatory triplets, and are thus likely to genuinely participate in the regulation. AVAILABILITY AND IMPLEMENTATION All source code is available at https://github.com/yangymargaret/REUNION.
Collapse
Affiliation(s)
- Yang Yang
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, United States
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, United States
| | - Dana Pe’er
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, United States
- Howard Hughes Medical Institute, Chevy Chase, MD 20815, United States
| |
Collapse
|
3
|
Pang QY, Chiu YC, Huang RYJ. Regulating epithelial-mesenchymal plasticity from 3D genome organization. Commun Biol 2024; 7:750. [PMID: 38902393 PMCID: PMC11190238 DOI: 10.1038/s42003-024-06441-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 06/11/2024] [Indexed: 06/22/2024] Open
Abstract
Epithelial-mesenchymal transition (EMT) is a dynamic process enabling polarized epithelial cells to acquire mesenchymal features implicated in development and carcinoma progression. As our understanding evolves, it is clear the reversible execution of EMT arises from complex epigenomic regulation involving histone modifications and 3-dimensional (3D) genome structural changes, leading to a cascade of transcriptional events. This review summarizes current knowledge on chromatin organization in EMT, with a focus on hierarchical structures of the 3D genome and chromatin accessibility changes.
Collapse
Affiliation(s)
- Qing You Pang
- Neuro-Oncology Research Laboratory, National Neuroscience Institute, Singapore, 308433, Singapore
| | - Yi-Chia Chiu
- School of Medicine, College of Medicine, National Taiwan University, Taipei, 10051, Taiwan
| | - Ruby Yun-Ju Huang
- School of Medicine, College of Medicine, National Taiwan University, Taipei, 10051, Taiwan.
- Center for Advanced Computing and Imaging in Biomedicine, National Taiwan University, Taipei, 10051, Taiwan.
- Department of Obstetrics & Gynaecology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 119077, Singapore.
| |
Collapse
|
4
|
Huang K, Xu Y, Feng T, Lan H, Ling F, Xiang H, Liu Q. The Advancement and Application of the Single-Cell Transcriptome in Biological and Medical Research. BIOLOGY 2024; 13:451. [PMID: 38927331 PMCID: PMC11200756 DOI: 10.3390/biology13060451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2024] [Revised: 06/11/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024]
Abstract
Single-cell RNA sequencing technology (scRNA-seq) has been steadily developing since its inception in 2009. Unlike bulk RNA-seq, scRNA-seq identifies the heterogeneity of tissue cells and reveals gene expression changes in individual cells at the microscopic level. Here, we review the development of scRNA-seq, which has gone through iterations of reverse transcription, in vitro transcription, smart-seq, drop-seq, 10 × Genomics, and spatial single-cell transcriptome technologies. The technology of 10 × Genomics has been widely applied in medicine and biology, producing rich research results. Furthermore, this review presents a summary of the analytical process for single-cell transcriptome data and its integration with other omics analyses, including genomes, epigenomes, proteomes, and metabolomics. The single-cell transcriptome has a wide range of applications in biology and medicine. This review analyzes the applications of scRNA-seq in cancer, stem cell research, developmental biology, microbiology, and other fields. In essence, scRNA-seq provides a means of elucidating gene expression patterns in single cells, thereby offering a valuable tool for scientific research. Nevertheless, the current single-cell transcriptome technology is still imperfect, and this review identifies its shortcomings and anticipates future developments. The objective of this review is to facilitate a deeper comprehension of scRNA-seq technology and its applications in biological and medical research, as well as to identify avenues for its future development in alignment with practical needs.
Collapse
Affiliation(s)
- Kongwei Huang
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510641, China
| | - Yixue Xu
- Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Animal Science and Technology, Guangxi University, Nanning 530005, China;
| | - Tong Feng
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular Imaging, Center for Artificial Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Hong Lan
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Fei Ling
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510641, China
| | - Hai Xiang
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Qingyou Liu
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
| |
Collapse
|
5
|
Liu J, Ma J, Wen J, Zhou X. A Cell Cycle-Aware Network for Data Integration and Label Transferring of Single-Cell RNA-Seq and ATAC-Seq. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2401815. [PMID: 38887194 DOI: 10.1002/advs.202401815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/22/2024] [Indexed: 06/20/2024]
Abstract
In recent years, the integration of single-cell multi-omics data has provided a more comprehensive understanding of cell functions and internal regulatory mechanisms from a non-single omics perspective, but it still suffers many challenges, such as omics-variance, sparsity, cell heterogeneity, and confounding factors. As it is known, the cell cycle is regarded as a confounder when analyzing other factors in single-cell RNA-seq data, but it is not clear how it will work on the integrated single-cell multi-omics data. Here, a cell cycle-aware network (CCAN) is developed to remove cell cycle effects from the integrated single-cell multi-omics data while keeping the cell type-specific variations. This is the first computational model to study the cell-cycle effects in the integration of single-cell multi-omics data. Validations on several benchmark datasets show the outstanding performance of CCAN in a variety of downstream analyses and applications, including removing cell cycle effects and batch effects of scRNA-seq datasets from different protocols, integrating paired and unpaired scRNA-seq and scATAC-seq data, accurately transferring cell type labels from scRNA-seq to scATAC-seq data, and characterizing the differentiation process from hematopoietic stem cells to different lineages in the integration of differentiation data.
Collapse
Affiliation(s)
- Jiajia Liu
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jian Ma
- Department of Electronic Information and Computer Engineering, The Engineering & Technical College of Chengdu University of Technology, Leshan, Sichuan, 614000, China
| | - Jianguo Wen
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| |
Collapse
|
6
|
Cameron D, Vinh NN, Prapaiwongs P, Perry EA, Walters JTR, Li M, O'Donovan MC, Bray NJ. Genetic Implication of Prenatal GABAergic and Cholinergic Neuron Development in Susceptibility to Schizophrenia. Schizophr Bull 2024:sbae083. [PMID: 38869145 DOI: 10.1093/schbul/sbae083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
BACKGROUND The ganglionic eminences (GE) are fetal-specific structures that give rise to gamma-aminobutyric acid (GABA)- and acetylcholine-releasing neurons of the forebrain. Given the evidence for GABAergic, cholinergic, and neurodevelopmental disturbances in schizophrenia, we tested the potential involvement of GE neuron development in mediating genetic risk for the condition. STUDY DESIGN We combined data from a recent large-scale genome-wide association study of schizophrenia with single-cell RNA sequencing data from the human GE to test the enrichment of schizophrenia risk variation in genes with high expression specificity for developing GE cell populations. We additionally performed the single nuclei Assay for Transposase-Accessible Chromatin with Sequencing (snATAC-Seq) to map potential regulatory genomic regions operating in individual cell populations of the human GE, using these to test for enrichment of schizophrenia common genetic variant liability and to functionally annotate non-coding variants-associated with the disorder. STUDY RESULTS Schizophrenia common variant liability was enriched in genes with high expression specificity for developing neuron populations that are predicted to form dopamine D1 and D2 receptor-expressing GABAergic medium spiny neurons of the striatum, cortical somatostatin-positive GABAergic interneurons, calretinin-positive GABAergic neurons, and cholinergic neurons. Consistent with these findings, schizophrenia genetic risk was concentrated in predicted regulatory genomic sequence mapped in developing neuronal populations of the GE. CONCLUSIONS Our study implicates prenatal development of specific populations of GABAergic and cholinergic neurons in later susceptibility to schizophrenia, and provides a map of predicted regulatory genomic elements operating in cells of the GE.
Collapse
Affiliation(s)
- Darren Cameron
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
| | - Ngoc-Nga Vinh
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
| | - Parinda Prapaiwongs
- Neuroscience and Mental Health Innovation Institute, Cardiff University, Cardiff, UK
| | - Elizabeth A Perry
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
| | - James T R Walters
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
| | - Meng Li
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
- Neuroscience and Mental Health Innovation Institute, Cardiff University, Cardiff, UK
| | - Michael C O'Donovan
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
| | - Nicholas J Bray
- Division of Psychological Medicine and Clinical Neurosciences, Centre for Neuropsychiatric Genetics & Genomics, Cardiff University, Cardiff, UK
- Neuroscience and Mental Health Innovation Institute, Cardiff University, Cardiff, UK
| |
Collapse
|
7
|
Majane AC, Cridland JM, Blair LK, Begun DJ. Evolution and genetics of accessory gland transcriptome divergence between Drosophila melanogaster and D. simulans. Genetics 2024; 227:iyae039. [PMID: 38518250 PMCID: PMC11151936 DOI: 10.1093/genetics/iyae039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 08/27/2023] [Accepted: 02/15/2024] [Indexed: 03/24/2024] Open
Abstract
Studies of allele-specific expression in interspecific hybrids have provided important insights into gene-regulatory divergence and hybrid incompatibilities. Many such investigations in Drosophila have used transcriptome data from complex mixtures of many tissues or from gonads, however, regulatory divergence may vary widely among species, sexes, and tissues. Thus, we lack sufficiently broad sampling to be confident about the general biological principles of regulatory divergence. Here, we seek to fill some of these gaps in the literature by characterizing regulatory evolution and hybrid misexpression in a somatic male sex organ, the accessory gland, in F1 hybrids between Drosophila melanogaster and D. simulans. The accessory gland produces seminal fluid proteins, which play an important role in male and female fertility and may be subject to adaptive divergence due to male-male or male-female interactions. We find that trans differences are relatively more abundant than cis, in contrast to most of the interspecific hybrid literature, though large effect-size trans differences are rare. Seminal fluid protein genes have significantly elevated levels of expression divergence and tend to be regulated through both cis and trans divergence. We find limited misexpression (over- or underexpression relative to both parents) in this organ compared to most other Drosophila studies. As in previous studies, male-biased genes are overrepresented among misexpressed genes and are much more likely to be underexpressed. ATAC-Seq data show that chromatin accessibility is correlated with expression differences among species and hybrid allele-specific expression. This work identifies unique regulatory evolution and hybrid misexpression properties of the accessory gland and suggests the importance of tissue-specific allele-specific expression studies.
Collapse
Affiliation(s)
- Alex C Majane
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Logan K Blair
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| |
Collapse
|
8
|
Jindal K, Adil MT, Yamaguchi N, Yang X, Wang HC, Kamimoto K, Rivera-Gonzalez GC, Morris SA. Single-cell lineage capture across genomic modalities with CellTag-multi reveals fate-specific gene regulatory changes. Nat Biotechnol 2024; 42:946-959. [PMID: 37749269 PMCID: PMC11180607 DOI: 10.1038/s41587-023-01931-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Accepted: 07/31/2023] [Indexed: 09/27/2023]
Abstract
Complex gene regulatory mechanisms underlie differentiation and reprogramming. Contemporary single-cell lineage-tracing (scLT) methods use expressed, heritable DNA barcodes to combine cell lineage readout with single-cell transcriptomics. However, reliance on transcriptional profiling limits adaptation to other single-cell assays. With CellTag-multi, we present an approach that enables direct capture of heritable random barcodes expressed as polyadenylated transcripts, in both single-cell RNA sequencing and single-cell Assay for Transposase Accessible Chromatin using sequencing assays, allowing for independent clonal tracking of transcriptional and epigenomic cell states. We validate CellTag-multi to characterize progenitor cell lineage priming during mouse hematopoiesis. Additionally, in direct reprogramming of fibroblasts to endoderm progenitors, we identify core regulatory programs underlying on-target and off-target fates. Furthermore, we reveal the transcription factor Zfp281 as a regulator of reprogramming outcome, biasing cells toward an off-target mesenchymal fate. Our results establish CellTag-multi as a lineage-tracing method compatible with multiple single-cell modalities and demonstrate its utility in revealing fate-specifying gene regulatory changes across diverse paradigms of differentiation and reprogramming.
Collapse
Affiliation(s)
- Kunal Jindal
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Mohd Tayyab Adil
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Naoto Yamaguchi
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Xue Yang
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Helen C Wang
- Department of Pediatrics, Division of Hematology and Oncology, Washington University School of Medicine, St. Louis, MO, USA
| | - Kenji Kamimoto
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Guillermo C Rivera-Gonzalez
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Samantha A Morris
- Department of Developmental Biology, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA.
- Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
9
|
Liu Q, Ma W, Chen R, Li S, Wang Q, Wei C, Hong Y, Sun H, Cheng Q, Zhao J, Kang J. Multiome in the Same Cell Reveals the Impact of Osmotic Stress on Arabidopsis Root Tip Development at Single-Cell Level. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2308384. [PMID: 38634607 PMCID: PMC11199978 DOI: 10.1002/advs.202308384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 02/27/2024] [Indexed: 04/19/2024]
Abstract
Cell-specific transcriptional regulatory networks (TRNs) play vital roles in plant development and response to environmental stresses. However, traditional single-cell mono-omics techniques are unable to directly capture the relationships and dynamics between different layers of molecular information within the same cells. While advanced algorithm facilitates merging scRNA-seq and scATAC-seq datasets, accurate data integration remains a challenge, particularly when investigating cell-type-specific TRNs. By examining gene expression and chromatin accessibility simultaneously in 16,670 Arabidopsis root tip nuclei, the TRNs are reconstructed that govern root tip development under osmotic stress. In contrast to commonly used computational integration at cell-type level, 12,968 peak-to-gene linkage is captured at the bona fide single-cell level and construct TRNs at an unprecedented resolution. Furthermore, the unprecedented datasets allow to more accurately reconstruct the coordinated changes of gene expression and chromatin states during cellular state transition. During root tip development, chromatin accessibility of initial cells precedes gene expression, suggesting that changes in chromatin accessibility may prime cells for subsequent differentiation steps. Pseudo-time trajectory analysis reveal that osmotic stress can shift the functional differentiation of trichoblast. Candidate stress-related gene-linked cis-regulatory elements (gl-cCREs) as well as potential target genes are also identified, and uncovered large cellular heterogeneity under osmotic stress.
Collapse
Affiliation(s)
- Qing Liu
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory of Vegetable Germplasm Innovation and Utilization of HebeiMinistry of Education of China‐Hebei Province Joint Innovation Center for Efficient Green Vegetable IndustryInternational Joint R & D Center of Hebei Province in Modern Agricultural BiotechnologyCollege of Life SciencesCollege of HorticultureHebei Agricultural UniversityBaoding071000China
| | - Wei Ma
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory of Vegetable Germplasm Innovation and Utilization of HebeiMinistry of Education of China‐Hebei Province Joint Innovation Center for Efficient Green Vegetable IndustryInternational Joint R & D Center of Hebei Province in Modern Agricultural BiotechnologyCollege of Life SciencesCollege of HorticultureHebei Agricultural UniversityBaoding071000China
| | - Ruiying Chen
- BGI ResearchBeijing102601China
- BGI ResearchShenzhen518083China
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| | | | - Qifan Wang
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory of Vegetable Germplasm Innovation and Utilization of HebeiMinistry of Education of China‐Hebei Province Joint Innovation Center for Efficient Green Vegetable IndustryInternational Joint R & D Center of Hebei Province in Modern Agricultural BiotechnologyCollege of Life SciencesCollege of HorticultureHebei Agricultural UniversityBaoding071000China
| | - Cai Wei
- BGI ResearchBeijing102601China
| | - Yiguo Hong
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory of Vegetable Germplasm Innovation and Utilization of HebeiMinistry of Education of China‐Hebei Province Joint Innovation Center for Efficient Green Vegetable IndustryInternational Joint R & D Center of Hebei Province in Modern Agricultural BiotechnologyCollege of Life SciencesCollege of HorticultureHebei Agricultural UniversityBaoding071000China
- School of Life SciencesUniversity of WarwickCoventryCV4 7ALUK
| | - Hai‐Xi Sun
- BGI ResearchBeijing102601China
- BGI ResearchShenzhen518083China
- College of Life SciencesUniversity of Chinese Academy of SciencesBeijing100049China
| | - Qi Cheng
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory of Vegetable Germplasm Innovation and Utilization of HebeiMinistry of Education of China‐Hebei Province Joint Innovation Center for Efficient Green Vegetable IndustryInternational Joint R & D Center of Hebei Province in Modern Agricultural BiotechnologyCollege of Life SciencesCollege of HorticultureHebei Agricultural UniversityBaoding071000China
| | - Jianjun Zhao
- State Key Laboratory of North China Crop Improvement and RegulationKey Laboratory of Vegetable Germplasm Innovation and Utilization of HebeiMinistry of Education of China‐Hebei Province Joint Innovation Center for Efficient Green Vegetable IndustryInternational Joint R & D Center of Hebei Province in Modern Agricultural BiotechnologyCollege of Life SciencesCollege of HorticultureHebei Agricultural UniversityBaoding071000China
| | - Jingmin Kang
- BGI ResearchBeijing102601China
- BGI ResearchShenzhen518083China
| |
Collapse
|
10
|
Chen H, Ryu J, Vinyard ME, Lerer A, Pinello L. SIMBA: single-cell embedding along with features. Nat Methods 2024; 21:1003-1013. [PMID: 37248389 PMCID: PMC11166568 DOI: 10.1038/s41592-023-01899-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Accepted: 04/26/2023] [Indexed: 05/31/2023]
Abstract
Most current single-cell analysis pipelines are limited to cell embeddings and rely heavily on clustering, while lacking the ability to explicitly model interactions between different feature types. Furthermore, these methods are tailored to specific tasks, as distinct single-cell problems are formulated differently. To address these shortcomings, here we present SIMBA, a graph embedding method that jointly embeds single cells and their defining features, such as genes, chromatin-accessible regions and DNA sequences, into a common latent space. By leveraging the co-embedding of cells and features, SIMBA allows for the study of cellular heterogeneity, clustering-free marker discovery, gene regulation inference, batch effect removal and omics data integration. We show that SIMBA provides a single framework that allows diverse single-cell problems to be formulated in a unified way and thus simplifies the development of new analyses and extension to new single-cell modalities. SIMBA is implemented as a comprehensive Python library ( https://simba-bio.readthedocs.io ).
Collapse
Affiliation(s)
- Huidong Chen
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jayoung Ryu
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Michael E Vinyard
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
| | - Adam Lerer
- Facebook AI Research, New York, NY, USA.
| | - Luca Pinello
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General Hospital, Boston, MA, USA.
- Department of Pathology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
11
|
Dorans E, Jagadeesh K, Dey K, Price AL. Linking regulatory variants to target genes by integrating single-cell multiome methods and genomic distance. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.24.24307813. [PMID: 38826240 PMCID: PMC11142273 DOI: 10.1101/2024.05.24.24307813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
Methods that analyze single-cell paired RNA-seq and ATAC-seq multiome data have shown great promise in linking regulatory elements to genes. However, existing methods differ in their modeling assumptions and approaches to account for biological and technical noise-leading to low concordance in their linking scores-and do not capture the effects of genomic distance. We propose pgBoost, an integrative modeling framework that trains a non-linear combination of existing linking strategies (including genomic distance) on fine-mapped eQTL data to assign a probabilistic score to each candidate SNP-gene link. We applied pgBoost to single-cell multiome data from 85k cells representing 6 major immune/blood cell types. pgBoost attained higher enrichment for fine-mapped eSNP-eGene pairs (e.g. 21x at distance >10kb) than existing methods (1.2-10x; p-value for difference = 5e-13 vs. distance-based method and < 4e-35 for each other method), with larger improvements at larger distances (e.g. 35x vs. 0.89-6.6x at distance >100kb; p-value for difference < 0.002 vs. each other method). pgBoost also outperformed existing methods in enrichment for CRISPR-validated links (e.g. 4.8x vs. 1.6-4.1x at distance >10kb; p-value for difference = 0.25 vs. distance-based method and < 2e-5 for each other method), with larger improvements at larger distances (e.g. 15x vs. 1.6-2.5x at distance >100kb; p-value for difference < 0.009 for each other method). Similar improvements in enrichment were observed for links derived from Activity-By-Contact (ABC) scores and GWAS data. We further determined that restricting pgBoost to features from a focal cell type improved the identification of SNP-gene links relevant to that cell type. We highlight several examples where pgBoost linked fine-mapped GWAS variants to experimentally validated or biologically plausible target genes that were not implicated by other methods. In conclusion, a non-linear combination of linking strategies, including genomic distance, improves power to identify target genes underlying GWAS associations.
Collapse
|
12
|
Wang W, Cen Y, Lu Z, Xu Y, Sun T, Xiao Y, Liu W, Li JJ, Wang C. scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data. Genome Biol 2024; 25:136. [PMID: 38783325 PMCID: PMC11112958 DOI: 10.1186/s13059-024-03284-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Accepted: 05/16/2024] [Indexed: 05/25/2024] Open
Abstract
In droplet-based single-cell and single-nucleus RNA-seq assays, systematic contamination of ambient RNA molecules biases the quantification of gene expression levels. Existing methods correct the contamination for all genes globally. However, there lacks specific evaluation of correction efficacy for varying contamination levels. Here, we show that DecontX and CellBender under-correct highly contaminating genes, while SoupX and scAR over-correct lowly/non-contaminating genes. Here, we develop scCDC as the first method to detect the contamination-causing genes and only correct expression levels of these genes, some of which are cell-type markers. Compared with existing decontamination methods, scCDC excels in decontaminating highly contaminating genes while avoiding over-correction of other genes.
Collapse
Affiliation(s)
- Weijian Wang
- Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China
| | - Yihui Cen
- Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China
| | - Zezhen Lu
- Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China
| | - Yueqing Xu
- Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China
| | - Tianyi Sun
- Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095, USA
| | - Ying Xiao
- Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, 310020, China
| | - Wanlu Liu
- Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China
| | - Jingyi Jessica Li
- Department of Statistics and Data Science, University of California, Los Angeles, CA, 90095, USA.
| | - Chaochen Wang
- Centre of Biomedical Systems and Informatics, International Campus, ZJU-UoE Institute, Zhejiang University School of Medicine, Zhejiang University, Haining, Zhejiang, 314400, China.
- Department of Gynecology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou, Zhejiang, 310020, China.
- Biomedical and Health Translational Research Centre, Zhejiang University, Haining, Zhejiang, 314400, China.
| |
Collapse
|
13
|
Wang C, Qiu J, Liu M, Wang Y, Yu Y, Liu H, Zhang Y, Han L. Microfluidic Biochips for Single-Cell Isolation and Single-Cell Analysis of Multiomics and Exosomes. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2401263. [PMID: 38767182 DOI: 10.1002/advs.202401263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 04/26/2024] [Indexed: 05/22/2024]
Abstract
Single-cell multiomic and exosome analyses are potent tools in various fields, such as cancer research, immunology, neuroscience, microbiology, and drug development. They facilitate the in-depth exploration of biological systems, providing insights into disease mechanisms and aiding in treatment. Single-cell isolation, which is crucial for single-cell analysis, ensures reliable cell isolation and quality control for further downstream analyses. Microfluidic chips are small lightweight systems that facilitate efficient and high-throughput single-cell isolation and real-time single-cell analysis on- or off-chip. Therefore, most current single-cell isolation and analysis technologies are based on the single-cell microfluidic technology. This review offers comprehensive guidance to researchers across different fields on the selection of appropriate microfluidic chip technologies for single-cell isolation and analysis. This review describes the design principles, separation mechanisms, chip characteristics, and cellular effects of various microfluidic chips available for single-cell isolation. Moreover, this review highlights the implications of using this technology for subsequent analyses, including single-cell multiomic and exosome analyses. Finally, the current challenges and future prospects of microfluidic chip technology are outlined for multiplex single-cell isolation and multiomic and exosome analyses.
Collapse
Affiliation(s)
- Chao Wang
- Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China
| | - Jiaoyan Qiu
- Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China
| | - Mengqi Liu
- Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China
| | - Yihe Wang
- Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China
| | - Yang Yu
- Department of Periodontology, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University, Jinan, 250100, China
| | - Hong Liu
- State Key Laboratory of Crystal Materials, Shandong University, Jinan, 250100, China
| | - Yu Zhang
- Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China
| | - Lin Han
- Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China
- Shandong Engineering Research Center of Biomarker and Artificial Intelligence Application, Jinan, 250100, China
| |
Collapse
|
14
|
Zhou T, Zhang R, Jia D, Doty RT, Munday AD, Gao D, Xin L, Abkowitz JL, Duan Z, Ma J. GAGE-seq concurrently profiles multiscale 3D genome organization and gene expression in single cells. Nat Genet 2024:10.1038/s41588-024-01745-3. [PMID: 38744973 DOI: 10.1038/s41588-024-01745-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 04/05/2024] [Indexed: 05/16/2024]
Abstract
The organization of mammalian genomes features a complex, multiscale three-dimensional (3D) architecture, whose functional significance remains elusive because of limited single-cell technologies that can concurrently profile genome organization and transcriptional activities. Here, we introduce genome architecture and gene expression by sequencing (GAGE-seq), a scalable, robust single-cell co-assay measuring 3D genome structure and transcriptome simultaneously within the same cell. Applied to mouse brain cortex and human bone marrow CD34+ cells, GAGE-seq characterized the intricate relationships between 3D genome and gene expression, showing that multiscale 3D genome features inform cell-type-specific gene expression and link regulatory elements to target genes. Integration with spatial transcriptomic data revealed in situ 3D genome variations in mouse cortex. Observations in human hematopoiesis unveiled discordant changes between 3D genome organization and gene expression, underscoring a complex, temporal interplay at the single-cell level. GAGE-seq provides a powerful, cost-effective approach for exploring genome structure and gene expression relationships at the single-cell level across diverse biological contexts.
Collapse
Affiliation(s)
- Tianming Zhou
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ruochi Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Eric and Wendy Schmidt Center, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Deyong Jia
- Department of Urology, University of Washington, Seattle, WA, USA
| | - Raymond T Doty
- Division of Hematology and Oncology, Department of Medicine/Fred Hutch Cancer Center, University of Washington, Seattle, WA, USA
| | - Adam D Munday
- Division of Hematology and Oncology, Department of Medicine/Fred Hutch Cancer Center, University of Washington, Seattle, WA, USA
| | - Daniel Gao
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
- Department of Chemistry, Pomona College, Claremont, CA, USA
| | - Li Xin
- Department of Urology, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Janis L Abkowitz
- Division of Hematology and Oncology, Department of Medicine/Fred Hutch Cancer Center, University of Washington, Seattle, WA, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA
| | - Zhijun Duan
- Division of Hematology and Oncology, Department of Medicine/Fred Hutch Cancer Center, University of Washington, Seattle, WA, USA.
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, USA.
| | - Jian Ma
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
15
|
Lotfollahi M, Yuhan Hao, Theis FJ, Satija R. The future of rapid and automated single-cell data analysis using reference mapping. Cell 2024; 187:2343-2358. [PMID: 38729109 PMCID: PMC11184658 DOI: 10.1016/j.cell.2024.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 03/05/2024] [Accepted: 03/08/2024] [Indexed: 05/12/2024]
Abstract
As the number of single-cell datasets continues to grow rapidly, workflows that map new data to well-curated reference atlases offer enormous promise for the biological community. In this perspective, we discuss key computational challenges and opportunities for single-cell reference-mapping algorithms. We discuss how mapping algorithms will enable the integration of diverse datasets across disease states, molecular modalities, genetic perturbations, and diverse species and will eventually replace manual and laborious unsupervised clustering pipelines.
Collapse
Affiliation(s)
- Mohammad Lotfollahi
- Institute of Computational Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany; Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Yuhan Hao
- Center for Genomics and Systems Biology, New York University, New York, NY, USA; New York Genome Center, New York, NY, USA
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich - German Research Center for Environmental Health, Neuherberg, Germany; Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK; Department of Mathematics, Technical University of Munich, Garching, Germany.
| | - Rahul Satija
- Center for Genomics and Systems Biology, New York University, New York, NY, USA; New York Genome Center, New York, NY, USA.
| |
Collapse
|
16
|
Jain S, Eadon MT. Spatial transcriptomics in health and disease. Nat Rev Nephrol 2024:10.1038/s41581-024-00841-1. [PMID: 38719971 DOI: 10.1038/s41581-024-00841-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/10/2024] [Indexed: 07/05/2024]
Abstract
The ability to localize hundreds of macromolecules to discrete locations, structures and cell types in a tissue is a powerful approach to understand the cellular and spatial organization of an organ. Spatially resolved transcriptomic technologies enable mapping of transcripts at single-cell or near single-cell resolution in a multiplex manner. The rapid development of spatial transcriptomic technologies has accelerated the pace of discovery in several fields, including nephrology. Its application to preclinical models and human samples has provided spatial information about new cell types discovered by single-cell sequencing and new insights into the cell-cell interactions within neighbourhoods, and has improved our understanding of the changes that occur in response to injury. Integration of spatial transcriptomic technologies with other omics methods, such as proteomics and spatial epigenetics, will further facilitate the generation of comprehensive molecular atlases, and provide insights into the dynamic relationships of molecular components in homeostasis and disease. This Review provides an overview of current and emerging spatial transcriptomic methods, their applications and remaining challenges for the field.
Collapse
Affiliation(s)
- Sanjay Jain
- Division of Nephrology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.
| | - Michael T Eadon
- Division of Nephrology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, USA.
| |
Collapse
|
17
|
Giansanti V, Giannese F, Botrugno OA, Gandolfi G, Balestrieri C, Antoniotti M, Tonon G, Cittaro D. Scalable integration of multiomic single-cell data using generative adversarial networks. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae300. [PMID: 38696763 DOI: 10.1093/bioinformatics/btae300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 03/22/2024] [Accepted: 04/30/2024] [Indexed: 05/04/2024]
Abstract
MOTIVATION Single-cell profiling has become a common practice to investigate the complexity of tissues, organs, and organisms. Recent technological advances are expanding our capabilities to profile various molecular layers beyond the transcriptome such as, but not limited to, the genome, the epigenome, and the proteome. Depending on the experimental procedure, these data can be obtained from separate assays or the very same cells. Yet, integration of more than two assays is currently not supported by the majority of the computational frameworks avaiable. RESULTS We here propose a Multi-Omic data integration framework based on Wasserstein Generative Adversarial Networks suitable for the analysis of paired or unpaired data with a high number of modalities (>2). At the core of our strategy is a single network trained on all modalities together, limiting the computational burden when many molecular layers are evaluated. AVAILABILITY AND IMPLEMENTATION Source code of our framework is available at https://github.com/vgiansanti/MOWGAN.
Collapse
Affiliation(s)
- Valentina Giansanti
- Department of Informatics, Systems and Communication, Università degli Studi di Milano-Bicocca, Milan, 20125, Italy
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
| | - Francesca Giannese
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
| | - Oronza A Botrugno
- Functional Genomics of Cancer Unit, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
- Università Vita-Salute San Raffaele, Milan, 20132, Italy
| | - Giorgia Gandolfi
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
| | - Chiara Balestrieri
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
- Experimental Hematology Unit, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
| | - Marco Antoniotti
- Department of Informatics, Systems and Communication, Università degli Studi di Milano-Bicocca, Milan, 20125, Italy
- Bicocca Bioinformatics Biostatistics and Bioimaging Centre-B4, Università degli Studi di Milano-Bicocca, Milan, 20125, Italy
- Istituto di Bioimmagini e Fisiologia Molecolare, Consiglio Nazionale delle Ricerche (CNR), Milan, 20090, Italy
| | - Giovanni Tonon
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
- Functional Genomics of Cancer Unit, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
- Università Vita-Salute San Raffaele, Milan, 20132, Italy
| | - Davide Cittaro
- Center for Omics Sciences, IRCCS San Raffaele Scientific Institute, Milan, 20132, Italy
| |
Collapse
|
18
|
Wang H, Liu Z, Ma X. Learning Consistency and Specificity of Cells From Single-Cell Multi-Omic Data. IEEE J Biomed Health Inform 2024; 28:3134-3145. [PMID: 38709615 DOI: 10.1109/jbhi.2024.3370868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Advancements in single-cell technologies concomitantly develop the epigenomic and transcriptomic profiles at the cell levels, providing opportunities to explore the potential biological mechanisms. Even though significant efforts have been dedicated to them, it remains challenging for the integration analysis of multi-omic data of single-cell because of the heterogeneity, complicated coupling and interpretability of data. To handle these issues, we propose a novel self-representation Learning-based Multi-omics data Integrative Clustering algorithm (sLMIC) for the integration of single-cell epigenomic profiles (DNA methylation or scATAC-seq) and transcriptomic (scRNA-seq), which the consistent and specific features of cells are explicitly extracted facilitating the cell clustering. Specifically, sLMIC constructs a graph for each type of single-cell data, thereby transforming omics data into multi-layer networks, which effectively removes heterogeneity of omic data. Then, sLMIC employs the low-rank and exclusivity constraints to separate the self-representation of cells into two parts, i.e., the shared and specific features, which explicitly characterize the consistency and diversity of omic data, providing an effective strategy to model the structure of cell types. Feature extraction and cell clustering are jointly formulated as an overall objective function, where latent features of data are obtained under the guidance of cell clustering. The extensive experimental results on 13 multi-omics datasets of single-cell from diverse organisms and tissues indicate that sLMIC observably exceeds the advanced algorithms regarding various measurements.
Collapse
|
19
|
Xu J, Huang D, Zhang X. scmFormer Integrates Large-Scale Single-Cell Proteomics and Transcriptomics Data by Multi-Task Transformer. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2307835. [PMID: 38483032 PMCID: PMC11109621 DOI: 10.1002/advs.202307835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 01/24/2024] [Indexed: 05/23/2024]
Abstract
Transformer-based models have revolutionized single cell RNA-seq (scRNA-seq) data analysis. However, their applicability is challenged by the complexity and scale of single-cell multi-omics data. Here a novel single-cell multi-modal/multi-task transformer (scmFormer) is proposed to fill up the existing blank of integrating single-cell proteomics with other omics data. Through systematic benchmarking, it is demonstrated that scmFormer excels in integrating large-scale single-cell multimodal data and heterogeneous multi-batch paired multi-omics data, while preserving shared information across batchs and distinct biological information. scmFormer achieves 54.5% higher average F1 score compared to the second method in transferring cell-type labels from single-cell transcriptomics to proteomics data. Using COVID-19 datasets, it is presented that scmFormer successfully integrates over 1.48 million cells on a personal computer. Moreover, it is also proved that scmFormer performs better than existing methods on generating the unmeasured modality and is well-suited for spatial multi-omic data. Thus, scmFormer is a powerful and comprehensive tool for analyzing single-cell multi-omics data.
Collapse
Affiliation(s)
- Jing Xu
- Key Laboratory of Plant Germplasm Enhancement and Specialty AgricultureWuhan Botanical GardenChinese Academy of SciencesWuhan430074China
- University of Chinese Academy of SciencesBeijing100049China
| | - De‐Shuang Huang
- Eastern Institute for Advanced StudyEastern Institute of TechnologyNingbo315200China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty AgricultureWuhan Botanical GardenChinese Academy of SciencesWuhan430074China
- Center of Economic BotanyCore Botanical GardensChinese Academy of SciencesWuhan430074China
| |
Collapse
|
20
|
Tsuchida A, Kaneko T, Nishikawa K, Kawasaki M, Yokokawa R, Shintaku H. Opto-combinatorial indexing enables high-content transcriptomics by linking cell images and transcriptomes. LAB ON A CHIP 2024; 24:2287-2297. [PMID: 38506394 DOI: 10.1039/d3lc00866e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
We introduce a simple integrated analysis method that links cellular phenotypic behaviour with single-cell RNA sequencing (scRNA-seq) by utilizing a combination of optical indices from cells and hydrogel beads. With our method, the combinations, referred to as joint colour codes, enable the link via matching the optical combinations measured by conventional epi-fluorescence microscopy with the concatenated DNA molecular barcodes created by cell-hydrogel bead pairs and sequenced by next-generation sequencing. We validated our approach by demonstrating an accurate link between the cell image and scRNA-seq with mixed species experiments, longitudinal cell tagging by electroporation and lipofection, and gene expression analysis. Furthermore, we extended our approach to multiplexed chemical transcriptomics, which enabled us to identify distinct phenotypic behaviours in HeLa cells treated with various concentrations of paclitaxel, and determine the corresponding gene regulation associated with the formation of a multipolar spindle.
Collapse
Affiliation(s)
- Arata Tsuchida
- Cluster for Pioneering Research, RIKEN, Main Research Building 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Department of Micro Engineering, Graduate School of Engineering, Kyoto University, Kyotodaigaku-katsura, Nishikyo-ku, Kyoto 615-8540, Japan.
| | - Taikopaul Kaneko
- Cluster for Pioneering Research, RIKEN, Main Research Building 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Kaori Nishikawa
- Cluster for Pioneering Research, RIKEN, Main Research Building 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Mayu Kawasaki
- Cluster for Pioneering Research, RIKEN, Main Research Building 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Ryuji Yokokawa
- Department of Micro Engineering, Graduate School of Engineering, Kyoto University, Kyotodaigaku-katsura, Nishikyo-ku, Kyoto 615-8540, Japan.
| | - Hirofumi Shintaku
- Cluster for Pioneering Research, RIKEN, Main Research Building 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
- Department of Micro Engineering, Graduate School of Engineering, Kyoto University, Kyotodaigaku-katsura, Nishikyo-ku, Kyoto 615-8540, Japan.
- Institute for Life and Medical Science, Kyoto University, 53 Kawara-cho, Shogoin, Sakyo-ku, Kyoto 606-8507, Japan
| |
Collapse
|
21
|
Yuan Q, Duren Z. Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data. Nat Biotechnol 2024:10.1038/s41587-024-02182-7. [PMID: 38609714 DOI: 10.1038/s41587-024-02182-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 02/26/2024] [Indexed: 04/14/2024]
Abstract
Existing methods for gene regulatory network (GRN) inference rely on gene expression data alone or on lower resolution bulk data. Despite the recent integration of chromatin accessibility and RNA sequencing data, learning complex mechanisms from limited independent data points still presents a daunting challenge. Here we present LINGER (Lifelong neural network for gene regulation), a machine-learning method to infer GRNs from single-cell paired gene expression and chromatin accessibility data. LINGER incorporates atlas-scale external bulk data across diverse cellular contexts and prior knowledge of transcription factor motifs as a manifold regularization. LINGER achieves a fourfold to sevenfold relative increase in accuracy over existing methods and reveals a complex regulatory landscape of genome-wide association studies, enabling enhanced interpretation of disease-associated variants and genes. Following the GRN inference from reference single-cell multiome data, LINGER enables the estimation of transcription factor activity solely from bulk or single-cell gene expression data, leveraging the abundance of available gene expression data to identify driver regulators from case-control studies.
Collapse
Affiliation(s)
- Qiuyue Yuan
- Center for Human Genetics, Department of Genetics and Biochemistry, Clemson University, Greenwood, SC, USA
| | - Zhana Duren
- Center for Human Genetics, Department of Genetics and Biochemistry, Clemson University, Greenwood, SC, USA.
| |
Collapse
|
22
|
Cao Y, Zhao X, Tang S, Jiang Q, Li S, Li S, Chen S. scButterfly: a versatile single-cell cross-modality translation method via dual-aligned variational autoencoders. Nat Commun 2024; 15:2973. [PMID: 38582890 PMCID: PMC10998864 DOI: 10.1038/s41467-024-47418-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 03/28/2024] [Indexed: 04/08/2024] Open
Abstract
Recent advancements for simultaneously profiling multi-omics modalities within individual cells have enabled the interrogation of cellular heterogeneity and molecular hierarchy. However, technical limitations lead to highly noisy multi-modal data and substantial costs. Although computational methods have been proposed to translate single-cell data across modalities, broad applications of the methods still remain impeded by formidable challenges. Here, we propose scButterfly, a versatile single-cell cross-modality translation method based on dual-aligned variational autoencoders and data augmentation schemes. With comprehensive experiments on multiple datasets, we provide compelling evidence of scButterfly's superiority over baseline methods in preserving cellular heterogeneity while translating datasets of various contexts and in revealing cell type-specific biological insights. Besides, we demonstrate the extensive applications of scButterfly for integrative multi-omics analysis of single-modality data, data enhancement of poor-quality single-cell multi-omics, and automatic cell type annotation of scATAC-seq data. Moreover, scButterfly can be generalized to unpaired data training, perturbation-response analysis, and consecutive translation.
Collapse
Affiliation(s)
- Yichuan Cao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China
| | - Xiamiao Zhao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China
| | - Songming Tang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China
| | - Qun Jiang
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division of BNRIST, Department of Automation, Tsinghua University, 100084, Beijing, China
| | - Sijie Li
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China
| | - Siyu Li
- School of Statistics and Data Science, Nankai University, Tianjin, 300071, China
| | - Shengquan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin, 300071, China.
| |
Collapse
|
23
|
Ren L, Huang D, Liu H, Ning L, Cai P, Yu X, Zhang Y, Luo N, Lin H, Su J, Zhang Y. Applications of single‑cell omics and spatial transcriptomics technologies in gastric cancer (Review). Oncol Lett 2024; 27:152. [PMID: 38406595 PMCID: PMC10885005 DOI: 10.3892/ol.2024.14285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/19/2024] [Indexed: 02/27/2024] Open
Abstract
Gastric cancer (GC) is a prominent contributor to global cancer-related mortalities, and a deeper understanding of its molecular characteristics and tumor heterogeneity is required. Single-cell omics and spatial transcriptomics (ST) technologies have revolutionized cancer research by enabling the exploration of cellular heterogeneity and molecular landscapes at the single-cell level. In the present review, an overview of the advancements in single-cell omics and ST technologies and their applications in GC research is provided. Firstly, multiple single-cell omics and ST methods are discussed, highlighting their ability to offer unique insights into gene expression, genetic alterations, epigenomic modifications, protein expression patterns and cellular location in tissues. Furthermore, a summary is provided of key findings from previous research on single-cell omics and ST methods used in GC, which have provided valuable insights into genetic alterations, tumor diagnosis and prognosis, tumor microenvironment analysis, and treatment response. In summary, the application of single-cell omics and ST technologies has revealed the levels of cellular heterogeneity and the molecular characteristics of GC, and holds promise for improving diagnostics, personalized treatments and patient outcomes in GC.
Collapse
Affiliation(s)
- Liping Ren
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, Sichuan 611844, P.R. China
| | - Danni Huang
- Department of Radiology, Central South University Xiangya School of Medicine Affiliated Haikou People's Hospital, Haikou, Hainan 570208, P.R. China
| | - Hongjiang Liu
- School of Computer Science and Technology, Aba Teachers College, Aba, Sichuan 624099, P.R. China
| | - Lin Ning
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, Sichuan 611844, P.R. China
| | - Peiling Cai
- School of Basic Medical Sciences, Chengdu University, Chengdu, Sichuan 610106, P.R. China
| | - Xiaolong Yu
- Hainan Yazhou Bay Seed Laboratory, Sanya Nanfan Research Institute, Material Science and Engineering Institute of Hainan University, Sanya, Hainan 572025, P.R. China
| | - Yang Zhang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, Sichuan 611137, P.R. China
| | - Nanchao Luo
- School of Computer Science and Technology, Aba Teachers College, Aba, Sichuan 624099, P.R. China
| | - Hao Lin
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, Sichuan 611731, P.R. China
| | - Jinsong Su
- Research Institute of Integrated Traditional Chinese Medicine and Western Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, Sichuan 611137, P.R. China
| | - Yinghui Zhang
- School of Healthcare Technology, Chengdu Neusoft University, Chengdu, Sichuan 611844, P.R. China
| |
Collapse
|
24
|
Sakaue S, Weinand K, Isaac S, Dey KK, Jagadeesh K, Kanai M, Watts GFM, Zhu Z, Brenner MB, McDavid A, Donlin LT, Wei K, Price AL, Raychaudhuri S. Tissue-specific enhancer-gene maps from multimodal single-cell data identify causal disease alleles. Nat Genet 2024; 56:615-626. [PMID: 38594305 DOI: 10.1038/s41588-024-01682-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 02/07/2024] [Indexed: 04/11/2024]
Abstract
Translating genome-wide association study (GWAS) loci into causal variants and genes requires accurate cell-type-specific enhancer-gene maps from disease-relevant tissues. Building enhancer-gene maps is essential but challenging with current experimental methods in primary human tissues. Here we developed a nonparametric statistical method, SCENT (single-cell enhancer target gene mapping), that models association between enhancer chromatin accessibility and gene expression in single-cell or nucleus multimodal RNA sequencing and ATAC sequencing data. We applied SCENT to 9 multimodal datasets including >120,000 single cells or nuclei and created 23 cell-type-specific enhancer-gene maps. These maps were highly enriched for causal variants in expression quantitative loci and GWAS for 1,143 diseases and traits. We identified likely causal genes for both common and rare diseases and linked somatic mutation hotspots to target genes. We demonstrate that application of SCENT to multimodal data from disease-relevant human tissue enables the scalable construction of accurate cell-type-specific enhancer-gene maps, essential for defining noncoding variant function.
Collapse
Affiliation(s)
- Saori Sakaue
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kathryn Weinand
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Shakson Isaac
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kushal K Dey
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Karthik Jagadeesh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Masahiro Kanai
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | - Gerald F M Watts
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Zhu Zhu
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael B Brenner
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Andrew McDavid
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA
| | - Laura T Donlin
- Hospital for Special Surgery, New York, NY, USA
- Weill Cornell Medicine, New York, NY, USA
| | - Kevin Wei
- Division of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T. H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA
| | - Soumya Raychaudhuri
- Center for Data Sciences, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Divisions of Genetics and Rheumatology, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
25
|
Yuan CU, Quah FX, Hemberg M. Single-cell and spatial transcriptomics: Bridging current technologies with long-read sequencing. Mol Aspects Med 2024; 96:101255. [PMID: 38368637 DOI: 10.1016/j.mam.2024.101255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Revised: 01/30/2024] [Accepted: 02/07/2024] [Indexed: 02/20/2024]
Abstract
Single-cell technologies have transformed biomedical research over the last decade, opening up new possibilities for understanding cellular heterogeneity, both at the genomic and transcriptomic level. In addition, more recent developments of spatial transcriptomics technologies have made it possible to profile cells in their tissue context. In parallel, there have been substantial advances in sequencing technologies, and the third generation of methods are able to produce reads that are tens of kilobases long, with error rates matching the second generation short reads. Long reads technologies make it possible to better map large genome rearrangements and quantify isoform specific abundances. This further improves our ability to characterize functionally relevant heterogeneity. Here, we show how researchers have begun to combine single-cell, spatial transcriptomics, and long-read technologies, and how this is resulting in powerful new approaches to profiling both the genome and the transcriptome. We discuss the achievements so far, and we highlight remaining challenges and opportunities.
Collapse
Affiliation(s)
- Chengwei Ulrika Yuan
- Department of Biochemistry, University of Cambridge, Cambridge, UK; Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
| | - Fu Xiang Quah
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Martin Hemberg
- Gene Lay Institute, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
26
|
Wen X, Luo Z, Zhao W, Calandrelli R, Nguyen TC, Wan X, Charles Richard JL, Zhong S. Single-cell multiplex chromatin and RNA interactions in ageing human brain. Nature 2024; 628:648-656. [PMID: 38538789 PMCID: PMC11023937 DOI: 10.1038/s41586-024-07239-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 02/26/2024] [Indexed: 04/06/2024]
Abstract
Dynamically organized chromatin complexes often involve multiplex chromatin interactions and sometimes chromatin-associated RNA1-3. Chromatin complex compositions change during cellular differentiation and ageing, and are expected to be highly heterogeneous among terminally differentiated single cells4-7. Here we introduce the multinucleic acid interaction mapping in single cells (MUSIC) technique for concurrent profiling of multiplex chromatin interactions, gene expression and RNA-chromatin associations within individual nuclei. When applied to 14 human frontal cortex samples from older donors, MUSIC delineated diverse cortical cell types and states. We observed that nuclei exhibiting fewer short-range chromatin interactions were correlated with both an 'older' transcriptomic signature and Alzheimer's disease pathology. Furthermore, the cell type exhibiting chromatin contacts between cis expression quantitative trait loci and a promoter tends to be that in which these cis expression quantitative trait loci specifically affect the expression of their target gene. In addition, female cortical cells exhibit highly heterogeneous interactions between XIST non-coding RNA and chromosome X, along with diverse spatial organizations of the X chromosomes. MUSIC presents a potent tool for exploration of chromatin architecture and transcription at cellular resolution in complex tissues.
Collapse
Affiliation(s)
- Xingzhao Wen
- Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA
| | - Zhifei Luo
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
- Department of Genetics, School of Medicine, Stanford, CA, USA
| | - Wenxin Zhao
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | - Riccardo Calandrelli
- Institute of Engineering in Medicine, University of California San Diego, La Jolla, CA, USA
| | - Tri C Nguyen
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
- Department of Genetics, School of Medicine, Stanford, CA, USA
| | - Xueyi Wan
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA, USA
| | | | - Sheng Zhong
- Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA, USA.
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA, USA.
- Institute of Engineering in Medicine, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
27
|
Tian T, Lin S, Yang C. Beyond single cells: microfluidics empowering multiomics analysis. Anal Bioanal Chem 2024; 416:2203-2220. [PMID: 38008783 DOI: 10.1007/s00216-023-05028-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 10/26/2023] [Accepted: 10/30/2023] [Indexed: 11/28/2023]
Abstract
Single-cell multiomics technologies empower simultaneous measurement of multiple types of molecules within individual cells, providing a more profound comprehension compared with the analysis of discrete molecular layers from different cells. Microfluidic technology, on the other hand, has emerged as a pivotal facilitator for high-throughput single-cell analysis, offering precise control and manipulation of individual cells. The primary focus of this review encompasses an appraisal of cutting-edge microfluidic platforms employed in the realm of single-cell multiomics analysis. Furthermore, it discusses technological advancements in various single-cell omics such as genomics, transcriptomics, epigenomics, and proteomics, with their perspective applications. Finally, it provides future prospects of these integrated single-cell multiomics methodologies, shedding light on the possibilities for future biological research.
Collapse
Affiliation(s)
- Tian Tian
- Chemistry and Biomedicine Innovation Center (ChemBIC), School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, China
| | - Shichao Lin
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province, Xiamen, 361005, China
| | - Chaoyong Yang
- Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province, Xiamen, 361005, China.
- The MOE Key Laboratory of Spectrochemical Analysis and Instrumentation, State Key Laboratory of Physical Chemistry of Solid Surfaces, Department of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| |
Collapse
|
28
|
Chen F, Zou G, Wu Y, Ou-Yang L. Clustering single-cell multi-omics data via graph regularized multi-view ensemble learning. Bioinformatics 2024; 40:btae169. [PMID: 38547401 PMCID: PMC11015955 DOI: 10.1093/bioinformatics/btae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/21/2024] [Accepted: 03/26/2024] [Indexed: 04/15/2024] Open
Abstract
MOTIVATION Single-cell clustering plays a crucial role in distinguishing between cell types, facilitating the analysis of cell heterogeneity mechanisms. While many existing clustering methods rely solely on gene expression data obtained from single-cell RNA sequencing techniques to identify cell clusters, the information contained in mono-omic data is often limited, leading to suboptimal clustering performance. The emergence of single-cell multi-omics sequencing technologies enables the integration of multiple omics data for identifying cell clusters, but how to integrate different omics data effectively remains challenging. In addition, designing a clustering method that performs well across various types of multi-omics data poses a persistent challenge due to the data's inherent characteristics. RESULTS In this paper, we propose a graph-regularized multi-view ensemble clustering (GRMEC-SC) model for single-cell clustering. Our proposed approach can adaptively integrate multiple omics data and leverage insights from multiple base clustering results. We extensively evaluate our method on five multi-omics datasets through a series of rigorous experiments. The results of these experiments demonstrate that our GRMEC-SC model achieves competitive performance across diverse multi-omics datasets with varying characteristics. AVAILABILITY AND IMPLEMENTATION Implementation of GRMEC-SC, along with examples, can be found on the GitHub repository: https://github.com/polarisChen/GRMEC-SC.
Collapse
Affiliation(s)
- Fuqun Chen
- College of Electronic and Information Engineering, Shenzhen University, Shenzhen 518060, Guangdong, China
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen 518060, Guangdong, China
- Shenzhen Key Laboratory of Media Security and Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen 518060, Guangdong, China
| | - Guanhua Zou
- College of Electronic and Information Engineering, Shenzhen University, Shenzhen 518060, Guangdong, China
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen 518060, Guangdong, China
- Shenzhen Key Laboratory of Media Security and Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen 518060, Guangdong, China
| | - Yongxian Wu
- College of Electronic and Information Engineering, Shenzhen University, Shenzhen 518060, Guangdong, China
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen 518060, Guangdong, China
- Shenzhen Key Laboratory of Media Security and Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen 518060, Guangdong, China
| | - Le Ou-Yang
- College of Electronic and Information Engineering, Shenzhen University, Shenzhen 518060, Guangdong, China
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen 518060, Guangdong, China
- Shenzhen Key Laboratory of Media Security and Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen 518060, Guangdong, China
| |
Collapse
|
29
|
Zhang W, Cui Y, Liu B, Loza M, Park SJ, Nakai K. HyGAnno: hybrid graph neural network-based cell type annotation for single-cell ATAC sequencing data. Brief Bioinform 2024; 25:bbae152. [PMID: 38581422 PMCID: PMC10998639 DOI: 10.1093/bib/bbae152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 02/19/2024] [Accepted: 03/10/2024] [Indexed: 04/08/2024] Open
Abstract
Reliable cell type annotations are crucial for investigating cellular heterogeneity in single-cell omics data. Although various computational approaches have been proposed for single-cell RNA sequencing (scRNA-seq) annotation, high-quality cell labels are still lacking in single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) data, because of extreme sparsity and inconsistent chromatin accessibility between datasets. Here, we present a novel automated cell annotation method that transfers cell type information from a well-labeled scRNA-seq reference to an unlabeled scATAC-seq target, via a parallel graph neural network, in a semi-supervised manner. Unlike existing methods that utilize only gene expression or gene activity features, HyGAnno leverages genome-wide accessibility peak features to facilitate the training process. In addition, HyGAnno reconstructs a reference-target cell graph to detect cells with low prediction reliability, according to their specific graph connectivity patterns. HyGAnno was assessed across various datasets, showcasing its strengths in precise cell annotation, generating interpretable cell embeddings, robustness to noisy reference data and adaptability to tumor tissues.
Collapse
Affiliation(s)
- Weihang Zhang
- Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Yang Cui
- Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Bowen Liu
- Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, University of Tokyo, Tokyo, Japan
| | - Martin Loza
- Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Sung-Joon Park
- Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| | - Kenta Nakai
- Department of Computational Biology and Medical Sciences, Graduate school of Frontier Sciences, University of Tokyo, Tokyo, Japan
- Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
| |
Collapse
|
30
|
Wen X, Luo Z, Zhao W, Calandrelli R, Nguyen TC, Wan X, Richard JLC, Zhong S. Single-cell multiplex chromatin and RNA interactions in aging human brain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.06.28.546457. [PMID: 37425846 PMCID: PMC10326989 DOI: 10.1101/2023.06.28.546457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
The dynamically organized chromatin complexes often involve multiplex chromatin interactions and sometimes chromatin-associated RNA (caRNA) 1-3. Chromatin complex compositions change during cellular differentiation and aging, and are expected to be highly heterogeneous among terminally differentiated single cells 4-7. Here we introduce the Multi-Nucleic Acid Interaction Mapping in Single Cell (MUSIC) technique for concurrent profiling of multiplex chromatin interactions, gene expression, and RNA-chromatin associations within individual nuclei. Applied to 14 human frontal cortex samples from elderly donors, MUSIC delineates diverse cortical cell types and states. We observed the nuclei exhibiting fewer short-range chromatin interactions are correlated with an "older" transcriptomic signature and with Alzheimer's pathology. Furthermore, the cell type exhibiting chromatin contacts between cis expression quantitative trait loci (cis eQTLs) and a promoter tends to be the cell type where these cis eQTLs specifically affect their target gene's expression. Additionally, the female cortical cells exhibit highly heterogeneous interactions between the XIST non-coding RNA and Chromosome X, along with diverse spatial organizations of the X chromosomes. MUSIC presents a potent tool for exploring chromatin architecture and transcription at cellular resolution in complex tissues.
Collapse
Affiliation(s)
- Xingzhao Wen
- Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA 92093, USA
| | - Zhifei Luo
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Wenxin Zhao
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Riccardo Calandrelli
- Institute of Engineering in Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | - Tri C. Nguyen
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | - Xueyi Wan
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
| | | | - Sheng Zhong
- Program in Bioinformatics and Systems Biology, University of California San Diego, La Jolla, CA 92093, USA
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA 92093, USA
- Institute of Engineering in Medicine, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
31
|
Lim J, Park C, Kim M, Kim H, Kim J, Lee DS. Advances in single-cell omics and multiomics for high-resolution molecular profiling. Exp Mol Med 2024; 56:515-526. [PMID: 38443594 PMCID: PMC10984936 DOI: 10.1038/s12276-024-01186-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 12/05/2023] [Accepted: 12/13/2023] [Indexed: 03/07/2024] Open
Abstract
Single-cell omics technologies have revolutionized molecular profiling by providing high-resolution insights into cellular heterogeneity and complexity. Traditional bulk omics approaches average signals from heterogeneous cell populations, thereby obscuring important cellular nuances. Single-cell omics studies enable the analysis of individual cells and reveal diverse cell types, dynamic cellular states, and rare cell populations. These techniques offer unprecedented resolution and sensitivity, enabling researchers to unravel the molecular landscape of individual cells. Furthermore, the integration of multimodal omics data within a single cell provides a comprehensive and holistic view of cellular processes. By combining multiple omics dimensions, multimodal omics approaches can facilitate the elucidation of complex cellular interactions, regulatory networks, and molecular mechanisms. This integrative approach enhances our understanding of cellular systems, from development to disease. This review provides an overview of the recent advances in single-cell and multimodal omics for high-resolution molecular profiling. We discuss the principles and methodologies for representatives of each omics method, highlighting the strengths and limitations of the different techniques. In addition, we present case studies demonstrating the applications of single-cell and multimodal omics in various fields, including developmental biology, neurobiology, cancer research, immunology, and precision medicine.
Collapse
Affiliation(s)
- Jongsu Lim
- Department of Life Science, University of Seoul, Seoul, 02504, Republic of Korea
| | - Chanho Park
- Department of Life Science, University of Seoul, Seoul, 02504, Republic of Korea
| | - Minjae Kim
- Department of Life Science, University of Seoul, Seoul, 02504, Republic of Korea
| | - Hyukhee Kim
- Department of Life Science, University of Seoul, Seoul, 02504, Republic of Korea
| | - Junil Kim
- School of Systems Biomedical Science, Soongsil University, Seoul, 06978, Republic of Korea
| | - Dong-Sung Lee
- Department of Life Science, University of Seoul, Seoul, 02504, Republic of Korea.
| |
Collapse
|
32
|
Liu C, Wang L, Liu Z. Quantification and visualization of cis-regulatory dynamics in single-cell multi-omics data with TREASMO. NAR Genom Bioinform 2024; 6:lqae007. [PMID: 38312937 PMCID: PMC10836941 DOI: 10.1093/nargab/lqae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/21/2023] [Accepted: 01/31/2024] [Indexed: 02/06/2024] Open
Abstract
Recent advances in single-cell multi-omics technologies have provided unprecedented insights into regulatory processes. We introduce TREASMO, a versatile Python package designed to quantify and visualize transcriptional regulatory dynamics in single-cell multi-omics datasets. TREASMO has four modules, spanning data preparation, correlation quantification, downstream analysis and visualization, enabling comprehensive dataset exploration. By introducing a novel single-cell gene-peak correlation strength index, TREASMO facilitates accurate identification of regulatory changes at single-cell resolution. Validation on a hematopoietic stem and progenitor cell dataset showcases TREASMO's capacity in quantifying the gene-peak correlation strength at the single-cell level, identifying regulatory markers and discovering temporal regulatory patterns along the trajectory.
Collapse
Affiliation(s)
- Chaozhong Liu
- Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA
| | - Linhua Wang
- Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX 77030, USA
| | - Zhandong Liu
- Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston, TX 77030, USA
- Department of Pediatrics, Baylor College of Medicine, Houston,TX 77030, USA
| |
Collapse
|
33
|
Kaur H, Jha P, Ochatt SJ, Kumar V. Single-cell transcriptomics is revolutionizing the improvement of plant biotechnology research: recent advances and future opportunities. Crit Rev Biotechnol 2024; 44:202-217. [PMID: 36775666 DOI: 10.1080/07388551.2023.2165900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 11/04/2022] [Accepted: 12/08/2022] [Indexed: 02/14/2023]
Abstract
Single-cell approaches are a promising way to obtain high-resolution transcriptomics data and have the potential to revolutionize the study of plant growth and development. Recent years have seen the advent of unprecedented technological advances in the field of plant biology to study the transcriptional information of individual cells by single-cell RNA sequencing (scRNA-seq). This review focuses on the modern advancements of single-cell transcriptomics in plants over the past few years. In addition, it also offers a new insight of how these emerging methods will expedite advance research in plant biotechnology in the near future. Lastly, the various technological hurdles and inherent limitations of single-cell technology that need to be conquered to develop such outstanding possible knowledge gain is critically analyzed and discussed.
Collapse
Affiliation(s)
- Harmeet Kaur
- Division of Research and Development, Plant Biotechnology Lab, Lovely Professional University, Phagwara, Punjab, India
- Department of Biotechnology, Lovely Faculty of Technology and Sciences, Lovely Professional University, Phagwara, Punjab, India
| | - Priyanka Jha
- Department of Biotechnology, Lovely Faculty of Technology and Sciences, Lovely Professional University, Phagwara, Punjab, India
- Department of Research Facilitation, Division of Research and Development, Lovely Professional University, Phagwara, Punjab, India
| | - Sergio J Ochatt
- Agroécologie, InstitutAgro Dijon, INRAE, Univ. Bourgogne Franche-Comté, Dijon, France
| | - Vijay Kumar
- Division of Research and Development, Plant Biotechnology Lab, Lovely Professional University, Phagwara, Punjab, India
- Department of Biotechnology, Lovely Faculty of Technology and Sciences, Lovely Professional University, Phagwara, Punjab, India
| |
Collapse
|
34
|
Zhou G, Li T, Du J, Wu M, Lin D, Pu W, Zhang J, Gu Z. Harnessing HetHydrogel: A Universal Platform to Dropletize Single-Cell Multiomics. SMALL METHODS 2024:e2301631. [PMID: 38419597 DOI: 10.1002/smtd.202301631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 01/12/2024] [Indexed: 03/02/2024]
Abstract
A universal platform is developed for dropletizing single cell plate-based multiomic assays, consisting of three main pillars: a miniaturized open Heterogeneous Hydrogel reactor (abbreviated HetHydrogel) for multi-step biochemistry, its tunable permeability that allows Tn5 tagmentation, and single cell droplet barcoding. Through optimizing the HetHydrogel manufacturing procedure, the chemical composition, and cell permeation conditions, simultaneous high-throughput mitochondrial DNA genotyping and chromatin profiling at the single-cell level are demonstrated using a mixed-species experiment. This platform offers a powerful way to investigate the genotype-phenotype relationships of various mtDNA mutations in biological processes. The HetHydrogel platform is believed to have the potential to democratize droplet technologies, upgrading a whole range of plate-based single cell assays to high throughput format.
Collapse
Affiliation(s)
- Guoqiang Zhou
- Center for Mitochondrial Genetics and Health, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, 511458, China
| | - Ting Li
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jingjing Du
- Center for Mitochondrial Genetics and Health, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, 511458, China
| | - Mengying Wu
- Center for Mitochondrial Genetics and Health, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, 511458, China
| | - Deng Lin
- Center for Mitochondrial Genetics and Health, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, 511458, China
| | - Weilin Pu
- Center for Mitochondrial Genetics and Health, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, 511458, China
| | - Jingwei Zhang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, Shanghai, 200438, China
- Zhejiang Lab, Hangzhou, 310000, China
| | - Zhenglong Gu
- Center for Mitochondrial Genetics and Health, Greater Bay Area Institute of Precision Medicine (Guangzhou), Fudan University, Guangzhou, 511458, China
| |
Collapse
|
35
|
Kazwini NE, Sanguinetti G. SHARE-Topic: Bayesian interpretable modeling of single-cell multi-omic data. Genome Biol 2024; 25:55. [PMID: 38395871 PMCID: PMC10885556 DOI: 10.1186/s13059-024-03180-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 01/31/2024] [Indexed: 02/25/2024] Open
Abstract
Multi-omic single-cell technologies, which simultaneously measure the transcriptional and epigenomic state of the same cell, enable understanding epigenetic mechanisms of gene regulation. However, noisy and sparse data pose fundamental statistical challenges to extract biological knowledge from complex datasets. SHARE-Topic, a Bayesian generative model of multi-omic single cell data using topic models, aims to address these challenges. SHARE-Topic identifies common patterns of co-variation between different omic layers, providing interpretable explanations for the data complexity. Tested on data from different technological platforms, SHARE-Topic provides low dimensional representations recapitulating known biology and defines associations between genes and distal regulators in individual cells.
Collapse
Affiliation(s)
- Nour El Kazwini
- Theoretical and Scientific Data Science, Scuola Internazionale Superiore di Studi Avanzati, Trieste, Italy
| | - Guido Sanguinetti
- Theoretical and Scientific Data Science, Scuola Internazionale Superiore di Studi Avanzati, Trieste, Italy.
| |
Collapse
|
36
|
Kojima Y, Mii S, Hayashi S, Hirose H, Ishikawa M, Akiyama M, Enomoto A, Shimamura T. Single-cell colocalization analysis using a deep generative model. Cell Syst 2024; 15:180-192.e7. [PMID: 38387441 DOI: 10.1016/j.cels.2024.01.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 03/06/2023] [Accepted: 01/23/2024] [Indexed: 02/24/2024]
Abstract
Analyzing colocalization of single cells with heterogeneous molecular phenotypes is essential for understanding cell-cell interactions, and cellular responses to external stimuli and their biological functions in diseases and tissues. However, existing computational methodologies identified the colocalization patterns between predefined cell populations, which can obscure the molecular signatures arising from intercellular communication. Here, we introduce DeepCOLOR, a computational framework based on a deep generative model that recovers intercellular colocalization networks with single-cell resolution by the integration of single-cell and spatial transcriptomes. Along with colocalized population detection accuracy that is superior to existing methods in simulated dataset, DeepCOLOR identified plausible cell-cell interaction candidates between colocalized single cells and segregated cell populations defined by the colocalization relationships in mouse brain tissues, human squamous cell carcinoma samples, and human lung tissues infected with SARS-CoV-2. DeepCOLOR is applicable to studying cell-cell interactions behind various spatial niches. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Yasuhiro Kojima
- Laboratory of Computational Life Science, National Cancer Center Research Institute, Chuo-ku, Tokyo 104-0045, Japan; Department of Computational and Systems Biology, Medical Research Insitute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo 113-0034, Japan; Division of Systems Biology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan.
| | - Shinji Mii
- Department of Pathology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan
| | - Shuto Hayashi
- Department of Computational and Systems Biology, Medical Research Insitute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo 113-0034, Japan; Division of Systems Biology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan
| | - Haruka Hirose
- Department of Computational and Systems Biology, Medical Research Insitute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo 113-0034, Japan; Division of Systems Biology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan
| | - Masato Ishikawa
- Institute for Life and Medical Sciences, Kyoto University, Kyoto, Kyoto 606-8507, Japan; Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan
| | - Masashi Akiyama
- Department of Dermatology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan
| | - Atsushi Enomoto
- Department of Pathology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan
| | - Teppei Shimamura
- Department of Computational and Systems Biology, Medical Research Insitute, Tokyo Medical and Dental University, Bunkyo-ku, Tokyo 113-0034, Japan; Division of Systems Biology, Nagoya University Graduate School of Medicine, Nagoya, Aichi 466-8550, Japan.
| |
Collapse
|
37
|
Martini L, Bardini R, Savino A, Di Carlo S. Cross-Omic Transcription Factor Analysis: An Insight on Transcription Factor Accessibility and Expression Correlation. Genes (Basel) 2024; 15:268. [PMID: 38540327 PMCID: PMC10970009 DOI: 10.3390/genes15030268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 02/13/2024] [Accepted: 02/17/2024] [Indexed: 06/15/2024] Open
Abstract
It is well known how sequencing technologies propelled cellular biology research in recent years, providing incredible insight into the basic mechanisms of cells. Single-cell RNA sequencing is at the front in this field, with single-cell ATAC sequencing supporting it and becoming more popular. In this regard, multi-modal technologies play a crucial role, allowing the possibility to simultaneously perform the mentioned sequencing modalities on the same cells. Yet, there still needs to be a clear and dedicated way to analyze these multi-modal data. One of the current methods is to calculate the Gene Activity Matrix (GAM), which summarizes the accessibility of the genes at the genomic level, to have a more direct link with the transcriptomic data. However, this concept is not well defined, and it is unclear how various accessible regions impact the expression of the genes. Moreover, the transcription process is highly regulated by the transcription factors that bind to the different DNA regions. Therefore, this work presents a continuation of the meta-analysis of Genomic-Annotated Gene Activity Matrix (GAGAM) contributions, aiming to investigate the correlation between the TF expression and motif information in the different functional genomic regions to understand the different Transcription Factors (TFs) dynamics involved in different cell types.
Collapse
Affiliation(s)
| | | | | | - Stefano Di Carlo
- Control and Computer Engineering Department, Politecnico di Torino, 10129 Torino, Italy; (L.M.); (R.B.); (A.S.)
| |
Collapse
|
38
|
Pereira MF, Finazzi V, Rizzuti L, Aprile D, Aiello V, Mollica L, Riva M, Soriani C, Dossena F, Shyti R, Castaldi D, Tenderini E, Carminho-Rodrigues MT, Bally JF, de Vries BBA, Gabriele M, Vitriolo A, Testa G. YY1 mutations disrupt corticogenesis through a cell-type specific rewiring of cell-autonomous and non-cell-autonomous transcriptional programs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.16.580337. [PMID: 38405909 PMCID: PMC10888784 DOI: 10.1101/2024.02.16.580337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Germline mutations of YY1 cause Gabriele-de Vries syndrome (GADEVS), a neurodevelopmental disorder featuring intellectual disability and a wide range of systemic manifestations. To dissect the cellular and molecular mechanisms underlying GADEVS, we combined large-scale imaging, single-cell multiomics and gene regulatory network reconstruction in 2D and 3D patient-derived physiopathologically relevant cell lineages. YY1 haploinsufficiency causes a pervasive alteration of cell type specific transcriptional networks, disrupting corticogenesis at the level of neural progenitors and terminally differentiated neurons, including cytoarchitectural defects reminiscent of GADEVS clinical features. Transcriptional alterations in neurons propagated to neighboring astrocytes through a major non-cell autonomous pro-inflammatory effect that grounds the rationale for modulatory interventions. Together, neurodevelopmental trajectories, synaptic formation and neuronal-astrocyte cross talk emerged as salient domains of YY1 dosage-dependent vulnerability. Mechanistically, cell-type resolved reconstruction of gene regulatory networks uncovered the regulatory interplay between YY1, NEUROG2 and ETV5 and its aberrant rewiring in GADEVS. Our findings underscore the reach of advanced in vitro models in capturing developmental antecedents of clinical features and exposing their underlying mechanisms to guide the search for targeted interventions.
Collapse
Affiliation(s)
- Marlene F Pereira
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Veronica Finazzi
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Ludovico Rizzuti
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Davide Aprile
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Vittorio Aiello
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Luca Mollica
- Department of Medical Biotechnology and Translational Medicine, University of Milan, Italy
| | - Matteo Riva
- Department of Medical Biotechnology and Translational Medicine, University of Milan, Italy
| | - Chiara Soriani
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
| | | | - Reinald Shyti
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Davide Castaldi
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Erika Tenderini
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Telethon Institute of Genetics and Medicine (TIGEM), Pozzuoli, Italy
| | | | - Julien F Bally
- Service of Neurology, Department of Clinical Neurosciences, Lausanne University Hospital & University of Lausanne, Lausanne, Switzerland
| | | | - Michele Gabriele
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Department of Biological Engineering, Massachusetts Institute of Technology; Cambridge, MA 02139, USA
- The Broad Institute of MIT and Harvard; Cambridge, MA 02139, USA
- Koch Institute for Integrative Cancer Research; Cambridge, MA, 02139, USA
| | - Alessandro Vitriolo
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| | - Giuseppe Testa
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Via Adamello 16, 20139, Milan, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, Via Santa Sofia 9, 20122, Milan, Italy
- Human Technopole, Viale Rita Levi-Montalcini 1, 20157, Milan, Italy
| |
Collapse
|
39
|
Bai D, Zhang X, Xiang H, Guo Z, Zhu C, Yi C. Simultaneous single-cell analysis of 5mC and 5hmC with SIMPLE-seq. Nat Biotechnol 2024:10.1038/s41587-024-02148-9. [PMID: 38336903 DOI: 10.1038/s41587-024-02148-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
Dynamic 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications to DNA regulate gene expression in a cell-type-specific manner and are associated with various biological processes, but the two modalities have not yet been measured simultaneously from the same genome at the single-cell level. Here we present SIMPLE-seq, a scalable, base resolution method for joint analysis of 5mC and 5hmC from thousands of single cells. Based on orthogonal labeling and recording of 'C-to-T' mutational signals from 5mC and 5hmC sites, SIMPLE-seq detects these two modifications from the same molecules in single cells and enables unbiased DNA methylation dynamics analysis of heterogeneous biological samples. We applied this method to mouse embryonic stem cells, human peripheral blood mononuclear cells and mouse brain to give joint epigenome maps at single-cell and single-molecule resolution. Integrated analysis of these two cytosine modifications reveals distinct epigenetic patterns associated with divergent regulatory programs in different cell types as well as cell states.
Collapse
Affiliation(s)
- Dongsheng Bai
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Xiaoting Zhang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Huifen Xiang
- Department of Obstetrics and Gynecology, First Affiliated Hospital of Anhui Medical University, Anhui, China
- NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract, Anhui Medical University, Anhui, China
| | - Zijian Guo
- State Key Laboratory of Coordination Chemistry, Coordination Chemistry Institute, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, China
| | - Chenxu Zhu
- New York Genome Center, New York, NY, USA.
- Department of Physiology and Biophysics, Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA.
| | - Chengqi Yi
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
- Department of Chemical Biology and Synthetic and Functional Biomolecules Center, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
| |
Collapse
|
40
|
Lin Y, Wu TY, Chen X, Wan S, Chao B, Xin J, Yang JYH, Wong WH, Wang YXR. Data integration and inference of gene regulation using single-cell temporal multimodal data with scTIE. Genome Res 2024; 34:119-133. [PMID: 38190633 PMCID: PMC10903952 DOI: 10.1101/gr.277960.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 12/13/2023] [Indexed: 01/10/2024]
Abstract
Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space by using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal data sets, we show scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome data set we generated from differentiating mouse embryonic stem cells over time, we show scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.
Collapse
Affiliation(s)
- Yingxin Lin
- School of Mathematics and Statistics, The University of Sydney, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, NSW 2006, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR 999077, China
| | - Tung-Yu Wu
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| | - Xi Chen
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| | - Sheng Wan
- Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
| | - Brian Chao
- Department of Electrical Engineering, Stanford University, Stanford, California 94305-9505, USA
| | - Jingxue Xin
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| | - Jean Y H Yang
- School of Mathematics and Statistics, The University of Sydney, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, NSW 2006, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR 999077, China
| | - Wing H Wong
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA;
- Department of Biomedical Data Science, Stanford University, Stanford, California 94305-5464, USA
- Bio-X Program, Stanford University, Stanford, California 94305, USA
| | - Y X Rachel Wang
- School of Mathematics and Statistics, The University of Sydney, NSW 2006, Australia;
| |
Collapse
|
41
|
Liu J, Ma J, Wen J, Zhou X. A Cell Cycle-aware Network for Data Integration and Label Transferring of Single-cell RNA-seq and ATAC-seq. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.31.578213. [PMID: 38352302 PMCID: PMC10862874 DOI: 10.1101/2024.01.31.578213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
In recent years, the integration of single-cell multi-omics data has provided a more comprehensive understanding of cell functions and internal regulatory mechanisms from a non-single omics perspective, but it still suffers many challenges, such as omics-variance, sparsity, cell heterogeneity and confounding factors. As we know, cell cycle is regarded as a confounder when analyzing other factors in single-cell RNA-seq data, but it's not clear how it will work on the integrated single-cell multi-omics data. Here, we developed a Cell Cycle-Aware Network (CCAN) to remove cell cycle effects from the integrated single-cell multi-omics data while keeping the cell type-specific variations. This is the first computational model to study the cell-cycle effects in the integration of single-cell multi-omics data. Validations on several benchmark datasets show the out-standing performance of CCAN in a variety of downstream analyses and applications, including removing cell cycle effects and batch effects of scRNA-seq datasets from different protocols, integrating paired and unpaired scRNA-seq and scATAC-seq data, accurately transferring cell type labels from scRNA-seq to scATAC-seq data, and characterizing the differentiation process from hematopoietic stem cells to different lineages in the integration of differentiation data.
Collapse
|
42
|
Wang X, Wu X, Hong N, Jin W. Progress in single-cell multimodal sequencing and multi-omics data integration. Biophys Rev 2024; 16:13-28. [PMID: 38495443 PMCID: PMC10937857 DOI: 10.1007/s12551-023-01092-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 06/27/2023] [Indexed: 03/19/2024] Open
Abstract
With the rapid advance of single-cell sequencing technology, cell heterogeneity in various biological processes was dissected at different omics levels. However, single-cell mono-omics results in fragmentation of information and could not provide complete cell states. In the past several years, a variety of single-cell multimodal omics technologies have been developed to jointly profile multiple molecular modalities, including genome, transcriptome, epigenome, and proteome, from the same single cell. With the availability of single-cell multimodal omics data, we can simultaneously investigate the effects of genomic mutation or epigenetic modification on transcription and translation, and reveal the potential mechanisms underlying disease pathogenesis. Driven by the massive single-cell omics data, the integration method of single-cell multi-omics data has rapidly developed. Integration of the massive multi-omics single-cell data in public databases in the future will make it possible to construct a cell atlas of multi-omics, enabling us to comprehensively understand cell state and gene regulation at single-cell resolution. In this review, we summarized the experimental methods for single-cell multimodal omics data and computational methods for multi-omics data integration. We also discussed the future development of this field.
Collapse
Affiliation(s)
- Xuefei Wang
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Xinchao Wu
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Ni Hong
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Wenfei Jin
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
43
|
Zhang K, Zemke NR, Armand EJ, Ren B. A fast, scalable and versatile tool for analysis of single-cell omics data. Nat Methods 2024; 21:217-227. [PMID: 38191932 PMCID: PMC10864184 DOI: 10.1038/s41592-023-02139-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 11/23/2023] [Indexed: 01/10/2024]
Abstract
Single-cell omics technologies have revolutionized the study of gene regulation in complex tissues. A major computational challenge in analyzing these datasets is to project the large-scale and high-dimensional data into low-dimensional space while retaining the relative relationships between cells. This low dimension embedding is necessary to decompose cellular heterogeneity and reconstruct cell-type-specific gene regulatory programs. Traditional dimensionality reduction techniques, however, face challenges in computational efficiency and in comprehensively addressing cellular diversity across varied molecular modalities. Here we introduce a nonlinear dimensionality reduction algorithm, embodied in the Python package SnapATAC2, which not only achieves a more precise capture of single-cell omics data heterogeneities but also ensures efficient runtime and memory usage, scaling linearly with the number of cells. Our algorithm demonstrates exceptional performance, scalability and versatility across diverse single-cell omics datasets, including single-cell assay for transposase-accessible chromatin using sequencing, single-cell RNA sequencing, single-cell Hi-C and single-cell multi-omics datasets, underscoring its utility in advancing single-cell analysis.
Collapse
Affiliation(s)
- Kai Zhang
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Westlake Laboratory of Life Sciences and Biomedicine, School of Life Sciences, Westlake University, Hangzhou, China
| | - Nathan R Zemke
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA
| | - Ethan J Armand
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Bing Ren
- Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Center for Epigenomics, University of California, San Diego School of Medicine, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
44
|
Song D, Wang Q, Yan G, Liu T, Sun T, Li JJ. scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nat Biotechnol 2024; 42:247-252. [PMID: 37169966 PMCID: PMC11182337 DOI: 10.1038/s41587-023-01772-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 03/30/2023] [Indexed: 05/13/2023]
Abstract
We present a statistical simulator, scDesign3, to generate realistic single-cell and spatial omics data, including various cell states, experimental designs and feature modalities, by learning interpretable parameters from real data. Using a unified probabilistic model for single-cell and spatial omics data, scDesign3 infers biologically meaningful parameters; assesses the goodness-of-fit of inferred cell clusters, trajectories and spatial locations; and generates in silico negative and positive controls for benchmarking computational tools.
Collapse
Affiliation(s)
- Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, CA, USA
| | - Qingyang Wang
- Department of Statistics, University of California, Los Angeles, CA, USA
| | - Guanao Yan
- Department of Statistics, University of California, Los Angeles, CA, USA
| | - Tianyang Liu
- Department of Statistics, University of California, Los Angeles, CA, USA
| | - Tianyi Sun
- Department of Statistics, University of California, Los Angeles, CA, USA
| | - Jingyi Jessica Li
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, CA, USA.
- Department of Statistics, University of California, Los Angeles, CA, USA.
- Department of Human Genetics, University of California, Los Angeles, CA, USA.
- Department of Computational Medicine, University of California, Los Angeles, CA, USA.
- Department of Biostatistics, University of California, Los Angeles, CA, USA.
- Radcliffe Institute for Advanced Study, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
45
|
Hao Y, Stuart T, Kowalski MH, Choudhary S, Hoffman P, Hartman A, Srivastava A, Molla G, Madad S, Fernandez-Granda C, Satija R. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol 2024; 42:293-304. [PMID: 37231261 PMCID: PMC10928517 DOI: 10.1038/s41587-023-01767-y] [Citation(s) in RCA: 168] [Impact Index Per Article: 168.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 03/28/2023] [Indexed: 05/27/2023]
Abstract
Mapping single-cell sequencing profiles to comprehensive reference datasets provides a powerful alternative to unsupervised analysis. However, most reference datasets are constructed from single-cell RNA-sequencing data and cannot be used to annotate datasets that do not measure gene expression. Here we introduce 'bridge integration', a method to integrate single-cell datasets across modalities using a multiomic dataset as a molecular bridge. Each cell in the multiomic dataset constitutes an element in a 'dictionary', which is used to reconstruct unimodal datasets and transform them into a shared space. Our procedure accurately integrates transcriptomic data with independent single-cell measurements of chromatin accessibility, histone modifications, DNA methylation and protein levels. Moreover, we demonstrate how dictionary learning can be combined with sketching techniques to improve computational scalability and harmonize 8.6 million human immune cell profiles from sequencing and mass cytometry experiments. Our approach, implemented in version 5 of our Seurat toolkit ( http://www.satijalab.org/seurat ), broadens the utility of single-cell reference datasets and facilitates comparisons across diverse molecular modalities.
Collapse
Affiliation(s)
- Yuhan Hao
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Tim Stuart
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Madeline H Kowalski
- New York Genome Center, New York, NY, USA
- Institute for System Genetics, NYU Langone Medical Center, New York, NY, USA
| | - Saket Choudhary
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Paul Hoffman
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - Austin Hartman
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | - Avi Srivastava
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | | | - Shaista Madad
- Center for Genomics and Systems Biology, New York University, New York, NY, USA
- New York Genome Center, New York, NY, USA
| | - Carlos Fernandez-Granda
- Center for Data Science, New York University, New York, NY, USA
- Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
| | - Rahul Satija
- Center for Genomics and Systems Biology, New York University, New York, NY, USA.
- New York Genome Center, New York, NY, USA.
| |
Collapse
|
46
|
Bawa G, Liu Z, Yu X, Tran LSP, Sun X. Introducing single cell stereo-sequencing technology to transform the plant transcriptome landscape. TRENDS IN PLANT SCIENCE 2024; 29:249-265. [PMID: 37914553 DOI: 10.1016/j.tplants.2023.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 10/01/2023] [Accepted: 10/02/2023] [Indexed: 11/03/2023]
Abstract
Single cell RNA-sequencing (scRNA-seq) advancements have helped detect transcriptional heterogeneities in biological samples. However, scRNA-seq cannot currently provide high-resolution spatial transcriptome information or identify subcellular organs in biological samples. These limitations have led to the development of spatially enhanced-resolution omics-sequencing (Stereo-seq), which combines spatial information with single cell transcriptomics to address the challenges of scRNA-seq alone. In this review, we discuss the advantages of Stereo-seq technology. We anticipate that the application of such an integrated approach in plant research will advance our understanding of biological process in the plant transcriptomics era. We conclude with an outlook of how such integration will enhance crop improvement.
Collapse
Affiliation(s)
- George Bawa
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, PR China
| | - Zhixin Liu
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, PR China
| | - Xiaole Yu
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, PR China
| | - Lam-Son Phan Tran
- Institute of Genomics for Crop Abiotic Stress Tolerance, Department of Plant and Soil Science, Texas Tech University, Lubbock, TX 79409, USA.
| | - Xuwu Sun
- National Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Key Laboratory of Plant Stress Biology, School of Life Sciences, Henan University, 85 Minglun Street, Kaifeng 475001, PR China.
| |
Collapse
|
47
|
Liu J, Jiang P, Lu Z, Yu Z, Qian P. Decoding leukemia at the single-cell level: clonal architecture, classification, microenvironment, and drug resistance. Exp Hematol Oncol 2024; 13:12. [PMID: 38291542 PMCID: PMC10826069 DOI: 10.1186/s40164-024-00479-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 01/16/2024] [Indexed: 02/01/2024] Open
Abstract
Leukemias are refractory hematological malignancies, characterized by marked intrinsic heterogeneity which poses significant obstacles to effective treatment. However, traditional bulk sequencing techniques have not been able to effectively unravel the heterogeneity among individual tumor cells. With the emergence of single-cell sequencing technology, it has bestowed upon us an unprecedented resolution to comprehend the mechanisms underlying leukemogenesis and drug resistance across various levels, including the genome, epigenome, transcriptome and proteome. Here, we provide an overview of the currently prevalent single-cell sequencing technologies and a detailed summary of single-cell studies conducted on leukemia, with a specific focus on four key aspects: (1) leukemia's clonal architecture, (2) frameworks to determine leukemia subtypes, (3) tumor microenvironment (TME) and (4) the drug-resistant mechanisms of leukemia. This review provides a comprehensive summary of current single-cell studies on leukemia and highlights the markers and mechanisms that show promising clinical implications for the diagnosis and treatment of leukemia.
Collapse
Affiliation(s)
- Jianche Liu
- Center for Stem Cell and Regenerative Medicine and Bone Marrow Transplantation Center of the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310058, China
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- International Campus, Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, 718 East Haizhou Road, Haining, 314400, China
| | - Penglei Jiang
- Center for Stem Cell and Regenerative Medicine and Bone Marrow Transplantation Center of the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310058, China
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- Institute of Hematology, Zhejiang Engineering Laboratory for Stem Cell and Immunotherapy, Zhejiang University, Hangzhou, 310058, China
| | - Zezhen Lu
- Center for Stem Cell and Regenerative Medicine and Bone Marrow Transplantation Center of the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310058, China
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- International Campus, Zhejiang University-University of Edinburgh Institute (ZJU-UoE Institute), Zhejiang University School of Medicine, Zhejiang University, 718 East Haizhou Road, Haining, 314400, China
| | - Zebin Yu
- Center for Stem Cell and Regenerative Medicine and Bone Marrow Transplantation Center of the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310058, China
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- Institute of Hematology, Zhejiang Engineering Laboratory for Stem Cell and Immunotherapy, Zhejiang University, Hangzhou, 310058, China
| | - Pengxu Qian
- Center for Stem Cell and Regenerative Medicine and Bone Marrow Transplantation Center of the First Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou, 310058, China.
- Liangzhu Laboratory, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China.
- Institute of Hematology, Zhejiang Engineering Laboratory for Stem Cell and Immunotherapy, Zhejiang University, Hangzhou, 310058, China.
| |
Collapse
|
48
|
Yan F, Suzuki A, Iwaya C, Pei G, Chen X, Yoshioka H, Yu M, Simon LM, Iwata J, Zhao Z. Single-cell multiomics decodes regulatory programs for mouse secondary palate development. Nat Commun 2024; 15:821. [PMID: 38280850 PMCID: PMC10821874 DOI: 10.1038/s41467-024-45199-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/17/2024] [Indexed: 01/29/2024] Open
Abstract
Perturbations in gene regulation during palatogenesis can lead to cleft palate, which is among the most common congenital birth defects. Here, we perform single-cell multiome sequencing and profile chromatin accessibility and gene expression simultaneously within the same cells (n = 36,154) isolated from mouse secondary palate across embryonic days (E) 12.5, E13.5, E14.0, and E14.5. We construct five trajectories representing continuous differentiation of cranial neural crest-derived multipotent cells into distinct lineages. By linking open chromatin signals to gene expression changes, we characterize the underlying lineage-determining transcription factors. In silico perturbation analysis identifies transcription factors SHOX2 and MEOX2 as important regulators of the development of the anterior and posterior palate, respectively. In conclusion, our study charts epigenetic and transcriptional dynamics in palatogenesis, serving as a valuable resource for further cleft palate research.
Collapse
Affiliation(s)
- Fangfang Yan
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Akiko Suzuki
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Department of Oral and Craniofacial Sciences, School of Dentistry, University of Missouri - Kansas City, Kansas City, Missouri, 64108, USA
| | - Chihiro Iwaya
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
| | - Guangsheng Pei
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xian Chen
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Hiroki Yoshioka
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
| | - Meifang Yu
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Lukas M Simon
- Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Junichi Iwata
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA.
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA.
- MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, 77030, USA.
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
49
|
Zhang Z, Ruf-Zamojski F, Zamojski M, Bernard D, Chen X, Troyanskaya O, Sealfon S. Peak-agnostic high-resolution cis-regulatory circuitry mapping using single cell multiome data. Nucleic Acids Res 2024; 52:572-582. [PMID: 38084892 PMCID: PMC10810203 DOI: 10.1093/nar/gkad1166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/15/2023] [Accepted: 11/27/2023] [Indexed: 01/26/2024] Open
Abstract
Single same cell RNAseq/ATACseq multiome data provide unparalleled potential to develop high resolution maps of the cell-type specific transcriptional regulatory circuitry underlying gene expression. We present CREMA, a framework that recovers the full cis-regulatory circuitry by modeling gene expression and chromatin activity in individual cells without peak-calling or cell type labeling constraints. We demonstrate that CREMA overcomes the limitations of existing methods that fail to identify about half of functional regulatory elements which are outside the called chromatin 'peaks'. These circuit sites outside called peaks are shown to be important cell type specific functional regulatory loci, sufficient to distinguish individual cell types. Analysis of mouse pituitary data identifies a Gata2-circuit for the gonadotrope-enriched disease-associated Pcsk1 gene, which is experimentally validated by reduced gonadotrope expression in a gonadotrope conditional Gata2-knockout model. We present a web accessible human immune cell regulatory circuit resource, and provide CREMA as an R package.
Collapse
Affiliation(s)
- Zidong Zhang
- Department of Neurology, Center for Advanced Research on Diagnostic Assays, Icahn School of Medicine at Mount Sinai (ISMMS), New York, NY, USA
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Frederique Ruf-Zamojski
- Department of Neurology, Center for Advanced Research on Diagnostic Assays, Icahn School of Medicine at Mount Sinai (ISMMS), New York, NY, USA
| | - Michel Zamojski
- Department of Neurology, Center for Advanced Research on Diagnostic Assays, Icahn School of Medicine at Mount Sinai (ISMMS), New York, NY, USA
| | - Daniel J Bernard
- Department of Pharmacology and Therapeutics, McGill University, Montreal, QC H3G 1Y6, Canada
| | - Xi Chen
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Olga G Troyanskaya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Department of Computer Science, Princeton University, Princeton, NJ, USA
| | - Stuart C Sealfon
- Department of Neurology, Center for Advanced Research on Diagnostic Assays, Icahn School of Medicine at Mount Sinai (ISMMS), New York, NY, USA
| |
Collapse
|
50
|
Wang L, Nie R, Miao X, Cai Y, Wang A, Zhang H, Zhang J, Cai J. InClust+: the deep generative framework with mask modules for multimodal data integration, imputation, and cross-modal generation. BMC Bioinformatics 2024; 25:41. [PMID: 38267858 PMCID: PMC10809631 DOI: 10.1186/s12859-024-05656-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 01/15/2024] [Indexed: 01/26/2024] Open
Abstract
BACKGROUND With the development of single-cell technology, many cell traits can be measured. Furthermore, the multi-omics profiling technology could jointly measure two or more traits in a single cell simultaneously. In order to process the various data accumulated rapidly, computational methods for multimodal data integration are needed. RESULTS Here, we present inClust+, a deep generative framework for the multi-omics. It's built on previous inClust that is specific for transcriptome data, and augmented with two mask modules designed for multimodal data processing: an input-mask module in front of the encoder and an output-mask module behind the decoder. InClust+ was first used to integrate scRNA-seq and MERFISH data from similar cell populations, and to impute MERFISH data based on scRNA-seq data. Then, inClust+ was shown to have the capability to integrate the multimodal data (e.g. tri-modal data with gene expression, chromatin accessibility and protein abundance) with batch effect. Finally, inClust+ was used to integrate an unlabeled monomodal scRNA-seq dataset and two labeled multimodal CITE-seq datasets, transfer labels from CITE-seq datasets to scRNA-seq dataset, and generate the missing modality of protein abundance in monomodal scRNA-seq data. In the above examples, the performance of inClust+ is better than or comparable to the most recent tools in the corresponding task. CONCLUSIONS The inClust+ is a suitable framework for handling multimodal data. Meanwhile, the successful implementation of mask in inClust+ means that it can be applied to other deep learning methods with similar encoder-decoder architecture to broaden the application scope of these models.
Collapse
Affiliation(s)
- Lifei Wang
- Shulan (Hangzhou) Hospital, Affiliated to Zhejiang Shuren University Shulan International Medical College, Hangzhou, China.
| | - Rui Nie
- China National Center for Bioinformation, Beijing, China
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xuexia Miao
- China National Center for Bioinformation, Beijing, China
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yankai Cai
- School of Economic and Management, China University of Geoscience, Wuhan, China
| | - Anqi Wang
- Shulan (Hangzhou) Hospital, Affiliated to Zhejiang Shuren University Shulan International Medical College, Hangzhou, China
| | - Hanwen Zhang
- Shulan (Hangzhou) Hospital, Affiliated to Zhejiang Shuren University Shulan International Medical College, Hangzhou, China
| | - Jiang Zhang
- School of Systems Science, Beijing Normal University, Beijing, 100875, China.
| | - Jun Cai
- China National Center for Bioinformation, Beijing, China.
- Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|