1
|
Shen X, Zuo L, Ye Z, Yuan Z, Huang K, Li Z, Yu Q, Zou X, Wei X, Xu P, Deng Y, Jin X, Xu X, Wu L, Zhu H, Qin P. Inferring cell trajectories of spatial transcriptomics via optimal transport analysis. Cell Syst 2025; 16:101194. [PMID: 39904341 DOI: 10.1016/j.cels.2025.101194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 09/30/2024] [Accepted: 01/10/2025] [Indexed: 02/06/2025]
Abstract
The integration of cell transcriptomics and spatial position to organize differentiation trajectories remains a challenge. Here, we introduce SpaTrack, which leverages optimal transport to reconcile both gene expression and spatial position from spatial transcriptomics into the transition costs, thereby reconstructing cell differentiation. SpaTrack can construct detailed spatial trajectories that reflect the differentiation topology and trace cell dynamics across multiple samples over temporal intervals. To capture the dynamic drivers of differentiation, SpaTrack models cell fate as a function of expression profiles influenced by transcription factors over time. By applying SpaTrack, we successfully disentangle spatiotemporal trajectories of axolotl telencephalon regeneration and mouse midbrain development. Diverse malignant lineages expanding within a primary tumor are uncovered. One lineage, characterized by upregulated epithelial mesenchymal transition, implants at the metastatic site and subsequently colonizes to form a secondary tumor. Overall, SpaTrack efficiently advances trajectory inference from spatial transcriptomics, providing valuable insights into differentiation processes.
Collapse
Affiliation(s)
- Xunan Shen
- BGI Research, Chongqing 401329, China; BGI Research, Beijing 102601, China
| | | | | | - Zhongyang Yuan
- BGI Research, Chongqing 401329, China; State Key Laboratory of Medicinal Chemical Biology, College of Life Sciences, Nankai University, Tianjin 300071, China
| | - Ke Huang
- BGI Research, Chongqing 401329, China
| | - Zeyu Li
- BGI Research, Chongqing 401329, China
| | - Qichao Yu
- BGI Research, Chongqing 401329, China
| | - Xuanxuan Zou
- BGI Research, Chongqing 401329, China; Department of Neurology, Hubei Provincial Clinical Research Center for Parkinson's Disease, Xiangyang No. 1 People's Hospital, Hubei University of Medicine, Xiangyang 441000, China
| | | | - Ping Xu
- BGI Research, Chongqing 401329, China; BGI College & Henan Institute of Medical and Pharmaceutical Sciences, Zhengzhou University, Zhengzhou 450000, China
| | - Yaqi Deng
- Key Laboratory of Major Brain Disease and Aging Research (Ministry of Education), Institute for Brain Science and Disease, Chongqing Medical University, Chongqing, China
| | - Xin Jin
- BGI Research, Shenzhen 518083, China
| | - Xun Xu
- BGI Research, Shenzhen 518083, China.
| | - Liang Wu
- BGI Research, Chongqing 401329, China; BGI Research, Shenzhen 518083, China.
| | | | - Pengfei Qin
- BGI Research, Chongqing 401329, China; BGI Research, Shenzhen 518083, China.
| |
Collapse
|
2
|
Yamada T, Trentesaux C, Brunger JM, Xiao Y, Stevens AJ, Martyn I, Kasparek P, Shroff NP, Aguilar A, Bruneau BG, Boffelli D, Klein OD, Lim WA. Synthetic organizer cells guide development via spatial and biochemical instructions. Cell 2025; 188:778-795.e18. [PMID: 39706189 DOI: 10.1016/j.cell.2024.11.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 07/10/2024] [Accepted: 11/08/2024] [Indexed: 12/23/2024]
Abstract
In vitro development relies primarily on treating progenitor cells with media-borne morphogens and thus lacks native-like spatial information. Here, we engineer morphogen-secreting organizer cells programmed to self-assemble, via cell adhesion, around mouse embryonic stem (ES) cells in defined architectures. By inducing the morphogen WNT3A and its antagonist DKK1 from organizer cells, we generated diverse morphogen gradients, varying in range and steepness. These gradients were strongly correlated with morphogenetic outcomes: the range of minimum-maximum WNT activity determined the resulting range of anterior-to-posterior (A-P) axis cell lineages. Strikingly, shallow WNT activity gradients, despite showing truncated A-P lineages, yielded higher-resolution tissue morphologies, such as a beating, chambered cardiac-like structure associated with an endothelial network. Thus, synthetic organizer cells, which integrate spatial, temporal, and biochemical information, provide a powerful way to systematically and flexibly direct the development of ES or other progenitor cells in different directions within the morphogenetic landscape.
Collapse
Affiliation(s)
- Toshimichi Yamada
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Coralie Trentesaux
- Department of Orofacial Sciences and Program in Craniofacial Biology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jonathan M Brunger
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Yini Xiao
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Adam J Stevens
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Iain Martyn
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Petr Kasparek
- Department of Orofacial Sciences and Program in Craniofacial Biology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Neha P Shroff
- Department of Orofacial Sciences and Program in Craniofacial Biology, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Angelica Aguilar
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Benoit G Bruneau
- Gladstone Institutes, San Francisco, CA 94158, USA; Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94143, USA; Cardiovascular Research Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Dario Boffelli
- Department of Pediatrics, Cedars-Sinai Guerin Children's, Los Angeles, CA 90048, USA
| | - Ophir D Klein
- Department of Orofacial Sciences and Program in Craniofacial Biology, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Pediatrics, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Pediatrics, Cedars-Sinai Guerin Children's, Los Angeles, CA 90048, USA.
| | - Wendell A Lim
- Cell Design Institute and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
3
|
Wen Y, He H, Ma Y, Bao D, Cai LC, Wang H, Li Y, Zhao B, Cai Z. Computing hematopoiesis plasticity in response to genetic mutations and environmental stimulations. Life Sci Alliance 2025; 8:e202402971. [PMID: 39537342 PMCID: PMC11561260 DOI: 10.26508/lsa.202402971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 11/06/2024] [Accepted: 11/06/2024] [Indexed: 11/16/2024] Open
Abstract
Cell plasticity (CP), describing a dynamic cell state, plays a crucial role in maintaining homeostasis during organ morphogenesis, regeneration, and trauma-to-repair biological process. Single-cell-omics datasets provide an unprecedented resource to empower CP analysis. Hematopoiesis offers fertile opportunities to develop quantitative methods for understanding CP. In this study, we generated high-quality lineage-negative single-cell RNA-sequencing datasets under various conditions and introduced a working pipeline named scPlasticity to interrogate naïve and disturbed plasticity of hematopoietic stem and progenitor cells with mutational or environmental challenges. Using embedding methods UMAP or FA, a continuum of hematopoietic development is visually observed in wild type where the pipeline confirms a low proportion of hybrid cells ( P hc , with bias range: 0.4∼0.6) on a transition trajectory. Upon Tet2 mutation, a driver of leukemia, or treatment of DSS, an inducer of colitis, P hc is increased and plasticity of hematopoietic stem and progenitor cells was enhanced. We prioritized several transcription factors and signaling pathways, which are responsible for P hc alterations. In silico perturbation suggests knocking out EGR regulons or pathways of IL-1R1 and β-adrenoreceptor partially reverses P hc promoted by Tet2 mutation and inflammation.
Collapse
Affiliation(s)
- Yuchen Wen
- National Key Laboratory of Experimental Hematology, Tianjin, China
- Tianjin Key Laboratory of Inflammatory Biology, Department of Pharmacology, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
| | - Hang He
- National Key Laboratory of Experimental Hematology, Tianjin, China
- Tianjin Key Laboratory of Inflammatory Biology, Department of Pharmacology, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
| | - Yunxi Ma
- National Key Laboratory of Experimental Hematology, Tianjin, China
- Tianjin Key Laboratory of Inflammatory Biology, Department of Pharmacology, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
| | - Dengyi Bao
- National Key Laboratory of Experimental Hematology, Tianjin, China
- Tianjin Key Laboratory of Inflammatory Biology, Department of Pharmacology, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
| | - Lorie Chen Cai
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
| | - Huaquan Wang
- Department of Hematology, Tianjin Medical University Tianjin General Hospital, Tianjin, China
| | - Yanmei Li
- Department of Rheumatology and Immunology, Tianjin Medical University Tianjin General Hospital, Tianjin, China
| | - Baobing Zhao
- Department of Pharmacology, School of Pharmaceutical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, China
| | - Zhigang Cai
- National Key Laboratory of Experimental Hematology, Tianjin, China
- Tianjin Key Laboratory of Inflammatory Biology, Department of Pharmacology, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
- The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Science, Tianjin Medical University, Tianjin, China
- Department of Hematology, Tianjin Medical University Tianjin General Hospital, Tianjin, China
- Department of Rheumatology and Immunology, Tianjin Medical University Tianjin General Hospital, Tianjin, China
| |
Collapse
|
4
|
Gong J, Lee C, Kim H, Kim J, Jeon J, Park S, Cho K. Control of Cellular Differentiation Trajectories for Cancer Reversion. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2025; 12:e2402132. [PMID: 39661721 PMCID: PMC11744559 DOI: 10.1002/advs.202402132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 11/08/2024] [Indexed: 12/13/2024]
Abstract
Cellular differentiation is controlled by intricate layers of gene regulation, involving the modulation of gene expression by various transcriptional regulators. Due to the complexity of gene regulation, identifying master regulators across the differentiation trajectory has been a longstanding challenge. To tackle this problem, a computational framework, single-cell Boolean network inference and control (BENEIN), is presented. Applying BENEIN to human large intestinal single-cell transcriptome data, MYB, HDAC2, and FOXA2 are identified as the master regulators whose inhibition induces enterocyte differentiation. It is found that simultaneous knockdown of these master regulators can revert colorectal cancer cells into normal-like enterocytes by synergistically inducing differentiation and suppressing malignancy, which is validated by in vitro and in vivo experiments.
Collapse
Affiliation(s)
- Jeong‐Ryeol Gong
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| | - Chun‐Kyung Lee
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| | - Hoon‐Min Kim
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| | - Juhee Kim
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| | - Jaeog Jeon
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| | - Sunmin Park
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| | - Kwang‐Hyun Cho
- Department of Bio and Brain EngineeringKorea Advanced Institute of Science and TechnologyDaejeon34141Republic of Korea
| |
Collapse
|
5
|
Defard T, Desrentes A, Fouillade C, Mueller F. Homebuilt Imaging-Based Spatial Transcriptomics: Tertiary Lymphoid Structures as a Case Example. Methods Mol Biol 2025; 2864:77-105. [PMID: 39527218 DOI: 10.1007/978-1-0716-4184-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Spatial transcriptomics methods provide insight into the cellular heterogeneity and spatial architecture of complex, multicellular systems. Combining molecular and spatial information provides important clues to study tissue architecture in development and disease. Here, we present a comprehensive do-it-yourself (DIY) guide to perform such experiments at reduced costs leveraging open-source approaches. This guide spans the entire life cycle of a project, from its initial definition to experimental choices, wet lab approaches, instrumentation, and analysis. As a concrete example, we focus on tertiary lymphoid structures (TLS), which we use to develop typical questions that can be addressed by these approaches.
Collapse
Affiliation(s)
- Thomas Defard
- Institut Pasteur, Université Paris Cité, Photonic Bio-Imaging, Centre de Ressources et Recherches Technologiques (UTechS-PBI, C2RT), Paris, France
- Institut Pasteur, Université Paris Cité, Imaging and Modeling Unit, Paris, France
- Centre for Computational Biology (CBIO), Mines Paris, PSL University, Paris, France
- Institut Curie, PSL University, Paris, France
- INSERM, U900, Paris, France
| | - Auxence Desrentes
- UMRS1135 Sorbonne University, Paris, France
- INSERM U1135, Paris, France
- Team "Immune Microenvironment and Immunotherapy", Centre for Immunology and Microbial Infections (CIMI), Paris, France
| | - Charles Fouillade
- Institut Curie, Inserm U1021-CNRS UMR 3347, University Paris-Saclay, PSL Research University, Centre Universitaire, Orsay, France
| | - Florian Mueller
- Institut Pasteur, Université Paris Cité, Photonic Bio-Imaging, Centre de Ressources et Recherches Technologiques (UTechS-PBI, C2RT), Paris, France.
- Institut Pasteur, Université Paris Cité, Imaging and Modeling Unit, Paris, France.
| |
Collapse
|
6
|
Liao X, Kang L, Peng Y, Chai X, Xie P, Lin C, Ji H, Jiao Y, Liu J. Multivariate stochastic modeling for transcriptional dynamics with cell-specific latent time using SDEvelo. Nat Commun 2024; 15:10849. [PMID: 39738101 DOI: 10.1038/s41467-024-55146-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 11/28/2024] [Indexed: 01/01/2025] Open
Abstract
Recently, RNA velocity has driven a paradigmatic change in single-cell RNA sequencing (scRNA-seq) studies, allowing the reconstruction and prediction of directed trajectories in cell differentiation and state transitions. Most existing methods of dynamic modeling use ordinary differential equations (ODE) for individual genes without applying multivariate approaches. However, this modeling strategy inadequately captures the intrinsically stochastic nature of transcriptional dynamics governed by a cell-specific latent time across multiple genes, potentially leading to erroneous results. Here, we present SDEvelo, a generative approach to inferring RNA velocity by modeling the dynamics of unspliced and spliced RNAs via multivariate stochastic differential equations (SDE). Uniquely, SDEvelo explicitly models inherent uncertainty in transcriptional dynamics while estimating a cell-specific latent time across genes. Using both simulated and four scRNA-seq and spatial transcriptomics datasets, we show that SDEvelo can model the random dynamic patterns of mature-state cells while accurately detecting carcinogenesis. Additionally, the estimated gene-shared latent time can facilitate many downstream analyses for biological discovery. We demonstrate that SDEvelo is computationally scalable and applicable to both scRNA-seq and sequencing-based spatial transcriptomics data.
Collapse
Affiliation(s)
- Xu Liao
- School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen, China
- Centre for Quantitative Medicine, Duke-NUS Medical School, Singapore, Singapore
| | - Lican Kang
- Institute for Math and AI, Wuhan University, Wuhan, China
- School of Mathematics and Statistics, Wuhan University, Wuhan, China
| | - Yihao Peng
- School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen, China
| | - Xiaoran Chai
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
| | - Peng Xie
- School of Biological Science & Medical Engineering, Southeast University, Nanjing, China
| | - Chengqi Lin
- Key Laboratory of Developmental Genes and Human Disease, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Yuling Jiao
- School of Artificial Intelligence, Wuhan University, Wuhan, China.
- Hubei Key Laboratory of Computational Science, Wuhan University, Wuhan, China.
| | - Jin Liu
- School of Data Science, The Chinese University of Hong Kong-Shenzhen, Shenzhen, China.
| |
Collapse
|
7
|
Manchel A, Gee M, Vadigepalli R. From sampling to simulating: Single-cell multiomics in systems pathophysiological modeling. iScience 2024; 27:111322. [PMID: 39628578 PMCID: PMC11612781 DOI: 10.1016/j.isci.2024.111322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2024] Open
Abstract
As single-cell omics data sampling and acquisition methods have accumulated at an unprecedented rate, various data analysis pipelines have been developed for the inference of cell types, cell states and their distribution, state transitions, state trajectories, and state interactions. This presents a new opportunity in which single-cell omics data can be utilized to generate high-resolution, high-fidelity computational models. In this review, we discuss how single-cell omics data can be used to build computational models to simulate biological systems at various scales. We propose that single-cell data can be integrated with physiological information to generate organ-specific models, which can then be assembled to generate multi-organ systems pathophysiological models. Finally, we discuss how generic multi-organ models can be brought to the patient-specific level thus permitting their use in the clinical setting.
Collapse
Affiliation(s)
- Alexandra Manchel
- Daniel Baugh Institute of Functional Genomics/Computational Biology, Department of Pathology and Genomic Medicine, Thomas Jefferson University, Philadelphia, PA, USA
| | - Michelle Gee
- Daniel Baugh Institute of Functional Genomics/Computational Biology, Department of Pathology and Genomic Medicine, Thomas Jefferson University, Philadelphia, PA, USA
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE, USA
| | - Rajanikanth Vadigepalli
- Daniel Baugh Institute of Functional Genomics/Computational Biology, Department of Pathology and Genomic Medicine, Thomas Jefferson University, Philadelphia, PA, USA
| |
Collapse
|
8
|
Wang J, Ye F, Chai H, Jiang Y, Wang T, Ran X, Xia Q, Xu Z, Fu Y, Zhang G, Wu H, Guo G, Guo H, Ruan Y, Wang Y, Xing D, Xu X, Zhang Z. Advances and applications in single-cell and spatial genomics. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-024-2770-x. [PMID: 39792333 DOI: 10.1007/s11427-024-2770-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Accepted: 10/10/2024] [Indexed: 01/12/2025]
Abstract
The applications of single-cell and spatial technologies in recent times have revolutionized the present understanding of cellular states and the cellular heterogeneity inherent in complex biological systems. These advancements offer unprecedented resolution in the examination of the functional genomics of individual cells and their spatial context within tissues. In this review, we have comprehensively discussed the historical development and recent progress in the field of single-cell and spatial genomics. We have reviewed the breakthroughs in single-cell multi-omics technologies, spatial genomics methods, and the computational strategies employed toward the analyses of single-cell atlas data. Furthermore, we have highlighted the advances made in constructing cellular atlases and their clinical applications, particularly in the context of disease. Finally, we have discussed the emerging trends, challenges, and opportunities in this rapidly evolving field.
Collapse
Affiliation(s)
- Jingjing Wang
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Fang Ye
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Haoxi Chai
- Life Sciences Institute and The Second Affiliated Hospital, Zhejiang University, Hangzhou, 310058, China
| | - Yujia Jiang
- BGI Research, Shenzhen, 518083, China
- BGI Research, Hangzhou, 310030, China
| | - Teng Wang
- Biomedical Pioneering Innovation Center (BIOPIC) and School of Life Sciences, Peking University, Beijing, 100871, China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Xia Ran
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Institute of Hematology, Zhejiang University, Hangzhou, 310000, China
| | - Qimin Xia
- Biomedical Pioneering Innovation Center (BIOPIC) and School of Life Sciences, Peking University, Beijing, 100871, China
| | - Ziye Xu
- Department of Laboratory Medicine of The First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Yuting Fu
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Guodong Zhang
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Hanyu Wu
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, 310058, China
| | - Guoji Guo
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China.
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou, 310058, China.
- Zhejiang Provincial Key Lab for Tissue Engineering and Regenerative Medicine, Dr. Li Dak Sum & Yip Yio Chin Center for Stem Cell and Regenerative Medicine, Hangzhou, 310058, China.
- Institute of Hematology, Zhejiang University, Hangzhou, 310000, China.
| | - Hongshan Guo
- Bone Marrow Transplantation Center of the First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China.
- Institute of Hematology, Zhejiang University, Hangzhou, 310000, China.
| | - Yijun Ruan
- Life Sciences Institute and The Second Affiliated Hospital, Zhejiang University, Hangzhou, 310058, China.
| | - Yongcheng Wang
- Department of Laboratory Medicine of The First Affiliated Hospital & Liangzhu Laboratory, Zhejiang University School of Medicine, Hangzhou, 310058, China.
| | - Dong Xing
- Biomedical Pioneering Innovation Center (BIOPIC) and School of Life Sciences, Peking University, Beijing, 100871, China.
- Beijing Advanced Innovation Center for Genomics (ICG), Peking University, Beijing, 100871, China.
| | - Xun Xu
- BGI Research, Shenzhen, 518083, China.
- BGI Research, Hangzhou, 310030, China.
- Guangdong Provincial Key Laboratory of Genome Read and Write, BGI Research, Shenzhen, 518083, China.
| | - Zemin Zhang
- Biomedical Pioneering Innovation Center (BIOPIC) and School of Life Sciences, Peking University, Beijing, 100871, China.
| |
Collapse
|
9
|
Huang Z, Guo X, Qin J, Gao L, Ju F, Zhao C, Yu L. Accurate RNA velocity estimation based on multibatch network reveals complex lineage in batch scRNA-seq data. BMC Biol 2024; 22:290. [PMID: 39696422 PMCID: PMC11657662 DOI: 10.1186/s12915-024-02085-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 12/02/2024] [Indexed: 12/20/2024] Open
Abstract
RNA velocity, as an extension of trajectory inference, is an effective method for understanding cell development using single-cell RNA sequencing (scRNA-seq) experiments. However, existing RNA velocity methods are limited by the batch effect because they cannot directly correct for batch effects in the input data, which comprises spliced and unspliced matrices in a proportional relationship. This limitation can lead to an incorrect velocity stream. This paper introduces VeloVGI, which addresses this issue innovatively in two key ways. Firstly, it employs an optimal transport (OT) and mutual nearest neighbor (MNN) approach to construct neighbors in batch data. This strategy overcomes the limitations of existing methods that are affected by the batch effect. Secondly, VeloVGI improves upon VeloVI's velocity estimation by incorporating the graph structure into the encoder for more effective feature extraction. The effectiveness of VeloVGI is demonstrated in various scenarios, including the mouse spinal cord and olfactory bulb tissue, as well as on several public datasets. The results show that VeloVGI outperformed other methods in terms of metric performance.
Collapse
Affiliation(s)
- Zhaoyang Huang
- School of Computer Science and Technology, Xidian University, Xi'an 710071, Shaanxi, China
| | - Xinyang Guo
- School of Computer Science and Technology, Xidian University, Xi'an 710071, Shaanxi, China
| | - Jie Qin
- Orthopedic Department, The Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an 710071, Shaanxi, China
| | - Fen Ju
- Department of Rehabilitation Medicine, Xijing Hospital, Fourth Military Medical University, Xi'an, China
| | - Chenguang Zhao
- Department of Rehabilitation Medicine, Xijing Hospital, Fourth Military Medical University, Xi'an, China.
| | - Liang Yu
- School of Computer Science and Technology, Xidian University, Xi'an 710071, Shaanxi, China.
| |
Collapse
|
10
|
Lederer AR, Leonardi M, Talamanca L, Bobrovskiy DM, Herrera A, Droin C, Khven I, Carvalho HJF, Valente A, Dominguez Mantes A, Mulet Arabí P, Pinello L, Naef F, La Manno G. Statistical inference with a manifold-constrained RNA velocity model uncovers cell cycle speed modulations. Nat Methods 2024; 21:2271-2286. [PMID: 39482463 PMCID: PMC11621032 DOI: 10.1038/s41592-024-02471-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 09/15/2024] [Indexed: 11/03/2024]
Abstract
Across biological systems, cells undergo coordinated changes in gene expression, resulting in transcriptome dynamics that unfold within a low-dimensional manifold. While low-dimensional dynamics can be extracted using RNA velocity, these algorithms can be fragile and rely on heuristics lacking statistical control. Moreover, the estimated vector field is not dynamically consistent with the traversed gene expression manifold. To address these challenges, we introduce a Bayesian model of RNA velocity that couples velocity field and manifold estimation in a reformulated, unified framework, identifying the parameters of an explicit dynamical system. Focusing on the cell cycle, we implement VeloCycle to study gene regulation dynamics on one-dimensional periodic manifolds and validate its ability to infer cell cycle periods using live imaging. We also apply VeloCycle to reveal speed differences in regionally defined progenitors and Perturb-seq gene knockdowns. Overall, VeloCycle expands the single-cell RNA sequencing analysis toolkit with a modular and statistically consistent RNA velocity inference framework.
Collapse
Affiliation(s)
- Alex R Lederer
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Maxine Leonardi
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Lorenzo Talamanca
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Daniil M Bobrovskiy
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Antonio Herrera
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Colas Droin
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Irina Khven
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Hugo J F Carvalho
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Alessandro Valente
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Albert Dominguez Mantes
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Laboratory of Bioimage Analysis and Computational Microscopy, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Pau Mulet Arabí
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Luca Pinello
- Molecular Pathology Unit, Massachusetts General Research Institute, Charlestown, MA, USA
- Massachusetts General Hospital Cancer Center, Harvard Medical School, Charlestown, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Felix Naef
- Laboratory of Computational and Systems Biology, Institute of Bioengineering, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| | - Gioele La Manno
- Laboratory of Brain Development and Biological Data Science, Brain Mind Institute, Faculty of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
| |
Collapse
|
11
|
Xu X, Wen Q, Lan T, Zeng L, Zeng Y, Lin S, Qiu M, Na X, Yang C. Time-resolved single-cell transcriptomic sequencing. Chem Sci 2024; 15:19225-19246. [PMID: 39568874 PMCID: PMC11575584 DOI: 10.1039/d4sc05700g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2024] [Accepted: 10/19/2024] [Indexed: 11/22/2024] Open
Abstract
Cells experience continuous transformation under both physiological and pathological circumstances. Single-cell RNA sequencing (scRNA-seq) is competent in disclosing the disparities of cells; nevertheless, it poses challenges in linking the individual cell state at distinct time points. Although computational approaches based on scRNA-seq data have been put forward for trajectory analysis, the result is based on assumptions and fails to reflect the actual states. Consequently, it is necessary to incorporate a "time anchor" into the scRNA-seq library for the temporal documentation of the dynamic expression pattern. This review comprehensively overviews the time-resolved single-cell transcriptomic sequencing methodologies and applications. As scRNA-seq functions as the basis for profiling single-cell expression patterns, the review initially introduces various scRNA-seq approaches. Subsequently, the review focuses on the different experimental strategies for introducing a "time anchor" to scRNA-seq, highlighting their principles, strengths, weaknesses, and comparing their adaptation in various scenarios. Next, it provides a brief summary of applications in immunity response, cancer progression, and embryo development. Finally, the review concludes with a forward-looking perspective on future advancements in time-resolved single-cell transcriptomic sequencing.
Collapse
Affiliation(s)
- Xing Xu
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
- Department of Laboratory Medicine, Key Laboratory of Clinical Laboratory Technology for Precision Medicine, School of Medical Technology and Engineering, Fujian Medical University Fuzhou 350122 China
| | - Qianxi Wen
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Tianchen Lan
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Liuqing Zeng
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Yonghao Zeng
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Shiyan Lin
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Minghao Qiu
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Xing Na
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
| | - Chaoyong Yang
- The MOE Key Laboratory of Spectrochemical Analysis & Instrumentation, The Key Laboratory of Chemical Biology of Fujian Province, State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Department of Chemical Biology, Department of Chemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University Xiamen 361005 China
- Institute of Molecular Medicine, Renji Hospital, Shanghai Jiao Tong University School of Medicine Shanghai 200127 China
| |
Collapse
|
12
|
Zhang J, Chakravarthy M, Singh R. scMultiNODE : Integrative Model for Multi-Modal Temporal Single-Cell Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.27.620531. [PMID: 39554192 PMCID: PMC11565911 DOI: 10.1101/2024.10.27.620531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/19/2024]
Abstract
Measuring single-cell genomic profiles at different timepoints enables our understanding of cell development. This understanding is more comprehensive when we perform an integrative analysis of multiple measurements (or modalities) across various developmental stages. However, obtaining such measurements from the same set of single cells is resource-intensive, restricting our ability to study them integratively. We propose an unsupervised integration model, scMultiNODE, that integrates gene expression and chromatin accessibility measurements in developing single cells while preserving cell type variations and cellular dynamics. scMultiNODE uses autoencoders to learn nonlinear low-dimensional cell representation and optimal transport to align cells across different measurements. Next, it utilizes neural ordinary differential equations to explicitly model cell development with a regularization term to learn a dynamic latent space. Our experiments on four real-world developmental single-cell datasets show that scMultiNODE can integrate temporally profiled multi-modal single-cell measurements better than existing methods that focus on cell type variations and tend to ignore cellular dynamics. We also show that scMultiNODE's joint latent space helps with the downstream analysis of single-cell development.
Collapse
Affiliation(s)
- Jiaqi Zhang
- Department of Computer Science, Brown University
| | | | - Ritambhara Singh
- Department of Computer Science, Brown University
- Center for Computational Molecular Biology, Brown University
| |
Collapse
|
13
|
Zhang J, Larschan E, Bigness J, Singh R. scNODE : generative model for temporal single cell transcriptomic data prediction. Bioinformatics 2024; 40:ii146-ii154. [PMID: 39230694 PMCID: PMC11373355 DOI: 10.1093/bioinformatics/btae393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024] Open
Abstract
SUMMARY Measurement of single-cell gene expression at different timepoints enables the study of cell development. However, due to the resource constraints and technical challenges associated with the single-cell experiments, researchers can only profile gene expression at discrete and sparsely sampled timepoints. This missing timepoint information impedes downstream cell developmental analyses. We propose scNODE, an end-to-end deep learning model that can predict in silico single-cell gene expression at unobserved timepoints. scNODE integrates a variational autoencoder with neural ordinary differential equations to predict gene expression using a continuous and nonlinear latent space. Importantly, we incorporate a dynamic regularization term to learn a latent space that is robust against distribution shifts when predicting single-cell gene expression at unobserved timepoints. Our evaluations on three real-world scRNA-seq datasets show that scNODE achieves higher predictive performance than state-of-the-art methods. We further demonstrate that scNODE's predictions help cell trajectory inference under the missing timepoint paradigm and the learned latent space is useful for in silico perturbation analysis of relevant genes along a developmental cell path. AVAILABILITY AND IMPLEMENTATION The data and code are publicly available at https://github.com/rsinghlab/scNODE.
Collapse
Affiliation(s)
- Jiaqi Zhang
- Department of Computer Science, Brown University, Providence, RI 02906, United States
| | - Erica Larschan
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, United States
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI 02912, United States
| | - Jeremy Bigness
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, United States
| | - Ritambhara Singh
- Department of Computer Science, Brown University, Providence, RI 02906, United States
- Center for Computational Molecular Biology, Brown University, Providence, RI 02912, United States
| |
Collapse
|
14
|
Hossain I, Fanfani V, Fischer J, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. Genome Biol 2024; 25:127. [PMID: 38773638 PMCID: PMC11106922 DOI: 10.1186/s13059-024-03264-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 04/30/2024] [Indexed: 05/24/2024] Open
Abstract
BACKGROUND Gene regulatory network (GRN) models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such gene regulatory ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the underlying GRN governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impede either scalability, explainability, or both. RESULTS We developed PHOENIX, a modeling framework based on neural ordinary differential equations (NeuralODEs) and Hill-Langmuir kinetics, that overcomes limitations of other methods by flexibly incorporating prior domain knowledge and biological constraints to promote sparse, biologically interpretable representations of GRN ODEs. We tested the accuracy of PHOENIX in a series of in silico experiments, benchmarking it against several currently used tools. We demonstrated PHOENIX's flexibility by modeling regulation of oscillating expression profiles obtained from synchronized yeast cells. We also assessed the scalability of PHOENIX by modeling genome-scale GRNs for breast cancer samples ordered in pseudotime and for B cells treated with Rituximab. CONCLUSIONS PHOENIX uses a combination of user-defined prior knowledge and functional forms from systems biology to encode biological "first principles" as soft constraints on the GRN allowing us to predict subsequent gene expression patterns in a biologically explainable manner.
Collapse
Affiliation(s)
| | - Viola Fanfani
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Jonas Fischer
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Rebekka Burkholz
- CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
| |
Collapse
|
15
|
Maizels RJ, Snell DM, Briscoe J. Reconstructing developmental trajectories using latent dynamical systems and time-resolved transcriptomics. Cell Syst 2024; 15:411-424.e9. [PMID: 38754365 DOI: 10.1016/j.cels.2024.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 02/01/2024] [Accepted: 04/17/2024] [Indexed: 05/18/2024]
Abstract
The snapshot nature of single-cell transcriptomics presents a challenge for studying the dynamics of cell fate decisions. Metabolic labeling and splicing can provide temporal information at single-cell level, but current methods have limitations. Here, we present a framework that overcomes these limitations: experimentally, we developed sci-FATE2, an optimized method for metabolic labeling with increased data quality, which we used to profile 45,000 embryonic stem (ES) cells differentiating into neural tube identities. Computationally, we developed a two-stage framework for dynamical modeling: VelvetVAE, a variational autoencoder (VAE) for velocity inference that outperforms all other tools tested, and VelvetSDE, a neural stochastic differential equation (nSDE) framework for simulating trajectory distributions. These recapitulate underlying dataset distributions and capture features such as decision boundaries between alternative fates and fate-specific gene expression. These methods recast single-cell analyses from descriptions of observed data to models of the dynamics that generated them, providing a framework for investigating developmental fate decisions.
Collapse
Affiliation(s)
- Rory J Maizels
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK; University College, London, UK
| | - Daniel M Snell
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - James Briscoe
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.
| |
Collapse
|
16
|
Arroyo-Esquivel J, Klausmeier CA, Litchman E. Using neural ordinary differential equations to predict complex ecological dynamics from population density data. J R Soc Interface 2024; 21:20230604. [PMID: 38745459 DOI: 10.1098/rsif.2023.0604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 03/25/2024] [Indexed: 05/16/2024] Open
Abstract
Simple models have been used to describe ecological processes for over a century. However, the complexity of ecological systems makes simple models subject to modelling bias due to simplifying assumptions or unaccounted factors, limiting their predictive power. Neural ordinary differential equations (NODEs) have surged as a machine-learning algorithm that preserves the dynamic nature of the data (Chen et al. 2018 Adv. Neural Inf. Process. Syst.). Although preserving the dynamics in the data is an advantage, the question of how NODEs perform as a forecasting tool of ecological communities is unanswered. Here, we explore this question using simulated time series of competing species in a time-varying environment. We find that NODEs provide more precise forecasts than autoregressive integrated moving average (ARIMA) models. We also find that untuned NODEs have a similar forecasting accuracy to untuned long-short term memory neural networks and both are outperformed in accuracy and precision by empirical dynamical modelling . However, we also find NODEs generally outperform all other methods when evaluating with the interval score, which evaluates precision and accuracy in terms of prediction intervals rather than pointwise accuracy. We also discuss ways to improve the forecasting performance of NODEs. The power of a forecasting tool such as NODEs is that it can provide insights into population dynamics and should thus broaden the approaches to studying time series of ecological communities.
Collapse
Affiliation(s)
| | - Christopher A Klausmeier
- Department of Global Ecology, Carnegie Institution for Science , Stanford, CA, USA
- W. K. Kellogg Biological Station, Michigan State University , Hickory Corners, MI, USA
- Program in Ecology and Evolutionary Biology, Michigan State University , East Lansing, MI, USA
- Department of Integrative Biology, Michigan State University , East Lansing, MI, USA
- Department of Plant Biology, Michigan State University , East Lansing, MI, USA
| | - Elena Litchman
- Department of Global Ecology, Carnegie Institution for Science , Stanford, CA, USA
- W. K. Kellogg Biological Station, Michigan State University , Hickory Corners, MI, USA
- Program in Ecology and Evolutionary Biology, Michigan State University , East Lansing, MI, USA
- Department of Integrative Biology, Michigan State University , East Lansing, MI, USA
| |
Collapse
|
17
|
Gao CF, Vaikuntanathan S, Riesenfeld SJ. Dissection and integration of bursty transcriptional dynamics for complex systems. Proc Natl Acad Sci U S A 2024; 121:e2306901121. [PMID: 38669186 PMCID: PMC11067469 DOI: 10.1073/pnas.2306901121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 03/06/2024] [Indexed: 04/28/2024] Open
Abstract
RNA velocity estimation is a potentially powerful tool to reveal the directionality of transcriptional changes in single-cell RNA-sequencing data, but it lacks accuracy, absent advanced metabolic labeling techniques. We developed an approach, TopicVelo, that disentangles simultaneous, yet distinct, dynamics by using a probabilistic topic model, a highly interpretable form of latent space factorization, to infer cells and genes associated with individual processes, thereby capturing cellular pluripotency or multifaceted functionality. Focusing on process-associated cells and genes enables accurate estimation of process-specific velocities via a master equation for a transcriptional burst model accounting for intrinsic stochasticity. The method obtains a global transition matrix by leveraging cell topic weights to integrate process-specific signals. In challenging systems, this method accurately recovers complex transitions and terminal states, while our use of first-passage time analysis provides insights into transient transitions. These results expand the limits of RNA velocity, empowering future studies of cell fate and functional responses.
Collapse
Affiliation(s)
- Cheng Frank Gao
- Department of Chemistry, University of Chicago, Chicago, IL60637
| | - Suriyanarayanan Vaikuntanathan
- Department of Chemistry, University of Chicago, Chicago, IL60637
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL60637
| | - Samantha J. Riesenfeld
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL60637
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL60637
- Department of Medicine, University of Chicago, Chicago, IL60637
- Committee on Immunology, Biological Sciences Division, University of Chicago, Chicago, IL60637
| |
Collapse
|
18
|
Maizels RJ. A dynamical perspective: moving towards mechanism in single-cell transcriptomics. Philos Trans R Soc Lond B Biol Sci 2024; 379:20230049. [PMID: 38432314 PMCID: PMC10909508 DOI: 10.1098/rstb.2023.0049] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 10/31/2023] [Indexed: 03/05/2024] Open
Abstract
As the field of single-cell transcriptomics matures, research is shifting focus from phenomenological descriptions of cellular phenotypes to a mechanistic understanding of the gene regulation underneath. This perspective considers the value of capturing dynamical information at single-cell resolution for gaining mechanistic insight; reviews the available technologies for recording and inferring temporal information in single cells; and explores whether better dynamical resolution is sufficient to adequately capture the causal relationships driving complex biological systems. This article is part of a discussion meeting issue 'Causes and consequences of stochastic processes in development and disease'.
Collapse
Affiliation(s)
- Rory J. Maizels
- The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
- University College London, London WC1E 6BT, UK
| |
Collapse
|
19
|
Xie C, Yang Y, Yu H, He Q, Yuan M, Dong B, Zhang L, Yang M. RNA velocity prediction via neural ordinary differential equation. iScience 2024; 27:109635. [PMID: 38623336 PMCID: PMC11016905 DOI: 10.1016/j.isci.2024.109635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 12/04/2023] [Accepted: 03/26/2024] [Indexed: 04/17/2024] Open
Abstract
RNA velocity is a crucial tool for unraveling the trajectory of cellular responses. Several approaches, including ordinary differential equations and machine learning models, have been proposed to interpret velocity. However, the practicality of these methods is constrained by underlying assumptions. In this study, we introduce SymVelo, a dual-path framework that effectively integrates high- and low-dimensional information. Rigorous benchmarking and extensive studies demonstrate that SymVelo is capable of inferring differentiation trajectories in developing organs, analyzing gene responses to stimulation, and uncovering transcription dynamics. Moreover, the adaptable architecture of SymVelo enables customization to accommodate intricate data and diverse modalities in forthcoming research, thereby providing a promising avenue for advancing our understanding of cellular behavior.
Collapse
Affiliation(s)
- Chenxi Xie
- MGI, BGI-Shenzhen, Shenzhen 518083, China
| | | | - Hao Yu
- Peking University, Beijing 100871, China
| | - Qiushun He
- MGI, BGI-Shenzhen, Shenzhen 518083, China
| | | | - Bin Dong
- Peking University, Beijing 100871, China
| | - Li Zhang
- Peking University, Beijing 100871, China
| | - Meng Yang
- MGI, BGI-Shenzhen, Shenzhen 518083, China
| |
Collapse
|
20
|
Zhang K, Zhu J, Kong D, Zhang Z. Modeling single cell trajectory using forward-backward stochastic differential equations. PLoS Comput Biol 2024; 20:e1012015. [PMID: 38620017 PMCID: PMC11018287 DOI: 10.1371/journal.pcbi.1012015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Accepted: 03/22/2024] [Indexed: 04/17/2024] Open
Abstract
Recent advances in single-cell sequencing technology have provided opportunities for mathematical modeling of dynamic developmental processes at the single-cell level, such as inferring developmental trajectories. Optimal transport has emerged as a promising theoretical framework for this task by computing pairings between cells from different time points. However, optimal transport methods have limitations in capturing nonlinear trajectories, as they are static and can only infer linear paths between endpoints. In contrast, stochastic differential equations (SDEs) offer a dynamic and flexible approach that can model non-linear trajectories, including the shape of the path. Nevertheless, existing SDE methods often rely on numerical approximations that can lead to inaccurate inferences, deviating from true trajectories. To address this challenge, we propose a novel approach combining forward-backward stochastic differential equations (FBSDE) with a refined approximation procedure. Our FBSDE model integrates the forward and backward movements of two SDEs in time, aiming to capture the underlying dynamics of single-cell developmental trajectories. Through comprehensive benchmarking on multiple scRNA-seq datasets, we demonstrate the superior performance of FBSDE compared to other methods, highlighting its efficacy in accurately inferring developmental trajectories.
Collapse
Affiliation(s)
- Kevin Zhang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Junhao Zhu
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Dehan Kong
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Zhaolei Zhang
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
21
|
Liu Y, Huang K, Chen W. Resolving cellular dynamics using single-cell temporal transcriptomics. Curr Opin Biotechnol 2024; 85:103060. [PMID: 38194753 DOI: 10.1016/j.copbio.2023.103060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 12/04/2023] [Accepted: 12/10/2023] [Indexed: 01/11/2024]
Abstract
Cellular dynamics, the transition of a cell from one state to another, is central to understanding developmental processes and disease progression. Single-cell transcriptomics has been pushing the frontiers of cellular dynamics studies into a genome-wide and single-cell level. While most single-cell RNA sequencing approaches are disruptive and only provide a snapshot of cell states, the dynamics of a cell could be reconstructed by either exploiting temporal information hiding in the transcriptomics data or integrating additional information. In this review, we describe these approaches, highlighting their underlying principles, key assumptions, and the rationality to interpret the results as models. We also discuss the recently emerging nondisruptive live-cell transcriptomics methods, which are highly complementary to the computational models for their assumption-free nature.
Collapse
Affiliation(s)
- Yifei Liu
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Kai Huang
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Wanze Chen
- Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China.
| |
Collapse
|
22
|
Ye F, Wang J, Li J, Mei Y, Guo G. Mapping Cell Atlases at the Single-Cell Level. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2305449. [PMID: 38145338 PMCID: PMC10885669 DOI: 10.1002/advs.202305449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/01/2023] [Indexed: 12/26/2023]
Abstract
Recent advancements in single-cell technologies have led to rapid developments in the construction of cell atlases. These atlases have the potential to provide detailed information about every cell type in different organisms, enabling the characterization of cellular diversity at the single-cell level. Global efforts in developing comprehensive cell atlases have profound implications for both basic research and clinical applications. This review provides a broad overview of the cellular diversity and dynamics across various biological systems. In addition, the incorporation of machine learning techniques into cell atlas analyses opens up exciting prospects for the field of integrative biology.
Collapse
Affiliation(s)
- Fang Ye
- Bone Marrow Transplantation Center of the First Affiliated Hospital, and Center for Stem Cell and Regenerative MedicineZhejiang University School of MedicineHangzhouZhejiang310000China
- Liangzhu LaboratoryZhejiang UniversityHangzhouZhejiang311121China
| | - Jingjing Wang
- Bone Marrow Transplantation Center of the First Affiliated Hospital, and Center for Stem Cell and Regenerative MedicineZhejiang University School of MedicineHangzhouZhejiang310000China
- Liangzhu LaboratoryZhejiang UniversityHangzhouZhejiang311121China
| | - Jiaqi Li
- Bone Marrow Transplantation Center of the First Affiliated Hospital, and Center for Stem Cell and Regenerative MedicineZhejiang University School of MedicineHangzhouZhejiang310000China
| | - Yuqing Mei
- Bone Marrow Transplantation Center of the First Affiliated Hospital, and Center for Stem Cell and Regenerative MedicineZhejiang University School of MedicineHangzhouZhejiang310000China
| | - Guoji Guo
- Bone Marrow Transplantation Center of the First Affiliated Hospital, and Center for Stem Cell and Regenerative MedicineZhejiang University School of MedicineHangzhouZhejiang310000China
- Liangzhu LaboratoryZhejiang UniversityHangzhouZhejiang311121China
- Zhejiang Provincial Key Lab for Tissue Engineering and Regenerative MedicineDr. Li Dak Sum & Yip Yio Chin Center for Stem Cell and Regenerative MedicineHangzhouZhejiang310058China
- Institute of HematologyZhejiang UniversityHangzhouZhejiang310000China
| |
Collapse
|
23
|
Lederer AR, Leonardi M, Talamanca L, Herrera A, Droin C, Khven I, Carvalho HJF, Valente A, Mantes AD, Arabí PM, Pinello L, Naef F, Manno GL. Statistical inference with a manifold-constrained RNA velocity model uncovers cell cycle speed modulations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.18.576093. [PMID: 38328127 PMCID: PMC10849531 DOI: 10.1101/2024.01.18.576093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Across a range of biological processes, cells undergo coordinated changes in gene expression, resulting in transcriptome dynamics that unfold within a low-dimensional manifold. Single-cell RNA-sequencing (scRNA-seq) only measures temporal snapshots of gene expression. However, information on the underlying low-dimensional dynamics can be extracted using RNA velocity, which models unspliced and spliced RNA abundances to estimate the rate of change of gene expression. Available RNA velocity algorithms can be fragile and rely on heuristics that lack statistical control. Moreover, the estimated vector field is not dynamically consistent with the traversed gene expression manifold. Here, we develop a generative model of RNA velocity and a Bayesian inference approach that solves these problems. Our model couples velocity field and manifold estimation in a reformulated, unified framework, so as to coherently identify the parameters of an autonomous dynamical system. Focusing on the cell cycle, we implemented VeloCycle to study gene regulation dynamics on one-dimensional periodic manifolds and validated using live-imaging its ability to infer actual cell cycle periods. We benchmarked RNA velocity inference with sensitivity analyses and demonstrated one- and multiple-sample testing. We also conducted Markov chain Monte Carlo inference on the model, uncovering key relationships between gene-specific kinetics and our gene-independent velocity estimate. Finally, we applied VeloCycle to in vivo samples and in vitro genome-wide Perturb-seq, revealing regionally-defined proliferation modes in neural progenitors and the effect of gene knockdowns on cell cycle speed. Ultimately, VeloCycle expands the scRNA-seq analysis toolkit with a modular and statistically rigorous RNA velocity inference framework.
Collapse
|
24
|
Li M, Guo H, Wang B, Han Z, Wu S, Liu J, Huang H, Zhu J, An F, Lin Z, Mo K, Tan J, Liu C, Wang L, Deng X, Li G, Ji J, Ouyang H. The single-cell transcriptomic atlas and RORA-mediated 3D epigenomic remodeling in driving corneal epithelial differentiation. Nat Commun 2024; 15:256. [PMID: 38177186 PMCID: PMC10766623 DOI: 10.1038/s41467-023-44471-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 12/13/2023] [Indexed: 01/06/2024] Open
Abstract
Proper differentiation of corneal epithelial cells (CECs) from limbal stem/progenitor cells (LSCs) is required for maintenance of ocular homeostasis and clear vision. Here, using a single-cell transcriptomic atlas, we delineate the comprehensive and refined molecular regulatory dynamics during human CEC development and differentiation. We find that RORA is a CEC-specific molecular switch that initiates and drives LSCs to differentiate into mature CECs by activating PITX1. RORA dictates CEC differentiation by establishing CEC-specific enhancers and chromatin interactions between CEC gene promoters and distal regulatory elements. Conversely, RORA silences LSC-specific promoters and disrupts promoter-anchored chromatin loops to turn off LSC genes. Collectively, our work provides detailed and comprehensive insights into the transcriptional dynamics and RORA-mediated epigenetic remodeling underlying human corneal epithelial differentiation.
Collapse
Affiliation(s)
- Mingsen Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China.
| | - Huizhen Guo
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Bofeng Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Zhuo Han
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Siqi Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Jiafeng Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Huaxing Huang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Jin Zhu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Fengjiao An
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Zesong Lin
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Kunlun Mo
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Jieying Tan
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Chunqiao Liu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Li Wang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China
| | - Xin Deng
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, 999077, China
| | - Guigang Li
- Department of Ophthalmology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei Province, China
| | - Jianping Ji
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China.
| | - Hong Ouyang
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangdong Provincial Clinical Research Center for Ocular Diseases, Guangzhou, 510060, China.
| |
Collapse
|
25
|
Zhang C, Fang Y, Chen W, Chen Z, Zhang Y, Xie Y, Chen W, Xie Z, Guo M, Wang J, Tan C, Wang H, Tang C. Improving the RNA velocity approach with single-cell RNA lifecycle (nascent, mature and degrading RNAs) sequencing technologies. Nucleic Acids Res 2023; 51:e112. [PMID: 37941145 PMCID: PMC10711548 DOI: 10.1093/nar/gkad969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 09/27/2023] [Accepted: 10/14/2023] [Indexed: 11/10/2023] Open
Abstract
We presented an experimental method called FLOUR-seq, which combines BD Rhapsody and nanopore sequencing to detect the RNA lifecycle (including nascent, mature, and degrading RNAs) in cells. Additionally, we updated our HIT-scISOseq V2 to discover a more accurate RNA lifecycle using 10x Chromium and Pacbio sequencing. Most importantly, to explore how single-cell full-length RNA sequencing technologies could help improve the RNA velocity approach, we introduced a new algorithm called 'Region Velocity' to more accurately configure cellular RNA velocity. We applied this algorithm to study spermiogenesis and compared the performance of FLOUR-seq with Pacbio-based HIT-scISOseq V2. Our findings demonstrated that 'Region Velocity' is more suitable for analyzing single-cell full-length RNA data than traditional RNA velocity approaches. These novel methods could be useful for researchers looking to discover full-length RNAs in single cells and comprehensively monitor RNA lifecycle in cells.
Collapse
Affiliation(s)
| | | | - Weitian Chen
- BGI, Shenzhen 518000, China
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen 518083, China
| | | | - Ying Zhang
- Guangdong Provincial Reproductive Science Institute (Guangdong Provincial Fertility Hospital), Guangzhou, China; NHC Key Laboratory of Male Reproduction and Genetics, Guangzhou, China
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Jackson CA, Beheler-Amass M, Tjärnberg A, Suresh I, Hickey ASM, Bonneau R, Gresham D. Simultaneous estimation of gene regulatory network structure and RNA kinetics from single cell gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.21.558277. [PMID: 37790443 PMCID: PMC10542544 DOI: 10.1101/2023.09.21.558277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Cells respond to environmental and developmental stimuli by remodeling their transcriptomes through regulation of both mRNA transcription and mRNA decay. A central goal of biology is identifying the global set of regulatory relationships between factors that control mRNA production and degradation and their target transcripts and construct a predictive model of gene expression. Regulatory relationships are typically identified using transcriptome measurements and causal inference algorithms. RNA kinetic parameters are determined experimentally by employing run-on or metabolic labeling (e.g. 4-thiouracil) methods that allow transcription and decay rates to be separately measured. Here, we develop a deep learning model, trained with single-cell RNA-seq data, that both infers causal regulatory relationships and estimates RNA kinetic parameters. The resulting in silico model predicts future gene expression states and can be perturbed to simulate the effect of transcription factor changes. We acquired model training data by sequencing the transcriptomes of 175,000 individual Saccharomyces cerevisiae cells that were subject to an external perturbation and continuously sampled over a one hour period. The rate of change for each transcript was calculated on a per-cell basis to estimate RNA velocity. We then trained a deep learning model with transcriptome and RNA velocity data to calculate time-dependent estimates of mRNA production and decay rates. By separating RNA velocity into transcription and decay rates, we show that rapamycin treatment causes existing ribosomal protein transcripts to be rapidly destabilized, while production of new transcripts gradually slows over the course of an hour. The neural network framework we present is designed to explicitly model causal regulatory relationships between transcription factors and their genes, and shows superior performance to existing models on the basis of recovery of known regulatory relationships. We validated the predictive power of the model by perturbing transcription factors in silico and comparing transcriptome-wide effects with experimental data. Our study represents the first step in constructing a complete, predictive, biophysical model of gene expression regulation.
Collapse
Affiliation(s)
- Christopher A Jackson
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Maggie Beheler-Amass
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Andreas Tjärnberg
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Ina Suresh
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | - Angela Shang-mei Hickey
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| | | | - David Gresham
- Center For Genomics and Systems Biology, New York University, New York, NY, USA
- Department of Biology, New York University, New York, NY, USA
| |
Collapse
|
27
|
Groves SM, Quaranta V. Quantifying cancer cell plasticity with gene regulatory networks and single-cell dynamics. FRONTIERS IN NETWORK PHYSIOLOGY 2023; 3:1225736. [PMID: 37731743 PMCID: PMC10507267 DOI: 10.3389/fnetp.2023.1225736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023]
Abstract
Phenotypic plasticity of cancer cells can lead to complex cell state dynamics during tumor progression and acquired resistance. Highly plastic stem-like states may be inherently drug-resistant. Moreover, cell state dynamics in response to therapy allow a tumor to evade treatment. In both scenarios, quantifying plasticity is essential for identifying high-plasticity states or elucidating transition paths between states. Currently, methods to quantify plasticity tend to focus on 1) quantification of quasi-potential based on the underlying gene regulatory network dynamics of the system; or 2) inference of cell potency based on trajectory inference or lineage tracing in single-cell dynamics. Here, we explore both of these approaches and associated computational tools. We then discuss implications of each approach to plasticity metrics, and relevance to cancer treatment strategies.
Collapse
Affiliation(s)
- Sarah M. Groves
- Department of Pharmacology, Vanderbilt University, Nashville, TN, United States
| | - Vito Quaranta
- Department of Pharmacology, Vanderbilt University, Nashville, TN, United States
- Department of Biochemistry, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
28
|
Erfanian N, Heydari AA, Feriz AM, Iañez P, Derakhshani A, Ghasemigol M, Farahpour M, Razavi SM, Nasseri S, Safarpour H, Sahebkar A. Deep learning applications in single-cell genomics and transcriptomics data analysis. Biomed Pharmacother 2023; 165:115077. [PMID: 37393865 DOI: 10.1016/j.biopha.2023.115077] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Revised: 06/22/2023] [Accepted: 06/23/2023] [Indexed: 07/04/2023] Open
Abstract
Traditional bulk sequencing methods are limited to measuring the average signal in a group of cells, potentially masking heterogeneity, and rare populations. The single-cell resolution, however, enhances our understanding of complex biological systems and diseases, such as cancer, the immune system, and chronic diseases. However, the single-cell technologies generate massive amounts of data that are often high-dimensional, sparse, and complex, thus making analysis with traditional computational approaches difficult and unfeasible. To tackle these challenges, many are turning to deep learning (DL) methods as potential alternatives to the conventional machine learning (ML) algorithms for single-cell studies. DL is a branch of ML capable of extracting high-level features from raw inputs in multiple stages. Compared to traditional ML, DL models have provided significant improvements across many domains and applications. In this work, we examine DL applications in genomics, transcriptomics, spatial transcriptomics, and multi-omics integration, and address whether DL techniques will prove to be advantageous or if the single-cell omics domain poses unique challenges. Through a systematic literature review, we have found that DL has not yet revolutionized the most pressing challenges of the single-cell omics field. However, using DL models for single-cell omics has shown promising results (in many cases outperforming the previous state-of-the-art models) in data preprocessing and downstream analysis. Although developments of DL algorithms for single-cell omics have generally been gradual, recent advances reveal that DL can offer valuable resources in fast-tracking and advancing research in single-cell.
Collapse
Affiliation(s)
- Nafiseh Erfanian
- Student Research Committee, Birjand University of Medical Sciences, Birjand, Iran
| | - A Ali Heydari
- Department of Applied Mathematics, University of California, Merced, CA, USA; Health Sciences Research Institute, University of California, Merced, CA, USA
| | - Adib Miraki Feriz
- Student Research Committee, Birjand University of Medical Sciences, Birjand, Iran
| | - Pablo Iañez
- Cellular Systems Genomics Group, Josep Carreras Research Institute, Barcelona, Spain
| | - Afshin Derakhshani
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada
| | | | - Mohsen Farahpour
- Department of Electronics, Faculty of Electrical and Computer Engineering, University of Birjand, Birjand, Iran
| | - Seyyed Mohammad Razavi
- Department of Electronics, Faculty of Electrical and Computer Engineering, University of Birjand, Birjand, Iran
| | - Saeed Nasseri
- Cellular and Molecular Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | - Hossein Safarpour
- Cellular and Molecular Research Center, Birjand University of Medical Sciences, Birjand, Iran.
| | - Amirhossein Sahebkar
- Biotechnology Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran; Applied Biomedical Research Center, Mashhad University of Medical Sciences, Mashhad, Iran; Department of Biotechnology, School of Pharmacy, Mashhad University of Medical Sciences, Mashhad, Iran.
| |
Collapse
|
29
|
Erbe R, Stein-O’Brien G, Fertig EJ. Transcriptomic forecasting with neural ordinary differential equations. PATTERNS (NEW YORK, N.Y.) 2023; 4:100793. [PMID: 37602211 PMCID: PMC10435954 DOI: 10.1016/j.patter.2023.100793] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 04/03/2023] [Accepted: 06/13/2023] [Indexed: 08/22/2023]
Abstract
Single-cell transcriptomics technologies can uncover changes in the molecular states that underlie cellular phenotypes. However, understanding the dynamic cellular processes requires extending from inferring trajectories from snapshots of cellular states to estimating temporal changes in cellular gene expression. To address this challenge, we have developed a neural ordinary differential-equation-based method, RNAForecaster, for predicting gene expression states in single cells for multiple future time steps in an embedding-independent manner. We demonstrate that RNAForecaster can accurately predict future expression states in simulated single-cell transcriptomic data with cellular tracking over time. We then show that by using metabolic labeling single-cell RNA sequencing (scRNA-seq) data from constitutively dividing cells, RNAForecaster accurately recapitulates many of the expected changes in gene expression during progression through the cell cycle over a 3-day period. Thus, RNAForecaster enables short-term estimation of future expression states in biological systems from high-throughput datasets with temporal information.
Collapse
Affiliation(s)
- Rossin Erbe
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins Convergence Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
| | - Genevieve Stein-O’Brien
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Johns Hopkins Convergence Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Kavli Neurodiscovery Institute, Baltimore, MD, USA
- Single Cell Training and Analysis Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elana J. Fertig
- Johns Hopkins Convergence Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
- Johns Hopkins Bloomberg Kimmel Institute for Immunotherapy, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
- Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
30
|
Burdziak C, Zhao CJ, Haviv D, Alonso-Curbelo D, Lowe SW, Pe’er D. scKINETICS: inference of regulatory velocity with single-cell transcriptomics data. Bioinformatics 2023; 39:i394-i403. [PMID: 37387147 PMCID: PMC10311321 DOI: 10.1093/bioinformatics/btad267] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Transcriptional dynamics are governed by the action of regulatory proteins and are fundamental to systems ranging from normal development to disease. RNA velocity methods for tracking phenotypic dynamics ignore information on the regulatory drivers of gene expression variability through time. RESULTS We introduce scKINETICS (Key regulatory Interaction NETwork for Inferring Cell Speed), a dynamical model of gene expression change which is fit with the simultaneous learning of per-cell transcriptional velocities and a governing gene regulatory network. Fitting is accomplished through an expectation-maximization approach designed to learn the impact of each regulator on its target genes, leveraging biologically motivated priors from epigenetic data, gene-gene coexpression, and constraints on cells' future states imposed by the phenotypic manifold. Applying this approach to an acute pancreatitis dataset recapitulates a well-studied axis of acinar-to-ductal transdifferentiation whilst proposing novel regulators of this process, including factors with previously appreciated roles in driving pancreatic tumorigenesis. In benchmarking experiments, we show that scKINETICS successfully extends and improves existing velocity approaches to generate interpretable, mechanistic models of gene regulatory dynamics. AVAILABILITY AND IMPLEMENTATION All python code and an accompanying Jupyter notebook with demonstrations are available at http://github.com/dpeerlab/scKINETICS.
Collapse
Affiliation(s)
- Cassandra Burdziak
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, Sloan Kettering Institute, 408 E 69th Street, New York, NY 10021, United States
| | - Chujun Julia Zhao
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, Sloan Kettering Institute, 408 E 69th Street, New York, NY 10021, United States
- Department of Biomedical Engineering, Columbia University, 1210 Amsterdam Ave, New York, NY 10027, United States
| | - Doron Haviv
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, Sloan Kettering Institute, 408 E 69th Street, New York, NY 10021, United States
| | - Direna Alonso-Curbelo
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology (BIST), Carrer de Baldiri Reixac, 10, Barcelona 08028, Spain
- Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, Sloan Kettering Institute, 408 E 69th Street, New York, NY 10021, United States
| | - Scott W Lowe
- Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, Sloan Kettering Institute, 408 E 69th Street, New York, NY 10021, United States
- Howard Hughes Medical Institute, 4000 Jones Bridge Road, Chevy Chase, Maryland 20815, United States
| | - Dana Pe’er
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, Sloan Kettering Institute, 408 E 69th Street, New York, NY 10021, United States
- Howard Hughes Medical Institute, 4000 Jones Bridge Road, Chevy Chase, Maryland 20815, United States
| |
Collapse
|
31
|
Li Q. scTour: a deep learning architecture for robust inference and accurate prediction of cellular dynamics. Genome Biol 2023; 24:149. [PMID: 37353848 PMCID: PMC10290357 DOI: 10.1186/s13059-023-02988-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 06/13/2023] [Indexed: 06/25/2023] Open
Abstract
Despite the continued efforts, a batch-insensitive tool that can both infer and predict the developmental dynamics using single-cell genomics is lacking. Here, I present scTour, a novel deep learning architecture to perform robust inference and accurate prediction of cellular dynamics with minimal influence from batch effects. For inference, scTour simultaneously estimates the developmental pseudotime, delineates the vector field, and maps the transcriptomic latent space under a single, integrated framework. For prediction, scTour precisely reconstructs the underlying dynamics of unseen cellular states or a new independent dataset. scTour's functionalities are demonstrated in a variety of biological processes from 19 datasets.
Collapse
Affiliation(s)
- Qian Li
- Department of Pathology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
32
|
Gao CF, Vaikuntanathan S, Riesenfeld SJ. Dissection and Integration of Bursty Transcriptional Dynamics for Complex Systems. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.13.544828. [PMID: 37398022 PMCID: PMC10312759 DOI: 10.1101/2023.06.13.544828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
RNA velocity estimation is a potentially powerful tool to reveal the directionality of transcriptional changes in single-cell RNA-seq data, but it lacks accuracy, absent advanced metabolic labeling techniques. We developed a novel approach, TopicVelo, that disentangles simultaneous, yet distinct, dynamics by using a probabilistic topic model, a highly interpretable form of latent space factorization, to infer cells and genes associated with individual processes, thereby capturing cellular pluripotency or multifaceted functionality. Focusing on process-associated cells and genes enables accurate estimation of process-specific velocities via a master equation for a transcriptional burst model accounting for intrinsic stochasticity. The method obtains a global transition matrix by leveraging cell topic weights to integrate process-specific signals. In challenging systems, this method accurately recovers complex transitions and terminal states, while our novel use of first-passage time analysis provides insights into transient transitions. These results expand the limits of RNA velocity, empowering future studies of cell fate and functional responses.
Collapse
Affiliation(s)
| | | | - Samantha J Riesenfeld
- Institute for Biophysical Dynamics, University of Chicago, IL
- Pritzker School of Molecular Engineering, University of Chicago, IL
- Department of Medicine, University of Chicago, IL
- Committee on Immunology, University of Chicago, IL
| |
Collapse
|
33
|
Vodovotz Y. Towards systems immunology of critical illness at scale: from single cell 'omics to digital twins. Trends Immunol 2023; 44:345-355. [PMID: 36967340 PMCID: PMC10147586 DOI: 10.1016/j.it.2023.03.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 03/06/2023] [Accepted: 03/07/2023] [Indexed: 04/05/2023]
Abstract
Single-cell 'omics methodology has yielded unprecedented insights based largely on data-centric informatics for reducing, and thus interpreting, massive datasets. In parallel, parsimonious mathematical modeling based on abstractions of pathobiology has also yielded major insights into inflammation and immunity, with these models being extended to describe multi-organ disease pathophysiology as the basis of 'digital twins' and in silico clinical trials. The integration of these distinct methods at scale can drive both basic and translational advances, especially in the context of critical illness, including diseases such as COVID-19. Here, I explore achievements and argue the challenges that are inherent to the integration of data-driven and mechanistic modeling approaches, highlighting the potential of modeling-based strategies for rational immune system reprogramming.
Collapse
Affiliation(s)
- Yoram Vodovotz
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15213, USA; Center for Inflammation and Regeneration Modeling, McGowan Institute for Regenerative Medicine, University of Pittsburgh, Pittsburgh, PA 15219, USA; Center for Systems Immunology, University of Pittsburgh, Pittsburgh, PA 15219, USA.
| |
Collapse
|
34
|
Hossain I, Fanfani V, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. RESEARCH SQUARE 2023:rs.3.rs-2675584. [PMID: 36993392 PMCID: PMC10055646 DOI: 10.21203/rs.3.rs-2675584/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the causal gene-regulatory network (GRN) governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impedes scalability and/or explainability. To overcome these limitations, we developed PHOENIX, a modeling framework based on neural ordinary differential equations (NeuralODEs) and Hill-Langmuir kinetics, that can flexibly incorporate prior domain knowledge and biological constraints to promote sparse, biologically interpretable representations of ODEs. We test accuracy of PHOENIX in a series of in silico experiments benchmarking it against several currently used tools for ODE estimation. We also demonstrate PHOENIX's flexibility by studying oscillating expression data from synchronized yeast cells and assess its scalability by modelling genome-scale breast cancer expression for samples ordered in pseudotime. Finally, we show how the combination of user-defined prior knowledge and functional forms from systems biology allows PHOENIX to encode key properties of the underlying GRN, and subsequently predict expression patterns in a biologically explainable way.
Collapse
Affiliation(s)
- Intekhab Hossain
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Viola Fanfani
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Rebekka Burkholz
- Helmholtz Center for Information Security (CISPA), Saarbrücken, Germany
| |
Collapse
|
35
|
He D, Soneson C, Patro R. Understanding and evaluating ambiguity in single-cell and single-nucleus RNA-sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.04.522742. [PMID: 36711921 PMCID: PMC9881993 DOI: 10.1101/2023.01.04.522742] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Recently, a new modification has been proposed by Hjörleifsson and Sullivan et al. to the model used to classify the splicing status of reads (as spliced (mature), unspliced (nascent), or ambiguous) in single-cell and single-nucleus RNA-seq data. Here, we evaluate both the theoretical basis and practical implementation of the proposed method. The proposed method is highly-conservative, and therefore, unlikely to mischaracterize reads as spliced (mature) or unspliced (nascent) when they are not. However, we find that it leaves a large fraction of reads classified as ambiguous, and, in practice, allocates these ambiguous reads in an all-or-nothing manner, and differently between single-cell and single-nucleus RNA-seq data. Further, as implemented in practice, the ambiguous classification is implicit and based on the index against which the reads are mapped, which leads to several drawbacks compared to methods that consider both spliced (mature) and unspliced (nascent) mapping targets simultaneously - for example, the ability to use confidently assigned reads to rescue ambiguous reads based on shared UMIs and gene targets. Nonetheless, we show that these conservative assignment rules can be obtained directly in existing approaches simply by altering the set of targets that are indexed. To this end, we introduce the spliceu reference and show that its use with alevin-fry recapitulates the more conservative proposed classification. We also observe that, on experimental data, and under the proposed allocation rules for ambiguous UMIs, the difference between the proposed classification scheme and existing conventions appears much smaller than previously reported. We demonstrate the use of the new piscem index for mapping simultaneously against spliced (mature) and unspliced (nascent) targets, allowing classification against the full nascent and mature transcriptome in human or mouse in <3GB of memory. Finally, we discuss the potential of incorporating probabilistic evidence into the inference of splicing status, and suggest that it may provide benefits beyond what can be obtained from discrete classification of UMIs as splicing-ambiguous.
Collapse
Affiliation(s)
- Dongze He
- Department of Cell Biology and Molecular Genetics and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA
| | - Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Rob Patro
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA
| |
Collapse
|