1
|
Sashittal P, Zhang RY, Law BK, Strzalkowski A, Schmidt H, Bolondi A, Chan MM, Raphael BJ. Inferring cell differentiation maps from lineage tracing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.09.611835. [PMID: 39314473 PMCID: PMC11419031 DOI: 10.1101/2024.09.09.611835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
During development, mulitpotent cells differentiate through a hierarchy of increasingly restricted progenitor cell types until they realize specialized cell types. A cell differentiation map describes this hierarchy, and inferring these maps is an active area of research spanning traditional single marker lineage studies to data-driven trajectory inference methods on single-cell RNA-seq data. Recent high-throughput lineage tracing technologies profile lineages and cell types at scale, but current methods to infer cell differentiation maps from these data rely on simple models with restrictive assumptions about the developmental process. We introduce a mathematical framework for cell differentiation maps based on the concept of potency, and develop an algorithm, Carta, that infers an optimal cell differentiation map from single-cell lineage tracing data. The key insight in Carta is to balance the trade-off between the complexity of the cell differentiation map and the number of unobserved cell type transitions on the lineage tree. We show that Carta more accurately infers cell differentiation maps on both simulated and real data compared to existing methods. In models of mammalian trunk development and mouse hematopoiesis, Carta identifies important features of development that are not revealed by other methods including convergent differentiation of specialized cell types, progenitor differentiation dynamics, and the refinement of routes of differentiation via new intermediate progenitors.
Collapse
Affiliation(s)
- Palash Sashittal
- Dept. of Computer Science, Princeton University, Princeton; 08544 NJ, USA
| | - Richard Y. Zhang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton; 08544 NJ, USA
| | - Benjamin K. Law
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton; 08544 NJ, USA
- Dept. of Molecular Biology, Princeton University, Princeton; 08544 NJ, USA
| | | | - Henri Schmidt
- Dept. of Computer Science, Princeton University, Princeton; 08544 NJ, USA
| | - Adriano Bolondi
- Dept. of Genome Regulation, Max Planck Institute for Molecular Genetics; 14195 Berlin, Germany
| | - Michelle M. Chan
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton; 08544 NJ, USA
- Dept. of Molecular Biology, Princeton University, Princeton; 08544 NJ, USA
| | | |
Collapse
|
2
|
Shu C, Street K, Breton CV, Bastain TM, Wilson ML. A review of single-cell transcriptomics and epigenomics studies in maternal and child health. Epigenomics 2024; 16:775-793. [PMID: 38709139 PMCID: PMC11318716 DOI: 10.1080/17501911.2024.2343276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Single-cell sequencing technologies enhance our understanding of cellular dynamics throughout pregnancy. We outlined the workflow of single-cell sequencing techniques and reviewed single-cell studies in maternal and child health. We conducted a literature review of single cell studies on maternal and child health using PubMed. We summarized the findings from 16 single-cell atlases of the human and mammalian placenta across gestational stages and 31 single-cell studies on maternal exposures and complications including infection, obesity, diet, gestational diabetes, pre-eclampsia, environmental exposure and preterm birth. Single-cell studies provides insights on novel cell types in placenta and cell type-specific marks associated with maternal exposures and complications.
Collapse
Affiliation(s)
- Chang Shu
- Center for Genetic Epidemiology, Division of Epidemiology & Genetics, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Kelly Street
- Division of Biostatistics, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Carrie V Breton
- Division of Environmental Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Theresa M Bastain
- Division of Environmental Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Melissa L Wilson
- Division of Disease Prevention, Policy, & Global Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles,CA USA
| |
Collapse
|
3
|
Xu F, Hu H, Lin H, Lu J, Cheng F, Zhang J, Li X, Shuai J. scGIR: deciphering cellular heterogeneity via gene ranking in single-cell weighted gene correlation networks. Brief Bioinform 2024; 25:bbae091. [PMID: 38487851 PMCID: PMC10940817 DOI: 10.1093/bib/bbae091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 02/08/2024] [Accepted: 02/15/2024] [Indexed: 03/18/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for investigating cellular heterogeneity through high-throughput analysis of individual cells. Nevertheless, challenges arise from prevalent sequencing dropout events and noise effects, impacting subsequent analyses. Here, we introduce a novel algorithm, Single-cell Gene Importance Ranking (scGIR), which utilizes a single-cell gene correlation network to evaluate gene importance. The algorithm transforms single-cell sequencing data into a robust gene correlation network through statistical independence, with correlation edges weighted by gene expression levels. We then constructed a random walk model on the resulting weighted gene correlation network to rank the importance of genes. Our analysis of gene importance using PageRank algorithm across nine authentic scRNA-seq datasets indicates that scGIR can effectively surmount technical noise, enabling the identification of cell types and inference of developmental trajectories. We demonstrated that the edges of gene correlation, weighted by expression, play a critical role in enhancing the algorithm's performance. Our findings emphasize that scGIR outperforms in enhancing the clustering of cell subtypes, reverse identifying differentially expressed marker genes, and uncovering genes with potential differential importance. Overall, we proposed a promising method capable of extracting more information from single-cell RNA sequencing datasets, potentially shedding new lights on cellular processes and disease mechanisms.
Collapse
Affiliation(s)
- Fei Xu
- Department of Physics, Anhui Normal University, Wuhu 241002, China
- Wenzhou Institute and Wenzhou Key Laboratory of Biophysics, University of Chinese Academy of Sciences, Wenzhou 325001, China
| | - Huan Hu
- Institute of Applied Genomics, Fuzhou University, Fuzhou 350108, China
| | - Hai Lin
- Wenzhou Institute and Wenzhou Key Laboratory of Biophysics, University of Chinese Academy of Sciences, Wenzhou 325001, China
| | - Jun Lu
- Department of Physics, Anhui Normal University, Wuhu 241002, China
- School of Medical Imageology, Wannan Medical College, Wuhu 241002, China
| | - Feng Cheng
- Department of Physics, and Fujian Provincial Key Lab for Soft Functional Materials Research, Xiamen University, Xiamen 361005, China
| | - Jiqian Zhang
- Department of Physics, Anhui Normal University, Wuhu 241002, China
| | - Xiang Li
- Department of Physics, and Fujian Provincial Key Lab for Soft Functional Materials Research, Xiamen University, Xiamen 361005, China
| | - Jianwei Shuai
- Wenzhou Institute and Wenzhou Key Laboratory of Biophysics, University of Chinese Academy of Sciences, Wenzhou 325001, China
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), Wenzhou 325001, China
| |
Collapse
|
4
|
Kim IS. DNA Barcoding Technology for Lineage Recording and Tracing to Resolve Cell Fate Determination. Cells 2023; 13:27. [PMID: 38201231 PMCID: PMC10778210 DOI: 10.3390/cells13010027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 12/18/2023] [Accepted: 12/20/2023] [Indexed: 01/12/2024] Open
Abstract
In various biological contexts, cells receive signals and stimuli that prompt them to change their current state, leading to transitions into a future state. This change underlies the processes of development, tissue maintenance, immune response, and the pathogenesis of various diseases. Following the path of cells from their initial identity to their current state reveals how cells adapt to their surroundings and undergo transformations to attain adjusted cellular states. DNA-based molecular barcoding technology enables the documentation of a phylogenetic tree and the deterministic events of cell lineages, providing the mechanisms and timing of cell lineage commitment that can either promote homeostasis or lead to cellular dysregulation. This review comprehensively presents recently emerging molecular recording technologies that utilize CRISPR/Cas systems, base editing, recombination, and innate variable sequences in the genome. Detailing their underlying principles, applications, and constraints paves the way for the lineage tracing of every cell within complex biological systems, encompassing the hidden steps and intermediate states of organism development and disease progression.
Collapse
Affiliation(s)
- Ik Soo Kim
- Department of Microbiology, Gachon University College of Medicine, Incheon 21999, Republic of Korea
| |
Collapse
|
5
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. Cell Syst 2023; 14:822-843.e22. [PMID: 37751736 PMCID: PMC10725240 DOI: 10.1016/j.cels.2023.08.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 08/16/2023] [Accepted: 08/25/2023] [Indexed: 09/28/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - John J Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA 02115, USA
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA; Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
6
|
Groves SM, Quaranta V. Quantifying cancer cell plasticity with gene regulatory networks and single-cell dynamics. FRONTIERS IN NETWORK PHYSIOLOGY 2023; 3:1225736. [PMID: 37731743 PMCID: PMC10507267 DOI: 10.3389/fnetp.2023.1225736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 08/25/2023] [Indexed: 09/22/2023]
Abstract
Phenotypic plasticity of cancer cells can lead to complex cell state dynamics during tumor progression and acquired resistance. Highly plastic stem-like states may be inherently drug-resistant. Moreover, cell state dynamics in response to therapy allow a tumor to evade treatment. In both scenarios, quantifying plasticity is essential for identifying high-plasticity states or elucidating transition paths between states. Currently, methods to quantify plasticity tend to focus on 1) quantification of quasi-potential based on the underlying gene regulatory network dynamics of the system; or 2) inference of cell potency based on trajectory inference or lineage tracing in single-cell dynamics. Here, we explore both of these approaches and associated computational tools. We then discuss implications of each approach to plasticity metrics, and relevance to cancer treatment strategies.
Collapse
Affiliation(s)
- Sarah M. Groves
- Department of Pharmacology, Vanderbilt University, Nashville, TN, United States
| | - Vito Quaranta
- Department of Pharmacology, Vanderbilt University, Nashville, TN, United States
- Department of Biochemistry, Vanderbilt University, Nashville, TN, United States
| |
Collapse
|
7
|
Gorin G, Vastola JJ, Pachter L. Studying stochastic systems biology of the cell with single-cell genomics data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541250. [PMID: 37292934 PMCID: PMC10245677 DOI: 10.1101/2023.05.17.541250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Recent experimental developments in genome-wide RNA quantification hold considerable promise for systems biology. However, rigorously probing the biology of living cells requires a unified mathematical framework that accounts for single-molecule biological stochasticity in the context of technical variation associated with genomics assays. We review models for a variety of RNA transcription processes, as well as the encapsulation and library construction steps of microfluidics-based single-cell RNA sequencing, and present a framework to integrate these phenomena by the manipulation of generating functions. Finally, we use simulated scenarios and biological data to illustrate the implications and applications of the approach.
Collapse
Affiliation(s)
- Gennady Gorin
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, 91125
| | - John J. Vastola
- Department of Neurobiology, Harvard Medical School, Boston, MA, 02115
| | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125
- Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, 91125
| |
Collapse
|
8
|
Fang S, Chen B, Zhang Y, Sun H, Liu L, Liu S, Li Y, Xu X. Computational Approaches and Challenges in Spatial Transcriptomics. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022:S1672-0229(22)00129-2. [PMID: 36252814 PMCID: PMC10372921 DOI: 10.1016/j.gpb.2022.10.001] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2022] [Revised: 09/08/2022] [Accepted: 10/09/2022] [Indexed: 01/19/2023]
Abstract
The development of spatial transcriptomics (ST) technologies has transformed genetic research from a single-cell data level to a two-dimensional spatial coordinate system and facilitated the study of the composition and function of various cell subsets in different environments and organs. The large-scale data generated by these ST technologies, which contain spatial gene expression information, have elicited the need for spatially resolved approaches to meet the requirements of computational and biological data interpretation. These requirements include dealing with the explosive growth of data to determine the cell-level and gene-level expression, correcting the inner batch effect and loss of expression to improve the data quality, conducting efficient interpretation and in-depth knowledge mining both at the single-cell and tissue-wide levels, and conducting multi-omics integration analysis to provide an extensible framework toward the in-depth understanding of biological processes. However, algorithms designed specifically for ST technologies to meet these requirements are still in their infancy. Here, we review computational approaches to these problems in light of corresponding issues and challenges, and present forward-looking insights into algorithm development.
Collapse
|
9
|
Minne M, Ke Y, Saura-Sanchez M, De Rybel B. Advancing root developmental research through single-cell technologies. CURRENT OPINION IN PLANT BIOLOGY 2022; 65:102113. [PMID: 34562694 PMCID: PMC7611778 DOI: 10.1016/j.pbi.2021.102113] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/09/2021] [Accepted: 08/13/2021] [Indexed: 06/12/2023]
Abstract
Single-cell RNA-sequencing has greatly increased the spatiotemporal resolution of root transcriptomics data, but we are still only scratching the surface of its full potential. Despite the challenges that remain in the field, the orderly aligned structure of the Arabidopsis root meristem makes it specifically suitable for lineage tracing and trajectory analysis. These methods will become even more potent by increasing resolution and specificity using tissue-specific single-cell RNA-sequencing and spatial transcriptomics. Feeding multiple single-cell omics data sets into single-cell gene regulatory networks will accelerate the discovery of regulators of root development in multiple species. By providing transcriptome atlases for virtually any species, single-cell technologies could tempt many root developmental biologists to move beyond the comfort of the well-known Arabidopsis root meristem.
Collapse
Affiliation(s)
- Max Minne
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Technologiepark 71, 9052, Ghent, Belgium; VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Yuji Ke
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Technologiepark 71, 9052, Ghent, Belgium; VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Maite Saura-Sanchez
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Technologiepark 71, 9052, Ghent, Belgium; VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium
| | - Bert De Rybel
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Technologiepark 71, 9052, Ghent, Belgium; VIB Center for Plant Systems Biology, Technologiepark 71, 9052, Ghent, Belgium.
| |
Collapse
|
10
|
Xie J, Yin Y, Wang J. TIPD: A Probability Distribution-Based Method for Trajectory Inference from Single-Cell RNA-Seq Data. Interdiscip Sci 2021; 13:652-665. [PMID: 34109565 DOI: 10.1007/s12539-021-00445-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 05/28/2021] [Accepted: 06/01/2021] [Indexed: 11/25/2022]
Abstract
Single-cell RNA-seq technology provides an unprecedented opportunity to allow researchers to study the biological heterogeneity during cell differentiation and development with higher resolution. Although many computational methods have been proposed to infer cell lineages from single-cell RNA-seq data, constructing accurate cell trajectories remains a challenge. We develop a novel trajectory inference method-based probability distribution (TIPD) to describe the heterogeneity of cell population. TIPD combines signalling entropy and clustering results of the gene expression profile to describe the probability distributions of heterogeneous states in a cell population. It does not require external knowledge to determine the direction of the differentiation trajectories, so its application is not limited by the annotations of the data set. We also propose a new distance metric to measure the distance of the probability distributions of the identified heterogeneous states. On this distance matrix, a minimum spanning tree (MST) is built to reorganize the order of cell clusters. The constructed MST is calculated based on systems-level information, so it is consistent with the real biological process. We validated our method on four previously published single-cell RNA-seq data sets including the linear structure and branch structure. The results showed that TIPD successfully reconstructed the differentiation trajectories that are highly consistent with the known differentiation trajectories and outperformed the other four state-of-the-art methods under different assessment criteria.
Collapse
Affiliation(s)
- Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Yiting Yin
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiao Wang
- School of Life Sciences, Shanghai University, Shanghai, China.
| |
Collapse
|