1
|
Hossain I, Fanfani V, Fischer J, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. Genome Biol 2024; 25:127. [PMID: 38773638 PMCID: PMC11106922 DOI: 10.1186/s13059-024-03264-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 04/30/2024] [Indexed: 05/24/2024] Open
Abstract
BACKGROUND Gene regulatory network (GRN) models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such gene regulatory ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the underlying GRN governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impede either scalability, explainability, or both. RESULTS We developed PHOENIX, a modeling framework based on neural ordinary differential equations (NeuralODEs) and Hill-Langmuir kinetics, that overcomes limitations of other methods by flexibly incorporating prior domain knowledge and biological constraints to promote sparse, biologically interpretable representations of GRN ODEs. We tested the accuracy of PHOENIX in a series of in silico experiments, benchmarking it against several currently used tools. We demonstrated PHOENIX's flexibility by modeling regulation of oscillating expression profiles obtained from synchronized yeast cells. We also assessed the scalability of PHOENIX by modeling genome-scale GRNs for breast cancer samples ordered in pseudotime and for B cells treated with Rituximab. CONCLUSIONS PHOENIX uses a combination of user-defined prior knowledge and functional forms from systems biology to encode biological "first principles" as soft constraints on the GRN allowing us to predict subsequent gene expression patterns in a biologically explainable manner.
Collapse
Affiliation(s)
| | - Viola Fanfani
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Jonas Fischer
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Rebekka Burkholz
- CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
| |
Collapse
|
2
|
Hong T, Xing J. Data- and theory-driven approaches for understanding paths of epithelial-mesenchymal transition. Genesis 2024; 62:e23591. [PMID: 38553870 PMCID: PMC11017362 DOI: 10.1002/dvg.23591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/16/2024] [Accepted: 03/16/2024] [Indexed: 04/02/2024]
Abstract
Reversible transitions between epithelial and mesenchymal cell states are a crucial form of epithelial plasticity for development and disease progression. Recent experimental data and mechanistic models showed multiple intermediate epithelial-mesenchymal transition (EMT) states as well as trajectories of EMT underpinned by complex gene regulatory networks. In this review, we summarize recent progress in quantifying EMT and characterizing EMT paths with computational methods and quantitative experiments including omics-level measurements. We provide perspectives on how these studies can help relating fundamental cell biology to physiological and pathological outcomes of EMT.
Collapse
Affiliation(s)
- Tian Hong
- Department of Biochemistry & Cellular and Molecular Biology, The University of Tennessee, Knoxville, Knoxville TN, USA
| | - Jianhua Xing
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, USA
- UPMC-Hillman Cancer Center, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Physics and Astronomy, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
3
|
Barcenas M, Bocci F, Nie Q. Tipping points in epithelial-mesenchymal lineages from single-cell transcriptomics data. Biophys J 2024:S0006-3495(24)00201-7. [PMID: 38504523 DOI: 10.1016/j.bpj.2024.03.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 02/09/2024] [Accepted: 03/15/2024] [Indexed: 03/21/2024] Open
Abstract
Understanding cell fate decision-making during complex biological processes is an open challenge that is now aided by high-resolution single-cell sequencing technologies. Specifically, it remains challenging to identify and characterize transition states corresponding to "tipping points" whereby cells commit to new cell states. Here, we present a computational method that takes advantage of single-cell transcriptomics data to infer the stability and gene regulatory networks (GRNs) along cell lineages. Our method uses the unspliced and spliced counts from single-cell RNA sequencing data and cell ordering along lineage trajectories to train an RNA splicing multivariate model, from which cell-state stability along the lineage is inferred based on spectral analysis of the model's Jacobian matrix. Moreover, the model infers the RNA cross-species interactions resulting in GRNs and their variation along the cell lineage. When applied to epithelial-mesenchymal transition in ovarian and lung cancer-derived cell lines, our model predicts a saddle-node transition between the epithelial and mesenchymal states passing through an unstable, intermediate cell state. Furthermore, we show that the underlying GRN controlling epithelial-mesenchymal transition rearranges during the transition, resulting in denser and less modular networks in the intermediate state. Overall, our method represents a flexible tool to study cell lineages with a combination of theory-driven modeling and single-cell transcriptomics data.
Collapse
Affiliation(s)
- Manuel Barcenas
- Department of Mathematics, University of California Irvine, Irvine, California
| | - Federico Bocci
- Department of Mathematics, University of California Irvine, Irvine, California; NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, California.
| | - Qing Nie
- Department of Mathematics, University of California Irvine, Irvine, California; NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, California.
| |
Collapse
|
4
|
Hossain I, Fanfani V, Fischer J, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.24.529835. [PMID: 36909563 PMCID: PMC10002636 DOI: 10.1101/2023.02.24.529835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Modeling dynamics of gene regulatory networks using ordinary differential equations (ODEs) allow a deeper understanding of disease progression and response to therapy, thus aiding in intervention optimization. Although there exist methods to infer regulatory ODEs, these are generally limited to small networks, rely on dimensional reduction, or impose non-biological parametric restrictions - all impeding scalability and explainability. PHOENIX is a neural ODE framework incorporating prior domain knowledge as soft constraints to infer sparse, biologically interpretable dynamics. Extensive experiments - on simulated and real data - demonstrate PHOENIX's unique ability to learn key regulatory dynamics while scaling to the whole genome.
Collapse
|
5
|
Sha Y, Qiu Y, Zhou P, Nie Q. Reconstructing growth and dynamic trajectories from single-cell transcriptomics data. NAT MACH INTELL 2023; 6:25-39. [PMID: 38274364 PMCID: PMC10805654 DOI: 10.1038/s42256-023-00763-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 10/25/2023] [Indexed: 01/27/2024]
Abstract
Time-series single-cell RNA sequencing (scRNA-seq) datasets provide unprecedented opportunities to learn dynamic processes of cellular systems. Due to the destructive nature of sequencing, it remains challenging to link the scRNA-seq snapshots sampled at different time points. Here we present TIGON, a dynamic, unbalanced optimal transport algorithm that reconstructs dynamic trajectories and population growth simultaneously as well as the underlying gene regulatory network from multiple snapshots. To tackle the high-dimensional optimal transport problem, we introduce a deep learning method using a dimensionless formulation based on the Wasserstein-Fisher-Rao (WFR) distance. TIGON is evaluated on simulated data and compared with existing methods for its robustness and accuracy in predicting cell state transition and cell population growth. Using three scRNA-seq datasets, we show the importance of growth in the temporal inference, TIGON's capability in reconstructing gene expression at unmeasured time points and its applications to temporal gene regulatory networks and cell-cell communication inference.
Collapse
Affiliation(s)
- Yutong Sha
- Department of Mathematics, University of California, Irvine, Irvine, CA USA
| | - Yuchi Qiu
- Department of Mathematics, Michigan State University, East Lansing, MI USA
| | - Peijie Zhou
- Department of Mathematics, University of California, Irvine, Irvine, CA USA
| | - Qing Nie
- Department of Mathematics, University of California, Irvine, Irvine, CA USA
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA USA
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, CA USA
| |
Collapse
|
6
|
Yampolskaya M, Herriges MJ, Ikonomou L, Kotton DN, Mehta P. scTOP: physics-inspired order parameters for cellular identification and visualization. Development 2023; 150:dev201873. [PMID: 37756586 PMCID: PMC10629677 DOI: 10.1242/dev.201873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 09/11/2023] [Indexed: 09/29/2023]
Abstract
Advances in single-cell RNA sequencing provide an unprecedented window into cellular identity. The abundance of data requires new theoretical and computational frameworks to analyze the dynamics of differentiation and integrate knowledge from cell atlases. We present 'single-cell Type Order Parameters' (scTOP): a statistical, physics-inspired approach for quantifying cell identity given a reference basis of cell types. scTOP can accurately classify cells, visualize developmental trajectories and assess the fidelity of engineered cells. Importantly, scTOP does this without feature selection, statistical fitting or dimensional reduction (e.g. uniform manifold approximation and projection, principle components analysis, etc.). We illustrate the power of scTOP using human and mouse datasets. By reanalyzing mouse lung data, we characterize a transient hybrid alveolar type 1/alveolar type 2 cell population. Visualizations of lineage tracing hematopoiesis data using scTOP confirm that a single clone can give rise to multiple mature cell types. We assess the transcriptional similarity between endogenous and donor-derived cells in the context of murine pulmonary cell transplantation. Our results suggest that physics-inspired order parameters can be an important tool for understanding differentiation and characterizing engineered cells. scTOP is available as an easy-to-use Python package.
Collapse
Affiliation(s)
| | - Michael J. Herriges
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Laertis Ikonomou
- Department of Oral Biology, University at Buffalo, The State University of New York, Buffalo, NY 14215, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, University at Buffalo, The State University of New York, Buffalo, NY 14215, USA
| | - Darrell N. Kotton
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA
| | - Pankaj Mehta
- Department of Physics, Boston University, Boston, MA 02215, USA
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA 02118, USA
- Faculty of Computing and Data Science, Boston University, Boston, MA 02215, USA
- Biological Design Center, Boston University, Boston, MA 02215, USA
| |
Collapse
|
7
|
Hossain I, Fanfani V, Quackenbush J, Burkholz R. Biologically informed NeuralODEs for genome-wide regulatory dynamics. RESEARCH SQUARE 2023:rs.3.rs-2675584. [PMID: 36993392 PMCID: PMC10055646 DOI: 10.21203/rs.3.rs-2675584/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Models that are formulated as ordinary differential equations (ODEs) can accurately explain temporal gene expression patterns and promise to yield new insights into important cellular processes, disease progression, and intervention design. Learning such ODEs is challenging, since we want to predict the evolution of gene expression in a way that accurately encodes the causal gene-regulatory network (GRN) governing the dynamics and the nonlinear functional relationships between genes. Most widely used ODE estimation methods either impose too many parametric restrictions or are not guided by meaningful biological insights, both of which impedes scalability and/or explainability. To overcome these limitations, we developed PHOENIX, a modeling framework based on neural ordinary differential equations (NeuralODEs) and Hill-Langmuir kinetics, that can flexibly incorporate prior domain knowledge and biological constraints to promote sparse, biologically interpretable representations of ODEs. We test accuracy of PHOENIX in a series of in silico experiments benchmarking it against several currently used tools for ODE estimation. We also demonstrate PHOENIX's flexibility by studying oscillating expression data from synchronized yeast cells and assess its scalability by modelling genome-scale breast cancer expression for samples ordered in pseudotime. Finally, we show how the combination of user-defined prior knowledge and functional forms from systems biology allows PHOENIX to encode key properties of the underlying GRN, and subsequently predict expression patterns in a biologically explainable way.
Collapse
Affiliation(s)
- Intekhab Hossain
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Viola Fanfani
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Rebekka Burkholz
- Helmholtz Center for Information Security (CISPA), Saarbrücken, Germany
| |
Collapse
|
8
|
Hettinger ZR, Hu S, Mamiya H, Sahu A, Iijima H, Wang K, Gilmer G, Miller A, Nasello G, Dâ Amore A, Vorp DA, Rando TA, Xing J, Ambrosio F. Dynamical modeling reveals RNA decay mediates the effect of matrix stiffness on aged muscle stem cell fate. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.24.529950. [PMID: 36865124 PMCID: PMC9980169 DOI: 10.1101/2023.02.24.529950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2023]
Abstract
Loss of muscle stem cell (MuSC) self-renewal with aging reflects a combination of influences from the intracellular (e.g., post-transcriptional modifications) and extracellular (e.g., matrix stiffness) environment. Whereas conventional single cell analyses have revealed valuable insights into factors contributing to impaired self-renewal with age, most are limited by static measurements that fail to capture nonlinear dynamics. Using bioengineered matrices mimicking the stiffness of young and old muscle, we showed that while young MuSCs were unaffected by aged matrices, old MuSCs were phenotypically rejuvenated by young matrices. Dynamical modeling of RNA velocity vector fields in silico revealed that soft matrices promoted a self-renewing state in old MuSCs by attenuating RNA decay. Vector field perturbations demonstrated that the effects of matrix stiffness on MuSC self-renewal could be circumvented by fine-tuning the expression of the RNA decay machinery. These results demonstrate that post-transcriptional dynamics dictate the negative effect of aged matrices on MuSC self-renewal.
Collapse
|
9
|
Yampolskaya M, Herriges M, Ikonomou L, Kotton D, Mehta P. scTOP: physics-inspired order parameters for cellular identification and visualization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.25.525581. [PMID: 36747864 PMCID: PMC9900792 DOI: 10.1101/2023.01.25.525581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Advances in single-cell RNA-sequencing (scRNA-seq) provide an unprecedented window into cellular identity. The increasing abundance of data requires new theoretical and computational frameworks for understanding cell fate determination, accurately classifying cell fates from expression data, and integrating knowledge from cell atlases. Here, we present single-cell Type Order Parameters (scTOP): a statistical-physics-inspired approach for constructing "order parameters" for cell fate given a reference basis of cell types. scTOP can quickly and accurately classify cells at a single-cell resolution, generate interpretable visualizations of developmental trajectories, and assess the fidelity of engineered cells. Importantly, scTOP does this without using feature selection, statistical fitting, or dimensional reduction (e.g., UMAP, PCA, etc.). We illustrate the power of scTOP utilizing a wide variety of human and mouse datasets (both in vivo and in vitro ). By reanalyzing mouse lung alveolar development data, we characterize a transient perinatal hybrid alveolar type 1/alveolar type 2 (AT1/AT2) cell population that disappears by 15 days post-birth and show that it is transcriptionally distinct from previously identified adult AT2-to-AT1 transitional cell types. Visualizations of lineage tracing data on hematopoiesis using scTOP confirm that a single clone can give rise to as many as three distinct differentiated cell types. We also show how scTOP can quantitatively assess the transcriptional similarity between endogenous and transplanted cells in the context of murine pulmonary cell transplantation. Finally, we provide an easy-to-use Python implementation of scTOP. Our results suggest that physics-inspired order parameters can be an important tool for understanding development and characterizing engineered cells.
Collapse
Affiliation(s)
| | - Michael Herriges
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Laertis Ikonomou
- Department of Oral Biology. University at Buffalo, The State University of New York, Buffalo, NY, USA
- Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Darrell Kotton
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA, USA
- The Pulmonary Center and Department of Medicine, Boston University School of Medicine, Boston, MA, USA
| | - Pankaj Mehta
- Department of Physics, Boston University, Boston, MA 02215, USA
- Center for Regenerative Medicine of Boston University and Boston Medical Center, Boston, MA, USA
- Faculty of Computing and Data Science, Boston University, Boston, MA 02215, USA
- Biological Design Center, Boston University, Boston, MA 02215, USA
| |
Collapse
|
10
|
Alderfer S, Sun J, Tahtamouni L, Prasad A. Morphological signatures of actin organization in single cells accurately classify genetic perturbations using CNNs with transfer learning. SOFT MATTER 2022; 18:8342-8354. [PMID: 36222484 DOI: 10.1039/d2sm01000c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The actin cytoskeleton plays essential roles in countless cell processes, from cell division to migration to signaling. In cancer cells, cytoskeletal dynamics, cytoskeletal filament organization, and overall cell morphology are known to be altered substantially. We hypothesize that actin fiber organization and cell shape may carry specific signatures of genetic or signaling perturbations. We used convolutional neural networks (CNNs) on a small fluorescence microscopy image dataset of retinal pigment epithelial (RPE) cells and triple-negative breast cancer (TNBC) cells for identifying morphological signatures in cancer cells. Using a transfer learning approach, CNNs could be trained to accurately distinguish between normal and oncogenically transformed RPE cells with an accuracy of about 95% or better at the single cell level. Furthermore, CNNs could distinguish transformed cell lines differing by an oncogenic mutation from each other and could also detect knockdown of cofilin in TNBC cells, indicating that each single oncogenic mutation or cytoskeletal perturbation produces a unique signature in actin morphology. Application of the Local Interpretable Model-Agnostic Explanations (LIME) method for visually interpreting the CNN results revealed features of the global actin structure relevant for some cells and classification tasks. Interestingly, many of these features were supported by previous biological observation. Actin fiber organization is thus a sensitive marker for cell identity, and identification of its perturbations could be very useful for assaying cell phenotypes, including disease states.
Collapse
Affiliation(s)
- Sydney Alderfer
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523, USA.
- School of Biomedical Engineering, Colorado State University, Fort Collins, CO 80523, USA
| | - Jiangyu Sun
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Lubna Tahtamouni
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
- Department of Biology and Biotechnology, The Hashemite University, Zarqa, Jordan
| | - Ashok Prasad
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523, USA.
- School of Biomedical Engineering, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|