1
|
Hirsch M, Pal S, Mehrabadi FR, Malikic S, Gruen C, Sassano A, Pérez-Guijarro E, Merlino G, Sahinalp C, Molloy EK, Day CP, Przytycka TM. Stochastic modelling of single-cell gene expression adaptation reveals non-genomic contribution to evolution of tumor subclones. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.17.588869. [PMID: 38712152 PMCID: PMC11071284 DOI: 10.1101/2024.04.17.588869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Cancer progression is an evolutionary process driven by the selection of cells adapted to gain growth advantage. We present the first formal study on the adaptation of gene expression in subclonal evolution. We model evolutionary changes in gene expression as stochastic Ornstein-Uhlenbeck processes, jointly leveraging the evolutionary history of subclones and single-cell expression data. Applying our model to sublines derived from single cells of a mouse melanoma revealed that sublines with distinct phenotypes are underlined by different patterns of gene expression adaptation, indicating non-genetic mechanisms of cancer evolution. Interestingly, sublines previously observed to be resistant to anti-CTLA-4 treatment showed adaptive expression of genes related to invasion and non-canonical Wnt signaling, whereas sublines that responded to treatment showed adaptive expression of genes related to proliferation and canonical Wnt signaling. Our results suggest that clonal phenotypes emerge as the result of specific adaptivity patterns of gene expression.
Collapse
Affiliation(s)
- M.G. Hirsch
- National Library of Medicine, NIH, Bethesda, Maryland, USA
- Department of Computer Science, University of Maryland, College Park, Maryland USA
| | - Soumitra Pal
- Neurobiology Neurodegeneration and Repair Lab, National Eye Institute, NIH, Bethesda, Maryland, USA
| | - Farid Rashidi Mehrabadi
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer institute, NIH, Bethesda, Maryland, USA
- Laboratory of Human Carcinogenesis, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland, USA
| | - Salem Malikic
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer institute, NIH, Bethesda, Maryland, USA
| | - Charli Gruen
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland, USA
| | - Antonella Sassano
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland, USA
| | - Eva Pérez-Guijarro
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland, USA
- Instituto de Investigaciones Biomédicas Sols-Morreale, Consejo Superior de Investigaciones Científicas, Universidad Autónoma de Madrid (IIBM, CSIC-UAM), Madrid, Spain
| | - Glenn Merlino
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland, USA
| | - Cenk Sahinalp
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer institute, NIH, Bethesda, Maryland, USA
| | - Erin K. Molloy
- Department of Computer Science, University of Maryland, College Park, Maryland USA
- University of Maryland Institute for Advanced Computer Studies, College Park, Maryland USA
| | - Chi-Ping Day
- Laboratory of Cancer Biology and Genetics, Center for Cancer Research, National Cancer Institute, NIH, Bethesda, Maryland, USA
| | | |
Collapse
|
2
|
Mah JL, Dunn CW. Cell type evolution reconstruction across species through cell phylogenies of single-cell RNA sequencing data. Nat Ecol Evol 2024; 8:325-338. [PMID: 38182680 DOI: 10.1038/s41559-023-02281-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 11/16/2023] [Indexed: 01/07/2024]
Abstract
The origin and evolution of cell types has emerged as a key topic in evolutionary biology. Driven by rapidly accumulating single-cell datasets, recent attempts to infer cell type evolution have largely been limited to pairwise comparisons because we lack approaches to build cell phylogenies using model-based approaches. Here we approach the challenges of applying explicit phylogenetic methods to single-cell data by using principal components as phylogenetic characters. We infer a cell phylogeny from a large, comparative single-cell dataset of eye cells from five distantly related mammals. Robust cell type clades enable us to provide a phylogenetic, rather than phenetic, definition of cell type, allowing us to forgo marker genes and phylogenetically classify cells by topology. We further observe evolutionary relationships between diverse vessel endothelia and identify the myelinating and non-myelinating Schwann cells as sister cell types. Finally, we examine principal component loadings and describe the gene expression dynamics underlying the function and identity of cell type clades that have been conserved across the five species. A cell phylogeny provides a rigorous framework towards investigating the evolutionary history of cells and will be critical to interpret comparative single-cell datasets that aim to ask fundamental evolutionary questions.
Collapse
Affiliation(s)
- Jasmine L Mah
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA.
| | - Casey W Dunn
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| |
Collapse
|
3
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. Genome Biol Evol 2023; 15:evad211. [PMID: 38000902 PMCID: PMC10709115 DOI: 10.1093/gbe/evad211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Revised: 11/09/2023] [Accepted: 11/17/2023] [Indexed: 11/26/2023] Open
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred models for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best-fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Developmental Biology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Vancouver, BC, Canada
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
4
|
Dimayacyac JR, Wu S, Jiang D, Pennell M. Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.09.527893. [PMID: 37645857 PMCID: PMC10461906 DOI: 10.1101/2023.02.09.527893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Phylogenetic comparative methods are increasingly used to test hypotheses about the evolutionary processes that drive divergence in gene expression among species. However, it is unknown whether the distributional assumptions of phylogenetic models designed for quantitative phenotypic traits are realistic for expression data and importantly, the reliability of conclusions of phylogenetic comparative studies of gene expression may depend on whether the data is well-described by the chosen model. To evaluate this, we first fit several phylogenetic models of trait evolution to 8 previously published comparative expression datasets, comprising a total of 54,774 genes with 145,927 unique gene-tissue combinations. Using a previously developed approach, we then assessed how well the best model of the set described the data in an absolute (not just relative) sense. First, we find that Ornstein-Uhlenbeck models, in which expression values are constrained around an optimum, were the preferred model for 66% of gene-tissue combinations. Second, we find that for 61% of gene-tissue combinations, the best fit model of the set was found to perform well; the rest were found to be performing poorly by at least one of the test statistics we examined. Third, we find that when simple models do not perform well, this appears to be typically a consequence of failing to fully account for heterogeneity in the rate of the evolution. We advocate that assessment of model performance should become a routine component of phylogenetic comparative expression studies; doing so can improve the reliability of inferences and inspire the development of novel models.
Collapse
Affiliation(s)
- Jose Rafael Dimayacyac
- Department of Zoology, University of British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Canada
| | - Shanyun Wu
- Department of Zoology, University of British Columbia, Canada
- Department of Genetics, Washington University School of Medicine, USA
| | - Daohan Jiang
- Department of Quantitative and Computational Biology, University of Southern California, USA
| | - Matt Pennell
- Department of Zoology, University of British Columbia, Canada
- Department of Quantitative and Computational Biology, University of Southern California, USA
- Department of Biological Sciences, University of Southern California, USA
| |
Collapse
|
5
|
Hibbins MS, Breithaupt LC, Hahn MW. Phylogenomic comparative methods: Accurate evolutionary inferences in the presence of gene tree discordance. Proc Natl Acad Sci U S A 2023; 120:e2220389120. [PMID: 37216509 PMCID: PMC10235958 DOI: 10.1073/pnas.2220389120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 04/24/2023] [Indexed: 05/24/2023] Open
Abstract
Phylogenetic comparative methods have long been a mainstay of evolutionary biology, allowing for the study of trait evolution across species while accounting for their common ancestry. These analyses typically assume a single, bifurcating phylogenetic tree describing the shared history among species. However, modern phylogenomic analyses have shown that genomes are often composed of mosaic histories that can disagree both with the species tree and with each other-so-called discordant gene trees. These gene trees describe shared histories that are not captured by the species tree, and therefore that are unaccounted for in classic comparative approaches. The application of standard comparative methods to species histories containing discordance leads to incorrect inferences about the timing, direction, and rate of evolution. Here, we develop two approaches for incorporating gene tree histories into comparative methods: one that constructs an updated phylogenetic variance-covariance matrix from gene trees, and another that applies Felsenstein's pruning algorithm over a set of gene trees to calculate trait histories and likelihoods. Using simulation, we demonstrate that our approaches generate much more accurate estimates of tree-wide rates of trait evolution than standard methods. We apply our methods to two clades of the wild tomato genus Solanum with varying rates of discordance, demonstrating the contribution of gene tree discordance to variation in a set of floral traits. Our approaches have the potential to be applied to a broad range of classic inference problems in phylogenetics, including ancestral state reconstruction and the inference of lineage-specific rate shifts.
Collapse
Affiliation(s)
- Mark S. Hibbins
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ONM5S 3B2, Canada
- Department of Biology, Indiana University, Bloomington, IN47405
| | - Lara C. Breithaupt
- Department of Biology, Indiana University, Bloomington, IN47405
- Department of Computer Science, Duke University, Durham, NC27710
| | - Matthew W. Hahn
- Department of Biology, Indiana University, Bloomington, IN47405
- Department of Computer Science, Indiana University, Bloomington, IN47405
| |
Collapse
|