1
|
DeWitt WS, Evans SN, Hiesmayr E, Hummel S. Mean-field interacting multi-type birth-death processes with a view to applications in phylodynamics. Theor Popul Biol 2024; 159:1-12. [PMID: 39019333 DOI: 10.1016/j.tpb.2024.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 07/09/2024] [Accepted: 07/11/2024] [Indexed: 07/19/2024]
Abstract
Multi-type birth-death processes underlie approaches for inferring evolutionary dynamics from phylogenetic trees across biological scales, ranging from deep-time species macroevolution to rapid viral evolution and somatic cellular proliferation. A limitation of current phylogenetic birth-death models is that they require restrictive linearity assumptions that yield tractable message-passing likelihoods, but that also preclude interactions between individuals. Many fundamental evolutionary processes - such as environmental carrying capacity or frequency-dependent selection - entail interactions, and may strongly influence the dynamics in some systems. Here, we introduce a multi-type birth-death process in mean-field interaction with an ensemble of replicas of the focal process. We prove that, under quite general conditions, the ensemble's stochastically evolving interaction field converges to a deterministic trajectory in the limit of an infinite ensemble. In this limit, the replicas effectively decouple, and self-consistent interactions appear as nonlinearities in the infinitesimal generator of the focal process. We investigate a special case that is rich enough to model both carrying capacity and frequency-dependent selection while yielding tractable message-passing likelihoods in the context of a phylogenetic birth-death model.
Collapse
Affiliation(s)
- William S DeWitt
- Department of Electrical Engineering & Computer Sciences, University of California, Berkeley, United States of America.
| | - Steven N Evans
- Department of Statistics, University of California, Berkeley, United States of America.
| | - Ella Hiesmayr
- Department of Statistics, University of California, Berkeley, United States of America.
| | - Sebastian Hummel
- Department of Statistics, University of California, Berkeley, United States of America.
| |
Collapse
|
2
|
Inferring density-dependent population dynamics mechanisms through rate disambiguation for logistic birth-death processes. J Math Biol 2023; 86:50. [PMID: 36864131 DOI: 10.1007/s00285-023-01877-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 11/21/2022] [Accepted: 01/18/2023] [Indexed: 03/04/2023]
Abstract
Density dependence is important in the ecology and evolution of microbial and cancer cells. Typically, we can only measure net growth rates, but the underlying density-dependent mechanisms that give rise to the observed dynamics can manifest in birth processes, death processes, or both. Therefore, we utilize the mean and variance of cell number fluctuations to separately identify birth and death rates from time series that follow stochastic birth-death processes with logistic growth. Our nonparametric method provides a novel perspective on stochastic parameter identifiability, which we validate by analyzing the accuracy in terms of the discretization bin size. We apply our method to the scenario where a homogeneous cell population goes through three stages: (1) grows naturally to its carrying capacity, (2) is treated with a drug that reduces its carrying capacity, and (3) overcomes the drug effect to restore its original carrying capacity. In each stage, we disambiguate whether the dynamics occur through the birth process, death process, or some combination of the two, which contributes to understanding drug resistance mechanisms. In the case of limited sample sizes, we provide an alternative method based on maximum likelihood and solve a constrained nonlinear optimization problem to identify the most likely density dependence parameter for a given cell number time series. Our methods can be applied to other biological systems at different scales to disambiguate density-dependent mechanisms underlying the same net growth rate.
Collapse
|
3
|
Zerenner T, Di Lauro F, Dashti M, Berthouze L, Kiss IZ. Probabilistic predictions of SIS epidemics on networks based on population-level observations. Math Biosci 2022; 350:108854. [PMID: 35659615 DOI: 10.1016/j.mbs.2022.108854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 05/17/2022] [Accepted: 05/23/2022] [Indexed: 11/16/2022]
Abstract
We predict the future course of ongoing susceptible-infected-susceptible (SIS) epidemics on regular, Erdős-Rényi and Barabási-Albert networks. It is known that the contact network influences the spread of an epidemic within a population. Therefore, observations of an epidemic, in this case at the population-level, contain information about the underlying network. This information, in turn, is useful for predicting the future course of an ongoing epidemic. To exploit this in a prediction framework, the exact high-dimensional stochastic model of an SIS epidemic on a network is approximated by a lower-dimensional surrogate model. The surrogate model is based on a birth-and-death process; the effect of the underlying network is described by a parametric model for the birth rates. We demonstrate empirically that the surrogate model captures the intrinsic stochasticity of the epidemic once it reaches a point from which it will not die out. Bayesian parameter inference allows for uncertainty about the model parameters and the class of the underlying network to be incorporated directly into probabilistic predictions. An evaluation of a number of scenarios shows that in most cases the resulting prediction intervals adequately quantify the prediction uncertainty. As long as the population-level data is available over a long-enough period, even if not sampled frequently, the model leads to excellent predictions where the underlying network is correctly identified and prediction uncertainty mainly reflects the intrinsic stochasticity of the spreading epidemic. For predictions inferred from shorter observational periods, uncertainty about parameters and network class dominate prediction uncertainty. The proposed method relies on minimal data at population-level, which is always likely to be available. This, combined with its numerical efficiency, makes the proposed method attractive to be used either as a standalone inference and prediction scheme or in conjunction with other inference and/or predictive models.
Collapse
Affiliation(s)
- T Zerenner
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK.
| | - F Di Lauro
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK; Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7FL, UK
| | - M Dashti
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK
| | - L Berthouze
- Department of Informatics, University of Sussex, Falmer, Brighton, BN1 9QH, UK
| | - I Z Kiss
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK.
| |
Collapse
|
4
|
Csűrös M. Gain-loss-duplication models for copy number evolution on a phylogeny: Exact algorithms for computing the likelihood and its gradient. Theor Popul Biol 2022; 145:80-94. [DOI: 10.1016/j.tpb.2022.03.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 03/07/2022] [Accepted: 03/10/2022] [Indexed: 10/18/2022]
|
5
|
A Numerical Approach for Evaluating the Time-Dependent Distribution of a Quasi Birth-Death Process. Methodol Comput Appl Probab 2021. [DOI: 10.1007/s11009-021-09882-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
AbstractThis paper considers a continuous-time quasi birth-death (qbd) process, which informally can be seen as a birth-death process of which the parameters are modulated by an external continuous-time Markov chain. The aim is to numerically approximate the time-dependent distribution of the resulting bivariate Markov process in an accurate and efficient way. An approach based on the Erlangization principle is proposed and formally justified. Its performance is investigated and compared with two existing approaches: one based on numerical evaluation of the matrix exponential underlying the qbd process, and one based on the uniformization technique. It is shown that in many settings the approach based on Erlangization is faster than the other approaches, while still being highly accurate. In the last part of the paper, we demonstrate the use of the developed technique in the context of the evaluation of the likelihood pertaining to a time series, which can then be optimized over its parameters to obtain the maximum likelihood estimator. More specifically, through a series of examples with simulated and real-life data, we show how it can be deployed in model selection problems that involve the choice between a qbd and its non-modulated counterpart.
Collapse
|
6
|
Manceau M, Gupta A, Vaughan T, Stadler T. The probability distribution of the ancestral population size conditioned on the reconstructed phylogenetic tree with occurrence data. J Theor Biol 2021; 509:110400. [PMID: 32739241 PMCID: PMC7733867 DOI: 10.1016/j.jtbi.2020.110400] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 05/07/2020] [Accepted: 07/03/2020] [Indexed: 01/10/2023]
Abstract
We consider a homogeneous birth-death process with three different sampling schemes. First, individuals can be sampled through time and included in a reconstructed phylogenetic tree. Second, they can be sampled through time and only recorded as a point 'occurrence' along a timeline. Third, extant individuals can be sampled and included in the reconstructed phylogenetic tree with a fixed probability. We further consider that sampled individuals can be removed or not from the process, upon sampling, with fixed probability. We derive the probability distribution of the population size at any time in the past conditional on the joint observation of a reconstructed phylogenetic tree and a record of occurrences not included in the tree. We also provide an algorithm to simulate ancestral population size trajectories given the observation of a reconstructed phylogenetic tree and occurrences. This distribution can be readily used to draw inferences about the ancestral population size in the field of epidemiology and macroevolution. In epidemiology, these results will allow data from epidemiological case count studies to be used in conjunction with molecular sequencing data (yielding reconstructed phylogenetic trees) to coherently estimate prevalence through time. In macroevolution, it will foster the joint examination of the fossil record and extant taxa to reconstruct past biodiversity.
Collapse
Affiliation(s)
- Marc Manceau
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
| | - Ankit Gupta
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Timothy Vaughan
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
| |
Collapse
|
7
|
Di Lauro F, Croix JC, Dashti M, Berthouze L, Kiss IZ. Network inference from population-level observation of epidemics. Sci Rep 2020; 10:18779. [PMID: 33139773 PMCID: PMC7606546 DOI: 10.1038/s41598-020-75558-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Accepted: 09/21/2020] [Indexed: 12/03/2022] Open
Abstract
Using the continuous-time susceptible-infected-susceptible (SIS) model on networks, we investigate the problem of inferring the class of the underlying network when epidemic data is only available at population-level (i.e., the number of infected individuals at a finite set of discrete times of a single realisation of the epidemic), the only information likely to be available in real world settings. To tackle this, epidemics on networks are approximated by a Birth-and-Death process which keeps track of the number of infected nodes at population level. The rates of this surrogate model encode both the structure of the underlying network and disease dynamics. We use extensive simulations over Regular, Erdős–Rényi and Barabási–Albert networks to build network class-specific priors for these rates. We then use Bayesian model selection to recover the most likely underlying network class, based only on a single realisation of the epidemic. We show that the proposed methodology yields good results on both synthetic and real-world networks.
Collapse
Affiliation(s)
- F Di Lauro
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK
| | - J-C Croix
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK
| | - M Dashti
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK
| | - L Berthouze
- Department of Informatics, University of Sussex, Falmer, BN1 9QH, UK
| | - I Z Kiss
- Department of Mathematics, University of Sussex, Falmer, Brighton, BN1 9QH, UK.
| |
Collapse
|
8
|
de Gunst M, Hautphenne S, Mandjes M, Sollie B. Parameter estimation for multivariate population processes: a saddlepoint approach. STOCH MODELS 2020. [DOI: 10.1080/15326349.2020.1832895] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
| | - Sophie Hautphenne
- School of Mathematics and Statistics, The University of Melbourne, Parkville, Victoria Melbourne, Australia
| | - Michel Mandjes
- University of Amsterdam Korteweg-de Vries Institute for Mathematics, Amsterdam, Netherlands
| | - Birgit Sollie
- Mathematics, VU University Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
9
|
Zwaenepoel A, Van de Peer Y. Model-Based Detection of Whole-Genome Duplications in a Phylogeny. Mol Biol Evol 2020; 37:2734-2746. [PMID: 32359154 DOI: 10.1093/molbev/msaa111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Ancient whole-genome duplications (WGDs) leave signatures in comparative genomic data sets that can be harnessed to detect these events of presumed evolutionary importance. Current statistical approaches for the detection of ancient WGDs in a phylogenetic context have two main drawbacks. The first is that unwarranted restrictive assumptions on the "background" gene duplication and loss rates make inferences unreliable in the face of model violations. The second is that most methods can only be used to examine a limited set of a priori selected WGD hypotheses and cannot be used to discover WGDs in a phylogeny. In this study, we develop an approach for WGD inference using gene count data that seeks to overcome both issues. We employ a phylogenetic birth-death model that includes WGD in a flexible hierarchical Bayesian approach and use reversible-jump Markov chain Monte Carlo to perform Bayesian inference of branch-specific duplication, loss, and WGD retention rates across the space of WGD configurations. We evaluate the proposed method using simulations, apply it to data sets from flowering plants, and discuss the statistical intricacies of model-based WGD inference.
Collapse
Affiliation(s)
- Arthur Zwaenepoel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent, Belgium.,Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| |
Collapse
|
10
|
Davison AC, Hautphenne S, Kraus A. Parameter estimation for discretely observed linear birth-and-death processes. Biometrics 2020; 77:186-196. [PMID: 32306397 DOI: 10.1111/biom.13282] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Revised: 03/21/2020] [Accepted: 03/30/2020] [Indexed: 11/29/2022]
Abstract
Birth-and-death processes are widely used to model the development of biological populations. Although they are relatively simple models, their parameters can be challenging to estimate, as the likelihood can become numerically unstable when data arise from the most common sampling schemes, such as annual population censuses. A further difficulty arises when the discrete observations are not equi-spaced, for example, when census data are unavailable for some years. We present two approaches to estimating the birth, death, and growth rates of a discretely observed linear birth-and-death process: via an embedded Galton-Watson process and by maximizing a saddlepoint approximation to the likelihood. We study asymptotic properties of the estimators, compare them on numerical examples, and apply the methodology to data on monitored populations.
Collapse
Affiliation(s)
- A C Davison
- Institute of Mathematics, Ecole Polytechnique Fédérale de Lausanne, EPFL-FSB-MATH-STAT, Lausanne, Switzerland
| | - S Hautphenne
- Institute of Mathematics, Ecole Polytechnique Fédérale de Lausanne, EPFL-FSB-MATH-STAT, Lausanne, Switzerland.,School of Mathematics and Statistics, The University of Melbourne, Melbourne, Australia
| | - A Kraus
- Department of Mathematics and Statistics, Masaryk University, Brno, Czech Republic
| |
Collapse
|
11
|
Abstract
Abstract
In this paper we provide an introduction to statistical inference for the classical linear birth‒death process, focusing on computational aspects of the problem in the setting of discretely observed processes. The basic probabilistic properties are given in Section 2, focusing on computation of the transition functions. This is followed by a brief discussion of simulation methods in Section 3, and of frequentist methods in Section 4. Section 5 is devoted to Bayesian methods, from rejection sampling to Markov chain Monte Carlo and approximate Bayesian computation. In Section 6 we consider the time-inhomogeneous case. The paper ends with a brief discussion in Section 7.
Collapse
|
12
|
Boys RJ, Ainsworth HF, Gillespie CS. Bayesian inference for a partially observed birth-death process using data on proportions. AUST NZ J STAT 2018. [DOI: 10.1111/anzs.12230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Richard J. Boys
- School of Mathematics, Statistics & Physics; Newcastle University; Newcastle Upon Tyne NE1 7RU UK
| | - Holly F. Ainsworth
- Institute of Health and Society; Newcastle University; Newcastle Upon Tyne NE1 4AX UK
| | - Colin S. Gillespie
- School of Mathematics, Statistics & Physics; Newcastle University; Newcastle Upon Tyne NE1 7RU UK
| |
Collapse
|
13
|
Crawford FW, Ho LST, Suchard MA. Computational methods for birth-death processes. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2018; 10:e1423. [PMID: 29942419 PMCID: PMC6014701 DOI: 10.1002/wics.1423] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Many important stochastic counting models can be written as general birth-death processes (BDPs). BDPs are continuous-time Markov chains on the non-negative integers in which only jumps to adjacent states are allowed. BDPs can be used to easily parameterize a rich variety of probability distributions on the non-negative integers, and straightforward conditions guarantee that these distributions are proper. BDPs also provide a mechanistic interpretation - birth and death of actual particles or organisms - that has proven useful in evolution, ecology, physics, and chemistry. Although the theoretical properties of general BDPs are well understood, traditionally statistical work on BDPs has been limited to the simple linear (Kendall) process. Aside from a few simple cases, it remains impossible to find analytic expressions for the likelihood of a discretely-observed BDP, and computational difficulties have hindered development of tools for statistical inference. But the gap between BDP theory and practical methods for estimation has narrowed in recent years. There are now robust methods for evaluating likelihoods for realizations of BDPs: finite-time transition, first passage, equilibrium probabilities, and distributions of summary statistics that arise commonly in applications. Recent work has also exploited the connection between continuously- and discretely-observed BDPs to derive EM algorithms for maximum likelihood estimation. Likelihood-based inference for previously intractable BDPs is much easier than previously thought and regression approaches analogous to Poisson regression are straightforward to derive. In this review, we outline the basic mathematical theory for BDPs and demonstrate new tools for statistical inference using data from BDPs.
Collapse
Affiliation(s)
- Forrest W Crawford
- Departments of Biostatistics, Ecology & Evolutionary Biology, and School of Management, Yale University
| | - Lam Si Tung Ho
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Marc A Suchard
- Departments of Biomathematics, Biostatistics and Human Genetics, University of California, Los Angeles
| |
Collapse
|
14
|
Ho LST, Xu J, Crawford FW, Minin VN, Suchard MA. Birth/birth-death processes and their computable transition probabilities with biological applications. J Math Biol 2018; 76:911-944. [PMID: 28741177 PMCID: PMC5783825 DOI: 10.1007/s00285-017-1160-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Revised: 04/04/2017] [Indexed: 01/20/2023]
Abstract
Birth-death processes track the size of a univariate population, but many biological systems involve interaction between populations, necessitating models for two or more populations simultaneously. A lack of efficient methods for evaluating finite-time transition probabilities of bivariate processes, however, has restricted statistical inference in these models. Researchers rely on computationally expensive methods such as matrix exponentiation or Monte Carlo approximation, restricting likelihood-based inference to small systems, or indirect methods such as approximate Bayesian computation. In this paper, we introduce the birth/birth-death process, a tractable bivariate extension of the birth-death process, where rates are allowed to be nonlinear. We develop an efficient algorithm to calculate its transition probabilities using a continued fraction representation of their Laplace transforms. Next, we identify several exemplary models arising in molecular epidemiology, macro-parasite evolution, and infectious disease modeling that fall within this class, and demonstrate advantages of our proposed method over existing approaches to inference in these models. Notably, the ubiquitous stochastic susceptible-infectious-removed (SIR) model falls within this class, and we emphasize that computable transition probabilities newly enable direct inference of parameters in the SIR model. We also propose a very fast method for approximating the transition probabilities under the SIR model via a novel branching process simplification, and compare it to the continued fraction representation method with application to the 17th century plague in Eyam. Although the two methods produce similar maximum a posteriori estimates, the branching process approximation fails to capture the correlation structure in the joint posterior distribution.
Collapse
Affiliation(s)
- Lam Si Tung Ho
- Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA, USA.
| | - Jason Xu
- Department of Biomathematics, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Vladimir N Minin
- Departments of Statistics and Biology, University of Washington, Seattle, WA, USA
| | - Marc A Suchard
- Departments of Biomathematics, Biostatistics and Human Genetics, University of California, Los Angeles, Los Angeles, WA, USA
| |
Collapse
|
15
|
Estimating dose-specific cell division and apoptosis rates from chemo-sensitivity experiments. Sci Rep 2018; 8:2705. [PMID: 29426887 PMCID: PMC5807362 DOI: 10.1038/s41598-018-21017-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Accepted: 01/29/2018] [Indexed: 11/13/2022] Open
Abstract
In-vitro chemo-sensitivity experiments are an essential step in the early stages of cancer therapy development, but existing data analysis methods suffer from problems with fitting, do not permit assessment of uncertainty, and can give misleading estimates of cell growth inhibition. We present an approach (bdChemo) based on a mechanistic model of cell division and death that permits rigorous statistical analyses of chemo-sensitivity experiment data by simultaneous estimation of cell division and apoptosis rates as functions of dose, without making strong assumptions about the shape of the dose-response curve. We demonstrate the utility of this method using a large-scale NCI-DREAM challenge dataset. We developed an R package “bdChemo” implementing this method, available at https://github.com/YiyiLiu1/bdChemo.
Collapse
|
16
|
Andronov A. On a reward rate estimation for the finite irreducible continuous-time Markov chain. JOURNAL OF STATISTICAL THEORY AND PRACTICE 2017. [DOI: 10.1080/15598608.2017.1282895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Alexander Andronov
- Mathematical Methods and Modeling, Transport and Telecommunication Institute, Riga, Latvia
| |
Collapse
|
17
|
Social interactions among grazing reef fish drive material flux in a coral reef ecosystem. Proc Natl Acad Sci U S A 2017; 114:4703-4708. [PMID: 28396400 DOI: 10.1073/pnas.1615652114] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In human financial and social systems, exchanges of information among individuals cause speculative bubbles, behavioral cascades, and other correlated actions that profoundly influence system-level function. Exchanges of information are also widespread in ecological systems, but their effects on ecosystem-level processes are largely unknown. Herbivory is a critical ecological process in coral reefs, where diverse assemblages of fish maintain reef health by controlling the abundance of algae. Here, we show that social interactions have a major effect on fish grazing rates in a reef ecosystem. We combined a system for observing and manipulating large foraging areas in a coral reef with a class of dynamical decision-making models to reveal that reef fish use information about the density and actions of nearby fish to decide when to feed on algae and when to flee foraging areas. This "behavioral coupling" causes bursts of feeding activity that account for up to 68% of the fish community's consumption of algae. Moreover, correlations in fish behavior induce a feedback, whereby each fish spends less time feeding when fewer fish are present, suggesting that reducing fish stocks may not only reduce total algal consumption but could decrease the amount of algae each remaining fish consumes. Our results demonstrate that social interactions among consumers can have a dominant effect on the flux of energy and materials through ecosystems, and our methodology paves the way for rigorous in situ measurements of the behavioral rules that underlie ecological rates in other natural systems.
Collapse
|
18
|
Liu LL, Brumbaugh J, Bar-Nur O, Smith Z, Stadtfeld M, Meissner A, Hochedlinger K, Michor F. Probabilistic Modeling of Reprogramming to Induced Pluripotent Stem Cells. Cell Rep 2016; 17:3395-3406. [PMID: 28009305 PMCID: PMC5467646 DOI: 10.1016/j.celrep.2016.11.080] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2016] [Revised: 10/04/2016] [Accepted: 11/24/2016] [Indexed: 01/01/2023] Open
Abstract
Reprogramming of somatic cells to induced pluripotent stem cells (iPSCs) is typically an inefficient and asynchronous process. A variety of technological efforts have been made to accelerate and/or synchronize this process. To define a unified framework to study and compare the dynamics of reprogramming under different conditions, we developed an in silico analysis platform based on mathematical modeling. Our approach takes into account the variability in experimental results stemming from probabilistic growth and death of cells and potentially heterogeneous reprogramming rates. We suggest that reprogramming driven by the Yamanaka factors alone is a more heterogeneous process, possibly due to cell-specific reprogramming rates, which could be homogenized by the addition of additional factors. We validated our approach using publicly available reprogramming datasets, including data on early reprogramming dynamics as well as cell count data, and thus we demonstrated the general utility and predictive power of our methodology for investigating reprogramming and other cell fate change systems.
Collapse
Affiliation(s)
- Lin L Liu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Justin Brumbaugh
- Massachusetts General Hospital Cancer Center and Center for Regenerative Medicine, Boston, MA 02114, USA; Harvard Stem Cell Institute, Cambridge, MA 02138, USA; Department of Stem Cell and Regenerative Biology, Cambridge, MA 02138, USA
| | - Ori Bar-Nur
- Massachusetts General Hospital Cancer Center and Center for Regenerative Medicine, Boston, MA 02114, USA; Harvard Stem Cell Institute, Cambridge, MA 02138, USA; Department of Stem Cell and Regenerative Biology, Cambridge, MA 02138, USA
| | - Zachary Smith
- Department of Stem Cell and Regenerative Biology, Cambridge, MA 02138, USA
| | - Matthias Stadtfeld
- The Helen L. and Martin S. Kimmel Center for Biology and Medicine, Skirball Institute of Biomolecular Medicine, Department of Cell Biology, NYU School of Medicine, New York, NY 10016, USA
| | - Alexander Meissner
- Department of Stem Cell and Regenerative Biology, Cambridge, MA 02138, USA
| | - Konrad Hochedlinger
- Massachusetts General Hospital Cancer Center and Center for Regenerative Medicine, Boston, MA 02114, USA; Harvard Stem Cell Institute, Cambridge, MA 02138, USA; Department of Stem Cell and Regenerative Biology, Cambridge, MA 02138, USA; Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Franziska Michor
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA.
| |
Collapse
|
19
|
Abstract
Birth-death processes are continuous-time Markov counting processes. Approximate moments can be computed by truncating the transition rate matrix. Using a coupling argument, we derive bounds for the total variation distance between the process and its finite approximation.
Collapse
Affiliation(s)
- Forrest W. Crawford
- Departments of Biostatistics and Ecology & Evolutionary Biology, Yale University, 60 College St, PO Box 208034 New Haven, CT 06510 USA, phone: (203) 785-6125
| | - Timothy C. Stutz
- Department of Biomathematics, University of California, Los Angeles
| | - Kenneth Lange
- Department of Biomathematics, University of California, Los Angeles
- Departments of Human Genetics and Statistics, University of California, Los Angeles
| |
Collapse
|
20
|
Xu J, Guttorp P, Kato-Maeda M, Minin VN. Likelihood-based inference for discretely observed birth-death-shift processes, with applications to evolution of mobile genetic elements. Biometrics 2015; 71:1009-21. [PMID: 26148963 DOI: 10.1111/biom.12352] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2015] [Revised: 05/01/2015] [Accepted: 05/01/2015] [Indexed: 11/28/2022]
Abstract
Continuous-time birth-death-shift (BDS) processes are frequently used in stochastic modeling, with many applications in ecology and epidemiology. In particular, such processes can model evolutionary dynamics of transposable elements-important genetic markers in molecular epidemiology. Estimation of the effects of individual covariates on the birth, death, and shift rates of the process can be accomplished by analyzing patient data, but inferring these rates in a discretely and unevenly observed setting presents computational challenges. We propose a multi-type branching process approximation to BDS processes and develop a corresponding expectation maximization algorithm, where we use spectral techniques to reduce calculation of expected sufficient statistics to low-dimensional integration. These techniques yield an efficient and robust optimization routine for inferring the rates of the BDS process, and apply broadly to multi-type branching processes whose rates can depend on many covariates. After rigorously testing our methodology in simulation studies, we apply our method to study intrapatient time evolution of IS6110 transposable element, a genetic marker frequently used during estimation of epidemiological clusters of Mycobacterium tuberculosis infections.
Collapse
Affiliation(s)
- Jason Xu
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Peter Guttorp
- Department of Statistics, University of Washington, Seattle, WA, U.S.A
| | - Midori Kato-Maeda
- School of Medicine, University of California, San Francisco, CA, U.S.A
| | - Vladimir N Minin
- Department of Statistics, University of Washington, Seattle, WA, U.S.A.,Department of Biology, University of Washington, Seattle, WA, U.S.A
| |
Collapse
|
21
|
Xu J, Minin VN. Efficient Transition Probability Computation for Continuous-Time Branching Processes via Compressed Sensing. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE : PROCEEDINGS OF THE ... CONFERENCE. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE 2015; 2015:952-961. [PMID: 26949377 PMCID: PMC4775097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Branching processes are a class of continuous-time Markov chains (CTMCs) with ubiquitous applications. A general difficulty in statistical inference under partially observed CTMC models arises in computing transition probabilities when the discrete state space is large or uncountable. Classical methods such as matrix exponentiation are infeasible for large or countably infinite state spaces, and sampling-based alternatives are computationally intensive, requiring integration over all possible hidden events. Recent work has successfully applied generating function techniques to computing transition probabilities for linear multi-type branching processes. While these techniques often require significantly fewer computations than matrix exponentiation, they also become prohibitive in applications with large populations. We propose a compressed sensing framework that significantly accelerates the generating function method, decreasing computational cost up to a logarithmic factor by only assuming the probability mass of transitions is sparse. We demonstrate accurate and efficient transition probability computations in branching process models for blood cell formation and evolution of self-replicating transposable elements in bacterial genomes.
Collapse
Affiliation(s)
- Jason Xu
- Department of Statistics, University of Washington, Seattle, WA 98195
| | - Vladimir N Minin
- Departments of Statistics and Biology, University of Washington, Seattle, WA 98195
| |
Collapse
|
22
|
Crawford FW, Weiss RE, Suchard MA. SEX, LIES AND SELF-REPORTED COUNTS: BAYESIAN MIXTURE MODELS FOR HEAPING IN LONGITUDINAL COUNT DATA VIA BIRTH-DEATH PROCESSES. Ann Appl Stat 2015; 9:572-596. [PMID: 26500711 DOI: 10.1214/15-aoas809] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Surveys often ask respondents to report non-negative counts, but respondents may misremember or round to a nearby multiple of 5 or 10. This phenomenon is called heaping, and the error inherent in heaped self-reported numbers can bias estimation. Heaped data may be collected cross-sectionally or longitudinally and there may be covariates that complicate the inferential task. Heaping is a well-known issue in many survey settings, and inference for heaped data is an important statistical problem. We propose a novel reporting distribution whose underlying parameters are readily interpretable as rates of misremembering and rounding. The process accommodates a variety of heaping grids and allows for quasi-heaping to values nearly but not equal to heaping multiples. We present a Bayesian hierarchical model for longitudinal samples with covariates to infer both the unobserved true distribution of counts and the parameters that control the heaping process. Finally, we apply our methods to longitudinal self-reported counts of sex partners in a study of high-risk behavior in HIV-positive youth.
Collapse
Affiliation(s)
| | - Robert E Weiss
- Department of Biostatistics, UCLA Fielding School of Public Health
| | - Marc A Suchard
- Department of Biostatistics, UCLA Fielding School of Public Health ; Departments of Biomathematics and Human Genetics, David Geffen School of Medicine at UCLA
| |
Collapse
|