1
|
Pessoa P, Schweiger M, Pressé S. Avoiding matrix exponentials for large transition rate matrices. J Chem Phys 2024; 160:094109. [PMID: 38436441 PMCID: PMC10919955 DOI: 10.1063/5.0190527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/07/2024] [Indexed: 03/05/2024] Open
Abstract
Exact methods for the exponentiation of matrices of dimension N can be computationally expensive in terms of execution time (N3) and memory requirements (N2), not to mention numerical precision issues. A matrix often exponentiated in the natural sciences is the rate matrix. Here, we explore five methods to exponentiate rate matrices, some of which apply more broadly to other matrix types. Three of the methods leverage a mathematical analogy between computing matrix elements of a matrix exponential process and computing transition probabilities of a dynamical process (technically a Markov jump process, MJP, typically simulated using Gillespie). In doing so, we identify a novel MJP-based method relying on restricting the number of "trajectory" jumps that incurs improved computational scaling. We then discuss this method's downstream implications on mixing properties of Monte Carlo posterior samplers. We also benchmark two other methods of matrix exponentiation valid for any matrix (beyond rate matrices and, more generally, positive definite matrices) related to solving differential equations: Runge-Kutta integrators and Krylov subspace methods. Under conditions where both the largest matrix element and the number of non-vanishing elements scale linearly with N-reasonable conditions for rate matrices often exponentiated-computational time scaling with the most competitive methods (Krylov and one of the MJP-based methods) reduces to N2 with total memory requirements of N.
Collapse
Affiliation(s)
| | | | - Steve Pressé
- Author to whom correspondence should be addressed:
| |
Collapse
|
2
|
Zolaktaf S, Dannenberg F, Schmidt M, Condon A, Winfree E. Predicting DNA kinetics with a truncated continuous-time Markov chain method. Comput Biol Chem 2023; 104:107837. [PMID: 36858009 DOI: 10.1016/j.compbiolchem.2023.107837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 02/05/2023] [Accepted: 02/21/2023] [Indexed: 03/03/2023]
Abstract
Predicting the kinetics of reactions involving nucleic acid strands is a fundamental task in biology and biotechnology. Reaction kinetics can be modeled as an elementary step continuous-time Markov chain, where states correspond to secondary structures and transitions correspond to base pair formation and breakage. Since the number of states in the Markov chain could be large, rates are determined by estimating the mean first passage time from sampled trajectories. As a result, the cost of kinetic predictions becomes prohibitively expensive for rare events with extremely long trajectories. Also problematic are scenarios where multiple predictions are needed for the same reaction, e.g., under different environmental conditions, or when calibrating model parameters, because a new set of trajectories is needed multiple times. We propose a new method, called pathway elaboration, to handle these scenarios. Pathway elaboration builds a truncated continuous-time Markov chain through both biased and unbiased sampling. The resulting Markov chain has moderate state space size, so matrix methods can efficiently compute reaction rates, even for rare events. Also the transition rates of the truncated Markov chain can easily be adapted when model or environmental parameters are perturbed, making model calibration feasible. We illustrate the utility of pathway elaboration on toehold-mediated strand displacement reactions, show that it well-approximates trajectory-based predictions of unbiased elementary step models on a wide range of reaction types for which such predictions are feasible, and demonstrate that it performs better than alternative truncation-based approaches that are applicable for mean first passage time estimation. Finally, in a small study, we use pathway elaboration to optimize the Metropolis kinetic model of Multistrand, an elementary step simulator, showing that the optimized parameters greatly improve reaction rate predictions. Our framework and dataset are available at https://github.com/DNA-and-Natural-Algorithms-Group/PathwayElaboration.
Collapse
Affiliation(s)
| | | | - Mark Schmidt
- University of British Columbia, Canada; Canada CIFAR AI Chair (Amii), Canada.
| | | | - Erik Winfree
- California Institute of Technology, United States of America.
| |
Collapse
|
3
|
Dinh T, Sidje RB. An adaptive solution to the chemical master equation using quantized tensor trains with sliding windows. Phys Biol 2020; 17:065014. [PMID: 32610302 DOI: 10.1088/1478-3975/aba1d2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
To cope with an extremely large or even infinite state space when solving the chemical master equation in biological problems, a potent strategy is to restrict to a finite state projection (FSP) and represent the transition matrix and probability vector in quantized tensor train (QTT) format, leading to savings in storage while retaining accuracy. In an earlier adaptive FSP-QTT algorithm, the multidimensional state space was downsized and kept in the form of a hyper rectangle that was updated when needed by selectively doubling some of its side dimensions. However, this could result in a much larger state space than necessary, with the effect of hampering both the execution time and stepping scheme. In this work, we improve the algorithm by enabling sliding windows that can dynamically slide, shrink or expand, with updates driven by a number of stochastic simulation algorithm trajectories. The ensuing state space is a considerably reduced hyper rectangle containing only the most probable states at each time step. Three numerical experiments of varying difficulty are performed to compare our approach with the original adaptive FSP-QTT algorithm.
Collapse
|
4
|
Kosarwal R, Kulasiri D, Samarasinghe S. Novel domain expansion methods to improve the computational efficiency of the Chemical Master Equation solution for large biological networks. BMC Bioinformatics 2020; 21:515. [PMID: 33176690 PMCID: PMC7656229 DOI: 10.1186/s12859-020-03668-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2020] [Accepted: 07/17/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Numerical solutions of the chemical master equation (CME) are important for understanding the stochasticity of biochemical systems. However, solving CMEs is a formidable task. This task is complicated due to the nonlinear nature of the reactions and the size of the networks which result in different realizations. Most importantly, the exponential growth of the size of the state-space, with respect to the number of different species in the system makes this a challenging assignment. When the biochemical system has a large number of variables, the CME solution becomes intractable. We introduce the intelligent state projection (ISP) method to use in the stochastic analysis of these systems. For any biochemical reaction network, it is important to capture more than one moment: this allows one to describe the system's dynamic behaviour. ISP is based on a state-space search and the data structure standards of artificial intelligence (AI). It can be used to explore and update the states of a biochemical system. To support the expansion in ISP, we also develop a Bayesian likelihood node projection (BLNP) function to predict the likelihood of the states. RESULTS To demonstrate the acceptability and effectiveness of our method, we apply the ISP method to several biological models discussed in prior literature. The results of our computational experiments reveal that the ISP method is effective both in terms of the speed and accuracy of the expansion, and the accuracy of the solution. This method also provides a better understanding of the state-space of the system in terms of blueprint patterns. CONCLUSIONS The ISP is the de-novo method which addresses both accuracy and performance problems for CME solutions. It systematically expands the projection space based on predefined inputs. This ensures accuracy in the approximation and an exact analytical solution for the time of interest. The ISP was more effective both in predicting the behavior of the state-space of the system and in performance management, which is a vital step towards modeling large biochemical systems.
Collapse
Affiliation(s)
- Rahul Kosarwal
- Centre for Advanced Computational Solutions (C-fACS), Lincoln University, Lincoln, Christchurch, New Zealand
- Complex Systems, Big Data, and Informatics Initiative (CSBII), Lincoln University, Lincoln, Christchurch, New Zealand
| | - Don Kulasiri
- Centre for Advanced Computational Solutions (C-fACS), Lincoln University, Lincoln, Christchurch, New Zealand.
- Complex Systems, Big Data, and Informatics Initiative (CSBII), Lincoln University, Lincoln, Christchurch, New Zealand.
| | - Sandhya Samarasinghe
- Centre for Advanced Computational Solutions (C-fACS), Lincoln University, Lincoln, Christchurch, New Zealand
- Complex Systems, Big Data, and Informatics Initiative (CSBII), Lincoln University, Lincoln, Christchurch, New Zealand
| |
Collapse
|
5
|
Terebus A, Liu C, Liang J. Discrete and continuous models of probability flux of switching dynamics: Uncovering stochastic oscillations in a toggle-switch system. J Chem Phys 2019; 151:185104. [PMID: 31731858 DOI: 10.1063/1.5124823] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The probability flux and velocity in stochastic reaction networks can help in characterizing dynamic changes in probability landscapes of these networks. Here, we study the behavior of three different models of probability flux, namely, the discrete flux model, the Fokker-Planck model, and a new continuum model of the Liouville flux. We compare these fluxes that are formulated based on, respectively, the chemical master equation, the stochastic differential equation, and the ordinary differential equation. We examine similarities and differences among these models at the nonequilibrium steady state for the toggle switch network under different binding and unbinding conditions. Our results show that at a strong stochastic condition of weak promoter binding, continuum models of Fokker-Planck and Liouville fluxes deviate significantly from the discrete flux model. Furthermore, we report the discovery of stochastic oscillation in the toggle-switch system occurring at weak binding conditions, a phenomenon captured only by the discrete flux model.
Collapse
Affiliation(s)
- Anna Terebus
- Richard and Loan Hill Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60607, USA
| | - Chun Liu
- Department of Applied Mathematics, Illinois Institute of Technology, Chicago, Illinois 60616, USA
| | - Jie Liang
- Richard and Loan Hill Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60607, USA
| |
Collapse
|
6
|
Reid BM, Sidje RB. Finite state projection for approximating the stationary solution to the chemical master equation using reaction rate equations. Math Biosci 2019; 316:108243. [PMID: 31449893 DOI: 10.1016/j.mbs.2019.108243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2019] [Revised: 08/21/2019] [Accepted: 08/22/2019] [Indexed: 10/26/2022]
Abstract
When modeling a physical system using a Markov chain, it is often instructive to compute its probability distribution at statistical equilibrium, thereby gaining insight into the stationary, or long-term, behavior of the system. Computing such a distribution directly is problematic when the state space of the system is large. Here, we look at the case of a chemical reaction system that models the dynamics of cellular processes, where it has become popular to constrain the computational burden by using a finite state projection, which aims only to capture the most likely states of the system, rather than every possible state. We propose an efficient method to further narrow these states to those that remain highly probable in the long run, after the transient behavior of the system has dissipated. Our strategy is to quickly estimate the local maxima of the stationary distribution using the reaction rate formulation, which is of considerably smaller size than the full-blown chemical master equation, and from there develop adaptive schemes to profile the distribution around the maxima. We include numerical tests that show the efficiency of our approach.
Collapse
Affiliation(s)
- Brandon M Reid
- Department of Mathematics, The University of Alabama, Tuscaloosa, AL 35487, USA.
| | - Roger B Sidje
- Department of Mathematics, The University of Alabama, Tuscaloosa, AL 35487, USA.
| |
Collapse
|
7
|
Vo HD, Fox Z, Baetica A, Munsky B. Bayesian Estimation for Stochastic Gene Expression Using Multifidelity Models. J Phys Chem B 2019; 123:2217-2234. [PMID: 30777763 DOI: 10.1021/acs.jpcb.8b10946] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The finite state projection (FSP) approach to solving the chemical master equation has enabled successful inference of discrete stochastic models to predict single-cell gene regulation dynamics. Unfortunately, the FSP approach is highly computationally intensive for all but the simplest models, an issue that is highly problematic when parameter inference and uncertainty quantification takes enormous numbers of parameter evaluations. To address this issue, we propose two new computational methods for the Bayesian inference of stochastic gene expression parameters given single-cell experiments. We formulate and verify an adaptive delayed acceptance Metropolis-Hastings (ADAMH) algorithm to utilize with reduced Krylov-basis projections of the FSP. We then introduce an extension of the ADAMH into a hybrid scheme that consists of an initial phase to construct a reduced model and a faster second phase to sample from the approximate posterior distribution determined by the constructed model. We test and compare both algorithms to an adaptive Metropolis algorithm with full FSP-based likelihood evaluations on three example models and simulated data to show that the new ADAMH variants achieve substantial speedup in comparison to the full FSP approach. By reducing the computational costs of parameter estimation, we expect the ADAMH approach to enable efficient data-driven estimation for more complex gene regulation models.
Collapse
Affiliation(s)
- Huy D Vo
- Department of Chemical and Biological Engineering , Colorado State University , Fort Collins , Colorado 80523 , United States
| | - Zachary Fox
- Keck Scholars, School of Biomedical Engineering , Colorado State University , Fort Collins , Colorado 80523 , United States
| | - Ania Baetica
- Department of Biochemistry and Biophysics , University of California San Francisco , San Francisco , California 94158 , United States
| | - Brian Munsky
- Department of Chemical and Biological Engineering , Colorado State University , Fort Collins , Colorado 80523 , United States.,Keck Scholars, School of Biomedical Engineering , Colorado State University , Fort Collins , Colorado 80523 , United States
| |
Collapse
|
8
|
Vo HD, Sidje RB. An adaptive solution to the chemical master equation using tensors. J Chem Phys 2018; 147:044102. [PMID: 28764339 DOI: 10.1063/1.4994917] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Solving the chemical master equation directly is difficult due to the curse of dimensionality. We tackle that challenge by a numerical scheme based on the quantized tensor train (QTT) format, which enables us to represent the solution in a compressed form that scales linearly with the dimension. We recast the finite state projection in this QTT framework and allow it to expand adaptively based on proven error criteria. The end result is a QTT-formatted matrix exponential that we evaluate through a combination of the inexact uniformization technique and the alternating minimal energy algorithm. Our method can detect when the equilibrium distribution is reached with an inexpensive test that exploits the structure of the tensor format. We successfully perform numerical tests on high-dimensional problems that had been out of reach for classical approaches.
Collapse
Affiliation(s)
- Huy D Vo
- Department of Mathematics, The University of Alabama, Tuscaloosa, Alabama 35487, USA
| | - Roger B Sidje
- Department of Mathematics, The University of Alabama, Tuscaloosa, Alabama 35487, USA
| |
Collapse
|
9
|
Dinh KN, Sidje RB. An application of the Krylov-FSP-SSA method to parameter fitting with maximum likelihood. Phys Biol 2017; 14:065001. [PMID: 29098989 DOI: 10.1088/1478-3975/aa868a] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Monte Carlo methods such as the stochastic simulation algorithm (SSA) have traditionally been employed in gene regulation problems. However, there has been increasing interest to directly obtain the probability distribution of the molecules involved by solving the chemical master equation (CME). This requires addressing the curse of dimensionality that is inherent in most gene regulation problems. The finite state projection (FSP) seeks to address the challenge and there have been variants that further reduce the size of the projection or that accelerate the resulting matrix exponential. The Krylov-FSP-SSA variant has proved numerically efficient by combining, on one hand, the SSA to adaptively drive the FSP, and on the other hand, adaptive Krylov techniques to evaluate the matrix exponential. Here we apply this Krylov-FSP-SSA to a mutual inhibitory gene network synthetically engineered in Saccharomyces cerevisiae, in which bimodality arises. We show numerically that the approach can efficiently approximate the transient probability distribution, and this has important implications for parameter fitting, where the CME has to be solved for many different parameter sets. The fitting scheme amounts to an optimization problem of finding the parameter set so that the transient probability distributions fit the observations with maximum likelihood. We compare five optimization schemes for this difficult problem, thereby providing further insights into this approach of parameter estimation that is often applied to models in systems biology where there is a need to calibrate free parameters.
Collapse
Affiliation(s)
- Khanh N Dinh
- Department of Mathematics, University of Alabama, Tuscaloosa, AL 35487, United States of America
| | | |
Collapse
|
10
|
Chu BK, Tse MJ, Sato RR, Read EL. Markov State Models of gene regulatory networks. BMC SYSTEMS BIOLOGY 2017; 11:14. [PMID: 28166778 PMCID: PMC5294885 DOI: 10.1186/s12918-017-0394-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 01/13/2017] [Indexed: 11/22/2022]
Abstract
Background Gene regulatory networks with dynamics characterized by multiple stable states underlie cell fate-decisions. Quantitative models that can link molecular-level knowledge of gene regulation to a global understanding of network dynamics have the potential to guide cell-reprogramming strategies. Networks are often modeled by the stochastic Chemical Master Equation, but methods for systematic identification of key properties of the global dynamics are currently lacking. Results The method identifies the number, phenotypes, and lifetimes of long-lived states for a set of common gene regulatory network models. Application of transition path theory to the constructed Markov State Model decomposes global dynamics into a set of dominant transition paths and associated relative probabilities for stochastic state-switching. Conclusions In this proof-of-concept study, we found that the Markov State Model provides a general framework for analyzing and visualizing stochastic multistability and state-transitions in gene networks. Our results suggest that this framework—adopted from the field of atomistic Molecular Dynamics—can be a useful tool for quantitative Systems Biology at the network scale. Electronic supplementary material The online version of this article (doi:10.1186/s12918-017-0394-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Brian K Chu
- Department of Chemical Engineering and Materials Science, University of California Irvine, Irvine, CA, USA
| | - Margaret J Tse
- Department of Chemical Engineering and Materials Science, University of California Irvine, Irvine, CA, USA
| | - Royce R Sato
- Department of Chemical Engineering and Materials Science, University of California Irvine, Irvine, CA, USA
| | - Elizabeth L Read
- Department of Chemical Engineering and Materials Science, University of California Irvine, Irvine, CA, USA. .,Department of Molecular Biology and Biochemistry, University of California Irvine, Irvine, CA, USA.
| |
Collapse
|
11
|
Dinh KN, Sidje RB. Understanding the finite state projection and related methods for solving the chemical master equation. Phys Biol 2016; 13:035003. [PMID: 27176781 DOI: 10.1088/1478-3975/13/3/035003] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The finite state projection (FSP) method has enabled us to solve the chemical master equation of some biological models that were considered out of reach not long ago. Since the original FSP method, much effort has gone into transforming it into an adaptive time-stepping algorithm as well as studying its accuracy. Some of the improvements include the multiple time interval FSP, the sliding windows, and most notably the Krylov-FSP approach. Our goal in this tutorial is to give the reader an overview of the current methods that build on the FSP.
Collapse
Affiliation(s)
- Khanh N Dinh
- Department of Mathematics University of Alabama Tuscaloosa, Alabama 35487, USA
| | | |
Collapse
|
12
|
Abstract
A stochastic model of cellular p53 regulation was established in Leenders, and Tuszynski (2013 Front. Oncol. 3 1-16) to study the interactions of p53 with MDM2 proteins, where the stochastic analysis was done using a Monte Carlo approach. We revisit that model here using an alternative scheme, which is to directly solve the chemical master equation (CME) by an adaptive Krylov-based finite state projection method that combines the stochastic simulation algorithm with other computational strategies, namely Krylov approximation techniques to the matrix exponential, divide and conquer, and aggregation. We report numerical results that demonstrate the extend of tackling the CME with this combination of tools.
Collapse
Affiliation(s)
- Huy D Vo
- Department of Mathematics University of Alabama Tuscaloosa, AL 35487, USA
| | | |
Collapse
|