1
|
Liu C, Wang J. Distilling dynamical knowledge from stochastic reaction networks. Proc Natl Acad Sci U S A 2024; 121:e2317422121. [PMID: 38530895 PMCID: PMC10998579 DOI: 10.1073/pnas.2317422121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/20/2024] [Indexed: 03/28/2024] Open
Abstract
Stochastic reaction networks are widely used in the modeling of stochastic systems across diverse domains such as biology, chemistry, physics, and ecology. However, the comprehension of the dynamic behaviors inherent in stochastic reaction networks is a formidable undertaking, primarily due to the exponential growth in the number of possible states or trajectories as the state space dimension increases. In this study, we introduce a knowledge distillation method based on reinforcement learning principles, aimed at compressing the dynamical knowledge encoded in stochastic reaction networks into a singular neural network construct. The trained neural network possesses the capability to accurately predict the state conditional joint probability distribution that corresponds to the given query contexts, when prompted with rate parameters, initial conditions, and time values. This obviates the need to track the dynamical process, enabling the direct estimation of normalized state and trajectory probabilities, without necessitating the integration over the complete state space. By applying our method to representative examples, we have observed a high degree of accuracy in both multimodal and high-dimensional systems. Additionally, the trained neural network can serve as a foundational model for developing efficient algorithms for parameter inference and trajectory ensemble generation. These results collectively underscore the efficacy of our approach as a universal means of distilling knowledge from stochastic reaction networks. Importantly, our methodology also spotlights the potential utility in harnessing a singular, pretrained, large-scale model to encapsulate the solution space underpinning a wide spectrum of stochastic dynamical systems.
Collapse
Affiliation(s)
- Chuanbo Liu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin130022, People’s Republic of China
| | - Jin Wang
- Center for Theoretical Interdisciplinary Sciences, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325001, People’s Republic of China
- Department of Chemistry and of Physics and Astronomy, State University of New York at Stony Brook, NY11794-3400
| |
Collapse
|
2
|
Carruthers J, Finnie T. Using mixture density networks to emulate a stochastic within-host model of Francisella tularensis infection. PLoS Comput Biol 2023; 19:e1011266. [PMID: 38117811 PMCID: PMC10766174 DOI: 10.1371/journal.pcbi.1011266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 01/04/2024] [Accepted: 12/01/2023] [Indexed: 12/22/2023] Open
Abstract
For stochastic models with large numbers of states, analytical techniques are often impractical, and simulations time-consuming and computationally demanding. This limitation can hinder the practical implementation of such models. In this study, we demonstrate how neural networks can be used to develop emulators for two outputs of a stochastic within-host model of Francisella tularensis infection: the dose-dependent probability of illness and the incubation period. Once the emulators are constructed, we employ Markov Chain Monte Carlo sampling methods to parameterize the within-host model using records of human infection. This inference is only possible through the use of a mixture density network to emulate the incubation period, providing accurate approximations of the corresponding probability distribution. Notably, these estimates improve upon previous approaches that relied on bacterial counts from the lungs of macaques. Our findings reveal a 50% infectious dose of approximately 10 colony-forming units and we estimate that the incubation period can last for up to 11 days following low dose exposure.
Collapse
Affiliation(s)
- Jonathan Carruthers
- Data, Analytics and Surveillance; UK Health Security Agency, Porton Down, United Kingdom
| | - Thomas Finnie
- Data, Analytics and Surveillance; UK Health Security Agency, Porton Down, United Kingdom
| |
Collapse
|
3
|
Benyó B, Paláncz B, Szlávecz Á, Szabó B, Kovács K, Chase JG. Classification-based deep neural network vs mixture density network models for insulin sensitivity prediction problem. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107633. [PMID: 37343375 DOI: 10.1016/j.cmpb.2023.107633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 05/21/2023] [Accepted: 05/29/2023] [Indexed: 06/23/2023]
Abstract
Model-based glycemic control (GC) protocols are used to treat stress-induced hyperglycaemia in intensive care units (ICUs). The STAR (Stochastic-TARgeted) glycemic control protocol - used in clinical practice in several ICUs in New Zealand, Hungary, Belgium, and Malaysia - is a model-based GC protocol using a patient-specific, model-based insulin sensitivity to describe the patient's actual state. Two neural network based methods are defined in this study to predict the patient's insulin sensitivity parameter: a classification deep neural network and a Mixture Density Network based method. Treatment data from three different patient cohorts are used to train the network models. Accuracy of neural network predictions are compared with the current model- based predictions used to guide care. The prediction accuracy was found to be the same or better than the reference. The authors suggest that these methods may be a promising alternative in model-based clinical treatment for patient state prediction. Still, more research is needed to validate these findings, including in-silico simulations and clinical validation trials.
Collapse
Affiliation(s)
- Balázs Benyó
- Department of Control Engineering and Information Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, Hungary.
| | - Béla Paláncz
- Department of Control Engineering and Information Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Ákos Szlávecz
- Department of Control Engineering and Information Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Bálint Szabó
- Department of Control Engineering and Information Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, Hungary
| | - Katalin Kovács
- Department of Informatics, Széchenyi István University, Győr, Hungary
| | - J Geoffrey Chase
- Department of Mechanical Engineering, University of Canterbury, Christchurch, New Zealand
| |
Collapse
|
4
|
Zhang W, Valencia A, Chang NB. Synergistic Integration Between Machine Learning and Agent-Based Modeling: A Multidisciplinary Review. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2170-2190. [PMID: 34473633 DOI: 10.1109/tnnls.2021.3106777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Agent-based modeling (ABM) involves developing models in which agents make adaptive decisions in a changing environment. Machine-learning (ML) based inference models can improve sequential decision-making by learning agents' behavioral patterns. With the aid of ML, this emerging area can extend traditional agent-based schemes that hardcode agents' behavioral rules into an adaptive model. Even though there are plenty of studies that apply ML in ABMs, the generalized applicable scenarios, frameworks, and procedures for implementations are not well addressed. In this article, we provide a comprehensive review of applying ML in ABM based on four major scenarios, i.e., microagent-level situational awareness learning, microagent-level behavior intervention, macro-ABM-level emulator, and sequential decision-making. For these four scenarios, the related algorithms, frameworks, procedures of implementations, and multidisciplinary applications are thoroughly investigated. We also discuss how ML can improve prediction in ABMs by trading off the variance and bias and how ML can improve the sequential decision-making of microagent and macrolevel policymakers via a mechanism of reinforced behavioral intervention. At the end of this article, future perspectives of applying ML in ABMs are discussed with respect to data acquisition and quality issues, the possible solution of solving the convergence problem of reinforcement learning, interpretable ML applications, and bounded rationality of ABM.
Collapse
|
5
|
Jørgensen ACS, Ghosh A, Sturrock M, Shahrezaei V. Efficient Bayesian inference for stochastic agent-based models. PLoS Comput Biol 2022; 18:e1009508. [PMID: 36197919 PMCID: PMC9576090 DOI: 10.1371/journal.pcbi.1009508] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 10/17/2022] [Accepted: 09/21/2022] [Indexed: 11/14/2022] Open
Abstract
The modelling of many real-world problems relies on computationally heavy simulations of randomly interacting individuals or agents. However, the values of the parameters that underlie the interactions between agents are typically poorly known, and hence they need to be inferred from macroscopic observations of the system. Since statistical inference rests on repeated simulations to sample the parameter space, the high computational expense of these simulations can become a stumbling block. In this paper, we compare two ways to mitigate this issue in a Bayesian setting through the use of machine learning methods: One approach is to construct lightweight surrogate models to substitute the simulations used in inference. Alternatively, one might altogether circumvent the need for Bayesian sampling schemes and directly estimate the posterior distribution. We focus on stochastic simulations that track autonomous agents and present two case studies: tumour growths and the spread of infectious diseases. We demonstrate that good accuracy in inference can be achieved with a relatively small number of simulations, making our machine learning approaches orders of magnitude faster than classical simulation-based methods that rely on sampling the parameter space. However, we find that while some methods generally produce more robust results than others, no algorithm offers a one-size-fits-all solution when attempting to infer model parameters from observations. Instead, one must choose the inference technique with the specific real-world application in mind. The stochastic nature of the considered real-world phenomena poses an additional challenge that can become insurmountable for some approaches. Overall, we find machine learning approaches that create direct inference machines to be promising for real-world applications. We present our findings as general guidelines for modelling practitioners.
Collapse
Affiliation(s)
| | | | - Marc Sturrock
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Vahid Shahrezaei
- Department of Mathematics, Faculty of Natural Sciences, Imperial College London, London, United Kingdom
| |
Collapse
|
6
|
Sukys A, Öcal K, Grima R. Approximating Solutions of the Chemical Master Equation using Neural Networks. iScience 2022; 25:105010. [PMID: 36117994 PMCID: PMC9474291 DOI: 10.1016/j.isci.2022.105010] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 06/13/2022] [Accepted: 08/18/2022] [Indexed: 10/27/2022] Open
|
7
|
Davahli MR, Karwowski W, Fiok K. Optimizing COVID-19 vaccine distribution across the United States using deterministic and stochastic recurrent neural networks. PLoS One 2021; 16:e0253925. [PMID: 34228740 PMCID: PMC8259963 DOI: 10.1371/journal.pone.0253925] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Accepted: 06/15/2021] [Indexed: 11/18/2022] Open
Abstract
Optimizing COVID-19 vaccine distribution can help plan around the limited production and distribution of vaccination, particularly in early stages. One of the main criteria for equitable vaccine distribution is predicting the geographic distribution of active virus at the time of vaccination. This research developed sequence-learning models to predict the behavior of the COVID-19 pandemic across the US, based on previously reported information. For this objective, we used two time-series datasets of confirmed COVID-19 cases and COVID-19 effective reproduction numbers from January 22, 2020 to November 26, 2020 for all states in the US. The datasets have 310 time-steps (days) and 50 features (US states). To avoid training the models for all states, we categorized US states on the basis of their similarity to previously reported COVID-19 behavior. For this purpose, we used an unsupervised self-organizing map to categorize all states of the US into four groups on the basis of the similarity of their effective reproduction numbers. After selecting a leading state (the state with earliest outbreaks) in each group, we developed deterministic and stochastic Long Short Term Memory (LSTM) and Mixture Density Network (MDN) models. We trained the models with data from each leading state to make predictions, then compared the models with a baseline linear regression model. We also remove seasonality and trends from a dataset of non-stationary COVID-19 cases to determine the effects on prediction. We showed that the deterministic LSTM model trained on the COVID-19 effective reproduction numbers outperforms other prediction methods.
Collapse
Affiliation(s)
- Mohammad Reza Davahli
- Department of Industrial Engineering and Management Systems, University of Central Florida, Orlando, Florida, United States of America
| | - Waldemar Karwowski
- Department of Industrial Engineering and Management Systems, University of Central Florida, Orlando, Florida, United States of America
| | - Krzysztof Fiok
- Department of Industrial Engineering and Management Systems, University of Central Florida, Orlando, Florida, United States of America
| |
Collapse
|
8
|
Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 1994. [PMID: 7584402 DOI: 10.2139/ssrn.3705225] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences. Multiple motifs are found by fitting a mixture model to the data, probabilistically erasing the occurrences of the motif thus found, and repeating the process to find successive motifs. The algorithm requires only a set of unaligned sequences and a number specifying the width of the motifs as input. It returns a model of each motif and a threshold which together can be used as a Bayes-optimal classifier for searching for occurrences of the motif in other databases. The algorithm estimates how many times each motif occurs in each sequence in the dataset and outputs an alignment of the occurrences of the motif. The algorithm is capable of discovering several different motifs with differing numbers of occurrences in a single dataset.
Collapse
Affiliation(s)
- T L Bailey
- Department of Computer Science and Engineering, University of California at San Diego, La Jolla 92093-0114, USA
| | | |
Collapse
|