1
|
Yang J, Daily NJ, Pullinger TK, Wakatsuki T, Sobie EA. Creating cell-specific computational models of stem cell-derived cardiomyocytes using optical experiments. PLoS Comput Biol 2024; 20:e1011806. [PMID: 39259757 PMCID: PMC11460686 DOI: 10.1371/journal.pcbi.1011806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 10/08/2024] [Accepted: 08/08/2024] [Indexed: 09/13/2024] Open
Abstract
Human induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) have gained traction as a powerful model in cardiac disease and therapeutics research, since iPSCs are self-renewing and can be derived from healthy and diseased patients without invasive surgery. However, current iPSC-CM differentiation methods produce cardiomyocytes with immature, fetal-like electrophysiological phenotypes, and the variety of maturation protocols in the literature results in phenotypic differences between labs. Heterogeneity of iPSC donor genetic backgrounds contributes to additional phenotypic variability. Several mathematical models of iPSC-CM electrophysiology have been developed to help to predict cell responses, but these models individually do not capture the phenotypic variability observed in iPSC-CMs. Here, we tackle these limitations by developing a computational pipeline to calibrate cell preparation-specific iPSC-CM electrophysiological parameters. We used the genetic algorithm (GA), a heuristic parameter calibration method, to tune ion channel parameters in a mathematical model of iPSC-CM physiology. To systematically optimize an experimental protocol that generates sufficient data for parameter calibration, we created in silico datasets by simulating various protocols applied to a population of models with known conductance variations, and then fitted parameters to those datasets. We found that calibrating to voltage and calcium transient data under 3 varied experimental conditions, including electrical pacing combined with ion channel blockade and changing buffer ion concentrations, improved model parameter estimates and model predictions of unseen channel block responses. This observation also held when the fitted data were normalized, suggesting that normalized fluorescence recordings, which are more accessible and higher throughput than patch clamp recordings, could sufficiently inform conductance parameters. Therefore, this computational pipeline can be applied to different iPSC-CM preparations to determine cell line-specific ion channel properties and understand the mechanisms behind variability in perturbation responses.
Collapse
Affiliation(s)
- Janice Yang
- Department of Pharmacological Sciences & Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | - Neil J. Daily
- InvivoSciences Inc., Madison, Wisconsin, United States of America
| | - Taylor K. Pullinger
- Department of Pharmacological Sciences & Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | | | - Eric A. Sobie
- Department of Pharmacological Sciences & Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| |
Collapse
|
2
|
Isenberg NM, Mertins SD, Yoon BJ, Reyes KG, Urban NM. Identifying Bayesian optimal experiments for uncertain biochemical pathway models. Sci Rep 2024; 14:15237. [PMID: 38956095 PMCID: PMC11219779 DOI: 10.1038/s41598-024-65196-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 06/18/2024] [Indexed: 07/04/2024] Open
Abstract
Pharmacodynamic (PD) models are mathematical models of cellular reaction networks that include drug mechanisms of action. These models are useful for studying predictive therapeutic outcomes of novel drug therapies in silico. However, PD models are known to possess significant uncertainty with respect to constituent parameter data, leading to uncertainty in the model predictions. Furthermore, experimental data to calibrate these models is often limited or unavailable for novel pathways. In this study, we present a Bayesian optimal experimental design approach for improving PD model prediction accuracy. We then apply our method using simulated experimental data to account for uncertainty in hypothetical laboratory measurements. This leads to a probabilistic prediction of drug performance and a quantitative measure of which prospective laboratory experiment will optimally reduce prediction uncertainty in the PD model. The methods proposed here provide a way forward for uncertainty quantification and guided experimental design for models of novel biological pathways.
Collapse
Affiliation(s)
| | - Susan D Mertins
- Fredrick National Laboratory for Cancer Research, Fredrick, MD, 21702, USA
| | - Byung-Jun Yoon
- Texas A &M University, College Station, TX, 77843, USA
- Brookhaven National Laboratory, Upton, NY, 11973, USA
| | - Kristofer G Reyes
- University at Buffalo, Buffalo, NY, 14260, USA
- Brookhaven National Laboratory, Upton, NY, 11973, USA
| | | |
Collapse
|
3
|
Antal BB, Chesebro AG, Strey HH, Mujica-Parodi LR, Weistuch C. Achieving Occam's razor: Deep learning for optimal model reduction. PLoS Comput Biol 2024; 20:e1012283. [PMID: 39024398 PMCID: PMC11288447 DOI: 10.1371/journal.pcbi.1012283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Revised: 07/30/2024] [Accepted: 06/27/2024] [Indexed: 07/20/2024] Open
Abstract
All fields of science depend on mathematical models. Occam's razor refers to the principle that good models should exclude parameters beyond those minimally required to describe the systems they represent. This is because redundancy can lead to incorrect estimates of model parameters from data, and thus inaccurate or ambiguous conclusions. Here, we show how deep learning can be powerfully leveraged to apply Occam's razor to model parameters. Our method, FixFit, uses a feedforward deep neural network with a bottleneck layer to characterize and predict the behavior of a given model from its input parameters. FixFit has three major benefits. First, it provides a metric to quantify the original model's degree of complexity. Second, it allows for the unique fitting of data. Third, it provides an unbiased way to discriminate between experimental hypotheses that add value versus those that do not. In three use cases, we demonstrate the broad applicability of this method across scientific domains. To validate the method using a known system, we apply FixFit to recover known composite parameters for the Kepler orbit model and a dynamic model of blood glucose regulation. In the latter, we demonstrate the ability to fit the latent parameters to real data. To illustrate how the method can be applied to less well-established fields, we use it to identify parameters for a multi-scale brain model and reduce the search space for viable candidate mechanisms.
Collapse
Affiliation(s)
- Botond B. Antal
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York, United States of America
| | - Anthony G. Chesebro
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York, United States of America
| | - Helmut H. Strey
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York, United States of America
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, United States of America
| | - Lilianne R. Mujica-Parodi
- Department of Biomedical Engineering, Stony Brook University, Stony Brook, New York, United States of America
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, United States of America
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| | - Corey Weistuch
- Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| |
Collapse
|
4
|
Valentin S, Kleinegesse S, Bramley NR, Seriès P, Gutmann MU, Lucas CG. Designing optimal behavioral experiments using machine learning. eLife 2024; 13:e86224. [PMID: 38261382 PMCID: PMC10805374 DOI: 10.7554/elife.86224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 11/19/2023] [Indexed: 01/24/2024] Open
Abstract
Computational models are powerful tools for understanding human cognition and behavior. They let us express our theories clearly and precisely and offer predictions that can be subtle and often counter-intuitive. However, this same richness and ability to surprise means our scientific intuitions and traditional tools are ill-suited to designing experiments to test and compare these models. To avoid these pitfalls and realize the full potential of computational modeling, we require tools to design experiments that provide clear answers about what models explain human behavior and the auxiliary assumptions those models must make. Bayesian optimal experimental design (BOED) formalizes the search for optimal experimental designs by identifying experiments that are expected to yield informative data. In this work, we provide a tutorial on leveraging recent advances in BOED and machine learning to find optimal experiments for any kind of model that we can simulate data from, and show how by-products of this procedure allow for quick and straightforward evaluation of models and their parameters against real experimental data. As a case study, we consider theories of how people balance exploration and exploitation in multi-armed bandit decision-making tasks. We validate the presented approach using simulations and a real-world experiment. As compared to experimental designs commonly used in the literature, we show that our optimal designs more efficiently determine which of a set of models best account for individual human behavior, and more efficiently characterize behavior given a preferred model. At the same time, formalizing a scientific question such that it can be adequately addressed with BOED can be challenging and we discuss several potential caveats and pitfalls that practitioners should be aware of. We provide code to replicate all analyses as well as tutorial notebooks and pointers to adapt the methodology to different experimental settings.
Collapse
Affiliation(s)
- Simon Valentin
- School of Informatics, University of EdinburghEdinburghUnited Kingdom
| | | | - Neil R Bramley
- Department of Psychology, University of EdinburghEdinburghUnited Kingdom
| | - Peggy Seriès
- School of Informatics, University of EdinburghEdinburghUnited Kingdom
| | - Michael U Gutmann
- School of Informatics, University of EdinburghEdinburghUnited Kingdom
| | | |
Collapse
|
5
|
Yang J, Daily N, Pullinger TK, Wakatsuki T, Sobie EA. Creating cell-specific computational models of stem cell-derived cardiomyocytes using optical experiments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.07.574577. [PMID: 38260376 PMCID: PMC10802448 DOI: 10.1101/2024.01.07.574577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Human induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) have gained traction as a powerful model in cardiac disease and therapeutics research, since iPSCs are self-renewing and can be derived from healthy and diseased patients without invasive surgery. However, current iPSC-CM differentiation methods produce cardiomyocytes with immature, fetal-like electrophysiological phenotypes, and the variety of maturation protocols in the literature results in phenotypic differences between labs. Heterogeneity of iPSC donor genetic backgrounds contributes to additional phenotypic variability. Several mathematical models of iPSC-CM electrophysiology have been developed to help understand the ionic underpinnings of, and to simulate, various cell responses, but these models individually do not capture the phenotypic variability observed in iPSC-CMs. Here, we tackle these limitations by developing a computational pipeline to calibrate cell preparation-specific iPSC-CM electrophysiological parameters. We used the genetic algorithm (GA), a heuristic parameter calibration method, to tune ion channel parameters in a mathematical model of iPSC-CM physiology. To systematically optimize an experimental protocol that generates sufficient data for parameter calibration, we created simulated datasets by applying various protocols to a population of in silico cells with known conductance variations, and we fitted to those datasets. We found that calibrating models to voltage and calcium transient data under 3 varied experimental conditions, including electrical pacing combined with ion channel blockade and changing buffer ion concentrations, improved model parameter estimates and model predictions of unseen channel block responses. This observation held regardless of whether the fitted data were normalized, suggesting that normalized fluorescence recordings, which are more accessible and higher throughput than patch clamp recordings, could sufficiently inform conductance parameters. Therefore, this computational pipeline can be applied to different iPSC-CM preparations to determine cell line-specific ion channel properties and understand the mechanisms behind variability in perturbation responses.
Collapse
Affiliation(s)
- Janice Yang
- Department of Pharmacological Sciences & Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Neil Daily
- InvivoSciences Inc., Madison, WI 53719, USA
| | - Taylor K. Pullinger
- Department of Pharmacological Sciences & Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Eric A. Sobie
- Department of Pharmacological Sciences & Graduate School of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
6
|
Lei CL, Clerx M, Gavaghan DJ, Mirams GR. Model-driven optimal experimental design for calibrating cardiac electrophysiology models. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107690. [PMID: 37478675 DOI: 10.1016/j.cmpb.2023.107690] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/09/2023] [Accepted: 06/22/2023] [Indexed: 07/23/2023]
Abstract
BACKGROUND AND OBJECTIVE Models of the cardiomyocyte action potential have contributed immensely to the understanding of heart function, pathophysiology, and the origin of heart rhythm disturbances. However, action potential models are highly nonlinear, making them difficult to parameterise and limiting to describing 'average cell' dynamics, when cell-specific models would be ideal to uncover inter-cell variability but are too experimentally challenging to be achieved. Here, we focus on automatically designing experimental protocols that allow us to better identify cell-specific maximum conductance values for each major current type. METHODS AND RESULTS We developed an approach that applies optimal experimental designs to patch-clamp experiments, including both voltage-clamp and current-clamp experiments. We assessed the models calibrated to these new optimal designs by comparing them to the models calibrated to some of the commonly used designs in the literature. We showed that optimal designs are not only overall shorter in duration but also able to perform better than many of the existing experiment designs in terms of identifying model parameters and hence model predictive power. CONCLUSIONS For cardiac cellular electrophysiology, this approach will allow researchers to define their hypothesis of the dynamics of the system and automatically design experimental protocols that will result in theoretically optimal designs.
Collapse
Affiliation(s)
- Chon Lok Lei
- Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China; Department of Biomedical Sciences, Faculty of Health Sciences, University of Macau, Macau, China.
| | - Michael Clerx
- Centre for Mathematical Medicine & Biology, School of Mathematical Sciences, University of Nottingham, Nottingham, United Kingdom
| | - David J Gavaghan
- Department of Computer Science, University of Oxford, Oxford, United Kingdom; Doctoral Training Centre, University of Oxford, Oxford, United Kingdom
| | - Gary R Mirams
- Centre for Mathematical Medicine & Biology, School of Mathematical Sciences, University of Nottingham, Nottingham, United Kingdom.
| |
Collapse
|
7
|
Thompson JC, Zavala VM, Venturelli OS. Integrating a tailored recurrent neural network with Bayesian experimental design to optimize microbial community functions. PLoS Comput Biol 2023; 19:e1011436. [PMID: 37773951 PMCID: PMC10540976 DOI: 10.1371/journal.pcbi.1011436] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 08/16/2023] [Indexed: 10/01/2023] Open
Abstract
Microbiomes interact dynamically with their environment to perform exploitable functions such as production of valuable metabolites and degradation of toxic metabolites for a wide range of applications in human health, agriculture, and environmental cleanup. Developing computational models to predict the key bacterial species and environmental factors to build and optimize such functions are crucial to accelerate microbial community engineering. However, there is an unknown web of interactions that determine the highly complex and dynamic behavior of these systems, which precludes the development of models based on known mechanisms. By contrast, entirely data-driven machine learning models can produce physically unrealistic predictions and often require significant amounts of experimental data to learn system behavior. We develop a physically-constrained recurrent neural network that preserves model flexibility but is constrained to produce physically consistent predictions and show that it can outperform existing machine learning methods in the prediction of certain experimentally measured species abundance and metabolite concentrations. Further, we present a closed-loop, Bayesian experimental design algorithm to guide data collection by selecting experimental conditions that simultaneously maximize information gain and target microbial community functions. Using a bioreactor case study, we demonstrate how the proposed framework can be used to efficiently navigate a large design space to identify optimal operating conditions. The proposed methodology offers a flexible machine learning approach specifically tailored to optimize microbiome target functions through the sequential design of informative experiments that seek to explore and exploit community functions.
Collapse
Affiliation(s)
- Jaron C. Thompson
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Victor M. Zavala
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Ophelia S. Venturelli
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
8
|
Haus ES, Drengstig T, Thorsen K. Structural identifiability of biomolecular controller motifs with and without flow measurements as model output. PLoS Comput Biol 2023; 19:e1011398. [PMID: 37639454 PMCID: PMC10491402 DOI: 10.1371/journal.pcbi.1011398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 09/08/2023] [Accepted: 07/28/2023] [Indexed: 08/31/2023] Open
Abstract
Controller motifs are simple biomolecular reaction networks with negative feedback. They can explain how regulatory function is achieved and are often used as building blocks in mathematical models of biological systems. In this paper we perform an extensive investigation into structural identifiability of controller motifs, specifically the so-called basic and antithetic controller motifs. Structural identifiability analysis is a useful tool in the creation and evaluation of mathematical models: it can be used to ensure that model parameters can be determined uniquely and to examine which measurements are necessary for this purpose. This is especially useful for biological models where parameter estimation can be difficult due to limited availability of measureable outputs. Our aim with this work is to investigate how structural identifiability is affected by controller motif complexity and choice of measurements. To increase the number of potential outputs we propose two methods for including flow measurements and show how this affects structural identifiability in combination with, or in the absence of, concentration measurements. In our investigation, we analyze 128 different controller motif structures using a combination of flow and/or concentration measurements, giving a total of 3648 instances. Among all instances, 34% of the measurement combinations provided structural identifiability. Our main findings for the controller motifs include: i) a single measurement is insufficient for structural identifiability, ii) measurements related to different chemical species are necessary for structural identifiability. Applying these findings result in a reduced subset of 1568 instances, where 80% are structurally identifiable, and more complex/interconnected motifs appear easier to structurally identify. The model structures we have investigated are commonly used in models of biological systems, and our results demonstrate how different model structures and measurement combinations affect structural identifiability of controller motifs.
Collapse
Affiliation(s)
- Eivind S. Haus
- Department of Electrical Engineering and Computer Science, University of Stavanger, Stavanger, Norway
| | - Tormod Drengstig
- Department of Electrical Engineering and Computer Science, University of Stavanger, Stavanger, Norway
| | - Kristian Thorsen
- Department of Electrical Engineering and Computer Science, University of Stavanger, Stavanger, Norway
| |
Collapse
|
9
|
Grabowski F, Nałęcz-Jawecki P, Lipniacki T. Predictive power of non-identifiable models. Sci Rep 2023; 13:11143. [PMID: 37429934 DOI: 10.1038/s41598-023-37939-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 06/29/2023] [Indexed: 07/12/2023] Open
Abstract
Resolving practical non-identifiability of computational models typically requires either additional data or non-algorithmic model reduction, which frequently results in models containing parameters lacking direct interpretation. Here, instead of reducing models, we explore an alternative, Bayesian approach, and quantify the predictive power of non-identifiable models. We considered an example biochemical signalling cascade model as well as its mechanical analogue. For these models, we demonstrated that by measuring a single variable in response to a properly chosen stimulation protocol, the dimensionality of the parameter space is reduced, which allows for predicting the measured variable's trajectory in response to different stimulation protocols even if all model parameters remain unidentified. Moreover, one can predict how such a trajectory will transform in the case of a multiplicative change of an arbitrary model parameter. Successive measurements of remaining variables further reduce the dimensionality of the parameter space and enable new predictions. We analysed potential pitfalls of the proposed approach that can arise when the investigated model is oversimplified, incorrect, or when the training protocol is inadequate. The main advantage of the suggested iterative approach is that the predictive power of the model can be assessed and practically utilised at each step.
Collapse
Affiliation(s)
- Frederic Grabowski
- Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland
| | - Paweł Nałęcz-Jawecki
- Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland
| | - Tomasz Lipniacki
- Institute of Fundamental Technological Research, Polish Academy of Sciences, Warsaw, Poland.
| |
Collapse
|
10
|
Cho H, Lewis AL, Storey KM, Byrne HM. Designing experimental conditions to use the Lotka-Volterra model to infer tumor cell line interaction types. J Theor Biol 2023; 559:111377. [PMID: 36470468 DOI: 10.1016/j.jtbi.2022.111377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/25/2022] [Accepted: 11/29/2022] [Indexed: 12/12/2022]
Abstract
The Lotka-Volterra model is widely used to model interactions between two species. Here, we generate synthetic data mimicking competitive, mutualistic and antagonistic interactions between two tumor cell lines, and then use the Lotka-Volterra model to infer the interaction type. Structural identifiability of the Lotka-Volterra model is confirmed, and practical identifiability is assessed for three experimental designs: (a) use of a single data set, with a mixture of both cell lines observed over time, (b) a sequential design where growth rates and carrying capacities are estimated using data from experiments in which each cell line is grown in isolation, and then interaction parameters are estimated from an experiment involving a mixture of both cell lines, and (c) a parallel experimental design where all model parameters are fitted to data from two mixtures (containing both cell lines but with different initial ratios) simultaneously. Each design is tested on data generated from the Lotka-Volterra model with noise added, to determine efficacy in an ideal sense. In addition to assessing each design for practical identifiability, we investigate how the predictive power of the model - i.e., its ability to fit data for initial ratios other than those to which it was calibrated - is affected by the choice of experimental design. The parallel calibration procedure is found to be optimal and is further tested on in silico data generated from a spatially-resolved cellular automaton model, which accounts for oxygen consumption and allows for variation in the intensity level of the interaction between the two cell lines. We use this study to highlight the care that must be taken when interpreting parameter estimates for the spatially-averaged Lotka-Volterra model when it is calibrated against data produced by the spatially-resolved cellular automaton model, since baseline competition for space and resources in the CA model may contribute to a discrepancy between the type of interaction used to generate the CA data and the type of interaction inferred by the LV model.
Collapse
Affiliation(s)
- Heyrim Cho
- Department of Mathematics, University of California, Riverside, CA, United States of America
| | - Allison L Lewis
- Department of Mathematics, Lafayette College, Easton, PA, United States of America
| | - Kathleen M Storey
- Department of Mathematics, Lafayette College, Easton, PA, United States of America.
| | - Helen M Byrne
- Department of Mathematics, University of Oxford, Oxford, UK
| |
Collapse
|
11
|
Vittadello ST, Stumpf MPH. Open problems in mathematical biology. Math Biosci 2022; 354:108926. [PMID: 36377100 DOI: 10.1016/j.mbs.2022.108926] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 11/06/2022]
Abstract
Biology is data-rich, and it is equally rich in concepts and hypotheses. Part of trying to understand biological processes and systems is therefore to confront our ideas and hypotheses with data using statistical methods to determine the extent to which our hypotheses agree with reality. But doing so in a systematic way is becoming increasingly challenging as our hypotheses become more detailed, and our data becomes more complex. Mathematical methods are therefore gaining in importance across the life- and biomedical sciences. Mathematical models allow us to test our understanding, make testable predictions about future behaviour, and gain insights into how we can control the behaviour of biological systems. It has been argued that mathematical methods can be of great benefit to biologists to make sense of data. But mathematics and mathematicians are set to benefit equally from considering the often bewildering complexity inherent to living systems. Here we present a small selection of open problems and challenges in mathematical biology. We have chosen these open problems because they are of both biological and mathematical interest.
Collapse
Affiliation(s)
- Sean T Vittadello
- Melbourne Integrative Genomics, University of Melbourne, Australia; School of BioSciences, University of Melbourne, Australia
| | - Michael P H Stumpf
- Melbourne Integrative Genomics, University of Melbourne, Australia; School of BioSciences, University of Melbourne, Australia; School of Mathematics and Statistics, University of Melbourne, Australia.
| |
Collapse
|
12
|
Treloar NJ, Braniff N, Ingalls B, Barnes CP. Deep reinforcement learning for optimal experimental design in biology. PLoS Comput Biol 2022; 18:e1010695. [PMID: 36409776 PMCID: PMC9721483 DOI: 10.1371/journal.pcbi.1010695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Revised: 12/05/2022] [Accepted: 10/31/2022] [Indexed: 11/22/2022] Open
Abstract
The field of optimal experimental design uses mathematical techniques to determine experiments that are maximally informative from a given experimental setup. Here we apply a technique from artificial intelligence-reinforcement learning-to the optimal experimental design task of maximizing confidence in estimates of model parameter values. We show that a reinforcement learning approach performs favourably in comparison with a one-step ahead optimisation algorithm and a model predictive controller for the inference of bacterial growth parameters in a simulated chemostat. Further, we demonstrate the ability of reinforcement learning to train over a distribution of parameters, indicating that this approach is robust to parametric uncertainty.
Collapse
Affiliation(s)
- Neythen J. Treloar
- Department of Cell and Developmental Biology, University College London, London, United Kingdom
| | - Nathan Braniff
- Department of Applied Mathematics, University of Waterloo, Waterloo, Canada
| | - Brian Ingalls
- Department of Applied Mathematics, University of Waterloo, Waterloo, Canada
| | - Chris P. Barnes
- Department of Cell and Developmental Biology, University College London, London, United Kingdom
- UCL Genetics Institute, University College London, London, United Kingdom
| |
Collapse
|
13
|
Palgen JL, Perrillat-Mercerot A, Ceres N, Peyronnet E, Coudron M, Tixier E, Illigens BMW, Bosley J, L’Hostis A, Monteiro C. Integration of Heterogeneous Biological Data in Multiscale Mechanistic Model Calibration: Application to Lung Adenocarcinoma. Acta Biotheor 2022; 70:19. [PMID: 35796890 PMCID: PMC9261258 DOI: 10.1007/s10441-022-09445-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 06/15/2022] [Indexed: 11/26/2022]
Abstract
Mechanistic models are built using knowledge as the primary information source, with well-established biological and physical laws determining the causal relationships within the model. Once the causal structure of the model is determined, parameters must be defined in order to accurately reproduce relevant data. Determining parameters and their values is particularly challenging in the case of models of pathophysiology, for which data for calibration is sparse. Multiple data sources might be required, and data may not be in a uniform or desirable format. We describe a calibration strategy to address the challenges of scarcity and heterogeneity of calibration data. Our strategy focuses on parameters whose initial values cannot be easily derived from the literature, and our goal is to determine the values of these parameters via calibration with constraints set by relevant data. When combined with a covariance matrix adaptation evolution strategy (CMA-ES), this step-by-step approach can be applied to a wide range of biological models. We describe a stepwise, integrative and iterative approach to multiscale mechanistic model calibration, and provide an example of calibrating a pathophysiological lung adenocarcinoma model. Using the approach described here we illustrate the successful calibration of a complex knowledge-based mechanistic model using only the limited heterogeneous datasets publicly available in the literature.
Collapse
Affiliation(s)
| | | | - Nicoletta Ceres
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
| | | | - Matthieu Coudron
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
| | - Eliott Tixier
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
| | - Ben M. W. Illigens
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
- Dresden International University, Freiberger Str. 37, Dresden, 01067 Germany
| | - Jim Bosley
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
| | - Adèle L’Hostis
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
| | - Claudio Monteiro
- Novadiscovery, Pl. Giovanni da Verrazzano, Lyon, 69009 Rhône France
| |
Collapse
|
14
|
Eriksson O, Bhalla US, Blackwell KT, Crook SM, Keller D, Kramer A, Linne ML, Saudargienė A, Wade RC, Hellgren Kotaleski J. Combining hypothesis- and data-driven neuroscience modeling in FAIR workflows. eLife 2022; 11:e69013. [PMID: 35792600 PMCID: PMC9259018 DOI: 10.7554/elife.69013] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Accepted: 05/13/2022] [Indexed: 12/22/2022] Open
Abstract
Modeling in neuroscience occurs at the intersection of different points of view and approaches. Typically, hypothesis-driven modeling brings a question into focus so that a model is constructed to investigate a specific hypothesis about how the system works or why certain phenomena are observed. Data-driven modeling, on the other hand, follows a more unbiased approach, with model construction informed by the computationally intensive use of data. At the same time, researchers employ models at different biological scales and at different levels of abstraction. Combining these models while validating them against experimental data increases understanding of the multiscale brain. However, a lack of interoperability, transparency, and reusability of both models and the workflows used to construct them creates barriers for the integration of models representing different biological scales and built using different modeling philosophies. We argue that the same imperatives that drive resources and policy for data - such as the FAIR (Findable, Accessible, Interoperable, Reusable) principles - also support the integration of different modeling approaches. The FAIR principles require that data be shared in formats that are Findable, Accessible, Interoperable, and Reusable. Applying these principles to models and modeling workflows, as well as the data used to constrain and validate them, would allow researchers to find, reuse, question, validate, and extend published models, regardless of whether they are implemented phenomenologically or mechanistically, as a few equations or as a multiscale, hierarchical system. To illustrate these ideas, we use a classical synaptic plasticity model, the Bienenstock-Cooper-Munro rule, as an example due to its long history, different levels of abstraction, and implementation at many scales.
Collapse
Affiliation(s)
- Olivia Eriksson
- Science for Life Laboratory, School of Electrical Engineering and Computer Science, KTH Royal Institute of TechnologyStockholmSweden
| | - Upinder Singh Bhalla
- National Center for Biological Sciences, Tata Institute of Fundamental ResearchBangaloreIndia
| | - Kim T Blackwell
- Department of Bioengineering, Volgenau School of Engineering, George Mason UniversityFairfaxUnited States
| | - Sharon M Crook
- School of Mathematical and Statistical Sciences, Arizona State UniversityTempeUnited States
| | - Daniel Keller
- Blue Brain Project, École Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Andrei Kramer
- Science for Life Laboratory, School of Electrical Engineering and Computer Science, KTH Royal Institute of TechnologyStockholmSweden
- Department of Neuroscience, Karolinska InstituteStockholmSweden
| | - Marja-Leena Linne
- Faculty of Medicine and Health Technology, Tampere UniversityTampereFinland
| | - Ausra Saudargienė
- Neuroscience Institute, Lithuanian University of Health SciencesKaunasLithuania
- Department of Informatics, Vytautas Magnus UniversityKaunasLithuania
| | - Rebecca C Wade
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS)HeidelbergGermany
- Center for Molecular Biology (ZMBH), ZMBH-DKFZ Alliance, University of HeidelbergHeidelbergGermany
- Interdisciplinary Center for Scientific Computing (IWR), Heidelberg UniversityHeidelbergGermany
| | - Jeanette Hellgren Kotaleski
- Science for Life Laboratory, School of Electrical Engineering and Computer Science, KTH Royal Institute of TechnologyStockholmSweden
- Department of Neuroscience, Karolinska InstituteStockholmSweden
| |
Collapse
|
15
|
Sharp JA, Browning AP, Burrage K, Simpson MJ. Parameter estimation and uncertainty quantification using information geometry. J R Soc Interface 2022; 19:20210940. [PMID: 35472269 PMCID: PMC9042578 DOI: 10.1098/rsif.2021.0940] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
In this work, we: (i) review likelihood-based inference for parameter estimation and the construction of confidence regions; and (ii) explore the use of techniques from information geometry, including geodesic curves and Riemann scalar curvature, to supplement typical techniques for uncertainty quantification, such as Bayesian methods, profile likelihood, asymptotic analysis and bootstrapping. These techniques from information geometry provide data-independent insights into uncertainty and identifiability, and can be used to inform data collection decisions. All code used in this work to implement the inference and information geometry techniques is available on GitHub.
Collapse
Affiliation(s)
- Jesse A Sharp
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia.,ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Alexander P Browning
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia.,ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Kevin Burrage
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia.,ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Queensland, Australia.,Department of Computer Science, University of Oxford, Oxford, UK
| | - Matthew J Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia.,Centre for Data Science, Queensland University of Technology, Brisbane, Queensland, Australia
| |
Collapse
|
16
|
Litwin T, Timmer J, Kreutz C. Optimal Experimental Design Based on Two-Dimensional Likelihood Profiles. Front Mol Biosci 2022; 9:800856. [PMID: 35281278 PMCID: PMC8906444 DOI: 10.3389/fmolb.2022.800856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 01/07/2022] [Indexed: 11/13/2022] Open
Abstract
Dynamic behavior of biological systems is commonly represented by non-linear models such as ordinary differential equations. A frequently encountered task in such systems is the estimation of model parameters based on measurement of biochemical compounds. Non-linear models require special techniques to estimate the uncertainty of the obtained model parameters and predictions, e.g. by exploiting the concept of the profile likelihood. Model parameters with significant uncertainty associated with their estimates hinder the interpretation of model results. Informing these model parameters by optimal experimental design minimizes the additional amount of data and therefore resources required in experiments. However, existing techniques of experimental design either require prior parameter distributions in Bayesian approaches or do not adequately deal with the non-linearity of the system in frequentist approaches. For identification of optimal experimental designs, we propose a two-dimensional profile likelihood approach, providing a design criterion which meaningfully represents the expected parameter uncertainty after measuring data for a specified experimental condition. The described approach is implemented into the open source toolbox Data2Dynamics in Matlab. The applicability of the method is demonstrated on an established systems biology model. For this demonstration, available data has been censored to simulate a setting in which parameters are not yet well determined. After determining the optimal experimental condition from the censored ones, a realistic evaluation was possible by re-introducing the censored data point corresponding to the optimal experimental condition. This provided a validation that our method is feasible in real-world applications. The approach applies to, but is not limited to, models in systems biology.
Collapse
Affiliation(s)
- Tim Litwin
- Institute of Medical Biometry and Statistics (IMBI), Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modelling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
- *Correspondence: Tim Litwin,
| | - Jens Timmer
- Freiburg Center for Data Analysis and Modelling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
- Centre for Integrative Biological Signalling Studies (CIBSS), University of Freiburg, Freiburg, Germany
| | - Clemens Kreutz
- Institute of Medical Biometry and Statistics (IMBI), Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modelling (FDM), University of Freiburg, Freiburg, Germany
- Centre for Integrative Biological Signalling Studies (CIBSS), University of Freiburg, Freiburg, Germany
| |
Collapse
|
17
|
Strouwen A, Nicolaï BM, Goos P. Robust dynamic experiments for the precise estimation of respiration and fermentation parameters of fruit and vegetables. PLoS Comput Biol 2022; 18:e1009610. [PMID: 35020716 PMCID: PMC8789162 DOI: 10.1371/journal.pcbi.1009610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 01/25/2022] [Accepted: 11/03/2021] [Indexed: 11/18/2022] Open
Abstract
Dynamic models based on non-linear differential equations are increasingly being used in many biological applications. Highly informative dynamic experiments are valuable for the identification of these dynamic models. The storage of fresh fruit and vegetables is one such application where dynamic experimentation is gaining momentum. In this paper, we construct optimal O2 and CO2 gas input profiles to estimate the respiration and fermentation kinetics of pear fruit. The optimal input profiles, however, depend on the true values of the respiration and fermentation parameters. Locally optimal design of input profiles, which uses a single initial guess for the parameters, is the traditional method to deal with this issue. This method, however, is very sensitive to the initial values selected for the model parameters. Therefore, we present a robust experimental design approach that can handle uncertainty on the model parameters.
Collapse
Affiliation(s)
- Arno Strouwen
- Department of Biosystems, KU Leuven, Leuven, Belgium
- * E-mail:
| | | | - Peter Goos
- Department of Biosystems, KU Leuven, Leuven, Belgium
- Department of Engineering Management, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
18
|
Roesch E, Rackauckas C, Stumpf MPH. Collocation based training of neural ordinary differential equations. Stat Appl Genet Mol Biol 2021; 20:37-49. [PMID: 34237805 DOI: 10.1515/sagmb-2020-0025] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Accepted: 05/04/2021] [Indexed: 11/15/2022]
Abstract
The predictive power of machine learning models often exceeds that of mechanistic modeling approaches. However, the interpretability of purely data-driven models, without any mechanistic basis is often complicated, and predictive power by itself can be a poor metric by which we might want to judge different methods. In this work, we focus on the relatively new modeling techniques of neural ordinary differential equations. We discuss how they relate to machine learning and mechanistic models, with the potential to narrow the gulf between these two frameworks: they constitute a class of hybrid model that integrates ideas from data-driven and dynamical systems approaches. Training neural ODEs as representations of dynamical systems data has its own specific demands, and we here propose a collocation scheme as a fast and efficient training strategy. This alleviates the need for costly ODE solvers. We illustrate the advantages that collocation approaches offer, as well as their robustness to qualitative features of a dynamical system, and the quantity and quality of observational data. We focus on systems that exemplify some of the hallmarks of complex dynamical systems encountered in systems biology, and we map out how these methods can be used in the analysis of mathematical models of cellular and physiological processes.
Collapse
Affiliation(s)
- Elisabeth Roesch
- Melbourne Integrative Genomics, University of Melbourne, 30 Royal Parade, Parkville, VIC3052, Australia.,School of Mathematics and Statistics, University of Melbourne, 813 Swanston Street, Parkville, VIC3010, Australia
| | - Christopher Rackauckas
- Department of Mathematics, Massachusetts Institute of Technology, 182 Memorial Dr, Cambridge, MA02142, USA.,Julia Computing, 240 Elm Street, 2nd Floor, Somerville, Massachusetts02144, USA.,Pumas-AI, 14711 Kamputa Drive, Centerville, VA20120, USA
| | - Michael P H Stumpf
- Melbourne Integrative Genomics, University of Melbourne, 30 Royal Parade, Parkville, VIC3052, Australia.,School of Mathematics and Statistics, University of Melbourne, 813 Swanston Street, Parkville, VIC3010, Australia
| |
Collapse
|
19
|
Introducing Parameter Clustering to the OED Procedure for Model Calibration of a Synthetic Inducible Promoter in S. cerevisiae. Processes (Basel) 2021. [DOI: 10.3390/pr9061053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In recent years, synthetic gene circuits for adding new cell features have become one of the most powerful tools in biological and pharmaceutical research and development. However, because of the inherent non-linearity and noisy experimental data, the experiment-based model calibration of these synthetic parts is perceived as a laborious and time-consuming procedure. Although the optimal experimental design (OED) based on the Fisher information matrix (FIM) has been proved to be an effective means to improve the calibration efficiency, the required calculation increases dramatically with the model size (parameter number). To reduce the OED complexity without losing the calibration accuracy, this paper proposes two OED approaches with different parameter clustering methods and validates the accuracy of calibrated models with in-silico experiments. A model of an inducible synthetic promoter in S. cerevisiae is adopted for bench-marking. The comparison with the traditional off-line OED approach suggests that the OED approaches with both of the clustering methods significantly reduce the complexity of OED problems (for at least 49.0%), while slightly improving the calibration accuracy (11.8% and 19.6% lower estimation error in average for FIM-based and sensitivity-based approaches). This study implicates that for calibrating non-linear models of biological pathways, cluster-based OED could be a beneficial approach to improve the efficiency of optimal experimental design.
Collapse
|
20
|
An Information-Theoretic Framework for Optimal Design: Analysis of Protocols for Estimating Soft Tissue Parameters in Biaxial Experiments. AXIOMS 2021. [DOI: 10.3390/axioms10020079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
A new framework for optimal design based on the information-theoretic measures of mutual information, conditional mutual information and their combination is proposed. The framework is tested on the analysis of protocols—a combination of angles along which strain measurements can be acquired—in a biaxial experiment of soft tissues for the estimation of hyperelastic constitutive model parameters. The proposed framework considers the information gain about the parameters from the experiment as the key criterion to be maximised, which can be directly used for optimal design. Information gain is computed through k-nearest neighbour algorithms applied to the joint samples of the parameters and measurements produced by the forward and observation models. For biaxial experiments, the results show that low angles have a relatively low information content compared to high angles. The results also show that a smaller number of angles with suitably chosen combinations can result in higher information gains when compared to a larger number of angles which are poorly combined. Finally, it is shown that the proposed framework is consistent with classical approaches, particularly D-optimal design.
Collapse
|
21
|
Barrett R, White AD. Investigating Active Learning and Meta-Learning for Iterative Peptide Design. J Chem Inf Model 2021; 61:95-105. [PMID: 33350829 PMCID: PMC7842147 DOI: 10.1021/acs.jcim.0c00946] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Indexed: 01/14/2023]
Abstract
Often the development of novel functional peptides is not amenable to high throughput or purely computational screening methods. Peptides must be synthesized one at a time in a process that does not generate large amounts of data. One way this method can be improved is by ensuring that each experiment provides the best improvement in both peptide properties and predictive modeling accuracy. Here, we study the effectiveness of active learning, optimizing experiment order, and meta-learning, transferring knowledge between contexts, to reduce the number of experiments necessary to build a predictive model. We present a multitask benchmark database of peptides designed to advance these methods for experimental design. Each task is a binary classification of peptides represented as a sequence string. We find neither active learning method tested to be better than random choice. The meta-learning method Reptile was found to improve the average accuracy across data sets. Combining meta-learning with active learning offers inconsistent benefits.
Collapse
Affiliation(s)
- Rainier Barrett
- Department of Chemical Engineering,
University of Rochester, Rochester, New York 14627,
United States
| | - Andrew D. White
- Department of Chemical Engineering,
University of Rochester, Rochester, New York 14627,
United States
| |
Collapse
|
22
|
Lomeli LM, Iniguez A, Tata P, Jena N, Liu ZY, Van Etten R, Lander AD, Shahbaba B, Lowengrub JS, Minin VN. Optimal experimental design for mathematical models of haematopoiesis. J R Soc Interface 2021; 18:20200729. [PMID: 33499768 PMCID: PMC7879761 DOI: 10.1098/rsif.2020.0729] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 01/04/2021] [Indexed: 11/12/2022] Open
Abstract
The haematopoietic system has a highly regulated and complex structure in which cells are organized to successfully create and maintain new blood cells. It is known that feedback regulation is crucial to tightly control this system, but the specific mechanisms by which control is exerted are not completely understood. In this work, we aim to uncover the underlying mechanisms in haematopoiesis by conducting perturbation experiments, where animal subjects are exposed to an external agent in order to observe the system response and evolution. We have developed a novel Bayesian hierarchical framework for optimal design of perturbation experiments and proper analysis of the data collected. We use a deterministic model that accounts for feedback and feedforward regulation on cell division rates and self-renewal probabilities. A significant obstacle is that the experimental data are not longitudinal, rather each data point corresponds to a different animal. We overcome this difficulty by modelling the unobserved cellular levels as latent variables. We then use principles of Bayesian experimental design to optimally distribute time points at which the haematopoietic cells are quantified. We evaluate our approach using synthetic and real experimental data and show that an optimal design can lead to better estimates of model parameters.
Collapse
Affiliation(s)
- Luis Martinez Lomeli
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
| | - Abdon Iniguez
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
| | - Prasanthi Tata
- Division of Hematology/Oncology, University of California Irvine, Irvine, CA, USA
| | - Nilamani Jena
- Division of Hematology/Oncology, University of California Irvine, Irvine, CA, USA
| | - Zhong-Ying Liu
- Division of Hematology/Oncology, University of California Irvine, Irvine, CA, USA
| | - Richard Van Etten
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
- Division of Hematology/Oncology, University of California Irvine, Irvine, CA, USA
- Department of Biological Chemistry, University of California Irvine, Irvine, CA, USA
- Center for Cancer Systems Biology, University of California Irvine, Irvine, CA, USA
- Chao Family Comprehensive Cancer Center, University of California Irvine, Irvine, CA, USA
| | - Arthur D. Lander
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
- Center for Cancer Systems Biology, University of California Irvine, Irvine, CA, USA
- Chao Family Comprehensive Cancer Center, University of California Irvine, Irvine, CA, USA
- Department of Developmental and Cell Biology, University of California Irvine, Irvine, CA, USA
- Department of Biomedical Engineering, University of California Irvine, Irvine, CA, USA
| | - Babak Shahbaba
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
- Center for Cancer Systems Biology, University of California Irvine, Irvine, CA, USA
- Department of Statistics, University of California Irvine, Irvine, CA, USA
| | - John S. Lowengrub
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
- Center for Cancer Systems Biology, University of California Irvine, Irvine, CA, USA
- Chao Family Comprehensive Cancer Center, University of California Irvine, Irvine, CA, USA
- Department of Biomedical Engineering, University of California Irvine, Irvine, CA, USA
- Department of Mathematics, University of California Irvine, Irvine, CA, USA
| | - Vladimir N. Minin
- Center for Complex Biological Systems, University of California Irvine, Irvine, CA, USA
- Center for Cancer Systems Biology, University of California Irvine, Irvine, CA, USA
- Department of Statistics, University of California Irvine, Irvine, CA, USA
| |
Collapse
|
23
|
Vlazaki M, Price DJ, Restif O. An experimental design tool to optimize inference precision in data-driven mathematical models of bacterial infections in vivo. J R Soc Interface 2020; 17:20200717. [PMID: 33323052 DOI: 10.1098/rsif.2020.0717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The management of bacterial diseases calls for a detailed knowledge about the dynamic changes in host-bacteria interactions. Biological insights are gained by integrating experimental data with mechanistic mathematical models to infer experimentally unobservable quantities. This inter-disciplinary field would benefit from experiments with maximal information content yielding high-precision inference. Here, we present a computationally efficient tool for optimizing experimental design in terms of parameter inference in studies using isogenic-tagged strains. We study the effect of three experimental design factors: number of biological replicates, sampling timepoint selection and number of copies per tagged strain. We conduct a simulation study to establish the relationship between our optimality criterion and the size of parameter estimate confidence intervals, and showcase its application in a range of biological scenarios reflecting different dynamics patterns observed in experimental infections. We show that in low-variance systems with low killing and replication rates, predicting high-precision experimental designs is consistently achieved; higher replicate sizes and strategic timepoint selection yield more precise estimates. Finally, we address the question of resource allocation under constraints; given a fixed number of host animals and a constraint on total inoculum size per host, infections with fewer strains at higher copies per strain lead to higher-precision inference.
Collapse
Affiliation(s)
- Myrto Vlazaki
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - David J Price
- Centre for Epidemiology and Biostatistics, University of Melbourne, Grattan Street, Parkville, Victoria 3010, Australia.,The Doherty Institute for Infection and Immunity, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia
| | - Olivier Restif
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| |
Collapse
|
24
|
Browning AP, Warne DJ, Burrage K, Baker RE, Simpson MJ. Identifiability analysis for stochastic differential equation models in systems biology. J R Soc Interface 2020; 17:20200652. [PMID: 33323054 PMCID: PMC7811582 DOI: 10.1098/rsif.2020.0652] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 11/24/2020] [Indexed: 12/26/2022] Open
Abstract
Mathematical models are routinely calibrated to experimental data, with goals ranging from building predictive models to quantifying parameters that cannot be measured. Whether or not reliable parameter estimates are obtainable from the available data can easily be overlooked. Such issues of parameter identifiability have important ramifications for both the predictive power of a model, and the mechanistic insight that can be obtained. Identifiability analysis is well-established for deterministic, ordinary differential equation (ODE) models, but there are no commonly adopted methods for analysing identifiability in stochastic models. We provide an accessible introduction to identifiability analysis and demonstrate how existing ideas for analysis of ODE models can be applied to stochastic differential equation (SDE) models through four practical case studies. To assess structural identifiability, we study ODEs that describe the statistical moments of the stochastic process using open-source software tools. Using practically motivated synthetic data and Markov chain Monte Carlo methods, we assess parameter identifiability in the context of available data. Our analysis shows that SDE models can often extract more information about parameters than deterministic descriptions. All code used to perform the analysis is available on Github.
Collapse
Affiliation(s)
- Alexander P. Browning
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| | - David J. Warne
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| | - Kevin Burrage
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, Queensland University of Technology, Brisbane, Australia
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Ruth E. Baker
- Mathematical Institute, University of Oxford, Oxford, UK
| | - Matthew J. Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
25
|
Bandiera L, Gomez-Cabeza D, Gilman J, Balsa-Canto E, Menolascina F. Optimally Designed Model Selection for Synthetic Biology. ACS Synth Biol 2020; 9:3134-3144. [PMID: 33152239 DOI: 10.1021/acssynbio.0c00393] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Modeling parts and circuits represents a significant roadblock to automating the Design-Build-Test-Learn cycle in synthetic biology. Once models are developed, discriminating among them requires informative data, computational resources, and skills that might not be readily available. The high cost entailed in model discrimination frequently leads to subjective choices on the selected structures and, in turn, to suboptimal models. Here, we outline frequentist and Bayesian approaches to model discrimination. We ranked three candidate models of a genetic toggle switch, which was adopted as a test case, according to the support from in vivo data. We show that, in each framework, efficient model discrimination can be achieved via optimally designed experiments. We offer a dynamical-systems interpretation of our optimization results and investigate their sensitivity to key parameters in the characterization of synthetic circuits. Our approach suggests that optimal experimental design is an effective strategy to discriminate between competing models of a gene regulatory network. Independent of the adopted framework, optimally designed perturbations exploit regions in the input space that maximally distinguish predictions from the competing models.
Collapse
Affiliation(s)
- Lucia Bandiera
- Institute for Bioengineering, The University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
- SynthSys - Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - David Gomez-Cabeza
- Institute for Bioengineering, The University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - James Gilman
- Institute for Bioengineering, The University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| | - Eva Balsa-Canto
- (Bio)Process Engineering Group, IIM-CSIC Spanish Research Council, Vigo, 36208, Spain
| | - Filippo Menolascina
- Institute for Bioengineering, The University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
- SynthSys - Centre for Synthetic and Systems Biology, The University of Edinburgh, Edinburgh, EH9 3BF, United Kingdom
| |
Collapse
|
26
|
Development of new media formulations for cell culture operations based on regression models. Bioprocess Biosyst Eng 2020; 44:453-472. [PMID: 33111178 DOI: 10.1007/s00449-020-02456-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Accepted: 09/25/2020] [Indexed: 10/23/2022]
Abstract
The paper discusses modelling and optimization of multi-component cell culture medium. The specific productivity (Qp) was considered a function of the medium components and possible interactions described by linear factors, two-way interactions and squared terms that results in a high dimensional problem where the number of variables p (represented by the medium components and their interactions) is much larger than the number of observations n. Principal Components Regression (PCR), Partial Least Squares (PLS), Lasso and Elastic Net regressions were compared as modelling tools to deal with a high dimensional [Formula: see text] problem. PCR and PLS regression models resulted in better prediction results and were used for robust optimization of the medium composition by a nonlinear optimization. The case studies show that it is possible to formulate new media that result in higher Qp than the ones provided by the initial media experiments available. Also, the multivariate statistical approach permitted us to select media that is most informative about the optimum thus permitting modelling and optimization with a reduced set of initial experiments.
Collapse
|
27
|
Stumpf MPH. Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds. J R Soc Interface 2020; 17:20200419. [PMID: 33081645 PMCID: PMC7653378 DOI: 10.1098/rsif.2020.0419] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number-typically less than 10-of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble-choosing good predictors-is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided.
Collapse
Affiliation(s)
- Michael P H Stumpf
- School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Parkville, VIC 3010, Australia.,Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
28
|
Bayesian Information-Theoretic Calibration of Radiotherapy Sensitivity Parameters for Informing Effective Scanning Protocols in Cancer. J Clin Med 2020; 9:jcm9103208. [PMID: 33027933 PMCID: PMC7601810 DOI: 10.3390/jcm9103208] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Revised: 10/01/2020] [Accepted: 10/03/2020] [Indexed: 12/03/2022] Open
Abstract
With new advancements in technology, it is now possible to collect data for a variety of different metrics describing tumor growth, including tumor volume, composition, and vascularity, among others. For any proposed model of tumor growth and treatment, we observe large variability among individual patients’ parameter values, particularly those relating to treatment response; thus, exploiting the use of these various metrics for model calibration can be helpful to infer such patient-specific parameters both accurately and early, so that treatment protocols can be adjusted mid-course for maximum efficacy. However, taking measurements can be costly and invasive, limiting clinicians to a sparse collection schedule. As such, the determination of optimal times and metrics for which to collect data in order to best inform proper treatment protocols could be of great assistance to clinicians. In this investigation, we employ a Bayesian information-theoretic calibration protocol for experimental design in order to identify the optimal times at which to collect data for informing treatment parameters. Within this procedure, data collection times are chosen sequentially to maximize the reduction in parameter uncertainty with each added measurement, ensuring that a budget of n high-fidelity experimental measurements results in maximum information gain about the low-fidelity model parameter values. In addition to investigating the optimal temporal pattern for data collection, we also develop a framework for deciding which metrics should be utilized at each data collection point. We illustrate this framework with a variety of toy examples, each utilizing a radiotherapy treatment regimen. For each scenario, we analyze the dependence of the predictive power of the low-fidelity model upon the measurement budget.
Collapse
|
29
|
Optimal experiment design under parametric uncertainty: A comparison of a sensitivities based approach versus a polynomial chaos based stochastic approach. Chem Eng Sci 2020. [DOI: 10.1016/j.ces.2020.115651] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
30
|
Srinivasan S, Cluett WR, Mahadevan R. A scalable method for parameter identification in kinetic models of metabolism using steady-state data. Bioinformatics 2020; 35:5216-5225. [PMID: 31197317 DOI: 10.1093/bioinformatics/btz445] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2018] [Revised: 04/26/2019] [Accepted: 06/05/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In kinetic models of metabolism, the parameter values determine the dynamic behaviour predicted by these models. Estimating parameters from in vivo experimental data require the parameters to be structurally identifiable, and the data to be informative enough to estimate these parameters. Existing methods to determine the structural identifiability of parameters in kinetic models of metabolism can only be applied to models of small metabolic networks due to their computational complexity. Additionally, a priori experimental design, a necessity to obtain informative data for parameter estimation, also does not account for using steady-state data to estimate parameters in kinetic models. RESULTS Here, we present a scalable methodology to structurally identify parameters for each flux in a kinetic model of metabolism based on the availability of steady-state data. In doing so, we also address the issue of determining the number and nature of experiments for generating steady-state data to estimate these parameters. By using a small metabolic network as an example, we show that most parameters in fluxes expressed by mechanistic enzyme kinetic rate laws can be identified using steady-state data, and the steady-state data required for their estimation can be obtained from selective experiments involving both substrate and enzyme level perturbations. The methodology can be used in combination with other identifiability and experimental design algorithms that use dynamic data to determine the most informative experiments requiring the least resources to perform. AVAILABILITY AND IMPLEMENTATION https://github.com/LMSE/ident. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shyam Srinivasan
- Department of Chemical Engineering and Applied Chemistry, 200 College Street, University of Toronto, Toronto, ON, M5S3E5, Canada
| | - William R Cluett
- Department of Chemical Engineering and Applied Chemistry, 200 College Street, University of Toronto, Toronto, ON, M5S3E5, Canada
| | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, 200 College Street, University of Toronto, Toronto, ON, M5S3E5, Canada.,Institute of Biomaterials and Biomedical Engineering, 164 College Street, University of Toronto, Toronto, ON, M5S 3G9, Canada
| |
Collapse
|
31
|
Whittaker DG, Clerx M, Lei CL, Christini DJ, Mirams GR. Calibration of ionic and cellular cardiac electrophysiology models. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2020; 12:e1482. [PMID: 32084308 PMCID: PMC8614115 DOI: 10.1002/wsbm.1482] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 01/17/2020] [Accepted: 01/18/2020] [Indexed: 12/30/2022]
Abstract
Cardiac electrophysiology models are among the most mature and well-studied mathematical models of biological systems. This maturity is bringing new challenges as models are being used increasingly to make quantitative rather than qualitative predictions. As such, calibrating the parameters within ion current and action potential (AP) models to experimental data sets is a crucial step in constructing a predictive model. This review highlights some of the fundamental concepts in cardiac model calibration and is intended to be readily understood by computational and mathematical modelers working in other fields of biology. We discuss the classic and latest approaches to calibration in the electrophysiology field, at both the ion channel and cellular AP scales. We end with a discussion of the many challenges that work to date has raised and the need for reproducible descriptions of the calibration process to enable models to be recalibrated to new data sets and built upon for new studies. This article is categorized under: Analytical and Computational Methods > Computational Methods Physiology > Mammalian Physiology in Health and Disease Models of Systems Properties and Processes > Cellular Models.
Collapse
Affiliation(s)
- Dominic G. Whittaker
- Centre for Mathematical Medicine & Biology, School of Mathematical SciencesUniversity of NottinghamNottinghamUK
| | - Michael Clerx
- Computational Biology & Health Informatics, Department of Computer ScienceUniversity of OxfordOxfordUK
| | - Chon Lok Lei
- Computational Biology & Health Informatics, Department of Computer ScienceUniversity of OxfordOxfordUK
| | | | - Gary R. Mirams
- Centre for Mathematical Medicine & Biology, School of Mathematical SciencesUniversity of NottinghamNottinghamUK
| |
Collapse
|
32
|
Fox ZR, Neuert G, Munsky B. Optimal Design of Single-Cell Experiments within Temporally Fluctuating Environments. COMPLEXITY 2020; 2020:8536365. [PMID: 32982137 PMCID: PMC7515449 DOI: 10.1155/2020/8536365] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Modern biological experiments are becoming increasingly complex, and designing these experiments to yield the greatest possible quantitative insight is an open challenge. Increasingly, computational models of complex stochastic biological systems are being used to understand and predict biological behaviors or to infer biological parameters. Such quantitative analyses can also help to improve experiment designs for particular goals, such as to learn more about specific model mechanisms or to reduce prediction errors in certain situations. A classic approach to experiment design is to use the Fisher information matrix (FIM), which quantifies the expected information a particular experiment will reveal about model parameters. The Finite State Projection based FIM (FSP-FIM) was recently developed to compute the FIM for discrete stochastic gene regulatory systems, whose complex response distributions do not satisfy standard assumptions of Gaussian variations. In this work, we develop the FSP-FIM analysis for a stochastic model of stress response genes in S. cerevisae under time-varying MAPK induction. We verify this FSP-FIM analysis and use it to optimize the number of cells that should be quantified at particular times to learn as much as possible about the model parameters. We then extend the FSP-FIM approach to explore how different measurement times or genetic modifications help to minimize uncertainty in the sensing of extracellular environments, and we experimentally validate the FSP-FIM to rank single-cell experiments for their abilities to minimize estimation uncertainty of NaCl concentrations during yeast osmotic shock. This work demonstrates the potential of quantitative models to not only make sense of modern biological data sets, but to close the loop between quantitative modeling and experimental data collection.
Collapse
Affiliation(s)
- Zachary R Fox
- Inria Saclay Ile-de-France, Palaiseau 91120, France Institut Pasteur, USR 3756 IP CNRS Paris, 75015, France School of Biomedical Engineering, Colorado State University Fort Collins, CO 80523, USA
| | - Gregor Neuert
- Department of Molecular Physiology and Biophysics, School of Medicine, Vanderbilt University, Nashville, TN 37232, USA
| | - Brian Munsky
- Department of Chemical and Biological Engineering, Colorado State University Fort Collins, CO 80523, USA School of Biomedical Engineering, Colorado State University Fort Collins, CO 80523, USA
| |
Collapse
|
33
|
Warne DJ, Baker RE, Simpson MJ. Simulation and inference algorithms for stochastic biochemical reaction networks: from basic concepts to state-of-the-art. J R Soc Interface 2020; 16:20180943. [PMID: 30958205 DOI: 10.1098/rsif.2018.0943] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Stochasticity is a key characteristic of intracellular processes such as gene regulation and chemical signalling. Therefore, characterizing stochastic effects in biochemical systems is essential to understand the complex dynamics of living things. Mathematical idealizations of biochemically reacting systems must be able to capture stochastic phenomena. While robust theory exists to describe such stochastic models, the computational challenges in exploring these models can be a significant burden in practice since realistic models are analytically intractable. Determining the expected behaviour and variability of a stochastic biochemical reaction network requires many probabilistic simulations of its evolution. Using a biochemical reaction network model to assist in the interpretation of time-course data from a biological experiment is an even greater challenge due to the intractability of the likelihood function for determining observation probabilities. These computational challenges have been subjects of active research for over four decades. In this review, we present an accessible discussion of the major historical developments and state-of-the-art computational techniques relevant to simulation and inference problems for stochastic biochemical reaction network models. Detailed algorithms for particularly important methods are described and complemented with Matlab® implementations. As a result, this review provides a practical and accessible introduction to computational methods for stochastic models within the life sciences community.
Collapse
Affiliation(s)
- David J Warne
- 1 School of Mathematical Sciences, Queensland University of Technology , Brisbane, Queensland 4001 , Australia
| | - Ruth E Baker
- 2 Mathematical Institute, University of Oxford , Oxford OX2 6GG , UK
| | - Matthew J Simpson
- 1 School of Mathematical Sciences, Queensland University of Technology , Brisbane, Queensland 4001 , Australia
| |
Collapse
|
34
|
Melinscak F, Bach DR. Computational optimization of associative learning experiments. PLoS Comput Biol 2020; 16:e1007593. [PMID: 31905214 PMCID: PMC6964915 DOI: 10.1371/journal.pcbi.1007593] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 01/16/2020] [Accepted: 12/09/2019] [Indexed: 02/02/2023] Open
Abstract
With computational biology striving to provide more accurate theoretical accounts of biological systems, use of increasingly complex computational models seems inevitable. However, this trend engenders a challenge of optimal experimental design: due to the flexibility of complex models, it is difficult to intuitively design experiments that will efficiently expose differences between candidate models or allow accurate estimation of their parameters. This challenge is well exemplified in associative learning research. Associative learning theory has a rich tradition of computational modeling, resulting in a growing space of increasingly complex models, which in turn renders manual design of informative experiments difficult. Here we propose a novel method for computational optimization of associative learning experiments. We first formalize associative learning experiments using a low number of tunable design variables, to make optimization tractable. Next, we combine simulation-based Bayesian experimental design with Bayesian optimization to arrive at a flexible method of tuning design variables. Finally, we validate the proposed method through extensive simulations covering both the objectives of accurate parameter estimation and model selection. The validation results show that computationally optimized experimental designs have the potential to substantially improve upon manual designs drawn from the literature, even when prior information guiding the optimization is scarce. Computational optimization of experiments may help address recent concerns over reproducibility by increasing the expected utility of studies, and it may even incentivize practices such as study pre-registration, since optimization requires a pre-specified analysis plan. Moreover, design optimization has the potential not only to improve basic research in domains such as associative learning, but also to play an important role in translational research. For example, design of behavioral and physiological diagnostic tests in the nascent field of computational psychiatry could benefit from an optimization-based approach, similar to the one presented here.
Collapse
Affiliation(s)
- Filip Melinscak
- Computational Psychiatry Research, Department of Psychiatry, Psychotherapy, and Psychosomatics, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich, Zurich, Switzerland
| | - Dominik R. Bach
- Computational Psychiatry Research, Department of Psychiatry, Psychotherapy, and Psychosomatics, University of Zurich, Zurich, Switzerland
- Neuroscience Center Zurich, University of Zurich, Zurich, Switzerland
- Wellcome Centre for Human Neuroimaging and Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| |
Collapse
|
35
|
Gordon N, Gilkey L, Smith RC, Michaud I, Williams B, Mousseau V, Hooper R, Jones C. A Mutual Information–Based Experimental Design Framework to Use High-Fidelity Nuclear Reactor Codes to Calibrate Low-Fidelity Codes. NUCL TECHNOL 2019. [DOI: 10.1080/00295450.2019.1590073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Natalie Gordon
- Sandia National Laboratories, Albuquerque, New Mexico 87185
| | - Lindsay Gilkey
- Sandia National Laboratories, Albuquerque, New Mexico 87185
| | - Ralph C. Smith
- North Carolina State University, Department of Mathematics, Raleigh, North Carolina 27695
| | - Isaac Michaud
- Los Alamos National Laboratory, Los Alamos, New Mexico 87545
| | - Brian Williams
- Los Alamos National Laboratory, Los Alamos, New Mexico 87545
| | | | - Russell Hooper
- Sandia National Laboratories, Albuquerque, New Mexico 87185
| | - Chris Jones
- Kansas State University, Department of Civil Engineering, Manhattan, Kansas 66506
| |
Collapse
|
36
|
Eriksson O, Jauhiainen A, Maad Sasane S, Kramer A, Nair AG, Sartorius C, Hellgren Kotaleski J. Uncertainty quantification, propagation and characterization by Bayesian analysis combined with global sensitivity analysis applied to dynamical intracellular pathway models. Bioinformatics 2019; 35:284-292. [PMID: 30010712 PMCID: PMC6330009 DOI: 10.1093/bioinformatics/bty607] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2017] [Accepted: 07/10/2018] [Indexed: 11/14/2022] Open
Abstract
Motivation Dynamical models describing intracellular phenomena are increasing in size and complexity as more information is obtained from experiments. These models are often over-parameterized with respect to the quantitative data used for parameter estimation, resulting in uncertainty in the individual parameter estimates as well as in the predictions made from the model. Here we combine Bayesian analysis with global sensitivity analysis (GSA) in order to give better informed predictions; to point out weaker parts of the model that are important targets for further experiments, as well as to give guidance on parameters that are essential in distinguishing different qualitative output behaviours. Results We used approximate Bayesian computation (ABC) to estimate the model parameters from experimental data, as well as to quantify the uncertainty in this estimation (inverse uncertainty quantification), resulting in a posterior distribution for the parameters. This parameter uncertainty was next propagated to a corresponding uncertainty in the predictions (forward uncertainty propagation), and a GSA was performed on the predictions using the posterior distribution as the possible values for the parameters. This methodology was applied on a relatively large model relevant for synaptic plasticity, using experimental data from several sources. We could hereby point out those parameters that by themselves have the largest contribution to the uncertainty of the prediction as well as identify parameters important to separate between qualitatively different predictions. This approach is useful both for experimental design as well as model building. Availability and implementation Source code is freely available at https://github.com/alexjau/uqsa. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Olivia Eriksson
- Science for Life Laboratory, Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.,Science for Life Laboratory, Department of Numerical Analysis and Computer Science, Stockholm University, Stockholm, Sweden.,Swedish e-Science Research Centre (SeRC), KTH Royal Institute of Technology, Stockholm, Sweden
| | - Alexandra Jauhiainen
- Biometrics, Early Clinical Development, IMED Biotech Unit, AstraZeneca, Gothenburg, Sweden
| | | | - Andrei Kramer
- Science for Life Laboratory, Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Anu G Nair
- Science for Life Laboratory, Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden
| | | | - Jeanette Hellgren Kotaleski
- Science for Life Laboratory, Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden.,Science for Life Laboratory, Department of Numerical Analysis and Computer Science, Stockholm University, Stockholm, Sweden.,Swedish e-Science Research Centre (SeRC), KTH Royal Institute of Technology, Stockholm, Sweden
| |
Collapse
|
37
|
Roesch E, Stumpf MPH. Parameter inference in dynamical systems with co-dimension 1 bifurcations. ROYAL SOCIETY OPEN SCIENCE 2019; 6:190747. [PMID: 31824698 PMCID: PMC6837231 DOI: 10.1098/rsos.190747] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 09/20/2019] [Indexed: 05/03/2023]
Abstract
Dynamical systems with intricate behaviour are all-pervasive in biology. Many of the most interesting biological processes indicate the presence of bifurcations, i.e. phenomena where a small change in a system parameter causes qualitatively different behaviour. Bifurcation theory has become a rich field of research in its own right and evaluating the bifurcation behaviour of a given dynamical system can be challenging. An even greater challenge, however, is to learn the bifurcation structure of dynamical systems from data, where the precise model structure is not known. Here, we study one aspects of this problem: the practical implications that the presence of bifurcations has on our ability to infer model parameters and initial conditions from empirical data; we focus on the canonical co-dimension 1 bifurcations and provide a comprehensive analysis of how dynamics, and our ability to infer kinetic parameters are linked. The picture thus emerging is surprisingly nuanced and suggests that identification of the qualitative dynamics-the bifurcation diagram-should precede any attempt at inferring kinetic parameters.
Collapse
|
38
|
Parag KV, Pybus OG. Robust Design for Coalescent Model Inference. Syst Biol 2019; 68:730-743. [PMID: 30726979 DOI: 10.1093/sysbio/syz008] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2018] [Revised: 01/28/2019] [Accepted: 02/04/2019] [Indexed: 11/08/2023] Open
Abstract
The coalescent process describes how changes in the size or structure of a population influence the genealogical patterns of sequences sampled from that population. The estimation of (effective) population size changes from genealogies that are reconstructed from these sampled sequences is an important problem in many biological fields. Often, population size is characterized by a piecewise-constant function, with each piece serving as a population size parameter to be estimated. Estimation quality depends on both the statistical coalescent inference method employed, and on the experimental protocol, which controls variables such as the sampling of sequences through time and space, or the transformation of model parameters. While there is an extensive literature on coalescent inference methodology, there is comparatively little work on experimental design. The research that does exist is largely simulation-based, precluding the development of provable or general design theorems. We examine three key design problems: temporal sampling of sequences under the skyline demographic coalescent model, spatio-temporal sampling under the structured coalescent model, and time discretization for sequentially Markovian coalescent models. In all cases, we prove that 1) working in the logarithm of the parameters to be inferred (e.g., population size) and 2) distributing informative coalescent events uniformly among these log-parameters, is uniquely robust. "Robust" means that the total and maximum uncertainty of our parameter estimates are minimized, and made insensitive to their unknown (true) values. This robust design theorem provides rigorous justification for several existing coalescent experimental design decisions and leads to usable guidelines for future empirical or simulation-based investigations. Given its persistence among models, this theorem may form the basis of an experimental design paradigm for coalescent inference.
Collapse
Affiliation(s)
- Kris V Parag
- Department of Zoology, University of Oxford, Oxford OX1 3SY, UK
| | - Oliver G Pybus
- Department of Zoology, University of Oxford, Oxford OX1 3SY, UK
| |
Collapse
|
39
|
Schmidt K, Smith RC, Hite J, Mattingly J, Azmy Y, Rajan D, Goldhahn R. Sequential optimal positioning of mobile sensors using mutual information. Stat Anal Data Min 2019. [DOI: 10.1002/sam.11431] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Kathleen Schmidt
- Applied Statistics Group Lawrence Livermore National Laboratory Livermore California
| | - Ralph C. Smith
- Department of Mathematics North Carolina State University Raleigh North Carolina
| | - Jason Hite
- Department of Nuclear Engineering North Carolina State University Raleigh North Carolina
| | - John Mattingly
- Department of Nuclear Engineering North Carolina State University Raleigh North Carolina
| | - Yousry Azmy
- Department of Nuclear Engineering North Carolina State University Raleigh North Carolina
| | - Deepak Rajan
- Applied Statistics Group Lawrence Livermore National Laboratory Livermore California
| | - Ryan Goldhahn
- Applied Statistics Group Lawrence Livermore National Laboratory Livermore California
| |
Collapse
|
40
|
Sai A, Kong N. Exploring the information transmission properties of noise-induced dynamics: application to glioma differentiation. BMC Bioinformatics 2019; 20:375. [PMID: 31272368 PMCID: PMC6610902 DOI: 10.1186/s12859-019-2970-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 06/26/2019] [Indexed: 12/21/2022] Open
Abstract
Background Cells operate in an uncertain environment, where critical cell decisions must be enacted in the presence of biochemical noise. Information theory can measure the extent to which such noise perturbs normal cellular function, in which cells must perceive environmental cues and relay signals accurately to make timely and informed decisions. Using multivariate response data can greatly improve estimates of the latent information content underlying important cell fates, like differentiation. Results We undertake an information theoretic analysis of two stochastic models concerning glioma differentiation therapy, an alternative cancer treatment modality whose underlying intracellular mechanisms remain poorly understood. Discernible changes in response dynamics, as captured by summary measures, were observed at low noise levels. Mitigating certain feedback mechanisms present in the signaling network improved information transmission overall, as did targeted subsampling and clustering of response dynamics. Conclusion Computing the channel capacity of noisy signaling pathways present great probative value in uncovering the prevalent trends in noise-induced dynamics. Areas of high dynamical variation can provide concise snapshots of informative system behavior that may otherwise be overlooked. Through this approach, we can examine the delicate interplay between noise and information, from signal to response, through the observed behavior of relevant system components. Electronic supplementary material The online version of this article (10.1186/s12859-019-2970-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Aditya Sai
- Weldon School of Biomedical Engineering, Purdue University, 206 S Martin Jischke Drive, West Lafayette, 47907, IN, USA.
| | - Nan Kong
- Weldon School of Biomedical Engineering, Purdue University, 206 S Martin Jischke Drive, West Lafayette, 47907, IN, USA
| |
Collapse
|
41
|
Kuckelkorn U, Stübler S, Textoris-Taube K, Kilian C, Niewienda A, Henklein P, Janek K, Stumpf MPH, Mishto M, Liepe J. Proteolytic dynamics of human 20S thymoproteasome. J Biol Chem 2019; 294:7740-7754. [PMID: 30914481 DOI: 10.1074/jbc.ra118.007347] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Revised: 02/26/2019] [Indexed: 01/22/2023] Open
Abstract
An efficient immunosurveillance of CD8+ T cells in the periphery depends on positive/negative selection of thymocytes and thus on the dynamics of antigen degradation and epitope production by thymoproteasome and immunoproteasome in the thymus. Although studies in mouse systems have shown how thymoproteasome activity differs from that of immunoproteasome and strongly impacts the T cell repertoire, the proteolytic dynamics and the regulation of human thymoproteasome are unknown. By combining biochemical and computational modeling approaches, we show here that human 20S thymoproteasome and immunoproteasome differ not only in the proteolytic activity of the catalytic sites but also in the peptide transport. These differences impinge upon the quantity of peptide products rather than where the substrates are cleaved. The comparison of the two human 20S proteasome isoforms depicts different processing of antigens that are associated to tumors and autoimmune diseases.
Collapse
Affiliation(s)
- Ulrike Kuckelkorn
- From the Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Institut für Biochemie, Germany, 10117 Berlin, Germany
| | - Sabine Stübler
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, United Kingdom.,Mathematical Modelling and Systems Biology, Institute of Mathematics, University of Potsdam, 14469 Potsdam, Germany
| | - Kathrin Textoris-Taube
- Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institut für Biochemie, Germany, 10117 Berlin, Germany.,Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Shared Facility for Mass Spectrometry, 10117 Berlin, Germany
| | - Christiane Kilian
- From the Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Institut für Biochemie, Germany, 10117 Berlin, Germany
| | - Agathe Niewienda
- Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institut für Biochemie, Germany, 10117 Berlin, Germany.,Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Shared Facility for Mass Spectrometry, 10117 Berlin, Germany
| | - Petra Henklein
- From the Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, Institut für Biochemie, Germany, 10117 Berlin, Germany
| | - Katharina Janek
- Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institut für Biochemie, Germany, 10117 Berlin, Germany.,Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Shared Facility for Mass Spectrometry, 10117 Berlin, Germany
| | - Michael P H Stumpf
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, United Kingdom.,Melbourne Integrative Genomics, Schools of BioSciences and of Maths & Stats, University of Melbourne, Parkville, 3010 Victoria, Australia
| | - Michele Mishto
- Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institut für Biochemie, Germany, 10117 Berlin, Germany, .,Centre for Inflammation Biology and Cancer Immunology (CIBCI) and Peter Gorer Department of Immunobiology, School of Immunology and Microbial Science, King's College London, London SE1 1UL, United Kingdom
| | - Juliane Liepe
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, United Kingdom, .,Max Planck Institute for Biophysical Chemistry, 37077 Göttingen, Germany, and
| |
Collapse
|
42
|
Dianzani C, Vecchio D, Clemente N, Chiocchetti A, Martinelli Boneschi F, Galimberti D, Dianzani U, Comi C, Mishto M, Liepe J. Untangling Extracellular Proteasome-Osteopontin Circuit Dynamics in Multiple Sclerosis. Cells 2019; 8:cells8030262. [PMID: 30897778 PMCID: PMC6468732 DOI: 10.3390/cells8030262] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 03/14/2019] [Accepted: 03/18/2019] [Indexed: 12/12/2022] Open
Abstract
The function of proteasomes in extracellular space is still largely unknown. The extracellular proteasome-osteopontin circuit has recently been hypothesized to be part of the inflammatory machinery regulating relapse/remission phase alternation in multiple sclerosis. However, it is still unclear what dynamics there are between the different elements of the circuit, what the role of proteasome isoforms is, and whether these inflammatory circuit dynamics are associated with the clinical severity of multiple sclerosis. To shed light on these aspects of this novel inflammatory circuit, we integrated in vitro proteasome isoform data, cell chemotaxis cell culture data, and clinical data of multiple sclerosis cohorts in a coherent computational inference framework. Thereby, we modeled extracellular osteopontin-proteasome circuit dynamics during relapse/remission alternation in multiple sclerosis. Applying this computational framework to a longitudinal study on single multiple sclerosis patients suggests a complex interaction between extracellular proteasome isoforms and osteopontin with potential clinical implications.
Collapse
Affiliation(s)
- Chiara Dianzani
- Department of Drug Science and Technology, University of Turin, 10126 Torino, Italy.
| | - Domizia Vecchio
- Interdisciplinary Research Centre of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Amedeo Avogadro, 28100 Novara, Italy.
| | - Nausicaa Clemente
- Interdisciplinary Research Centre of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Amedeo Avogadro, 28100 Novara, Italy.
| | - Annalisa Chiocchetti
- Interdisciplinary Research Centre of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Amedeo Avogadro, 28100 Novara, Italy.
| | - Filippo Martinelli Boneschi
- Department of Biomedical Sciences for Health, University of Milan, 20122 Milan, Italy.
- MS Research Unit and Department of Neurology, IRCCS Policlinico San Donato, San Donato Milanese, 20097 Milan, Italy.
| | - Daniela Galimberti
- Department of Biomedical, Surgical and Dental Sciences, University of Milan, "Dino Ferrari" Centre, 20100 Milano, Italy.
- Fondazione IRCCS Cà Granda, Ospedale Maggiore Policlinico, 20100 Milano, Italy.
| | - Umberto Dianzani
- Interdisciplinary Research Centre of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Amedeo Avogadro, 28100 Novara, Italy.
| | - Cristoforo Comi
- Interdisciplinary Research Centre of Autoimmune Diseases (IRCAD), University of Piemonte Orientale, Amedeo Avogadro, 28100 Novara, Italy.
- Department of Translational Medicine, Section of Neurology, University of Piemonte Orientale, 28100 Novara, Italy.
| | - Michele Mishto
- Centre for Inflammation Biology and Cancer Immunology (CIBCI) & Peter Gorer Department of Immunobiology, King's College London, SE1 1UL London, UK.
- Institute for Biochemistry, Charité⁻Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Institut für Biochemie, Germany, 10117 Berlin, Germany.
| | - Juliane Liepe
- Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany.
| |
Collapse
|
43
|
Strutz J, Martin J, Greene J, Broadbelt L, Tyo K. Metabolic kinetic modeling provides insight into complex biological questions, but hurdles remain. Curr Opin Biotechnol 2019; 59:24-30. [PMID: 30851632 DOI: 10.1016/j.copbio.2019.02.005] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Revised: 01/25/2019] [Accepted: 02/04/2019] [Indexed: 01/16/2023]
Abstract
Metabolic models containing kinetic information can answer unique questions about cellular metabolism that are useful to metabolic engineering. Several kinetic modeling frameworks have recently been developed or improved. In addition, techniques for systematic identification of model structure, including regulatory interactions, have been reported. Each framework has advantages and limitations, which can make it difficult to choose the most appropriate framework. Common limitations are data availability and computational time, especially in large-scale modeling efforts. However, recently developed experimental techniques, parameter identification algorithms, as well as model reduction techniques help alleviate these computational bottlenecks. Opportunities for additional improvements may come from the rich literature in catalysis and chemical networks. In all, kinetic models are positioned to make significant impact in cellular engineering.
Collapse
Affiliation(s)
- Jonathan Strutz
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA; Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Jacob Martin
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA; Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Jennifer Greene
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA; Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
| | - Linda Broadbelt
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA
| | - Keith Tyo
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL, USA; Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
44
|
Using Experimental Data and Information Criteria to Guide Model Selection for Reaction–Diffusion Problems in Mathematical Biology. Bull Math Biol 2019; 81:1760-1804. [DOI: 10.1007/s11538-019-00589-x] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 02/20/2019] [Indexed: 12/20/2022]
|
45
|
Treece BW, Kienzle PA, Hoogerheide DP, Majkrzak CF, Lösche M, Heinrich F. Optimization of reflectometry experiments using information theory. J Appl Crystallogr 2019; 52:47-59. [PMID: 30800029 PMCID: PMC6362612 DOI: 10.1107/s1600576718017016] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Accepted: 11/30/2018] [Indexed: 12/26/2022] Open
Abstract
A framework based on Bayesian statistics and information theory is developed to optimize the design of surface-sensitive reflectometry experiments. The method applies to model-based reflectivity data analysis, uses simulated reflectivity data and is capable of optimizing experiments that probe a sample under more than one condition. After presentation of the underlying theory and its implementation, the framework is applied to exemplary test problems for which the information gain ΔH is determined. Reflectivity data are simulated for the current generation of neutron reflectometers at the NIST Center for Neutron Research. However, the simulation can be easily modified for X-ray or neutron instruments at any source. With application to structural biology in mind, this work explores the dependence of ΔH on the scattering length density of aqueous solutions in which the sample structure is bathed, on the counting time and on the maximum momentum transfer of the measurement. Finally, the impact of a buried magnetic reference layer on ΔH is investigated.
Collapse
Affiliation(s)
- Bradley W. Treece
- Department of Physics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
| | - Paul A. Kienzle
- Center for Neutron Research, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899-6102, USA
| | - David P. Hoogerheide
- Center for Neutron Research, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899-6102, USA
| | - Charles F. Majkrzak
- Center for Neutron Research, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899-6102, USA
| | - Mathias Lösche
- Department of Physics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
- Center for Neutron Research, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899-6102, USA
- Department of Biomedical Engineering, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
| | - Frank Heinrich
- Department of Physics, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, Pennsylvania 15213, USA
- Center for Neutron Research, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, Maryland 20899-6102, USA
| |
Collapse
|
46
|
Dony L, He F, Stumpf MPH. Parametric and non-parametric gradient matching for network inference: a comparison. BMC Bioinformatics 2019; 20:52. [PMID: 30683048 PMCID: PMC6346534 DOI: 10.1186/s12859-018-2590-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 12/21/2018] [Indexed: 11/24/2022] Open
Abstract
Background Reverse engineering of gene regulatory networks from time series gene-expression data is a challenging problem, not only because of the vast sets of candidate interactions but also due to the stochastic nature of gene expression. We limit our analysis to nonlinear differential equation based inference methods. In order to avoid the computational cost of large-scale simulations, a two-step Gaussian process interpolation based gradient matching approach has been proposed to solve differential equations approximately. Results We apply a gradient matching inference approach to a large number of candidate models, including parametric differential equations or their corresponding non-parametric representations, we evaluate the network inference performance under various settings for different inference objectives. We use model averaging, based on the Bayesian Information Criterion (BIC), to combine the different inferences. The performance of different inference approaches is evaluated using area under the precision-recall curves. Conclusions We found that parametric methods can provide comparable, and often improved inference compared to non-parametric methods; the latter, however, require no kinetic information and are computationally more efficient. Electronic supplementary material The online version of this article (10.1186/s12859-018-2590-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Leander Dony
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.,Institute of Computational Biology, Helmholtz Center Munich, German Research Center for Environmental Health, Neuherberg, 85764, Germany.,Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, Munich, 80804, Germany
| | - Fei He
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.,School of Computing, Electronics, and Mathematics, Coventry University, Coventry, CV1 2JH, UK
| | - Michael P H Stumpf
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK. .,Melbourne Integrative Genomics, School of BioScience & School of Mathematics and Statistics, University of Melbourne, Parkville Melbourne, 3010, Australia.
| |
Collapse
|
47
|
Bandiera L, Gomez Cabeza D, Balsa-Canto E, Menolascina F. Bayesian model selection in synthetic biology: factor levels and observation functions. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.ifacol.2019.12.231] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
48
|
Dony L, Mackerodt J, Ward S, Filippi S, Stumpf MPH, Liepe J. PEITH(Θ): perfecting experiments with information theory in Python with GPU support. Bioinformatics 2018; 34:1249-1250. [PMID: 29228182 PMCID: PMC5998942 DOI: 10.1093/bioinformatics/btx776] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 12/04/2017] [Indexed: 01/27/2023] Open
Abstract
Motivation Different experiments provide differing levels of information about a biological system. This makes it difficult, a priori, to select one of them beyond mere speculation and/or belief, especially when resources are limited. With the increasing diversity of experimental approaches and general advances in quantitative systems biology, methods that inform us about the information content that a given experiment carries about the question we want to answer, become crucial. Results PEITH(Θ) is a general purpose, Python framework for experimental design in systems biology. PEITH(Θ) uses Bayesian inference and information theory in order to derive which experiments are most informative in order to estimate all model parameters and/or perform model predictions. Availability and implementation https://github.com/MichaelPHStumpf/Peitho
Collapse
Affiliation(s)
| | | | | | - Sarah Filippi
- Faculty of Medicine, School of Public Health.,Department of Mathematics, Imperial College London, London SW7 2AZ, UK
| | | | - Juliane Liepe
- Department of Life Sciences.,Max-Planck-Institute for Biophysical Chemistry, 37077 Göttingen, Germany
| |
Collapse
|
49
|
Jeong JE, Qiu P. Quantifying the relative importance of experimental data points in parameter estimation. BMC SYSTEMS BIOLOGY 2018; 12:103. [PMID: 30463558 PMCID: PMC6249737 DOI: 10.1186/s12918-018-0622-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Ordinary differential equations (ODEs) are often used to understand biological processes. Since ODE-based models usually contain many unknown parameters, parameter estimation is an important step toward deeper understanding of the process. Parameter estimation is often formulated as a least squares optimization problem, where all experimental data points are considered as equally important. However, this equal-weight formulation ignores the possibility of existence of relative importance among different data points, and may lead to misleading parameter estimation results. Therefore, we propose to introduce weights to account for the relative importance of different data points when formulating the least squares optimization problem. Each weight is defined by the uncertainty of one data point given the other data points. If one data point can be accurately inferred given the other data, the uncertainty of this data point is low and the importance of this data point is low. Whereas, if inferring one data point from the other data is almost impossible, it contains a huge uncertainty and carries more information for estimating parameters. RESULTS G1/S transition model with 6 parameters and 12 parameters, and MAPK module with 14 parameters were used to test the weighted formulation. In each case, evenly spaced experimental data points were used. Weights calculated in these models showed similar patterns: high weights for data points in dynamic regions and low weights for data points in flat regions. We developed a sampling algorithm to evaluate the weighted formulation, and demonstrated that the weighted formulation reduced the redundancy in the data. For G1/S transition model with 12 parameters, we examined unevenly spaced experimental data points, strategically sampled to have more measurement points where the weights were relatively high, and fewer measurement points where the weights were relatively low. This analysis showed that the proposed weights can be used for designing measurement time points. CONCLUSIONS Giving a different weight to each data point according to its relative importance compared to other data points is an effective method for improving robustness of parameter estimation by reducing the redundancy in the experimental data.
Collapse
Affiliation(s)
- Jenny E. Jeong
- Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, 30332 GA USA
| | - Peng Qiu
- Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, 30332 GA USA
| |
Collapse
|
50
|
Comprehensive experimental design for chemical engineering processes: A two-layer iterative design approach. Chem Eng Sci 2018. [DOI: 10.1016/j.ces.2018.05.047] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|