1
|
Xu S, Xu T, Yang Y, Chen X. Learning metabolic dynamics from irregular observations by Bidirectional Time-Series State Transfer Network. mSystems 2024; 9:e0069724. [PMID: 39057922 PMCID: PMC11334518 DOI: 10.1128/msystems.00697-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 07/03/2024] [Indexed: 07/28/2024] Open
Abstract
Modeling microbial metabolic dynamics is important for the rational optimization of both biosynthetic systems and industrial processes to facilitate green and efficient biomanufacturing. Classical approaches utilize explicit equation systems to represent metabolic networks, enabling the quantification of pathway fluxes to identify metabolic bottlenecks. However, these white-box models, despite their diverse applications, have limitations in simulating metabolic dynamics and are intrinsically inaccurate for industrial strains that lack information on network structures and kinetic parameters. On the other hand, black-box models do not rely on prior mechanistic knowledge of strains but are built upon observed time-series trajectories of biosynthetic systems in action. In practice, these observations are typically irregular, with discontinuously observed time points across multiple independent batches, each time point potentially containing missing measurements. Learning from such irregular data remains challenging for existing approaches. To address this issue, we present the Bidirectional Time-Series State Transfer Network (BTSTN) for modeling metabolic dynamics directly from irregular observations. Using evaluation data sets derived from both ideal dynamic systems and a real-world fermentation process, we demonstrate that BTSTN accurately reconstructs dynamic behaviors and predicts future trajectories. This approach exhibits enhanced robustness against missing measurements and noise, as compared to the state-of-the-art methods.IMPORTANCEIndustrial biosynthetic systems often involve strains with unclear genetic backgrounds, posing challenges in modeling their distinct metabolic dynamics. In such scenarios, white-box models, which commonly rely on inferred networks, are thereby of limited applicability and accuracy. In contrast, black-box models, such as statistical models and neural networks, are directly fitted or learned from observed time-series trajectories of biosynthetic systems in action. These methods typically assume regular observations without missing time points or measurements. If the observations are irregular, a pre-processing step becomes necessary to obtain a fully filled data set for subsequent model training, which, at the same time, inevitably introduces errors into the resulting models. BTSTN is a novel approach that natively learns from irregular observations. This distinctive feature makes it a unique addition to the current arsenal of technologies modeling metabolic dynamics.
Collapse
Affiliation(s)
- Shaohua Xu
- School of Basic Medical Sciences and the First Affiliated Hospital Department of Radiation Oncology, Zhejiang University School of Medicine, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Microbial Biochemistry and Metabolic Engineering, Hangzhou, China
| | - Ting Xu
- School of Basic Medical Sciences and the First Affiliated Hospital Department of Radiation Oncology, Zhejiang University School of Medicine, Hangzhou, China
| | - Yuping Yang
- School of Basic Medical Sciences and the First Affiliated Hospital Department of Radiation Oncology, Zhejiang University School of Medicine, Hangzhou, China
| | - Xin Chen
- School of Basic Medical Sciences and the First Affiliated Hospital Department of Radiation Oncology, Zhejiang University School of Medicine, Hangzhou, China
- Zhejiang Provincial Key Laboratory for Microbial Biochemistry and Metabolic Engineering, Hangzhou, China
| |
Collapse
|
2
|
Biddau G, Caviglia G, Piana M, Sommariva S. PCA-based synthetic sensitivity coefficients for chemical reaction network in cancer. Sci Rep 2024; 14:17706. [PMID: 39085332 PMCID: PMC11291660 DOI: 10.1038/s41598-024-67862-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Accepted: 07/16/2024] [Indexed: 08/02/2024] Open
Abstract
Chemical reaction networks are powerful tools for modeling cell signaling and its disruptions in diseases like cancer. Realistic chemical reaction networks involve hundreds of proteins and reactions, resulting in a model depending on a consistently large number of kinetic parameters. Since finely calibrating all the parameters would require an unrealistic amount of data, proper sensitivity analysis is required to identify a subset of parameters for which fine tuning is needed and thus provide a fundamental tool for the qualitative analysis of the network. We present a multidisciplinary approach for computing a set of synthetic sensitivity indices. These indices rank the kinetic parameters, based on the impact that errors in their values would have on the protein concentration profile at equilibrium. Our tests on a chemical reaction network devised for colorectal cells demonstrate the effectiveness of the considered sensitivity indices in different scenarios including in-silico drug dosage and novel therapeutic target discovery. The Matlab code for computing the synthetic sensitivity indices and the data concerning the network for colorectal cells are available at https://github.com/theMIDAgroup/CRN_sensitivity.
Collapse
Affiliation(s)
- Giorgia Biddau
- MIDA, Dipartimento di Matematica, Dipartimento di Eccellenza 2023-2027, Università di Genova, Genova, Italy.
| | - Giacomo Caviglia
- MIDA, Dipartimento di Matematica, Dipartimento di Eccellenza 2023-2027, Università di Genova, Genova, Italy
| | - Michele Piana
- MIDA, Dipartimento di Matematica, Dipartimento di Eccellenza 2023-2027, Università di Genova, Genova, Italy
- IRCCS Ospedale Policlinico San Martino, LISCOMP, Genova, Italy
| | - Sara Sommariva
- MIDA, Dipartimento di Matematica, Dipartimento di Eccellenza 2023-2027, Università di Genova, Genova, Italy
| |
Collapse
|
3
|
Massonis G, Villaverde AF, Banga JR. Improving dynamic predictions with ensembles of observable models. Bioinformatics 2022; 39:6842325. [PMID: 36416122 PMCID: PMC9805594 DOI: 10.1093/bioinformatics/btac755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/20/2022] [Accepted: 11/22/2022] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION Dynamic mechanistic modelling in systems biology has been hampered by the complexity and variability associated with the underlying interactions, and by uncertain and sparse experimental measurements. Ensemble modelling, a concept initially developed in statistical mechanics, has been introduced in biological applications with the aim of mitigating those issues. Ensemble modelling uses a collection of different models compatible with the observed data to describe the phenomena of interest. However, since systems biology models often suffer from a lack of identifiability and observability, ensembles of models are particularly unreliable when predicting non-observable states. RESULTS We present a strategy to assess and improve the reliability of a class of model ensembles. In particular, we consider kinetic models described using ordinary differential equations with a fixed structure. Our approach builds an ensemble with a selection of the parameter vectors found when performing parameter estimation with a global optimization metaheuristic. This technique enforces diversity during the sampling of parameter space and it can quantify the uncertainty in the predictions of state trajectories. We couple this strategy with structural identifiability and observability analysis, and when these tests detect possible prediction issues we obtain model reparameterizations that surmount them. The end result is an ensemble of models with the ability to predict the internal dynamics of a biological process. We demonstrate our approach with models of glucose regulation, cell division, circadian oscillations and the JAK-STAT signalling pathway. AVAILABILITY AND IMPLEMENTATION The code that implements the methodology and reproduces the results is available at https://doi.org/10.5281/zenodo.6782638. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Gemma Massonis
- Computational Biology Lab, MBG-CSIC (Spanish National Research Council), Pontevedra, Galicia 36143, Spain
| | | | | |
Collapse
|
4
|
Boada Y, Vignoni A, Alarcon-Ruiz I, Andreu-Vilarroig C, Monfort-Llorens R, Requena A, Picó J. Characterization of Gene Circuit Parts Based on Multiobjective Optimization by Using Standard Calibrated Measurements. Chembiochem 2019; 20:2653-2665. [PMID: 31269324 DOI: 10.1002/cbic.201900272] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 06/12/2019] [Indexed: 01/08/2023]
Abstract
Standardization and characterization of biological parts is necessary for the further development of bottom-up synthetic biology. Herein, an easy-to-use methodology that embodies both a calibration procedure and a multiobjective optimization approach is proposed to characterize biological parts. The calibration procedure generates values for specific fluorescence per cell expressed as standard units of molecules of equivalent fluorescein per particle. The use of absolute standard units enhances the characterization of model parameters for biological parts by bringing measurements and estimations results from different sources into a common domain, so they can be integrated and compared faithfully. The multiobjective optimization procedure exploits these concepts by estimating the values of the model parameters, which represent biological parts of interest, while considering a varied range of experimental and circuit contexts. Thus, multiobjective optimization provides a robust characterization of them. The proposed calibration and characterization methodology can be used as a guide for good practices in dry and wet laboratories; thus allowing not only portability between models, but is also useful for generating libraries of tested and well-characterized biological parts.
Collapse
Affiliation(s)
- Yadira Boada
- Synthetic Biology and Biosystems Control Lab, I.U. de Automática e Informática Industrial (ai2), Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain.,Centro Universitario EDEM, Escuela de Empresarios, La Marina de València, Muelle de la Aduana S/N, 46024, Valencia, Spain
| | - Alejandro Vignoni
- Synthetic Biology and Biosystems Control Lab, I.U. de Automática e Informática Industrial (ai2), Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain
| | - Iván Alarcon-Ruiz
- Synthetic Biology and Biosystems Control Lab, I.U. de Automática e Informática Industrial (ai2), Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain.,Escuela Tècnica Superior de Ingeniería Agronómica y del Medio Natural, Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain
| | - Carlos Andreu-Vilarroig
- Escuela Técnica Superior de Ingeniería Industrial, Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain
| | - Roger Monfort-Llorens
- Synthetic Biology and Biosystems Control Lab, I.U. de Automática e Informática Industrial (ai2), Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain.,Escuela Técnica Superior de Ingeniería Industrial, Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain
| | - Adrián Requena
- Synthetic Biology and Biosystems Control Lab, I.U. de Automática e Informática Industrial (ai2), Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain.,Escuela Tècnica Superior de Ingeniería Agronómica y del Medio Natural, Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain
| | - Jesús Picó
- Synthetic Biology and Biosystems Control Lab, I.U. de Automática e Informática Industrial (ai2), Universitat Politècnica de Valencia, Camino de Vera S/N, 46022, Valencia, Spain
| |
Collapse
|
5
|
Liu F, Heiner M, Gilbert D. Fuzzy Petri nets for modelling of uncertain biological systems. Brief Bioinform 2018; 21:198-210. [PMID: 30590430 DOI: 10.1093/bib/bby118] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 10/09/2018] [Accepted: 11/16/2018] [Indexed: 12/28/2022] Open
Abstract
The modelling of biological systems is accompanied with epistemic uncertainties that range from structural uncertainty to parametric uncertainty due to such limitations as insufficient understanding of the underlying mechanism and incomplete measurement data of a system. Fuzzy logic approaches such as fuzzy Petri nets (FPNs) are effective in addressing these issues. In this paper, we review FPNs that have been used for modelling uncertain biological systems, which we classify in three categories: basic fuzzy Petri nets, fuzzy quantitative Petri nets and Petri nets with fuzzy kinetic parameters. For each category of these FPNs, we summarize its modelling capabilities and current applications, discuss its merits and drawbacks and give suggestions for further research. This understanding on how to use FPNs for modelling uncertain biological systems will assist readers in selecting appropriate FPN classes for specific modelling circumstances. This review may also promote the extensive research and application of FPNs in the systems biology area.
Collapse
Affiliation(s)
- Fei Liu
- School of Software Engineering, South China University of Technology, Guangzhou, P. R. China
| | - Monika Heiner
- Department of Computer Science, Brandenburg University of Technology Cottbus-Senftenberg, Cottbus, Germany
| | - David Gilbert
- Department of Computer Science, Brunel University London, Middlesex, UK
| |
Collapse
|
6
|
Henriques D, Villaverde AF, Rocha M, Saez-Rodriguez J, Banga JR. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models. PLoS Comput Biol 2017; 13:e1005379. [PMID: 28166222 PMCID: PMC5319798 DOI: 10.1371/journal.pcbi.1005379] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 02/21/2017] [Accepted: 01/24/2017] [Indexed: 11/19/2022] Open
Abstract
Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM's ensemble prediction is not only consistently better than predictions from individual models, but also often outperforms the state of the art represented by the methods used in the HPN-DREAM challenge.
Collapse
Affiliation(s)
- David Henriques
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
| | - Alejandro F. Villaverde
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Julio Saez-Rodriguez
- Joint Research Center for Computational Biomedicine, RWTH-Aachen University, Aachen, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | - Julio R. Banga
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
| |
Collapse
|
7
|
Penas DR, González P, Egea JA, Doallo R, Banga JR. Parameter estimation in large-scale systems biology models: a parallel and self-adaptive cooperative strategy. BMC Bioinformatics 2017; 18:52. [PMID: 28109249 PMCID: PMC5251293 DOI: 10.1186/s12859-016-1452-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 12/24/2016] [Indexed: 12/02/2022] Open
Abstract
Background The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. Results The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. Conclusions The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1452-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David R Penas
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain
| | - Patricia González
- Computer Architecture Group, Universidade da Coruña, Campus de Elviña s/n, Coruña, 15071 A, Spain
| | - Jose A Egea
- Department of Applied Mathematics and Statistics, Universidad Politécnica de Cartagena, c/ Dr. Fleming s/n, Cartagena, 30202, Spain
| | - Ramón Doallo
- Computer Architecture Group, Universidade da Coruña, Campus de Elviña s/n, Coruña, 15071 A, Spain
| | - Julio R Banga
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain.
| |
Collapse
|
8
|
Srinivasan S, Cluett WR, Mahadevan R. Constructing kinetic models of metabolism at genome-scales: A review. Biotechnol J 2016; 10:1345-59. [PMID: 26332243 DOI: 10.1002/biot.201400522] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Revised: 04/01/2015] [Accepted: 07/08/2015] [Indexed: 11/08/2022]
Abstract
Constraint-based modeling of biological networks (metabolism, transcription and signal transduction), although used successfully in many applications, suffer from specific limitations such as the lack of representation of metabolite concentrations and enzymatic regulation, which are necessary for a complete physiologically relevant model. Kinetic models conversely overcome these shortcomings and enable dynamic analysis of biological systems for enhanced in silico hypothesis generation. Nonetheless, kinetic models also have limitations for modeling at genome-scales chiefly due to: (i) model non-linearity; (ii) computational tractability; (iii) parameter identifiability; (iv) estimability; and (v) uncertainty. In order to support further development of kinetic models as viable alternatives to constraint-based models, this review presents a brief description of the existing obstacles towards building genome-scale kinetic models. Specific kinetic modeling frameworks capable of overcoming these obstacles are covered in this review. The tractability and physiological feasibility of these models are discussed with the objective of using available in vivo experimental observations to define the model parameter space. Among the different methods discussed, Monte Carlo kinetic models of metabolism stand out as potentially tractable methods to model genome scale networks while also addressing in vivo parameter uncertainty.
Collapse
Affiliation(s)
- Shyam Srinivasan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, Canada
| | - William R Cluett
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, Canada
| | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, Canada. .,Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
9
|
Costa RS, Hartmann A, Vinga S. Kinetic modeling of cell metabolism for microbial production. J Biotechnol 2015; 219:126-41. [PMID: 26724578 DOI: 10.1016/j.jbiotec.2015.12.023] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 11/25/2015] [Accepted: 12/15/2015] [Indexed: 12/20/2022]
Abstract
Kinetic models of cellular metabolism are important tools for the rational design of metabolic engineering strategies and to explain properties of complex biological systems. The recent developments in high-throughput experimental data are leading to new computational approaches for building kinetic models of metabolism. Herein, we briefly survey the available databases, standards and software tools that can be applied for kinetic models of metabolism. In addition, we give an overview about recently developed ordinary differential equations (ODE)-based kinetic models of metabolism and some of the main applications of such models are illustrated in guiding metabolic engineering design. Finally, we review the kinetic modeling approaches of large-scale networks that are emerging, discussing their main advantages, challenges and limitations.
Collapse
Affiliation(s)
- Rafael S Costa
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal.
| | - Andras Hartmann
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal
| | - Susana Vinga
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa, Portugal
| |
Collapse
|
10
|
Almquist J, Bendrioua L, Adiels CB, Goksör M, Hohmann S, Jirstrand M. A Nonlinear Mixed Effects Approach for Modeling the Cell-To-Cell Variability of Mig1 Dynamics in Yeast. PLoS One 2015; 10:e0124050. [PMID: 25893847 PMCID: PMC4404321 DOI: 10.1371/journal.pone.0124050] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 02/25/2015] [Indexed: 11/29/2022] Open
Abstract
The last decade has seen a rapid development of experimental techniques that allow data collection from individual cells. These techniques have enabled the discovery and characterization of variability within a population of genetically identical cells. Nonlinear mixed effects (NLME) modeling is an established framework for studying variability between individuals in a population, frequently used in pharmacokinetics and pharmacodynamics, but its potential for studies of cell-to-cell variability in molecular cell biology is yet to be exploited. Here we take advantage of this novel application of NLME modeling to study cell-to-cell variability in the dynamic behavior of the yeast transcription repressor Mig1. In particular, we investigate a recently discovered phenomenon where Mig1 during a short and transient period exits the nucleus when cells experience a shift from high to intermediate levels of extracellular glucose. A phenomenological model based on ordinary differential equations describing the transient dynamics of nuclear Mig1 is introduced, and according to the NLME methodology the parameters of this model are in turn modeled by a multivariate probability distribution. Using time-lapse microscopy data from nearly 200 cells, we estimate this parameter distribution according to the approach of maximizing the population likelihood. Based on the estimated distribution, parameter values for individual cells are furthermore characterized and the resulting Mig1 dynamics are compared to the single cell times-series data. The proposed NLME framework is also compared to the intuitive but limited standard two-stage (STS) approach. We demonstrate that the latter may overestimate variabilities by up to almost five fold. Finally, Monte Carlo simulations of the inferred population model are used to predict the distribution of key characteristics of the Mig1 transient response. We find that with decreasing levels of post-shift glucose, the transient response of Mig1 tend to be faster, more extended, and displays an increased cell-to-cell variability.
Collapse
Affiliation(s)
- Joachim Almquist
- Fraunhofer-Chalmers Centre, Chalmers Science Park, Göteborg, Sweden
- Systems and Synthetic Biology, Department of Chemical and Biological Engineering, Chalmers University of Technology, Göteborg, Sweden
- * E-mail:
| | - Loubna Bendrioua
- Department of Chemistry and Molecular Biology, University of Gothenburg, Göteborg, Sweden
- Department of Physics, University of Gothenburg, Göteborg, Sweden
| | | | - Mattias Goksör
- Department of Physics, University of Gothenburg, Göteborg, Sweden
| | - Stefan Hohmann
- Department of Chemistry and Molecular Biology, University of Gothenburg, Göteborg, Sweden
| | - Mats Jirstrand
- Fraunhofer-Chalmers Centre, Chalmers Science Park, Göteborg, Sweden
| |
Collapse
|
11
|
Villaverde AF, Bongard S, Mauch K, Müller D, Balsa-Canto E, Schmid J, Banga JR. A consensus approach for estimating the predictive accuracy of dynamic models in biology. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2015; 119:17-28. [PMID: 25716416 DOI: 10.1016/j.cmpb.2015.02.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 12/19/2014] [Accepted: 02/02/2015] [Indexed: 06/04/2023]
Abstract
Mathematical models that predict the complex dynamic behaviour of cellular networks are fundamental in systems biology, and provide an important basis for biomedical and biotechnological applications. However, obtaining reliable predictions from large-scale dynamic models is commonly a challenging task due to lack of identifiability. The present work addresses this challenge by presenting a methodology for obtaining high-confidence predictions from dynamic models using time-series data. First, to preserve the complex behaviour of the network while reducing the number of estimated parameters, model parameters are combined in sets of meta-parameters, which are obtained from correlations between biochemical reaction rates and between concentrations of the chemical species. Next, an ensemble of models with different parameterizations is constructed and calibrated. Finally, the ensemble is used for assessing the reliability of model predictions by defining a measure of convergence of model outputs (consensus) that is used as an indicator of confidence. We report results of computational tests carried out on a metabolic model of Chinese Hamster Ovary (CHO) cells, which are used for recombinant protein production. Using noisy simulated data, we find that the aggregated ensemble predictions are on average more accurate than the predictions of individual ensemble models. Furthermore, ensemble predictions with high consensus are statistically more accurate than ensemble predictions with large variance. The procedure provides quantitative estimates of the confidence in model predictions and enables the analysis of sufficiently complex networks as required for practical applications.
Collapse
Affiliation(s)
| | - Sophia Bongard
- Insilico Biotechnology AG, Meitnerstraße 8, 70563 Stuttgart, Germany.
| | - Klaus Mauch
- Insilico Biotechnology AG, Meitnerstraße 8, 70563 Stuttgart, Germany.
| | - Dirk Müller
- Insilico Biotechnology AG, Meitnerstraße 8, 70563 Stuttgart, Germany.
| | - Eva Balsa-Canto
- Bioprocess Engineering Group, IIM-CSIC, Eduardo Cabello 6, 36208 Vigo, Spain.
| | - Joachim Schmid
- Insilico Biotechnology AG, Meitnerstraße 8, 70563 Stuttgart, Germany.
| | - Julio R Banga
- Bioprocess Engineering Group, IIM-CSIC, Eduardo Cabello 6, 36208 Vigo, Spain.
| |
Collapse
|
12
|
Abstract
Mathematical models of natural systems are abstractions of much more complicated processes. Developing informative and realistic models of such systems typically involves suitable statistical inference methods, domain expertise, and a modicum of luck. Except for cases where physical principles provide sufficient guidance, it will also be generally possible to come up with a large number of potential models that are compatible with a given natural system and any finite amount of data generated from experiments on that system. Here we develop a computational framework to systematically evaluate potentially vast sets of candidate differential equation models in light of experimental and prior knowledge about biological systems. This topological sensitivity analysis enables us to evaluate quantitatively the dependence of model inferences and predictions on the assumed model structures. Failure to consider the impact of structural uncertainty introduces biases into the analysis and potentially gives rise to misleading conclusions.
Collapse
|
13
|
Almquist J, Cvijovic M, Hatzimanikatis V, Nielsen J, Jirstrand M. Kinetic models in industrial biotechnology - Improving cell factory performance. Metab Eng 2014; 24:38-60. [PMID: 24747045 DOI: 10.1016/j.ymben.2014.03.007] [Citation(s) in RCA: 158] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Revised: 03/07/2014] [Accepted: 03/09/2014] [Indexed: 11/16/2022]
Abstract
An increasing number of industrial bioprocesses capitalize on living cells by using them as cell factories that convert sugars into chemicals. These processes range from the production of bulk chemicals in yeasts and bacteria to the synthesis of therapeutic proteins in mammalian cell lines. One of the tools in the continuous search for improved performance of such production systems is the development and application of mathematical models. To be of value for industrial biotechnology, mathematical models should be able to assist in the rational design of cell factory properties or in the production processes in which they are utilized. Kinetic models are particularly suitable towards this end because they are capable of representing the complex biochemistry of cells in a more complete way compared to most other types of models. They can, at least in principle, be used to in detail understand, predict, and evaluate the effects of adding, removing, or modifying molecular components of a cell factory and for supporting the design of the bioreactor or fermentation process. However, several challenges still remain before kinetic modeling will reach the degree of maturity required for routine application in industry. Here we review the current status of kinetic cell factory modeling. Emphasis is on modeling methodology concepts, including model network structure, kinetic rate expressions, parameter estimation, optimization methods, identifiability analysis, model reduction, and model validation, but several applications of kinetic models for the improvement of cell factories are also discussed.
Collapse
Affiliation(s)
- Joachim Almquist
- Fraunhofer-Chalmers Centre, Chalmers Science Park, SE-412 88 Göteborg, Sweden; Systems and Synthetic Biology, Department of Chemical and Biological Engineering, Chalmers University of Technology, SE-412 96 Göteborg, Sweden.
| | - Marija Cvijovic
- Mathematical Sciences, Chalmers University of Technology and University of Gothenburg, SE-412 96 Göteborg, Sweden; Mathematical Sciences, University of Gothenburg, SE-412 96 Göteborg, Sweden
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology, Ecole Polytechnique Federale de Lausanne, CH 1015 Lausanne, Switzerland
| | - Jens Nielsen
- Systems and Synthetic Biology, Department of Chemical and Biological Engineering, Chalmers University of Technology, SE-412 96 Göteborg, Sweden
| | - Mats Jirstrand
- Fraunhofer-Chalmers Centre, Chalmers Science Park, SE-412 88 Göteborg, Sweden
| |
Collapse
|
14
|
Villaverde AF, Banga JR. Reverse engineering and identification in systems biology: strategies, perspectives and challenges. J R Soc Interface 2014; 11:20130505. [PMID: 24307566 PMCID: PMC3869153 DOI: 10.1098/rsif.2013.0505] [Citation(s) in RCA: 163] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 11/12/2013] [Indexed: 12/17/2022] Open
Abstract
The interplay of mathematical modelling with experiments is one of the central elements in systems biology. The aim of reverse engineering is to infer, analyse and understand, through this interplay, the functional and regulatory mechanisms of biological systems. Reverse engineering is not exclusive of systems biology and has been studied in different areas, such as inverse problem theory, machine learning, nonlinear physics, (bio)chemical kinetics, control theory and optimization, among others. However, it seems that many of these areas have been relatively closed to outsiders. In this contribution, we aim to compare and highlight the different perspectives and contributions from these fields, with emphasis on two key questions: (i) why are reverse engineering problems so hard to solve, and (ii) what methods are available for the particular problems arising from systems biology?
Collapse
Affiliation(s)
| | - Julio R. Banga
- BioProcess Engineering Group, IIM-CSIC, Spanish National Research Council, Vigo 36208, Spain
| |
Collapse
|
15
|
Kourdis PD, Goussis DA. Glycolysis in saccharomyces cerevisiae: Algorithmic exploration of robustness and origin of oscillations. Math Biosci 2013; 243:190-214. [DOI: 10.1016/j.mbs.2013.03.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2012] [Revised: 03/03/2013] [Accepted: 03/04/2013] [Indexed: 01/15/2023]
|
16
|
Sunnaker M, Zamora-Sillero E, Dechant R, Ludwig C, Busetto AG, Wagner A, Stelling J. Automatic Generation of Predictive Dynamic Models Reveals Nuclear Phosphorylation as the Key Msn2 Control Mechanism. Sci Signal 2013; 6:ra41. [DOI: 10.1126/scisignal.2003621] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
17
|
Abstract
Reconstructing gene regulatory networks from high-throughput data is a long-standing problem. Through the DREAM project (Dialogue on Reverse Engineering Assessment and Methods), we performed a comprehensive blind assessment of over thirty network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae, and in silico microarray data. We characterize performance, data requirements, and inherent biases of different inference approaches offering guidelines for both algorithm application and development. We observe that no single inference method performs optimally across all datasets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse datasets. Thereby, we construct high-confidence networks for E. coli and S. aureus, each comprising ~1700 transcriptional interactions at an estimated precision of 50%. We experimentally test 53 novel interactions in E. coli, of which 23 were supported (43%). Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks.
Collapse
|
18
|
|
19
|
Serrano MÁ, Sagués F. Network-based scoring system for genome-scale metabolic reconstructions. BMC SYSTEMS BIOLOGY 2011; 5:76. [PMID: 21595941 PMCID: PMC3113238 DOI: 10.1186/1752-0509-5-76] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/03/2011] [Accepted: 05/19/2011] [Indexed: 11/17/2022]
Abstract
Background Network reconstructions at the cell level are a major development in Systems Biology. However, we are far from fully exploiting its potentialities. Often, the incremental complexity of the pursued systems overrides experimental capabilities, or increasingly sophisticated protocols are underutilized to merely refine confidence levels of already established interactions. For metabolic networks, the currently employed confidence scoring system rates reactions discretely according to nested categories of experimental evidence or model-based likelihood. Results Here, we propose a complementary network-based scoring system that exploits the statistical regularities of a metabolic network as a bipartite graph. As an illustration, we apply it to the metabolism of Escherichia coli. The model is adjusted to the observations to derive connection probabilities between individual metabolite-reaction pairs and, after validation, to assess the reliability of each reaction in probabilistic terms. This network-based scoring system uncovers very specific reactions that could be functionally or evolutionary important, identifies prominent experimental targets, and enables further confirmation of modeling results. Conclusions We foresee a wide range of potential applications at different sub-cellular or supra-cellular levels of biological interactions given the natural bipartivity of many biological networks.
Collapse
Affiliation(s)
- M Ángeles Serrano
- Departament de Química Física, Universitat de Barcelona, Martí i Franquès 1, Barcelona, 08028, Spain.
| | | |
Collapse
|
20
|
Siegal-Gaskins D, Mejia-Guerra MK, Smith GD, Grotewold E. Emergence of switch-like behavior in a large family of simple biochemical networks. PLoS Comput Biol 2011; 7:e1002039. [PMID: 21589886 PMCID: PMC3093349 DOI: 10.1371/journal.pcbi.1002039] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2010] [Accepted: 03/21/2011] [Indexed: 01/13/2023] Open
Abstract
Bistability plays a central role in the gene regulatory networks (GRNs) controlling many essential biological functions, including cellular differentiation and cell cycle control. However, establishing the network topologies that can exhibit bistability remains a challenge, in part due to the exceedingly large variety of GRNs that exist for even a small number of components. We begin to address this problem by employing chemical reaction network theory in a comprehensive in silico survey to determine the capacity for bistability of more than 40,000 simple networks that can be formed by two transcription factor-coding genes and their associated proteins (assuming only the most elementary biochemical processes). We find that there exist reaction rate constants leading to bistability in ∼90% of these GRN models, including several circuits that do not contain any of the TF cooperativity commonly associated with bistable systems, and the majority of which could only be identified as bistable through an original subnetwork-based analysis. A topological sorting of the two-gene family of networks based on the presence or absence of biochemical reactions reveals eleven minimal bistable networks (i.e., bistable networks that do not contain within them a smaller bistable subnetwork). The large number of previously unknown bistable network topologies suggests that the capacity for switch-like behavior in GRNs arises with relative ease and is not easily lost through network evolution. To highlight the relevance of the systematic application of CRNT to bistable network identification in real biological systems, we integrated publicly available protein-protein interaction, protein-DNA interaction, and gene expression data from Saccharomyces cerevisiae, and identified several GRNs predicted to behave in a bistable fashion. Switch-like behavior is found across a wide range of biological systems, and as a result there is significant interest in identifying the various ways in which biochemical reactions can be combined to yield a switch-like response. In this work we use a set of mathematical tools from chemical reaction network theory that provide information about the steady-states of a reaction network irrespective of the values of network rate constants, to conduct a large computational study of a family of model networks consisting of only two protein-coding genes. We find that a large majority of these networks (∼90%) have (for some set of parameters) the mathematical property known as bistability and can behave in a switch-like manner. Interestingly, the capacity for switch-like behavior is often maintained as networks increase in size through the introduction of new reactions. We then demonstrate using published yeast data how theoretical parameter-free surveys such as this one can be used to discover possible switch-like circuits in real biological systems. Our results highlight the potential usefulness of parameter-free modeling for the characterization of complex networks and to the study of network evolution, and are suggestive of a role for it in the development of novel synthetic biological switches.
Collapse
Affiliation(s)
- Dan Siegal-Gaskins
- Mathematical Biosciences Institute, The Ohio State University, Columbus, Ohio, United States of America.
| | | | | | | |
Collapse
|
21
|
Gomez-Cabrero D, Compte A, Tegner J. Workflow for generating competing hypothesis from models with parameter uncertainty. Interface Focus 2011; 1:438-49. [PMID: 22670212 DOI: 10.1098/rsfs.2011.0015] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2011] [Accepted: 03/07/2011] [Indexed: 01/07/2023] Open
Abstract
Mathematical models are increasingly used in life sciences. However, contrary to other disciplines, biological models are typically over-parametrized and loosely constrained by scarce experimental data and prior knowledge. Recent efforts on analysis of complex models have focused on isolated aspects without considering an integrated approach-ranging from model building to derivation of predictive experiments and refutation or validation of robust model behaviours. Here, we develop such an integrative workflow, a sequence of actions expanding upon current efforts with the purpose of setting the stage for a methodology facilitating an extraction of core behaviours and competing mechanistic hypothesis residing within underdetermined models. To this end, we make use of optimization search algorithms, statistical (machine-learning) classification techniques and cluster-based analysis of the state variables' dynamics and their corresponding parameter sets. We apply the workflow to a mathematical model of fat accumulation in the arterial wall (atherogenesis), a complex phenomena with limited quantitative understanding, thus leading to a model plagued with inherent uncertainty. We find that the mathematical atherogenesis model can still be understood in terms of a few key behaviours despite the large number of parameters. This result enabled us to derive distinct mechanistic predictions from the model despite the lack of confidence in the model parameters. We conclude that building integrative workflows enable investigators to embrace modelling of complex biological processes despite uncertainty in parameters.
Collapse
Affiliation(s)
- David Gomez-Cabrero
- Department of Medicine, Karolinska Institutet , Unit of Computational Medicine, Centre for Molecular Medicine , Solna, Stockholm , Sweden
| | | | | |
Collapse
|
22
|
Chandran D, Bergmann FT, Sauro HM. Computer-aided design of biological circuits using TinkerCell. Bioeng Bugs 2010; 1:274-81. [PMID: 21327060 PMCID: PMC3026467 DOI: 10.4161/bbug.1.4.12506] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2010] [Revised: 05/25/2010] [Accepted: 05/28/2010] [Indexed: 11/19/2022] Open
Abstract
Synthetic biology is an engineering discipline that builds on modeling practices from systems biology and wet-lab techniques from genetic engineering. As synthetic biology advances, efficient procedures will be developed that will allow a synthetic biologist to design, analyze, and build biological networks. In this idealized pipeline, computer-aided design (CAD) is a necessary component. The role of a CAD application would be to allow efficient transition from a general design to a final product. TinkerCell is a design tool for serving this purpose in synthetic biology. In TinkerCell, users build biological networks using biological parts and modules. The network can be analyzed using one of several functions provided by TinkerCell or custom programs from third-party sources. Since best practices for modeling and constructing synthetic biology networks have not yet been established, TinkerCell is designed as a flexible and extensible application that can adjust itself to changes in the field.
Collapse
Affiliation(s)
- Deepak Chandran
- Department of Bioengineering; University of Washington; Seattle, WA, USA.
| | | | | |
Collapse
|
23
|
Heinemann M, Sauer U. Systems biology of microbial metabolism. Curr Opin Microbiol 2010; 13:337-43. [PMID: 20219420 DOI: 10.1016/j.mib.2010.02.005] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2010] [Accepted: 02/13/2010] [Indexed: 12/20/2022]
Abstract
One current challenge in metabolic systems biology is to map out the regulation networks that control metabolism. From progress in this area, we conclude that non-transcriptional mechanisms (e.g. metabolite-protein interactions and protein phosphorylation) are highly relevant in actually controlling metabolic function. Furthermore, recent results highlight more functions of enzymes and metabolites than currently appreciated in genome-scale metabolic reconstructions, thereby adding another level of complexity. Combining experimental analyses and modeling efforts we are also beginning to understand how metabolic behavior emerges. Particularly, we recognize that metabolism is not simply a dull workhorse process but rather takes very active control of itself and other cellular processes, rendering true system-level understanding of metabolism possibly more difficult than for other cellular systems.
Collapse
Affiliation(s)
- Matthias Heinemann
- ETH Zurich, Institute of Molecular Systems Biology, Wolfgang-Pauli-Str. 16, 8093 Zurich, Switzerland.
| | | |
Collapse
|
24
|
A unifying view of 21st century systems biology. FEBS Lett 2010; 583:3891-4. [PMID: 19913537 DOI: 10.1016/j.febslet.2009.11.024] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2009] [Revised: 11/10/2009] [Accepted: 11/10/2009] [Indexed: 11/21/2022]
Abstract
The idea that multi-scale dynamic complex systems formed by interacting macromolecules and metabolites, cells, organs and organisms underlie some of the most fundamental aspects of life was proposed by a few visionaries half a century ago. We are witnessing a powerful resurgence of this idea made possible by the availability of nearly complete genome sequences, ever improving gene annotations and interactome network maps, the development of sophisticated informatic and imaging tools, and importantly, the use of engineering and physics concepts such as control and graph theory. Alongside four other fundamental "great ideas" as suggested by Sir Paul Nurse, namely, the gene, the cell, the role of chemistry in biological processes, and evolution by natural selection, systems-level understanding of "What is Life" may materialize as one of the major ideas of biology.
Collapse
|