201
|
Li P, Vu QD. Identification of parameter correlations for parameter estimation in dynamic biological models. BMC SYSTEMS BIOLOGY 2013; 7:91. [PMID: 24053643 PMCID: PMC4015753 DOI: 10.1186/1752-0509-7-91] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/14/2013] [Accepted: 09/12/2013] [Indexed: 11/15/2022]
Abstract
Background One of the challenging tasks in systems biology is parameter estimation in nonlinear dynamic models. A biological model usually contains a large number of correlated parameters leading to non-identifiability problems. Although many approaches have been developed to address both structural and practical non-identifiability problems, very few studies have been made to systematically investigate parameter correlations. Results In this study we present an approach that is able to identify both pairwise parameter correlations and higher order interrelationships among parameters in nonlinear dynamic models. Correlations are interpreted as surfaces in the subspaces of correlated parameters. Based on the correlation information obtained in this way both structural and practical non-identifiability can be clarified. Moreover, it can be concluded from the correlation analysis that a minimum number of data sets with different inputs for experimental design are needed to relieve the parameter correlations, which corresponds to the maximum number of correlated parameters among the correlation groups. Conclusions The information of pairwise and higher order interrelationships among parameters in biological models gives a deeper insight into the cause of non-identifiability problems. The result of our correlation analysis provides a necessary condition for experimental design in order to acquire suitable measurement data for unique parameter estimation.
Collapse
Affiliation(s)
- Pu Li
- Department of Simulation and Optimal Processes, Institute of Automation and Systems Engineering, Ilmenau University of Technology, P, O, Box 100565, 98684 Ilmenau, Germany.
| | | |
Collapse
|
202
|
Taylor B, Lee TJ, Weitz JS. A guide to sensitivity analysis of quantitative models of gene expression dynamics. Methods 2013; 62:109-20. [DOI: 10.1016/j.ymeth.2013.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2012] [Accepted: 03/08/2013] [Indexed: 11/30/2022] Open
|
203
|
Oguz C, Laomettachit T, Chen KC, Watson LT, Baumann WT, Tyson JJ. Optimization and model reduction in the high dimensional parameter space of a budding yeast cell cycle model. BMC SYSTEMS BIOLOGY 2013; 7:53. [PMID: 23809412 PMCID: PMC3702416 DOI: 10.1186/1752-0509-7-53] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2012] [Accepted: 06/19/2013] [Indexed: 01/16/2023]
Abstract
Background Parameter estimation from experimental data is critical for mathematical modeling of protein regulatory networks. For realistic networks with dozens of species and reactions, parameter estimation is an especially challenging task. In this study, we present an approach for parameter estimation that is effective in fitting a model of the budding yeast cell cycle (comprising 26 nonlinear ordinary differential equations containing 126 rate constants) to the experimentally observed phenotypes (viable or inviable) of 119 genetic strains carrying mutations of cell cycle genes. Results Starting from an initial guess of the parameter values, which correctly captures the phenotypes of only 72 genetic strains, our parameter estimation algorithm quickly improves the success rate of the model to 105–111 of the 119 strains. This success rate is comparable to the best values achieved by a skilled modeler manually choosing parameters over many weeks. The algorithm combines two search and optimization strategies. First, we use Latin hypercube sampling to explore a region surrounding the initial guess. From these samples, we choose ∼20 different sets of parameter values that correctly capture wild type viability. These sets form the starting generation of differential evolution that selects new parameter values that perform better in terms of their success rate in capturing phenotypes. In addition to producing highly successful combinations of parameter values, we analyze the results to determine the parameters that are most critical for matching experimental outcomes and the most competitive strains whose correct outcome with a given parameter vector forces numerous other strains to have incorrect outcomes. These “most critical parameters” and “most competitive strains” provide biological insights into the model. Conversely, the “least critical parameters” and “least competitive strains” suggest ways to reduce the computational complexity of the optimization. Conclusions Our approach proves to be a useful tool to help systems biologists fit complex dynamical models to large experimental datasets. In the process of fitting the model to the data, the tool identifies suggestive correlations among aspects of the model and the data.
Collapse
Affiliation(s)
- Cihan Oguz
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia 24061, USA
| | | | | | | | | | | |
Collapse
|
204
|
López C DC, Barz T, Peñuela M, Villegas A, Ochoa S, Wozny G. Model-based identifiable parameter determination applied to a simultaneous saccharification and fermentation process model for bio-ethanol production. Biotechnol Prog 2013; 29:1064-82. [PMID: 23749438 DOI: 10.1002/btpr.1753] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Revised: 04/02/2013] [Indexed: 01/30/2023]
Abstract
In this work, a methodology for the model-based identifiable parameter determination (MBIPD) is presented. This systematic approach is proposed to be used for structure and parameter identification of nonlinear models of biological reaction networks. Usually, this kind of problems are over-parameterized with large correlations between parameters. Hence, the related inverse problems for parameter determination and analysis are mathematically ill-posed and numerically difficult to solve. The proposed MBIPD methodology comprises several tasks: (i) model selection, (ii) tracking of an adequate initial guess, and (iii) an iterative parameter estimation step which includes an identifiable parameter subset selection (SsS) algorithm and accuracy analysis of the estimated parameters. The SsS algorithm is based on the analysis of the sensitivity matrix by rank revealing factorization methods. Using this, a reduction of the parameter search space to a reasonable subset, which can be reliably and efficiently estimated from available measurements, is achieved. The simultaneous saccharification and fermentation (SSF) process for bio-ethanol production from cellulosic material is used as case study for testing the methodology. The successful application of MBIPD to the SSF process demonstrates a relatively large reduction in the identified parameter space. It is shown by a cross-validation that using the identified parameters (even though the reduction of the search space), the model is still able to predict the experimental data properly. Moreover, it is shown that the model is easily and efficiently adapted to new process conditions by solving reduced and well conditioned problems.
Collapse
Affiliation(s)
- Diana C López C
- Chair of Process Dynamics and Operation, Technische Universität Berlin, Sekr.KWT-9, Str. Des 17. Juni 135, D-10623, Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
205
|
Vanlier J, Tiemann CA, Hilbers PAJ, van Riel NAW. Parameter uncertainty in biochemical models described by ordinary differential equations. Math Biosci 2013; 246:305-14. [PMID: 23535194 DOI: 10.1016/j.mbs.2013.03.006] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Revised: 03/07/2013] [Accepted: 03/12/2013] [Indexed: 12/21/2022]
Abstract
Improved mechanistic understanding of biochemical networks is one of the driving ambitions of Systems Biology. Computational modeling allows the integration of various sources of experimental data in order to put this conceptual understanding to the test in a quantitative manner. The aim of computational modeling is to obtain both predictive as well as explanatory models for complex phenomena, hereby providing useful approximations of reality with varying levels of detail. As the complexity required to describe different system increases, so does the need for determining how well such predictions can be made. Despite efforts to make tools for uncertainty analysis available to the field, these methods have not yet found widespread use in the field of Systems Biology. Additionally, the suitability of the different methods strongly depends on the problem and system under investigation. This review provides an introduction to some of the techniques available as well as gives an overview of the state-of-the-art methods for parameter uncertainty analysis.
Collapse
Affiliation(s)
- J Vanlier
- Eindhoven University of Technology, Department of Biomedical Engineering, Eindhoven, The Netherlands; Netherlands Consortium for Systems Biology, University of Amsterdam, Amsterdam, 1098 XH, The Netherlands.
| | | | | | | |
Collapse
|
206
|
Abdullah A, Deris S, Anwar S, Arjunan SNV. An evolutionary firefly algorithm for the estimation of nonlinear biological model parameters. PLoS One 2013; 8:e56310. [PMID: 23469172 PMCID: PMC3587642 DOI: 10.1371/journal.pone.0056310] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 01/08/2013] [Indexed: 11/19/2022] Open
Abstract
The development of accurate computational models of biological processes is fundamental to computational systems biology. These models are usually represented by mathematical expressions that rely heavily on the system parameters. The measurement of these parameters is often difficult. Therefore, they are commonly estimated by fitting the predicted model to the experimental data using optimization methods. The complexity and nonlinearity of the biological processes pose a significant challenge, however, to the development of accurate and fast optimization methods. We introduce a new hybrid optimization method incorporating the Firefly Algorithm and the evolutionary operation of the Differential Evolution method. The proposed method improves solutions by neighbourhood search using evolutionary procedures. Testing our method on models for the arginine catabolism and the negative feedback loop of the p53 signalling pathway, we found that it estimated the parameters with high accuracy and within a reasonable computation time compared to well-known approaches, including Particle Swarm Optimization, Nelder-Mead, and Firefly Algorithm. We have also verified the reliability of the parameters estimated by the method using an a posteriori practical identifiability test.
Collapse
Affiliation(s)
- Afnizanfaizal Abdullah
- Artificial Intelligence and Bioinformatics Group, Faculty of Computing, Universiti Teknologi Malaysia, Johor, Malaysia
- * E-mail: (AA); (SNVA)
| | - Safaai Deris
- Artificial Intelligence and Bioinformatics Group, Faculty of Computing, Universiti Teknologi Malaysia, Johor, Malaysia
| | - Sohail Anwar
- Pennsylvania State University, Altoona, Pennsylvania, United States of America
| | - Satya N. V. Arjunan
- Laboratory for Biochemical Simulation, RIKEN Quantitative Biology Center, Osaka, Japan
- * E-mail: (AA); (SNVA)
| |
Collapse
|
207
|
Raue A, Kreutz C, Theis FJ, Timmer J. Joining forces of Bayesian and frequentist methodology: a study for inference in the presence of non-identifiability. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2013; 371:20110544. [PMID: 23277602 DOI: 10.1098/rsta.2011.0544] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Increasingly complex applications involve large datasets in combination with nonlinear and high-dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications, experimental data and prior assumptions potentially constrain the posterior probability distribution insufficiently. In these situations, Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view, insufficient experimental data and prior assumptions can be interpreted as non-identifiability. The profile-likelihood approach offers to detect and to resolve non-identifiability by experimental design iteratively. Therefore, it allows one to better constrain the posterior probability distribution until Markov chain Monte Carlo sampling can be used securely. Using an application from cell biology, we compare both methods and show that a successive application of the two methods facilitates a realistic assessment of uncertainty in model predictions.
Collapse
Affiliation(s)
- Andreas Raue
- Institute for Physics, University of Freiburg, Freiburg, Germany
| | | | | | | |
Collapse
|
208
|
Stroh M, Hutmacher MM, Pang J, Lutz R, Magara H, Stone J. Simultaneous pharmacokinetic model for rolofylline and both M1-trans and M1-cis metabolites. AAPS JOURNAL 2013; 15:498-504. [PMID: 23355301 DOI: 10.1208/s12248-012-9443-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 11/13/2012] [Indexed: 12/17/2022]
Abstract
Rolofylline is a potent, selective adenosine A1 receptor antagonist that was under development for the treatment of patients with acute congestive heart failure and renal impairment. Rolofylline is metabolized primarily to the pharmacologically active M1-trans and M1-cis metabolites (metabolites) by cytochrome P450 (CYP) 3A4. The aim of this investigation was to provide a pharmacokinetic (PK) model for rolofylline and metabolites following intravenous administration to healthy volunteers. Data included for this investigation came from a randomized, double-blind, dose-escalation trial in four groups of healthy volunteers (N=36) where single doses of rolofylline, spanning 1 to 60 mg ,were infused over 1-2 h. The rolofylline and metabolite data were analyzed simultaneously using NONMEM. The simultaneous PK model comprised, in part, a two-compartment linear PK model for rolofylline, with estimates of clearance and volume of distribution at steady-state of 24.4 L/h and 239 L, respectively. In addition, the final PK model contained provisions for both conversion of rolofylline to metabolites and stereochemical conversion of M1-trans to M1-cis. Accordingly, the final model captured known aspects of rolofylline metabolism and was capable of simultaneously describing the PK of rolofylline and metabolites in healthy volunteers.
Collapse
Affiliation(s)
- Mark Stroh
- Merck Sharp & Dohme Corp, Whitehouse Station, NJ, USA.
| | | | | | | | | | | |
Collapse
|
209
|
Chakrabarty A, Buzzard GT, Rundell AE. Model-based design of experiments for cellular processes. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2013; 5:181-203. [PMID: 23293047 DOI: 10.1002/wsbm.1204] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Ankush Chakrabarty
- School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
| | | | | |
Collapse
|
210
|
Rateitschak K, Winter F, Lange F, Jaster R, Wolkenhauer O. Parameter identifiability and sensitivity analysis predict targets for enhancement of STAT1 activity in pancreatic cancer and stellate cells. PLoS Comput Biol 2012; 8:e1002815. [PMID: 23284277 PMCID: PMC3527226 DOI: 10.1371/journal.pcbi.1002815] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2012] [Accepted: 10/01/2012] [Indexed: 12/13/2022] Open
Abstract
The present work exemplifies how parameter identifiability analysis can be used to gain insights into differences in experimental systems and how uncertainty in parameter estimates can be handled. The case study, presented here, investigates interferon-gamma (IFNγ) induced STAT1 signalling in two cell types that play a key role in pancreatic cancer development: pancreatic stellate and cancer cells. IFNγ inhibits the growth for both types of cells and may be prototypic of agents that simultaneously hit cancer and stroma cells. We combined time-course experiments with mathematical modelling to focus on the common situation in which variations between profiles of experimental time series, from different cell types, are observed. To understand how biochemical reactions are causing the observed variations, we performed a parameter identifiability analysis. We successfully identified reactions that differ in pancreatic stellate cells and cancer cells, by comparing confidence intervals of parameter value estimates and the variability of model trajectories. Our analysis shows that useful information can also be obtained from nonidentifiable parameters. For the prediction of potential therapeutic targets we studied the consequences of uncertainty in the values of identifiable and nonidentifiable parameters. Interestingly, the sensitivity of model variables is robust against parameter variations and against differences between IFNγ induced STAT1 signalling in pancreatic stellate and cancer cells. This provides the basis for a prediction of therapeutic targets that are valid for both cell types.
Collapse
Affiliation(s)
- Katja Rateitschak
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany.
| | | | | | | | | |
Collapse
|
211
|
Berthoumieux S, Brilli M, Kahn D, de Jong H, Cinquemani E. On the identifiability of metabolic network models. J Math Biol 2012; 67:1795-832. [PMID: 23229063 DOI: 10.1007/s00285-012-0614-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 10/08/2012] [Indexed: 01/07/2023]
Abstract
A major problem for the identification of metabolic network models is parameter identifiability, that is, the possibility to unambiguously infer the parameter values from the data. Identifiability problems may be due to the structure of the model, in particular implicit dependencies between the parameters, or to limitations in the quantity and quality of the available data. We address the detection and resolution of identifiability problems for a class of pseudo-linear models of metabolism, so-called linlog models. Linlog models have the advantage that parameter estimation reduces to linear or orthogonal regression, which facilitates the analysis of identifiability. We develop precise definitions of structural and practical identifiability, and clarify the fundamental relations between these concepts. In addition, we use singular value decomposition to detect identifiability problems and reduce the model to an identifiable approximation by a principal component analysis approach. The criterion is adapted to real data, which are frequently scarce, incomplete, and noisy. The test of the criterion on a model with simulated data shows that it is capable of correctly identifying the principal components of the data vector. The application to a state-of-the-art dataset on central carbon metabolism in Escherichia coli yields the surprising result that only 4 out of 31 reactions, and 37 out of 100 parameters, are identifiable. This underlines the practical importance of identifiability analysis and model reduction in the modeling of large-scale metabolic networks. Although our approach has been developed in the context of linlog models, it carries over to other pseudo-linear models, such as generalized mass-action (power-law) models. Moreover, it provides useful hints for the identifiability analysis of more general classes of nonlinear models of metabolism.
Collapse
|
212
|
Ensemble kinetic modeling of metabolic networks from dynamic metabolic profiles. Metabolites 2012; 2:891-912. [PMID: 24957767 PMCID: PMC3901226 DOI: 10.3390/metabo2040891] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2012] [Revised: 11/02/2012] [Accepted: 11/05/2012] [Indexed: 01/21/2023] Open
Abstract
Kinetic modeling of metabolic pathways has important applications in metabolic engineering, but significant challenges still remain. The difficulties faced vary from finding best-fit parameters in a highly multidimensional search space to incomplete parameter identifiability. To meet some of these challenges, an ensemble modeling method is developed for characterizing a subset of kinetic parameters that give statistically equivalent goodness-of-fit to time series concentration data. The method is based on the incremental identification approach, where the parameter estimation is done in a step-wise manner. Numerical efficacy is achieved by reducing the dimensionality of parameter space and using efficient random parameter exploration algorithms. The shift toward using model ensembles, instead of the traditional "best-fit" models, is necessary to directly account for model uncertainty during the application of such models. The performance of the ensemble modeling approach has been demonstrated in the modeling of a generic branched pathway and the trehalose pathway in Saccharomyces cerevisiae using generalized mass action (GMA) kinetics.
Collapse
|
213
|
Transtrum MK, Qiu P. Optimal experiment selection for parameter estimation in biological differential equation models. BMC Bioinformatics 2012; 13:181. [PMID: 22838836 PMCID: PMC3536579 DOI: 10.1186/1471-2105-13-181] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Accepted: 07/12/2012] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Parameter estimation in biological models is a common yet challenging problem. In this work we explore the problem for gene regulatory networks modeled by differential equations with unknown parameters, such as decay rates, reaction rates, Michaelis-Menten constants, and Hill coefficients. We explore the question to what extent parameters can be efficiently estimated by appropriate experimental selection. RESULTS A minimization formulation is used to find the parameter values that best fit the experiment data. When the data is insufficient, the minimization problem often has many local minima that fit the data reasonably well. We show that selecting a new experiment based on the local Fisher Information of one local minimum generates additional data that allows one to successfully discriminate among the many local minima. The parameters can be estimated to high accuracy by iteratively performing minimization and experiment selection. We show that the experiment choices are roughly independent of which local minima is used to calculate the local Fisher Information. CONCLUSIONS We show that by an appropriate choice of experiments, one can, in principle, efficiently and accurately estimate all the parameters of gene regulatory network. In addition, we demonstrate that appropriate experiment selection can also allow one to restrict model predictions without constraining the parameters using many fewer experiments. We suggest that predicting model behaviors and inferring parameters represent two different approaches to model calibration with different requirements on data and experimental cost.
Collapse
Affiliation(s)
- Mark K Transtrum
- Department of Bioinformatics and Computational Biology, University of Texas M,D, Anderson Cancer Cneter, Houston, Texas, USA
| | | |
Collapse
|
214
|
A Simple Model to Control Growth Rate of Synthetic E. coli during the Exponential Phase: Model Analysis and Parameter Estimation. COMPUTATIONAL METHODS IN SYSTEMS BIOLOGY 2012. [DOI: 10.1007/978-3-642-33636-2_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|