1
|
Dorešić D, Grein S, Hasenauer J. Efficient parameter estimation for ODE models of cellular processes using semi-quantitative data. Bioinformatics 2024; 40:i558-i566. [PMID: 38940161 PMCID: PMC11211815 DOI: 10.1093/bioinformatics/btae210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Quantitative dynamical models facilitate the understanding of biological processes and the prediction of their dynamics. The parameters of these models are commonly estimated from experimental data. Yet, experimental data generated from different techniques do not provide direct information about the state of the system but a nonlinear (monotonic) transformation of it. For such semi-quantitative data, when this transformation is unknown, it is not apparent how the model simulations and the experimental data can be compared. RESULTS We propose a versatile spline-based approach for the integration of a broad spectrum of semi-quantitative data into parameter estimation. We derive analytical formulas for the gradients of the hierarchical objective function and show that this substantially increases the estimation efficiency. Subsequently, we demonstrate that the method allows for the reliable discovery of unknown measurement transformations. Furthermore, we show that this approach can significantly improve the parameter inference based on semi-quantitative data in comparison to available methods. AVAILABILITY AND IMPLEMENTATION Modelers can easily apply our method by using our implementation in the open-source Python Parameter EStimation TOolbox (pyPESTO) available at https://github.com/ICB-DCM/pyPESTO.
Collapse
Affiliation(s)
- Domagoj Dorešić
- Life and Medical Sciences (LIMES) Institute, University of Bonn, 53113 Bonn, Germany
- Institute of Computational Biology, Helmholtz Zentrum München – German Research Center for Environmental Health, 85764 Neuherberg, Germany
| | - Stephan Grein
- Life and Medical Sciences (LIMES) Institute, University of Bonn, 53113 Bonn, Germany
| | - Jan Hasenauer
- Life and Medical Sciences (LIMES) Institute, University of Bonn, 53113 Bonn, Germany
- Institute of Computational Biology, Helmholtz Zentrum München – German Research Center for Environmental Health, 85764 Neuherberg, Germany
- Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| |
Collapse
|
2
|
van Sluijs B, Zhou T, Helwig B, Baltussen MG, Nelissen FHT, Heus HA, Huck WTS. Iterative design of training data to control intricate enzymatic reaction networks. Nat Commun 2024; 15:1602. [PMID: 38383500 PMCID: PMC10881569 DOI: 10.1038/s41467-024-45886-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 02/06/2024] [Indexed: 02/23/2024] Open
Abstract
Kinetic modeling of in vitro enzymatic reaction networks is vital to understand and control the complex behaviors emerging from the nonlinear interactions inside. However, modeling is severely hampered by the lack of training data. Here, we introduce a methodology that combines an active learning-like approach and flow chemistry to efficiently create optimized datasets for a highly interconnected enzymatic reactions network with multiple sub-pathways. The optimal experimental design (OED) algorithm designs a sequence of out-of-equilibrium perturbations to maximize the information about the reaction kinetics, yielding a descriptive model that allows control of the output of the network towards any cost function. We experimentally validate the model by forcing the network to produce different product ratios while maintaining a minimum level of overall conversion efficiency. Our workflow scales with the complexity of the system and enables the optimization of previously unobtainable network outputs.
Collapse
Affiliation(s)
- Bob van Sluijs
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands
| | - Tao Zhou
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands.
| | - Britta Helwig
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands
| | - Mathieu G Baltussen
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands
| | - Frank H T Nelissen
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands
| | - Hans A Heus
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands
| | - Wilhelm T S Huck
- Institute for Molecules and Materials, Radboud University, Nijmegen, AJ, The Netherlands.
| |
Collapse
|
3
|
Villaverde AF, Pathirana D, Fröhlich F, Hasenauer J, Banga JR. A protocol for dynamic model calibration. Brief Bioinform 2022; 23:bbab387. [PMID: 34619769 PMCID: PMC8769694 DOI: 10.1093/bib/bbab387] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/06/2021] [Accepted: 08/29/2021] [Indexed: 12/23/2022] Open
Abstract
Ordinary differential equation models are nowadays widely used for the mechanistic description of biological processes and their temporal evolution. These models typically have many unknown and nonmeasurable parameters, which have to be determined by fitting the model to experimental data. In order to perform this task, known as parameter estimation or model calibration, the modeller faces challenges such as poor parameter identifiability, lack of sufficiently informative experimental data and the existence of local minima in the objective function landscape. These issues tend to worsen with larger model sizes, increasing the computational complexity and the number of unknown parameters. An incorrectly calibrated model is problematic because it may result in inaccurate predictions and misleading conclusions. For nonexpert users, there are a large number of potential pitfalls. Here, we provide a protocol that guides the user through all the steps involved in the calibration of dynamic models. We illustrate the methodology with two models and provide all the code required to reproduce the results and perform the same analysis on new models. Our protocol provides practitioners and researchers in biological modelling with a one-stop guide that is at the same time compact and sufficiently comprehensive to cover all aspects of the problem.
Collapse
Affiliation(s)
- Alejandro F Villaverde
- Universidade de Vigo, Department of Systems Engineering & Control, Vigo 36310, Galicia, Spain
| | - Dilan Pathirana
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53115, Germany
| | - Fabian Fröhlich
- Institute of Computational Biology, Helmholtz Zentrum München, Neuherberg 85764, Germany
| | - Jan Hasenauer
- Center for Mathematics, Technische Universität München, Garching 85748, Germany
- Harvard Medical School, Cambridge, MA 02115, USA
| | - Julio R Banga
- Bioprocess Engineering Group, IIM-CSIC, Vigo 36208, Galicia, Spain
| |
Collapse
|
4
|
Schmiester L, Weindl D, Hasenauer J. Efficient gradient-based parameter estimation for dynamic models using qualitative data. BIOINFORMATICS (OXFORD, ENGLAND) 2021. [PMID: 34260697 DOI: 10.1101/2021.02.06.430039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
MOTIVATION Unknown parameters of dynamical models are commonly estimated from experimental data. However, while various efficient optimization and uncertainty analysis methods have been proposed for quantitative data, methods for qualitative data are rare and suffer from bad scaling and convergence. RESULTS Here, we propose an efficient and reliable framework for estimating the parameters of ordinary differential equation models from qualitative data. In this framework, we derive a semi-analytical algorithm for gradient calculation of the optimal scaling method developed for qualitative data. This enables the use of efficient gradient-based optimization algorithms. We demonstrate that the use of gradient information improves performance of optimization and uncertainty quantification on several application examples. On average, we achieve a speedup of more than one order of magnitude compared to gradient-free optimization. In addition, in some examples, the gradient-based approach yields substantially improved objective function values and quality of the fits. Accordingly, the proposed framework substantially improves the parameterization of models from qualitative data. AVAILABILITY AND IMPLEMENTATION The proposed approach is implemented in the open-source Python Parameter EStimation TOolbox (pyPESTO). pyPESTO is available at https://github.com/ICB-DCM/pyPESTO. All application examples and code to reproduce this study are available at https://doi.org/10.5281/zenodo.4507613. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leonard Schmiester
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg 85764, Germany
- Center for Mathematics, Technische Universität München, Garching 85748, Germany
| | - Daniel Weindl
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg 85764, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg 85764, Germany
- Center for Mathematics, Technische Universität München, Garching 85748, Germany
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn 53113, Germany
| |
Collapse
|
5
|
Schmiester L, Weindl D, Hasenauer J. Efficient gradient-based parameter estimation for dynamic models using qualitative data. Bioinformatics 2021; 37:4493-4500. [PMID: 34260697 PMCID: PMC8652033 DOI: 10.1093/bioinformatics/btab512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 07/02/2021] [Accepted: 07/08/2021] [Indexed: 11/22/2022] Open
Abstract
Motivation Unknown parameters of dynamical models are commonly estimated from experimental data. However, while various efficient optimization and uncertainty analysis methods have been proposed for quantitative data, methods for qualitative data are rare and suffer from bad scaling and convergence. Results Here, we propose an efficient and reliable framework for estimating the parameters of ordinary differential equation models from qualitative data. In this framework, we derive a semi-analytical algorithm for gradient calculation of the optimal scaling method developed for qualitative data. This enables the use of efficient gradient-based optimization algorithms. We demonstrate that the use of gradient information improves performance of optimization and uncertainty quantification on several application examples. On average, we achieve a speedup of more than one order of magnitude compared to gradient-free optimization. In addition, in some examples, the gradient-based approach yields substantially improved objective function values and quality of the fits. Accordingly, the proposed framework substantially improves the parameterization of models from qualitative data. Availability and implementation The proposed approach is implemented in the open-source Python Parameter EStimation TOolbox (pyPESTO). pyPESTO is available at https://github.com/ICB-DCM/pyPESTO. All application examples and code to reproduce this study are available at https://doi.org/10.5281/zenodo.4507613. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leonard Schmiester
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, 85764, Germany.,Center for Mathematics, Technische Universität München, Garching, 85748, Germany
| | - Daniel Weindl
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, 85764, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, 85764, Germany.,Center for Mathematics, Technische Universität München, Garching, 85748, Germany.,Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, 53113, Germany
| |
Collapse
|
6
|
Schmiester L, Schälte Y, Bergmann FT, Camba T, Dudkin E, Egert J, Fröhlich F, Fuhrmann L, Hauber AL, Kemmer S, Lakrisenko P, Loos C, Merkt S, Müller W, Pathirana D, Raimúndez E, Refisch L, Rosenblatt M, Stapor PL, Städter P, Wang D, Wieland FG, Banga JR, Timmer J, Villaverde AF, Sahle S, Kreutz C, Hasenauer J, Weindl D. PEtab-Interoperable specification of parameter estimation problems in systems biology. PLoS Comput Biol 2021; 17:e1008646. [PMID: 33497393 PMCID: PMC7864467 DOI: 10.1371/journal.pcbi.1008646] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 02/05/2021] [Accepted: 12/18/2020] [Indexed: 01/24/2023] Open
Abstract
Reproducibility and reusability of the results of data-based modeling studies are essential. Yet, there has been-so far-no broadly supported format for the specification of parameter estimation problems in systems biology. Here, we introduce PEtab, a format which facilitates the specification of parameter estimation problems using Systems Biology Markup Language (SBML) models and a set of tab-separated value files describing the observation model and experimental data as well as parameters to be estimated. We already implemented PEtab support into eight well-established model simulation and parameter estimation toolboxes with hundreds of users in total. We provide a Python library for validation and modification of a PEtab problem and currently 20 example parameter estimation problems based on recent studies.
Collapse
Affiliation(s)
- Leonard Schmiester
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Yannik Schälte
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | | | - Tacio Camba
- Department of Applied Mathematics II, University of Vigo, Vigo, Galicia, Spain
- BioProcess Engineering Group, IIM-CSIC, Vigo, Galicia, Spain
| | - Erika Dudkin
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | - Janine Egert
- Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
| | - Fabian Fröhlich
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA
| | - Lara Fuhrmann
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | - Adrian L. Hauber
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
| | - Svenja Kemmer
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
| | - Polina Lakrisenko
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Carolin Loos
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
- Ragon Institute of MGH, MIT and Harvard, Cambridge, Massachusetts, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Simon Merkt
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | - Wolfgang Müller
- Heidelberg Institute for Theoretical Studies (HITS gGmbH), Heidelberg, Germany
| | - Dilan Pathirana
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | - Elba Raimúndez
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
| | - Lukas Refisch
- Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
| | - Marcus Rosenblatt
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
| | - Paul L. Stapor
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Philipp Städter
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Dantong Wang
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
| | - Franz-Georg Wieland
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
| | - Julio R. Banga
- BioProcess Engineering Group, IIM-CSIC, Vigo, Galicia, Spain
| | - Jens Timmer
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
- Institute of Physics, University of Freiburg, Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany
| | | | - Sven Sahle
- BioQUANT/COS, Heidelberg University, Heidelberg, Germany
| | - Clemens Kreutz
- Faculty of Medicine and Medical Center, Institute of Medical Biometry and Statistics, University of Freiburg, Freiburg, Germany
- Freiburg Center for Data Analysis and Modeling (FDM), University of Freiburg, Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, Freiburg, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
- Center for Mathematics, Technische Universität München, Garching, Germany
- Faculty of Mathematics and Natural Sciences, University of Bonn, Bonn, Germany
- * E-mail:
| | - Daniel Weindl
- Institute of Computational Biology, Helmholtz Zentrum München—German Research Center for Environmental Health, Neuherberg, Germany
| |
Collapse
|