1
|
Stapor P, Weindl D, Ballnus B, Hug S, Loos C, Fiedler A, Krause S, Hroß S, Fröhlich F, Hasenauer J. PESTO: Parameter EStimation TOolbox. Bioinformatics 2019; 34:705-707. [PMID: 29069312 PMCID: PMC5860618 DOI: 10.1093/bioinformatics/btx676] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 10/20/2017] [Indexed: 11/15/2022] Open
Abstract
Summary PESTO is a widely applicable and highly customizable toolbox for parameter estimation in MathWorks MATLAB. It offers scalable algorithms for optimization, uncertainty and identifiability analysis, which work in a very generic manner, treating the objective function as a black box. Hence, PESTO can be used for any parameter estimation problem, for which the user can provide a deterministic objective function in MATLAB. Availability and implementation PESTO is a MATLAB toolbox, freely available under the BSD license. The source code, along with extensive documentation and example code, can be downloaded from https://github.com/ICB-DCM/PESTO/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paul Stapor
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Daniel Weindl
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany
| | - Benjamin Ballnus
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Sabine Hug
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Carolin Loos
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Anna Fiedler
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Sabrina Krause
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Sabrina Hroß
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Fabian Fröhlich
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München, 85764 Neuherberg, Germany.,Center for Mathematics, Technische Universität München, 85748 Garching, Germany
| |
Collapse
|
2
|
Abstract
Motivation Mathematical models have become standard tools for the investigation of cellular processes and the unraveling of signal processing mechanisms. The parameters of these models are usually derived from the available data using optimization and sampling methods. However, the efficiency of these methods is limited by the properties of the mathematical model, e.g. non-identifiabilities, and the resulting posterior distribution. In particular, multi-modal distributions with long valleys or pronounced tails are difficult to optimize and sample. Thus, the developement or improvement of optimization and sampling methods is subject to ongoing research. Results We suggest a region-based adaptive parallel tempering algorithm which adapts to the problem-specific posterior distributions, i.e. modes and valleys. The algorithm combines several established algorithms to overcome their individual shortcomings and to improve sampling efficiency. We assessed its properties for established benchmark problems and two ordinary differential equation models of biochemical reaction networks. The proposed algorithm outperformed state-of-the-art methods in terms of calculation efficiency and mixing. Since the algorithm does not rely on a specific problem structure, but adapts to the posterior distribution, it is suitable for a variety of model classes. Availability and implementation The code is available both as Supplementary Material and in a Git repository written in MATLAB. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Benjamin Ballnus
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Garching, Germany
| | - Steffen Schaper
- Bayer AG, Engineering and Technologies, Applied Mathematics, Leverkusen, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Garching, Germany
| | - Jan Hasenauer
- Institute of Computational Biology, Helmholtz Zentrum München–German Research Center for Environmental Health, Neuherberg, Germany
- Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Garching, Germany
| |
Collapse
|
3
|
Ballnus B, Hug S, Hatz K, Görlitz L, Hasenauer J, Theis FJ. Comprehensive benchmarking of Markov chain Monte Carlo methods for dynamical systems. BMC Syst Biol 2017; 11:63. [PMID: 28646868 PMCID: PMC5482939 DOI: 10.1186/s12918-017-0433-1] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Accepted: 05/10/2017] [Indexed: 11/12/2022]
Abstract
BACKGROUND In quantitative biology, mathematical models are used to describe and analyze biological processes. The parameters of these models are usually unknown and need to be estimated from experimental data using statistical methods. In particular, Markov chain Monte Carlo (MCMC) methods have become increasingly popular as they allow for a rigorous analysis of parameter and prediction uncertainties without the need for assuming parameter identifiability or removing non-identifiable parameters. A broad spectrum of MCMC algorithms have been proposed, including single- and multi-chain approaches. However, selecting and tuning sampling algorithms suited for a given problem remains challenging and a comprehensive comparison of different methods is so far not available. RESULTS We present the results of a thorough benchmarking of state-of-the-art single- and multi-chain sampling methods, including Adaptive Metropolis, Delayed Rejection Adaptive Metropolis, Metropolis adjusted Langevin algorithm, Parallel Tempering and Parallel Hierarchical Sampling. Different initialization and adaptation schemes are considered. To ensure a comprehensive and fair comparison, we consider problems with a range of features such as bifurcations, periodical orbits, multistability of steady-state solutions and chaotic regimes. These problem properties give rise to various posterior distributions including uni- and multi-modal distributions and non-normally distributed mode tails. For an objective comparison, we developed a pipeline for the semi-automatic comparison of sampling results. CONCLUSION The comparison of MCMC algorithms, initialization and adaptation schemes revealed that overall multi-chain algorithms perform better than single-chain algorithms. In some cases this performance can be further increased by using a preceding multi-start local optimization scheme. These results can inform the selection of sampling methods and the benchmark collection can serve for the evaluation of new algorithms. Furthermore, our results confirm the need to address exploration quality of MCMC chains before applying the commonly used quality measure of effective sample size to prevent false analysis conclusions.
Collapse
Affiliation(s)
- Benjamin Ballnus
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstraße 1, Neuherberg, 85764 Germany
- Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Boltzmannstraße 15, Garching, 85748 Germany
| | - Sabine Hug
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstraße 1, Neuherberg, 85764 Germany
| | - Kathrin Hatz
- Bayer AG, Engineering & Technologies, Applied Mathematics, Kaiser-Wilhelm-Allee, Leverkusen, 51368 Germany
| | - Linus Görlitz
- Bayer AG, Engineering & Technologies, Applied Mathematics, Kaiser-Wilhelm-Allee, Leverkusen, 51368 Germany
| | - Jan Hasenauer
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstraße 1, Neuherberg, 85764 Germany
- Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Boltzmannstraße 15, Garching, 85748 Germany
| | - Fabian J. Theis
- Helmholtz Zentrum München - German Research Center for Environmental Health, Institute of Computational Biology, Ingolstädter Landstraße 1, Neuherberg, 85764 Germany
- Technische Universität München, Center for Mathematics, Chair of Mathematical Modeling of Biological Systems, Boltzmannstraße 15, Garching, 85748 Germany
| |
Collapse
|