1
|
Paul RD, Jadebeck JF, Stratmann A, Wiechert W, Nöh K. hopsy - a methods marketplace for convex polytope sampling in Python. Bioinformatics 2024; 40:btae430. [PMID: 38950177 PMCID: PMC11245314 DOI: 10.1093/bioinformatics/btae430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 05/10/2024] [Accepted: 06/28/2024] [Indexed: 07/03/2024] Open
Abstract
SUMMARY Effective collaboration between developers of Bayesian inference methods and users is key to advance our quantitative understanding of biosystems. We here present hopsy, a versatile open-source platform designed to provide convenient access to powerful Markov chain Monte Carlo sampling algorithms tailored to models defined on convex polytopes (CP). Based on the high-performance C++ sampling library HOPS, hopsy inherits its strengths and extends its functionalities with the accessibility of the Python programming language. A versatile plugin-mechanism enables seamless integration with domain-specific models, providing method developers with a framework for testing, benchmarking, and distributing CP samplers to approach real-world inference tasks. We showcase hopsy by solving common and newly composed domain-specific sampling problems, highlighting important design choices. By likening hopsy to a marketplace, we emphasize its role in bringing together users and developers, where users get access to state-of-the-art methods, and developers contribute their own innovative solutions for challenging domain-specific inference problems. AVAILABILITY AND IMPLEMENTATION Sources, documentation and a continuously updated list of sampling algorithms are available at https://jugit.fz-juelich.de/IBG-1/ModSim/hopsy, with Linux, Windows and MacOS binaries at https://pypi.org/project/hopsy/.
Collapse
Affiliation(s)
- Richard D Paul
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52428 Jülich, Germany
- Institute of Advanced Simulations, IAS-8: Data Analytics and Machine Learning, Forschungszentrum Jülich, 52428 Jülich, Germany
| | - Johann F Jadebeck
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52428 Jülich, Germany
- Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52074 Aachen, Germany
| | - Anton Stratmann
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52428 Jülich, Germany
- Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52074 Aachen, Germany
| | - Wolfgang Wiechert
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52428 Jülich, Germany
- Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52074 Aachen, Germany
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52428 Jülich, Germany
| |
Collapse
|
2
|
Theorell A, Jadebeck JF, Wiechert W, McFadden J, Nöh K. Rethinking 13C-metabolic flux analysis - The Bayesian way of flux inference. Metab Eng 2024; 83:137-149. [PMID: 38582144 DOI: 10.1016/j.ymben.2024.03.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 03/22/2024] [Accepted: 03/23/2024] [Indexed: 04/08/2024]
Abstract
Metabolic reaction rates (fluxes) play a crucial role in comprehending cellular phenotypes and are essential in areas such as metabolic engineering, biotechnology, and biomedical research. The state-of-the-art technique for estimating fluxes is metabolic flux analysis using isotopic labelling (13C-MFA), which uses a dataset-model combination to determine the fluxes. Bayesian statistical methods are gaining popularity in the field of life sciences, but the use of 13C-MFA is still dominated by conventional best-fit approaches. The slow take-up of Bayesian approaches is, at least partly, due to the unfamiliarity of Bayesian methods to metabolic engineering researchers. To address this unfamiliarity, we here outline similarities and differences between the two approaches and highlight particular advantages of the Bayesian way of flux analysis. With a real-life example, re-analysing a moderately informative labelling dataset of E. coli, we identify situations in which Bayesian methods are advantageous and more informative, pointing to potential pitfalls of current 13C-MFA evaluation approaches. We propose the use of Bayesian model averaging (BMA) for flux inference as a means of overcoming the problem of model uncertainty through its tendency to assign low probabilities to both, models that are unsupported by data, and models that are overly complex. In this capacity, BMA resembles a tempered Ockham's razor. With the tempered razor as a guide, BMA-based 13C-MFA alleviates the problem of model selection uncertainty and is thereby capable of becoming a game changer for metabolic engineering by uncovering new insights and inspiring novel approaches.
Collapse
Affiliation(s)
- Axel Theorell
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Johann F Jadebeck
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52062 Aachen, Germany
| | - Wolfgang Wiechert
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany; Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52062 Aachen, Germany
| | - Johnjoe McFadden
- Department of Microbial and Cellular Sciences, University of Surrey, GU2 7XH Guildford, United Kingdom
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany.
| |
Collapse
|
3
|
Jadebeck JF, Wiechert W, Nöh K. Practical sampling of constraint-based models: Optimized thinning boosts CHRR performance. PLoS Comput Biol 2023; 19:e1011378. [PMID: 37566638 PMCID: PMC10446239 DOI: 10.1371/journal.pcbi.1011378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Revised: 08/23/2023] [Accepted: 07/21/2023] [Indexed: 08/13/2023] Open
Abstract
Thinning is a sub-sampling technique to reduce the memory footprint of Markov chain Monte Carlo. Despite being commonly used, thinning is rarely considered efficient. For sampling constraint-based models, a highly relevant use-case in systems biology, we here demonstrate that thinning boosts computational and, thereby, sampling efficiencies of the widely used Coordinate Hit-and-Run with Rounding (CHRR) algorithm. By benchmarking CHRR with thinning with simplices and genome-scale metabolic networks of up to thousands of dimensions, we find a substantial increase in computational efficiency compared to unthinned CHRR, in our examples by orders of magnitude, as measured by the effective sample size per time (ESS/t), with performance gains growing with polytope (effective network) dimension. Using a set of benchmark models we derive a ready-to-apply guideline for tuning thinning to efficient and effective use of compute resources without requiring additional coding effort. Our guideline is validated using three (out-of-sample) large-scale networks and we show that it allows sampling convex polytopes uniformly to convergence in a fraction of time, thereby unlocking the rigorous investigation of hitherto intractable models. The derivation of our guideline is explained in detail, allowing future researchers to update it as needed as new model classes and more training data becomes available. CHRR with deliberate utilization of thinning thereby paves the way to keep pace with progressing model sizes derived with the constraint-based reconstruction and analysis (COBRA) tool set. Sampling and evaluation pipelines are available at https://jugit.fz-juelich.de/IBG-1/ModSim/fluxomics/chrrt.
Collapse
Affiliation(s)
- Johann F. Jadebeck
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, Jülich, Germany
- Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, Aachen, Germany
| | - Wolfgang Wiechert
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, Jülich, Germany
- Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, Aachen, Germany
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, Jülich, Germany
| |
Collapse
|
4
|
Borah Slater K, Beyß M, Xu Y, Barber J, Costa C, Newcombe J, Theorell A, Bailey MJ, Beste DJV, McFadden J, Nöh K. One-shot 13 C 15 N-metabolic flux analysis for simultaneous quantification of carbon and nitrogen flux. Mol Syst Biol 2023; 19:e11099. [PMID: 36705093 PMCID: PMC9996240 DOI: 10.15252/msb.202211099] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 01/06/2023] [Accepted: 01/13/2023] [Indexed: 01/28/2023] Open
Abstract
Metabolic flux is the final output of cellular regulation and has been extensively studied for carbon but much less is known about nitrogen, which is another important building block for living organisms. For the tuberculosis pathogen, this is particularly important in informing the development of effective drugs targeting the pathogen's metabolism. Here we performed 13 C15 N dual isotopic labeling of Mycobacterium bovis BCG steady state cultures, quantified intracellular carbon and nitrogen fluxes and inferred reaction bidirectionalities. This was achieved by model scope extension and refinement, implemented in a multi-atom transition model, within the statistical framework of Bayesian model averaging (BMA). Using BMA-based 13 C15 N-metabolic flux analysis, we jointly resolve carbon and nitrogen fluxes quantitatively. We provide the first nitrogen flux distributions for amino acid and nucleotide biosynthesis in mycobacteria and establish glutamate as the central node for nitrogen metabolism. We improved resolution of the notoriously elusive anaplerotic node in central carbon metabolism and revealed possible operation modes. Our study provides a powerful and statistically rigorous platform to simultaneously infer carbon and nitrogen metabolism in any biological system.
Collapse
Affiliation(s)
| | - Martin Beyß
- Forschungszentrum Jülich GmbH, Institute of Bio‐ and Geosciences, IBG‐1: BiotechnologyJülichGermany
- Computational Systems BiotechnologyRWTH Aachen UniversityAachenGermany
| | - Ye Xu
- Faculty of Health and Medical SciencesUniversity of SurreyGuildfordUK
| | - Jim Barber
- Faculty of Health and Medical SciencesUniversity of SurreyGuildfordUK
| | - Catia Costa
- Faculty of Engineering and Physical SciencesUniversity of SurreyGuildfordUK
| | - Jane Newcombe
- Faculty of Health and Medical SciencesUniversity of SurreyGuildfordUK
| | - Axel Theorell
- Forschungszentrum Jülich GmbH, Institute of Bio‐ and Geosciences, IBG‐1: BiotechnologyJülichGermany
- Present address:
Computational Systems BiologyETH ZürichBaselSwitzerland
| | - Melanie J Bailey
- Faculty of Engineering and Physical SciencesUniversity of SurreyGuildfordUK
| | - Dany J V Beste
- Faculty of Health and Medical SciencesUniversity of SurreyGuildfordUK
| | - Johnjoe McFadden
- Faculty of Health and Medical SciencesUniversity of SurreyGuildfordUK
| | - Katharina Nöh
- Forschungszentrum Jülich GmbH, Institute of Bio‐ and Geosciences, IBG‐1: BiotechnologyJülichGermany
| |
Collapse
|
5
|
Feierabend M, Renz A, Zelle E, Nöh K, Wiechert W, Dräger A. High-Quality Genome-Scale Reconstruction of Corynebacterium glutamicum ATCC 13032. Front Microbiol 2021; 12:750206. [PMID: 34867870 PMCID: PMC8634658 DOI: 10.3389/fmicb.2021.750206] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 10/19/2021] [Indexed: 11/30/2022] Open
Abstract
Corynebacterium glutamicum belongs to the microbes of enormous biotechnological relevance. In particular, its strain ATCC 13032 is a widely used producer of L-amino acids at an industrial scale. Its apparent robustness also turns it into a favorable platform host for a wide range of further compounds, mainly because of emerging bio-based economies. A deep understanding of the biochemical processes in C. glutamicum is essential for a sustainable enhancement of the microbe's productivity. Computational systems biology has the potential to provide a valuable basis for driving metabolic engineering and biotechnological advances, such as increased yields of healthy producer strains based on genome-scale metabolic models (GEMs). Advanced reconstruction pipelines are now available that facilitate the reconstruction of GEMs and support their manual curation. This article presents iCGB21FR, an updated and unified GEM of C. glutamicum ATCC 13032 with high quality regarding comprehensiveness and data standards, built with the latest modeling techniques and advanced reconstruction pipelines. It comprises 1042 metabolites, 1539 reactions, and 805 genes with detailed annotations and database cross-references. The model validation took place using different media and resulted in realistic growth rate predictions under aerobic and anaerobic conditions. The new GEM produces all canonical amino acids, and its phenotypic predictions are consistent with laboratory data. The in silico model proved fruitful in adding knowledge to the metabolism of C. glutamicum: iCGB21FR still produces L-glutamate with the knock-out of the enzyme pyruvate carboxylase, despite the common belief to be relevant for the amino acid's production. We conclude that integrating high standards into the reconstruction of GEMs facilitates replicating validated knowledge, closing knowledge gaps, and making it a useful basis for metabolic engineering. The model is freely available from BioModels Database under identifier MODEL2102050001.
Collapse
Affiliation(s)
- Martina Feierabend
- Computational Systems Biology of Infections and Antimicrobial-Resistant Pathogens, Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Tübingen, Germany
- Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Alina Renz
- Computational Systems Biology of Infections and Antimicrobial-Resistant Pathogens, Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Tübingen, Germany
- Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Elisabeth Zelle
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich, Germany
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich, Germany
| | - Wolfgang Wiechert
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich, Germany
- Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, Aachen, Germany
| | - Andreas Dräger
- Computational Systems Biology of Infections and Antimicrobial-Resistant Pathogens, Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Tübingen, Germany
- Department of Computer Science, University of Tübingen, Tübingen, Germany
| |
Collapse
|
6
|
Theorell A, Jadebeck JF, Nöh K, Stelling J. PolyRound: polytope rounding for random sampling in metabolic networks. Bioinformatics 2021; 38:566-567. [PMID: 34329395 PMCID: PMC8723145 DOI: 10.1093/bioinformatics/btab552] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 05/25/2021] [Accepted: 07/29/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY Random flux sampling is a powerful tool for the constraint-based analysis of metabolic networks. The most efficient sampling method relies on a rounding transform of the constraint polytope, but no available rounding implementation can round all relevant models. By removing redundant polytope constraints on the go, PolyRound simplifies the numerical problem and rounds all the 108 models in the BiGG database without parameter tuning, compared to ∼50% for the state-of-the-art implementation. AVAILABILITY AND IMPLEMENTATION The implementation is available on gitlab: https://gitlab.com/csb.ethz/PolyRound. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Johann F Jadebeck
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52425 Jülich, Germany,Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, 52062 Aachen, Germany
| | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52425 Jülich, Germany
| | | |
Collapse
|
7
|
Beyß M, Parra-Peña VD, Ramirez-Malule H, Nöh K. Robustifying Experimental Tracer Design for 13C-Metabolic Flux Analysis. Front Bioeng Biotechnol 2021; 9:685323. [PMID: 34239861 PMCID: PMC8258161 DOI: 10.3389/fbioe.2021.685323] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 05/19/2021] [Indexed: 11/25/2022] Open
Abstract
13C metabolic flux analysis (MFA) has become an indispensable tool to measure metabolic reaction rates (fluxes) in living organisms, having an increasingly diverse range of applications. Here, the choice of the13C labeled tracer composition makes the difference between an information-rich experiment and an experiment with only limited insights. To improve the chances for an informative labeling experiment, optimal experimental design approaches have been devised for13C-MFA, all relying on some a priori knowledge about the actual fluxes. If such prior knowledge is unavailable, e.g., for research organisms and producer strains, existing methods are left with a chicken-and-egg problem. In this work, we present a general computational method, termed robustified experimental design (R-ED), to guide the decision making about suitable tracer choices when prior knowledge about the fluxes is lacking. Instead of focusing on one mixture, optimal for specific flux values, we pursue a sampling based approach and introduce a new design criterion, which characterizes the extent to which mixtures are informative in view of all possible flux values. The R-ED workflow enables the exploration of suitable tracer mixtures and provides full flexibility to trade off information and cost metrics. The potential of the R-ED workflow is showcased by applying the approach to the industrially relevant antibiotic producer Streptomyces clavuligerus, where we suggest informative, yet economic labeling strategies.
Collapse
Affiliation(s)
- Martin Beyß
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich, Germany.,Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University, Aachen, Germany
| | | | | | - Katharina Nöh
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, Jülich, Germany
| |
Collapse
|