1
|
Guil F, Hidalgo JF, García JM. On the representativeness and stability of a set of EFMs. Bioinformatics 2023; 39:btad356. [PMID: 37252834 PMCID: PMC10264373 DOI: 10.1093/bioinformatics/btad356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/18/2023] [Accepted: 05/30/2023] [Indexed: 06/01/2023] Open
Abstract
MOTIVATION Elementary flux modes are a well-known tool for analyzing metabolic networks. The whole set of elementary flux modes (EFMs) cannot be computed in most genome-scale networks due to their large cardinality. Therefore, different methods have been proposed to compute a smaller subset of EFMs that can be used for studying the structure of the network. These latter methods pose the problem of studying the representativeness of the calculated subset. In this article, we present a methodology to tackle this problem. RESULTS We have introduced the concept of stability for a particular network parameter and its relation to the representativeness of the EFM extraction method studied. We have also defined several metrics to study and compare the EFM biases. We have applied these techniques to compare the relative behavior of previously proposed methods in two case studies. Furthermore, we have presented a new method for the EFM computation (PiEFM), which is more stable (less biased) than previous ones, has suitable representativeness measures, and exhibits better variability in the extracted EFMs. AVAILABILITY AND IMPLEMENTATION Software and additional material are freely available at https://github.com/biogacop/PiEFM.
Collapse
Affiliation(s)
- Francisco Guil
- Grupo de Arquitectura y Computación Paralela, Departamento de Ingeniería y Tecnología de Computadores, Facultad de Informática, Universidad de Murcia, Campus de Espinardo, Murcia 30100, Spain
| | - José F Hidalgo
- Grupo de Arquitectura y Computación Paralela, Departamento de Ingeniería y Tecnología de Computadores, Facultad de Informática, Universidad de Murcia, Campus de Espinardo, Murcia 30100, Spain
| | - José M García
- Grupo de Arquitectura y Computación Paralela, Departamento de Ingeniería y Tecnología de Computadores, Facultad de Informática, Universidad de Murcia, Campus de Espinardo, Murcia 30100, Spain
| |
Collapse
|
2
|
Sedaghat N, Stephen T, Chindelevitch L. Speeding Up the Structural Analysis of Metabolic Network Models Using the Fredman-Khachiyan Algorithm B. J Comput Biol 2023; 30:678-694. [PMID: 37327036 DOI: 10.1089/cmb.2022.0319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023] Open
Abstract
The problem of computing the Elementary Flux Modes (EFMs) and Minimal Cut Sets (MCSs) of metabolic network is a fundamental one in metabolic networks. A key insight is that they can be understood as a dual pair of monotone Boolean functions (MBFs). Using this insight, this computation reduces to the question of generating from an oracle a dual pair of MBFs. If one of the two sets (functions) is known, then the other can be computed through a process known as dualization. Fredman and Khachiyan provided two algorithms, which they called simply A and B that can serve as an engine for oracle-based generation or dualization of MBFs. We look at efficiencies available in implementing their algorithm B, which we will refer to as FK-B. Like their algorithm A, FK-B certifies whether two given MBFs in the form of Conjunctive Normal Form and Disjunctive Normal Form are dual or not, and in case of not being dual it returns a conflicting assignment (CA), that is, an assignment that makes one of the given Boolean functions True and the other one False. The FK-B algorithm is a recursive algorithm that searches through the tree of assignments to find a CA. If it does not find any CA, it means that the given Boolean functions are dual. In this article, we propose six techniques applicable to the FK-B and hence to the dualization process. Although these techniques do not reduce the time complexity, they considerably reduce the running time in practice. We evaluate the proposed improvements by applying them to compute the MCSs from the EFMs in the 19 small- and medium-sized models from the BioModels database along with 4 models of biomass synthesis in Escherichia coli that were used in an earlier computational survey Haus et al. (2008).
Collapse
Affiliation(s)
- Nafiseh Sedaghat
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| | - Tamon Stephen
- Department of Mathematics, Simon Fraser University, Burnaby, Canada
| | - Leonid Chindelevitch
- MRC Center for Global Infectious Disease Analysis, School of Public Health, Imperial College, London, United Kingdom
| |
Collapse
|
3
|
Buchner BA, Zanghellini J. EFMlrs: a Python package for elementary flux mode enumeration via lexicographic reverse search. BMC Bioinformatics 2021; 22:547. [PMID: 34758748 PMCID: PMC8579665 DOI: 10.1186/s12859-021-04417-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 09/27/2021] [Indexed: 12/02/2022] Open
Abstract
Background Elementary flux mode (EFM) analysis is a well-established, yet computationally challenging approach to characterize metabolic networks. Standard algorithms require huge amounts of memory and lack scalability which limits their application to single servers and consequently limits a comprehensive analysis to medium-scale networks. Recently, Avis et al. developed mplrs—a parallel version of the lexicographic reverse search (lrs) algorithm, which, in principle, enables an EFM analysis on high-performance computing environments (Avis and Jordan. mplrs: a scalable parallel vertex/facet enumeration code. arXiv:1511.06487, 2017). Here we test its applicability for EFM enumeration. Results We developed EFMlrs, a Python package that gives users access to the enumeration capabilities of mplrs. EFMlrs uses COBRApy to process metabolic models from sbml files, performs loss-free compressions of the stoichiometric matrix, and generates suitable inputs for mplrs as well as efmtool, providing support not only for our proposed new method for EFM enumeration but also for already established tools. By leveraging COBRApy, EFMlrs also allows the application of additional reaction boundaries and seamlessly integrates into existing workflows. Conclusion We show that due to mplrs’s properties, the algorithm is perfectly suited for high-performance computing (HPC) and thus offers new possibilities for the unbiased analysis of substantially larger metabolic models via EFM analyses. EFMlrs is an open-source program that comes together with a designated workflow and can be easily installed via pip. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04417-9.
Collapse
Affiliation(s)
- Bianca A Buchner
- Department of Biotechnology, University of Natural Resources and Life Sciences, Vienna, Austria.,Austrian Centre of Industrial Biotechnology, Vienna, Austria
| | - Jürgen Zanghellini
- Department of Analytical Chemistry, University of Vienna, Vienna, Austria.
| |
Collapse
|
4
|
Answer Set Programming for Computing Constraints-Based Elementary Flux Modes: Application to Escherichia coli Core Metabolism. Processes (Basel) 2020. [DOI: 10.3390/pr8121649] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Elementary Flux Modes (EFMs) provide a rigorous basis to systematically characterize the steady state, cellular phenotypes, as well as metabolic network robustness and fragility. However, the number of EFMs typically grows exponentially with the size of the metabolic network, leading to excessive computational demands, and unfortunately, a large fraction of these EFMs are not biologically feasible due to system constraints. This combinatorial explosion often prevents the complete analysis of genome-scale metabolic models. Traditionally, EFMs are computed by the double description method, an efficient algorithm based on matrix calculation; however, only a few constraints can be integrated into this computation. They must be monotonic with regard to the set inclusion of the supports; otherwise, they must be treated in post-processing and thus do not save computational time. We present aspefm, a hybrid computational tool based on Answer Set Programming (ASP) and Linear Programming (LP) that permits the computation of EFMs while implementing many different types of constraints. We apply our methodology to the Escherichia coli core model, which contains 226×106 EFMs. In considering transcriptional and environmental regulation, thermodynamic constraints, and resource usage considerations, the solution space is reduced to 1118 EFMs that can be computed directly with aspefm. The solution set, for E. coli growth on O2 gradients spanning fully aerobic to anaerobic, can be further reduced to four optimal EFMs using post-processing and Pareto front analysis.
Collapse
|
5
|
Klamt S, Mahadevan R, von Kamp A. Speeding up the core algorithm for the dual calculation of minimal cut sets in large metabolic networks. BMC Bioinformatics 2020; 21:510. [PMID: 33167871 PMCID: PMC7654042 DOI: 10.1186/s12859-020-03837-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 10/23/2020] [Indexed: 12/16/2022] Open
Abstract
Background The concept of minimal cut sets (MCS) has become an important mathematical framework for analyzing and (re)designing metabolic networks. However, the calculation of MCS in genome-scale metabolic models is a complex computational problem. The development of duality-based algorithms in the last years allowed the enumeration of thousands of MCS in genome-scale networks by solving mixed-integer linear problems (MILP). A recent advancement in this field was the introduction of the MCS2 approach. In contrast to the Farkas-lemma-based dual system used in earlier studies, the MCS2 approach employs a more condensed representation of the dual system based on the nullspace of the stoichiometric matrix, which, due to its reduced dimension, holds promise to further enhance MCS computations. Results In this work, we introduce several new variants and modifications of duality-based MCS algorithms and benchmark their effects on the overall performance. As one major result, we generalize the original MCS2 approach (which was limited to blocking the operation of certain target reactions) to the most general case of MCS computations with arbitrary target and desired regions. Building upon these developments, we introduce a new MILP variant which allows maximal flexibility in the formulation of MCS problems and fully leverages the reduced size of the nullspace-based dual system. With a comprehensive set of benchmarks, we show that the MILP with the nullspace-based dual system outperforms the MILP with the Farkas-lemma-based dual system speeding up MCS computation with an averaged factor of approximately 2.5. We furthermore present several simplifications in the formulation of constraints, mainly related to binary variables, which further enhance the performance of MCS-related MILP. However, the benchmarks also reveal that some highly condensed formulations of constraints, especially on reversible reactions, may lead to worse behavior when compared to variants with a larger number of (more explicit) constraints and involved variables. Conclusions Our results further enhance the algorithmic toolbox for MCS calculations and are of general importance for theoretical developments as well as for practical applications of the MCS framework.
Collapse
Affiliation(s)
- Steffen Klamt
- Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, 39106, Magdeburg, Germany.
| | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, 200 College Street, Toronto, ON, M5S 3E5, Canada
| | - Axel von Kamp
- Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, 39106, Magdeburg, Germany
| |
Collapse
|
6
|
Miraskarshahi R, Zabeti H, Stephen T, Chindelevitch L. MCS2: minimal coordinated supports for fast enumeration of minimal cut sets in metabolic networks. Bioinformatics 2020; 35:i615-i623. [PMID: 31510702 PMCID: PMC6612898 DOI: 10.1093/bioinformatics/btz393] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Motivation Constraint-based modeling of metabolic networks helps researchers gain insight into the metabolic processes of many organisms, both prokaryotic and eukaryotic. Minimal cut sets (MCSs) are minimal sets of reactions whose inhibition blocks a target reaction in a metabolic network. Most approaches for finding the MCSs in constrained-based models require, either as an intermediate step or as a byproduct of the calculation, the computation of the set of elementary flux modes (EFMs), a convex basis for the valid flux vectors in the network. Recently, Ballerstein et al. proposed a method for computing the MCSs of a network without first computing its EFMs, by creating a dual network whose EFMs are a superset of the MCSs of the original network. However, their dual network is always larger than the original network and depends on the target reaction. Here we propose the construction of a different dual network, which is typically smaller than the original network and is independent of the target reaction, for the same purpose. We prove the correctness of our approach, minimal coordinated support (MCS2), and describe how it can be modified to compute the few smallest MCSs for a given target reaction. Results We compare MCS2 to the method of Ballerstein et al. and two other existing methods. We show that MCS2 succeeds in calculating the full set of MCSs in many models where other approaches cannot finish within a reasonable amount of time. Thus, in addition to its theoretical novelty, our approach provides a practical advantage over existing methods. Availability and implementation MCS2 is freely available at https://github.com/RezaMash/MCS under the GNU 3.0 license. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Reza Miraskarshahi
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Hooman Zabeti
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Tamon Stephen
- Department of Mathematics, Simon Fraser University, Burnaby, BC, Canada
| | | |
Collapse
|
7
|
Nelson WC, Graham EB, Crump AR, Fansler SJ, Arntzen EV, Kennedy DW, Stegen JC. Distinct temporal diversity profiles for nitrogen cycling genes in a hyporheic microbiome. PLoS One 2020; 15:e0228165. [PMID: 31986180 PMCID: PMC6984685 DOI: 10.1371/journal.pone.0228165] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 01/08/2020] [Indexed: 11/29/2022] Open
Abstract
Biodiversity is thought to prevent decline in community function in response to changing environmental conditions through replacement of organisms with similar functional capacity but different optimal growth characteristics. We examined how this concept translates to the within-gene level by exploring seasonal dynamics of within-gene diversity for genes involved in nitrogen cycling in hyporheic zone communities. Nitrification genes displayed low richness—defined as the number of unique within-gene phylotypes—across seasons. Conversely, denitrification genes varied in both richness and the degree to which phylotypes were recruited or lost. These results demonstrate that there is not a universal mechanism for maintaining community functional potential for nitrogen cycling activities, even across seasonal environmental shifts to which communities would be expected to be well adapted. As such, extreme environmental changes could have very different effects on the stability of the different nitrogen cycle activities. These outcomes suggest a need to modify existing conceptual models that link biodiversity to microbiome function to incorporate within-gene diversity. Specifically, we suggest an expanded conceptualization that 1) recognizes component steps (genes) with low diversity as potential bottlenecks influencing pathway-level function, and 2) includes variation in both the number of entities (e.g. species, phylotypes) that can contribute to a given process and the turnover of those entities in response to shifting conditions. Building these concepts into process-based ecosystem models represents an exciting opportunity to connect within-gene-scale ecological dynamics to ecosystem-scale services.
Collapse
Affiliation(s)
- William C. Nelson
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
- * E-mail:
| | - Emily B. Graham
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Alex R. Crump
- Department of Soil and Water Systems, University of Idaho, Moscow, Idaho, United States of America
| | - Sarah J. Fansler
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Evan V. Arntzen
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - David W. Kennedy
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - James C. Stegen
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| |
Collapse
|
8
|
Systems biology based metabolic engineering for non-natural chemicals. Biotechnol Adv 2019; 37:107379. [DOI: 10.1016/j.biotechadv.2019.04.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Revised: 02/23/2019] [Accepted: 04/01/2019] [Indexed: 12/17/2022]
|
9
|
|