1
|
Moyer DC, Reimertz J, Segrè D, Fuxman Bass JI. Semi-Automatic Detection of Errors in Genome-Scale Metabolic Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.24.600481. [PMID: 38979177 PMCID: PMC11230171 DOI: 10.1101/2024.06.24.600481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Genome-Scale Metabolic Models (GSMMs) are used for numerous tasks requiring computational estimates of metabolic fluxes, from predicting novel drug targets to engineering microbes to produce valuable compounds. A key limiting step in most applications of GSMMs is ensuring their representation of the target organism's metabolism is complete and accurate. Identifying and visualizing errors in GSMMs is complicated by the fact that they contain thousands of densely interconnected reactions. Furthermore, many errors in GSMMs only become apparent when considering pathways of connected reactions collectively, as opposed to examining reactions individually. Results We present Metabolic Accuracy Check and Analysis Workflow (MACAW), a collection of algorithms for detecting errors in GSMMs. The relative frequencies of errors we detect in manually curated GSMMs appear to reflect the different approaches used to curate them. Changing the method used to automatically create a GSMM from a particular organism's genome can have a larger impact on the kinds of errors in the resulting GSMM than using the same method with a different organism's genome. Our algorithms are particularly capable of identifying errors that are only apparent at the pathway level, including loops, and nontrivial cases of dead ends. Conclusions MACAW is capable of identifying inaccuracies of varying severity in a wide range of GSMMs. Correcting these errors can measurably improve the predictive capacity of a GSMM. The relative prevalence of each type of error we identify in a large collection of GSMMs could help shape future efforts for further automation of error correction and GSMM creation.
Collapse
|
2
|
Jalili M, Scharm M, Wolkenhauer O, Salehzadeh-Yazdi A. Metabolic function-based normalization improves transcriptome data-driven reduction of genome-scale metabolic models. NPJ Syst Biol Appl 2023; 9:15. [PMID: 37210409 DOI: 10.1038/s41540-023-00281-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 05/10/2023] [Indexed: 05/22/2023] Open
Abstract
Genome-scale metabolic models (GEMs) are extensively used to simulate cell metabolism and predict cell phenotypes. GEMs can also be tailored to generate context-specific GEMs, using omics data integration approaches. To date, many integration approaches have been developed, however, each with specific pros and cons; and none of these algorithms systematically outperforms the others. The key to successful implementation of such integration algorithms lies in the optimal selection of parameters, and thresholding is a crucial component in this process. To improve the predictive accuracy of context-specific models, we introduce a new integration framework that improves the ranking of related genes and homogenizes the expression values of those gene sets using single-sample Gene Set Enrichment Analysis (ssGSEA). In this study, we coupled ssGSEA with GIMME and validated the advantages of the proposed framework to predict the ethanol formation of yeast grown in the glucose-limited chemostats, and to simulate metabolic behaviors of yeast growth in four different carbon sources. This framework enhances the predictive accuracy of GIMME which we demonstrate for predicting the yeast physiology in nutrient-limited cultures.
Collapse
Affiliation(s)
- Mahdi Jalili
- Hematology, Oncology and SCT Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | | | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, Rostock, Germany
- Stellenbosch University, Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre, Stellenbosch, South Africa
- Leibniz Institute for Food Systems Biology at the Technical University Munich, Freising, Germany
| | | |
Collapse
|
3
|
Briones-Baez MF, Aguilera-Vazquez L, Rangel-Valdez N, Martinez-Salazar AL, Zuñiga C. Multi-Objective Optimization of Microalgae Metabolism: An Evolutive Algorithm Based on FBA. Metabolites 2022; 12:metabo12070603. [PMID: 35888727 PMCID: PMC9325016 DOI: 10.3390/metabo12070603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 06/17/2022] [Accepted: 06/22/2022] [Indexed: 12/07/2022] Open
Abstract
Studies enabled by metabolic models of different species of microalgae have become significant since they allow us to understand changes in their metabolism and physiological stages. The most used method to study cell metabolism is FBA, which commonly focuses on optimizing a single objective function. However, recent studies have brought attention to the exploration of simultaneous optimization of multiple objectives. Such strategies have found application in optimizing biomass and several other bioproducts of interest; they usually use approaches such as multi-level models or enumerations schemes. This work proposes an alternative in silico multiobjective model based on an evolutionary algorithm that offers a broader approximation of the Pareto frontier, allowing a better angle for decision making in metabolic engineering. The proposed strategy is validated on a reduced metabolic network of the microalgae Chlamydomonas reinhardtii while optimizing for the production of protein, carbohydrates, and CO2 uptake. The results from the conducted experimental design show a favorable difference in the number of solutions achieved compared to a classic tool solving FBA.
Collapse
Affiliation(s)
- Monica Fabiola Briones-Baez
- TECNM/Instituto Tecnológico de Ciudad Madero, División de Estudios de Posgrado e Investigación, Los Mangos 89440, Mexico; (M.F.B.-B.); (L.A.-V.)
| | - Luciano Aguilera-Vazquez
- TECNM/Instituto Tecnológico de Ciudad Madero, División de Estudios de Posgrado e Investigación, Los Mangos 89440, Mexico; (M.F.B.-B.); (L.A.-V.)
| | - Nelson Rangel-Valdez
- CONACyT—TECNM/Instituto Tecnológico de Ciudad Madero, División de Estudios de Posgrado e Investigación, Los Mangos 89440, Mexico;
| | - Ana Lidia Martinez-Salazar
- TECNM/Instituto Tecnológico de Ciudad Madero, División de Estudios de Posgrado e Investigación, Los Mangos 89440, Mexico; (M.F.B.-B.); (L.A.-V.)
- Correspondence:
| | - Cristal Zuñiga
- Department of Biology, San Diego State University, 5500 Campanile Drive, San Diego, CA 92182, USA;
| |
Collapse
|
4
|
Vijayakumar S, Angione C. Protocol for hybrid flux balance, statistical, and machine learning analysis of multi-omic data from the cyanobacterium Synechococcus sp. PCC 7002. STAR Protoc 2021; 2:100837. [PMID: 34632416 PMCID: PMC8488602 DOI: 10.1016/j.xpro.2021.100837] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Combining a computational framework for flux balance analysis with machine learning improves the accuracy of predicting metabolic activity across conditions, while enabling mechanistic interpretation. This protocol presents a guide to condition-specific metabolic modeling that integrates regularized flux balance analysis with machine learning approaches to extract key features from transcriptomic and fluxomic data. We demonstrate the protocol as applied to Synechococcus sp. PCC 7002; we also outline how it can be adapted to any species or community with available multi-omic data. For complete details on the use and execution of this protocol, please refer to Vijayakumar et al. (2020).
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- School of Computing, Engineering & Digital Technologies, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
| | - Claudio Angione
- School of Computing, Engineering & Digital Technologies, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
- Centre for Digital Innovation, Teesside University, Middlesbrough TS1 3BX, UK
- Healthcare Innovation Centre, Teesside University, Middlesbrough TS1 3BX, UK
| |
Collapse
|
5
|
Occhipinti A, Hamadi Y, Kugler H, Wintersteiger CM, Yordanov B, Angione C. Discovering Essential Multiple Gene Effects Through Large Scale Optimization: An Application to Human Cancer Metabolism. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2339-2352. [PMID: 32248120 DOI: 10.1109/tcbb.2020.2973386] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Computational modelling of metabolic processes has proven to be a useful approach to formulate our knowledge and improve our understanding of core biochemical systems that are crucial to maintaining cellular functions. Towards understanding the broader role of metabolism on cellular decision-making in health and disease conditions, it is important to integrate the study of metabolism with other core regulatory systems and omics within the cell, including gene expression patterns. After quantitatively integrating gene expression profiles with a genome-scale reconstruction of human metabolism, we propose a set of combinatorial methods to reverse engineer gene expression profiles and to find pairs and higher-order combinations of genetic modifications that simultaneously optimize multi-objective cellular goals. This enables us to suggest classes of transcriptomic profiles that are most suitable to achieve given metabolic phenotypes. We demonstrate how our techniques are able to compute beneficial, neutral or "toxic" combinations of gene expression levels. We test our methods on nine tissue-specific cancer models, comparing our outcomes with the corresponding normal cells, identifying genes as targets for potential therapies. Our methods open the way to a broad class of applications that require an understanding of the interplay among genotype, metabolism, and cellular behaviour, at scale.
Collapse
|
6
|
Yaneske E, Zampieri G, Bertoldi L, Benvenuto G, Angione C. Genome-scale metabolic modelling of SARS-CoV-2 in cancer cells reveals an increased shift to glycolytic energy production. FEBS Lett 2021; 595:2350-2365. [PMID: 34409594 PMCID: PMC8427129 DOI: 10.1002/1873-3468.14180] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 08/02/2021] [Accepted: 08/15/2021] [Indexed: 01/08/2023]
Abstract
Cancer is considered a high‐risk condition for severe illness resulting from COVID‐19. The interaction between severe acute respiratory syndrome coronavirus‐2 (SARS‐CoV‐2) and human metabolism is key to elucidating the risk posed by COVID‐19 for cancer patients and identifying effective treatments, yet it is largely uncharacterised on a mechanistic level. We present a genome‐scale map of short‐term metabolic alterations triggered by SARS‐CoV‐2 infection of cancer cells. Through transcriptomic‐ and proteomic‐informed genome‐scale metabolic modelling, we characterise the role of RNA and fatty acid biosynthesis in conjunction with a rewiring in energy production pathways and enhanced cytokine secretion. These findings link together complementary aspects of viral invasion of cancer cells, while providing mechanistic insights that can inform the development of treatment strategies.
Collapse
Affiliation(s)
- Elisabeth Yaneske
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK
| | - Guido Zampieri
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK.,Department of Biology, University of Padua, Italy
| | | | | | - Claudio Angione
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK.,Healthcare Innovation Centre, Teesside University, Middlesbrough, UK.,Centre for Digital Innovation, Teesside University, Middlesbrough, UK
| |
Collapse
|
7
|
Integrating systemic and molecular levels to infer key drivers sustaining metabolic adaptations. PLoS Comput Biol 2021; 17:e1009234. [PMID: 34297714 PMCID: PMC8336858 DOI: 10.1371/journal.pcbi.1009234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Revised: 08/04/2021] [Accepted: 07/01/2021] [Indexed: 12/02/2022] Open
Abstract
Metabolic adaptations to complex perturbations, like the response to pharmacological treatments in multifactorial diseases such as cancer, can be described through measurements of part of the fluxes and concentrations at the systemic level and individual transporter and enzyme activities at the molecular level. In the framework of Metabolic Control Analysis (MCA), ensembles of linear constraints can be built integrating these measurements at both systemic and molecular levels, which are expressed as relative differences or changes produced in the metabolic adaptation. Here, combining MCA with Linear Programming, an efficient computational strategy is developed to infer additional non-measured changes at the molecular level that are required to satisfy these constraints. An application of this strategy is illustrated by using a set of fluxes, concentrations, and differentially expressed genes that characterize the response to cyclin-dependent kinases 4 and 6 inhibition in colon cancer cells. Decreases and increases in transporter and enzyme individual activities required to reprogram the measured changes in fluxes and concentrations are compared with down-regulated and up-regulated metabolic genes to unveil those that are key molecular drivers of the metabolic response. Deciphering the essential events in the reprogramming of metabolic networks subjected to complex perturbations, including the response to pharmacological treatments in multifactorial diseases like cancer, is crucial for the design of efficient therapies. Yet, tools to infer the molecular drivers sustaining such metabolic responses remain elusive for large metabolic networks. Here we develop an efficient computational strategy that integrates measured changes at systemic and molecular levels and combines metabolic control analysis with linear programming tools to infer key molecular drivers sustaining the metabolic adaptations to complex perturbations, such as an antitumoral drug therapy. The collective behavior is approximated using linear expressions where the adaptation of systemic concentrations and fluxes to a perturbation is described as a function of the molecular reprogramming of transport and enzyme activities. Starting from measured changes in fluxes and concentrations, we identify changes in the reprogramming of transporter and enzyme activities that are required to orchestrate the metabolic adaptation of colon cancer cells to a cell cycle inhibitor.
Collapse
|
8
|
Vijayakumar S, Rahman PK, Angione C. A Hybrid Flux Balance Analysis and Machine Learning Pipeline Elucidates Metabolic Adaptation in Cyanobacteria. iScience 2020; 23:101818. [PMID: 33354660 PMCID: PMC7744713 DOI: 10.1016/j.isci.2020.101818] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 10/23/2020] [Accepted: 11/13/2020] [Indexed: 01/20/2023] Open
Abstract
Machine learning has recently emerged as a promising tool for inferring multi-omic relationships in biological systems. At the same time, genome-scale metabolic models (GSMMs) can be integrated with such multi-omic data to refine phenotypic predictions. In this work, we use a multi-omic machine learning pipeline to analyze a GSMM of Synechococcus sp. PCC 7002, a cyanobacterium with large potential to produce renewable biofuels. We use regularized flux balance analysis to observe flux response between conditions across photosynthesis and energy metabolism. We then incorporate principal-component analysis, k-means clustering, and LASSO regularization to reduce dimensionality and extract key cross-omic features. Our results suggest that combining metabolic modeling with machine learning elucidates mechanisms used by cyanobacteria to cope with fluctuations in light intensity and salinity that cannot be detected using transcriptomics alone. Furthermore, GSMMs introduce critical mechanistic details that improve the performance of omic-based machine learning methods.
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
| | - Pattanathu K.S.M. Rahman
- Centre for Enzyme Innovation, Institute of Biological and Biomedical Sciences, School of Biological Sciences, University of Portsmouth, Portsmouth, Hampshire PO1 2UP, UK
- Tara Biologics, Woking, Surrey GU21 6BP, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
- Centre for Digital Innovation, Teesside University, Middlesbrough TS1 3BX, UK
- Healthcare Innovation Centre, Teesside University, Middlesbrough TS1 3BX, UK
| |
Collapse
|
9
|
Tsiantis N, Banga JR. Using optimal control to understand complex metabolic pathways. BMC Bioinformatics 2020; 21:472. [PMID: 33087041 PMCID: PMC7579911 DOI: 10.1186/s12859-020-03808-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 10/13/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Optimality principles have been used to explain the structure and behavior of living matter at different levels of organization, from basic phenomena at the molecular level, up to complex dynamics in whole populations. Most of these studies have assumed a single-criteria approach. Such optimality principles have been justified from an evolutionary perspective. In the context of the cell, previous studies have shown how dynamics of gene expression in small metabolic models can be explained assuming that cells have developed optimal adaptation strategies. Most of these works have considered rather simplified representations, such as small linear pathways, or reduced networks with a single branching point, and a single objective for the optimality criteria. RESULTS Here we consider the extension of this approach to more realistic scenarios, i.e. biochemical pathways of arbitrary size and structure. We first show that exploiting optimality principles for these networks poses great challenges due to the complexity of the associated optimal control problems. Second, in order to surmount such challenges, we present a computational framework which has been designed with scalability and efficiency in mind, including mechanisms to avoid the most common pitfalls. Third, we illustrate its performance with several case studies considering the central carbon metabolism of S. cerevisiae and B. subtilis. In particular, we consider metabolic dynamics during nutrient shift experiments. CONCLUSIONS We show how multi-objective optimal control can be used to predict temporal profiles of enzyme activation and metabolite concentrations in complex metabolic pathways. Further, we also show how to consider general cost/benefit trade-offs. In this study we have considered metabolic pathways, but this computational framework can also be applied to analyze the dynamics of other complex pathways, such as signal transduction or gene regulatory networks.
Collapse
Affiliation(s)
- Nikolaos Tsiantis
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, C/Eduardo Cabello 6, 36208 Vigo, Spain
- Department of Chemical Engineering, University of Vigo, 36310 Vigo, Spain
| | - Julio R. Banga
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, C/Eduardo Cabello 6, 36208 Vigo, Spain
| |
Collapse
|
10
|
A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth. Proc Natl Acad Sci U S A 2020; 117:18869-18879. [PMID: 32675233 DOI: 10.1073/pnas.2002959117] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype-phenotype-environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning-based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.
Collapse
|
11
|
Zampieri G, Vijayakumar S, Yaneske E, Angione C. Machine and deep learning meet genome-scale metabolic modeling. PLoS Comput Biol 2019; 15:e1007084. [PMID: 31295267 PMCID: PMC6622478 DOI: 10.1371/journal.pcbi.1007084] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Omic data analysis is steadily growing as a driver of basic and applied molecular biology research. Core to the interpretation of complex and heterogeneous biological phenotypes are computational approaches in the fields of statistics and machine learning. In parallel, constraint-based metabolic modeling has established itself as the main tool to investigate large-scale relationships between genotype, phenotype, and environment. The development and application of these methodological frameworks have occurred independently for the most part, whereas the potential of their integration for biological, biomedical, and biotechnological research is less known. Here, we describe how machine learning and constraint-based modeling can be combined, reviewing recent works at the intersection of both domains and discussing the mathematical and practical aspects involved. We overlap systematic classifications from both frameworks, making them accessible to nonexperts. Finally, we delineate potential future scenarios, propose new joint theoretical frameworks, and suggest concrete points of investigation for this joint subfield. A multiview approach merging experimental and knowledge-driven omic data through machine learning methods can incorporate key mechanistic information in an otherwise biologically-agnostic learning process.
Collapse
Affiliation(s)
- Guido Zampieri
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Elisabeth Yaneske
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
- Healthcare Innovation Centre, Teesside University, Middlesbrough, United Kingdom
| |
Collapse
|
12
|
Tran VDT, Moretti S, Coste AT, Amorim-Vaz S, Sanglard D, Pagni M. Condition-specific series of metabolic sub-networks and its application for gene set enrichment analysis. Bioinformatics 2019; 35:2258-2266. [PMID: 30445518 PMCID: PMC6596900 DOI: 10.1093/bioinformatics/bty929] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 10/16/2018] [Accepted: 11/09/2018] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Genome-scale metabolic networks and transcriptomic data represent complementary sources of knowledge about an organism's metabolism, yet their integration to achieve biological insight remains challenging. RESULTS We investigate here condition-specific series of metabolic sub-networks constructed by successively removing genes from a comprehensive network. The optimal order of gene removal is deduced from transcriptomic data. The sub-networks are evaluated via a fitness function, which estimates their degree of alteration. We then consider how a gene set, i.e. a group of genes contributing to a common biological function, is depleted in different series of sub-networks to detect the difference between experimental conditions. The method, named metaboGSE, is validated on public data for Yarrowia lipolytica and mouse. It is shown to produce GO terms of higher specificity compared to popular gene set enrichment methods like GSEA or topGO. AVAILABILITY AND IMPLEMENTATION The metaboGSE R package is available at https://CRAN.R-project.org/package=metaboGSE. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Van Du T Tran
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Sébastien Moretti
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
- Evolutionary Bioinformatics Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| | - Alix T Coste
- Institute of Microbiology, University Hospital and University of Lausanne, CH-1015 Lausanne, Switzerland
| | - Sara Amorim-Vaz
- Institute of Microbiology, University Hospital and University of Lausanne, CH-1015 Lausanne, Switzerland
| | - Dominique Sanglard
- Institute of Microbiology, University Hospital and University of Lausanne, CH-1015 Lausanne, Switzerland
| | - Marco Pagni
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
| |
Collapse
|
13
|
Human Systems Biology and Metabolic Modelling: A Review-From Disease Metabolism to Precision Medicine. BIOMED RESEARCH INTERNATIONAL 2019; 2019:8304260. [PMID: 31281846 PMCID: PMC6590590 DOI: 10.1155/2019/8304260] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 02/07/2019] [Accepted: 05/20/2019] [Indexed: 01/06/2023]
Abstract
In cell and molecular biology, metabolism is the only system that can be fully simulated at genome scale. Metabolic systems biology offers powerful abstraction tools to simulate all known metabolic reactions in a cell, therefore providing a snapshot that is close to its observable phenotype. In this review, we cover the 15 years of human metabolic modelling. We show that, although the past five years have not experienced large improvements in the size of the gene and metabolite sets in human metabolic models, their accuracy is rapidly increasing. We also describe how condition-, tissue-, and patient-specific metabolic models shed light on cell-specific changes occurring in the metabolic network, therefore predicting biomarkers of disease metabolism. We finally discuss current challenges and future promising directions for this research field, including machine/deep learning and precision medicine. In the omics era, profiling patients and biological processes from a multiomic point of view is becoming more common and less expensive. Starting from multiomic data collected from patients and N-of-1 trials where individual patients constitute different case studies, methods for model-building and data integration are being used to generate patient-specific models. Coupled with state-of-the-art machine learning methods, this will allow characterizing each patient's disease phenotype and delivering precision medicine solutions, therefore leading to preventative medicine, reduced treatment, and in silico clinical trials.
Collapse
|
14
|
Oulas A, Minadakis G, Zachariou M, Sokratous K, Bourdakou MM, Spyrou GM. Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches. Brief Bioinform 2019; 20:806-824. [PMID: 29186305 PMCID: PMC6585387 DOI: 10.1093/bib/bbx151] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 10/17/2017] [Indexed: 02/01/2023] Open
Abstract
Systems Bioinformatics is a relatively new approach, which lies in the intersection of systems biology and classical bioinformatics. It focuses on integrating information across different levels using a bottom-up approach as in systems biology with a data-driven top-down approach as in bioinformatics. The advent of omics technologies has provided the stepping-stone for the emergence of Systems Bioinformatics. These technologies provide a spectrum of information ranging from genomics, transcriptomics and proteomics to epigenomics, pharmacogenomics, metagenomics and metabolomics. Systems Bioinformatics is the framework in which systems approaches are applied to such data, setting the level of resolution as well as the boundary of the system of interest and studying the emerging properties of the system as a whole rather than the sum of the properties derived from the system's individual components. A key approach in Systems Bioinformatics is the construction of multiple networks representing each level of the omics spectrum and their integration in a layered network that exchanges information within and between layers. Here, we provide evidence on how Systems Bioinformatics enhances computational therapeutics and diagnostics, hence paving the way to precision medicine. The aim of this review is to familiarize the reader with the emerging field of Systems Bioinformatics and to provide a comprehensive overview of its current state-of-the-art methods and technologies. Moreover, we provide examples of success stories and case studies that utilize such methods and tools to significantly advance research in the fields of systems biology and systems medicine.
Collapse
Affiliation(s)
- Anastasis Oulas
- Bioinformatics European Research Area Chair, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - George Minadakis
- Bioinformatics European Research Area Chair, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Margarita Zachariou
- Bioinformatics European Research Area Chair, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Kleitos Sokratous
- Bioinformatics European Research Area Chair, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Marilena M Bourdakou
- Bioinformatics European Research Area Chair, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - George M Spyrou
- Bioinformatics European Research Area Chair, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| |
Collapse
|
15
|
Ajjolli Nagaraja A, Fontaine N, Delsaut M, Charton P, Damour C, Offmann B, Grondin-Perez B, Cadet F. Flux prediction using artificial neural network (ANN) for the upper part of glycolysis. PLoS One 2019; 14:e0216178. [PMID: 31067238 PMCID: PMC6505829 DOI: 10.1371/journal.pone.0216178] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 04/15/2019] [Indexed: 01/08/2023] Open
Abstract
The selection of optimal enzyme concentration in multienzyme cascade reactions for the highest product yield in practice is very expensive and time-consuming process. The modelling of biological pathways is a difficult process because of the complexity of the system. The mathematical modelling of the system using an analytical approach depends on the many parameters of enzymes which rely on tedious and expensive experiments. The artificial neural network (ANN) method has been successively applied in different fields of science to perform complex functions. In this study, ANN models were trained to predict the flux for the upper part of glycolysis as inferred by NADH consumption, using four enzyme concentrations i.e., phosphoglucoisomerase, phosphofructokinase, fructose-bisphosphate-aldolase, triose-phosphate-isomerase. Out of three ANN algorithms, the neuralnet package with two activation functions, “logistic” and “tanh” were implemented. The prediction of the flux was very efficient: RMSE and R2 were 0.847, 0.93 and 0.804, 0.94 respectively for logistic and tanh functions using a cross validation procedure. This study showed that a systemic approach such as ANN could be used for accurate prediction of the flux through the metabolic pathway. This could help to save a lot of time and costs, particularly from an industrial perspective. The R-code is available at: https://github.com/DSIMB/ANN-Glycolysis-Flux-Prediction.
Collapse
Affiliation(s)
- Anamya Ajjolli Nagaraja
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, France
| | | | - Mathieu Delsaut
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, France
| | - Philippe Charton
- DSIMB, INSERM, UMR S-1134, Laboratory of ExcellenceLABEX GR, Faculty of Sciences and Technology, University of La Reunion & University Paris Diderot, Paris, France
| | - Cedric Damour
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, France
| | - Bernard Offmann
- Université de Nantes, Unité Fonctionnalité et Ingénierie des Protéines (UFIP), UMR 6286 CNRS, UFR Sciences et Techniques, chemin de la Houssinière, France
| | - Brigitte Grondin-Perez
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, France
| | - Frederic Cadet
- DSIMB, INSERM, UMR S-1134, Laboratory of ExcellenceLABEX GR, Faculty of Sciences and Technology, University of La Reunion & University Paris Diderot, Paris, France
- * E-mail:
| |
Collapse
|
16
|
Di Stefano A, Scatà M, Vijayakumar S, Angione C, La Corte A, Liò P. Social dynamics modeling of chrono-nutrition. PLoS Comput Biol 2019; 15:e1006714. [PMID: 30699206 PMCID: PMC6370249 DOI: 10.1371/journal.pcbi.1006714] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 02/11/2019] [Accepted: 12/14/2018] [Indexed: 12/13/2022] Open
Abstract
Gut microbiota and human relationships are strictly connected to each other. What we eat reflects our body-mind connection and synchronizes with people around us. However, how this impacts on gut microbiota and, conversely, how gut bacteria influence our dietary behaviors has not been explored yet. To quantify the complex dynamics of this interplay between gut and human behaviors we explore the "gut-human behavior axis" and its evolutionary dynamics in a real-world scenario represented by the social multiplex network. We consider a dual type of similarity, homophily and gut similarity, other than psychological and unconscious biases. We analyze the dynamics of social and gut microbial communities, quantifying the impact of human behaviors on diets and gut microbial composition and, backwards, through a control mechanism. Meal timing mechanisms and "chrono-nutrition" play a crucial role in feeding behaviors, along with the quality and quantity of food intake. Considering a population of shift workers, we explore the dynamic interplay between their eating behaviors and gut microbiota, modeling the social dynamics of chrono-nutrition in a multiplex network. Our findings allow us to quantify the relation between human behaviors and gut microbiota through the methodological introduction of gut metabolic modeling and statistical estimators, able to capture their dynamic interplay. Moreover, we find that the timing of gut microbial communities is slower than social interactions and shift-working, and the impact of shift-working on the dynamics of chrono-nutrition is a fluctuation of strategies with a major propensity for defection (e.g. high-fat meals). A deeper understanding of the relation between gut microbiota and the dietary behavioral patterns, by embedding also the related social aspects, allows improving the overall knowledge about metabolic models and their implications for human health, opening the possibility to design promising social therapeutic dietary interventions.
Collapse
Affiliation(s)
- Alessandro Di Stefano
- Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), CNIT (National Inter-University Consortium for Telecommunications) Catania, Italy
| | - Marialisa Scatà
- Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), CNIT (National Inter-University Consortium for Telecommunications) Catania, Italy
| | - Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Aurelio La Corte
- Dipartimento di Ingegneria Elettrica, Elettronica e Informatica (DIEEI), CNIT (National Inter-University Consortium for Telecommunications) Catania, Italy
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
17
|
Mancini A, Eyassu F, Conway M, Occhipinti A, Liò P, Angione C, Pucciarelli S. CiliateGEM: an open-project and a tool for predictions of ciliate metabolic variations and experimental condition design. BMC Bioinformatics 2018; 19:442. [PMID: 30497359 PMCID: PMC6266953 DOI: 10.1186/s12859-018-2422-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The study of cell metabolism is becoming central in several fields such as biotechnology, evolution/adaptation and human disease investigations. Here we present CiliateGEM, the first metabolic network reconstruction draft of the freshwater ciliate Tetrahymena thermophila. We also provide the tools and resources to simulate different growth conditions and to predict metabolic variations. CiliateGEM can be extended to other ciliates in order to set up a meta-model, i.e. a metabolic network reconstruction valid for all ciliates. Ciliates are complex unicellular eukaryotes of presumably monophyletic origin, with a phylogenetic position that is equal from plants and animals. These cells represent a new concept of unicellular system with a high degree of species, population biodiversity and cell complexity. Ciliates perform in a single cell all the functions of a pluricellular organism, including locomotion, feeding, digestion, and sexual processes. RESULTS After generating the model, we performed an in-silico simulation with the presence and absence of glucose. The lack of this nutrient caused a 32.1% reduction rate in biomass synthesis. Despite the glucose starvation, the growth did not stop due to the use of alternative carbon sources such as amino acids. CONCLUSIONS The future models obtained from CiliateGEM may represent a new approach to describe the metabolism of ciliates. This tool will be a useful resource for the ciliate research community in order to extend these species as model organisms in different research fields. An improved understanding of ciliate metabolism could be relevant to elucidate the basis of biological phenomena like genotype-phenotype relationships, population genetics, and cilia-related disease mechanisms.
Collapse
Affiliation(s)
- Alessio Mancini
- School of Biosciences and Veterinary Medicine, University of Camerino, Camerino, Italy
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Filmon Eyassu
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Maxwell Conway
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | | | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Sandra Pucciarelli
- School of Biosciences and Veterinary Medicine, University of Camerino, Camerino, Italy
| |
Collapse
|
18
|
Abstract
BACKGROUND Ageing can be classified in two different ways, chronological ageing and biological ageing. While chronological age is a measure of the time that has passed since birth, biological (also known as transcriptomic) ageing is defined by how time and the environment affect an individual in comparison to other individuals of the same chronological age. Recent research studies have shown that transcriptomic age is associated with certain genes, and that each of those genes has an effect size. Using these effect sizes we can calculate the transcriptomic age of an individual from their age-associated gene expression levels. The limitation of this approach is that it does not consider how these changes in gene expression affect the metabolism of individuals and hence their observable cellular phenotype. RESULTS We propose a method based on poly-omic constraint-based models and machine learning in order to further the understanding of transcriptomic ageing. We use normalised CD4 T-cell gene expression data from peripheral blood mononuclear cells in 499 healthy individuals to create individual metabolic models. These models are then combined with a transcriptomic age predictor and chronological age to provide new insights into the differences between transcriptomic and chronological ageing. As a result, we propose a novel metabolic age predictor. CONCLUSIONS We show that our poly-omic predictors provide a more detailed analysis of transcriptomic ageing compared to gene-based approaches, and represent a basis for furthering our knowledge of the ageing mechanisms in human cells.
Collapse
Affiliation(s)
- Elisabeth Yaneske
- Department of Computer Science and Information Systems, Teesside University, Borough Road, Middlesbrough, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Borough Road, Middlesbrough, UK
| |
Collapse
|
19
|
Cuperlovic-Culf M. Machine Learning Methods for Analysis of Metabolic Data and Metabolic Pathway Modeling. Metabolites 2018; 8:E4. [PMID: 29324649 PMCID: PMC5875994 DOI: 10.3390/metabo8010004] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 01/08/2018] [Accepted: 01/09/2018] [Indexed: 01/15/2023] Open
Abstract
Machine learning uses experimental data to optimize clustering or classification of samples or features, or to develop, augment or verify models that can be used to predict behavior or properties of systems. It is expected that machine learning will help provide actionable knowledge from a variety of big data including metabolomics data, as well as results of metabolism models. A variety of machine learning methods has been applied in bioinformatics and metabolism analyses including self-organizing maps, support vector machines, the kernel machine, Bayesian networks or fuzzy logic. To a lesser extent, machine learning has also been utilized to take advantage of the increasing availability of genomics and metabolomics data for the optimization of metabolic network models and their analysis. In this context, machine learning has aided the development of metabolic networks, the calculation of parameters for stoichiometric and kinetic models, as well as the analysis of major features in the model for the optimal application of bioreactors. Examples of this very interesting, albeit highly complex, application of machine learning for metabolism modeling will be the primary focus of this review presenting several different types of applications for model optimization, parameter determination or system analysis using models, as well as the utilization of several different types of machine learning technologies.
Collapse
Affiliation(s)
- Miroslava Cuperlovic-Culf
- Digital Technologies Research Center, National Research Council of Canada, 1200 Montreal Road, Ottawa, ON K1A 0R6, Canada.
| |
Collapse
|
20
|
Occhipinti A, Eyassu F, Rahman TJ, Rahman PKSM, Angione C. In silico engineering of Pseudomonas metabolism reveals new biomarkers for increased biosurfactant production. PeerJ 2018; 6:e6046. [PMID: 30588397 PMCID: PMC6301282 DOI: 10.7717/peerj.6046] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Accepted: 10/30/2018] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND Rhamnolipids, biosurfactants with a wide range of biomedical applications, are amphiphilic molecules produced on the surfaces of or excreted extracellularly by bacteria including Pseudomonas aeruginosa. However, Pseudomonas putida is a non-pathogenic model organism with greater metabolic versatility and potential for industrial applications. METHODS We investigate in silico the metabolic capabilities of P. putida for rhamnolipids biosynthesis using statistical, metabolic and synthetic engineering approaches after introducing key genes (RhlA and RhlB) from P. aeruginosa into a genome-scale model of P. putida. This pipeline combines machine learning methods with multi-omic modelling, and drives the engineered P. putida model toward an optimal production and export of rhamnolipids out of the membrane. RESULTS We identify a substantial increase in synthesis of rhamnolipids by the engineered model compared to the control model. We apply statistical and machine learning techniques on the metabolic reaction rates to identify distinct features on the structure of the variables and individual components driving the variation of growth and rhamnolipids production. We finally provide a computational framework for integrating multi-omics data and identifying latent pathways and genes for the production of rhamnolipids in P. putida. CONCLUSIONS We anticipate that our results will provide a versatile methodology for integrating multi-omics data for topological and functional analysis of P. putida toward maximization of biosurfactant production.
Collapse
Affiliation(s)
- Annalisa Occhipinti
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Filmon Eyassu
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| | - Thahira J. Rahman
- Technology Futures Institute, School of Science, Engineering and Design, Teesside University, Middlesbrough, UK
| | - Pattanathu K. S. M. Rahman
- Technology Futures Institute, School of Science, Engineering and Design, Teesside University, Middlesbrough, UK
- Institute of Biological and Biomedical Sciences, School of Biological Sciences, University of Portsmouth, Portsmouth, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, UK
| |
Collapse
|
21
|
Eyassu F, Angione C. Modelling pyruvate dehydrogenase under hypoxia and its role in cancer metabolism. ROYAL SOCIETY OPEN SCIENCE 2017; 4:170360. [PMID: 29134060 PMCID: PMC5666243 DOI: 10.1098/rsos.170360] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2017] [Accepted: 09/25/2017] [Indexed: 05/18/2023]
Abstract
Metabolism is the only biological system that can be fully modelled at genome scale. As a result, metabolic models have been increasingly used to study the molecular mechanisms of various diseases. Hypoxia, a low-oxygen tension, is a well-known characteristic of many cancer cells. Pyruvate dehydrogenase (PDH) controls the flux of metabolites between glycolysis and the tricarboxylic acid cycle and is a key enzyme in metabolic reprogramming in cancer metabolism. Here, we develop and manually curate a constraint-based metabolic model to investigate the mechanism of pyruvate dehydrogenase under hypoxia. Our results characterize the activity of pyruvate dehydrogenase and its decline during hypoxia. This results in lactate accumulation, consistent with recent hypoxia studies and a well-known feature in cancer metabolism. We apply machine-learning techniques on the flux datasets to identify reactions that drive these variations. We also identify distinct features on the structure of the variables and individual metabolic components in the switch from normoxia to hypoxia. Our results provide a framework for future studies by integrating multi-omics data to predict condition-specific metabolic phenotypes under hypoxia.
Collapse
|