1
|
Turanli B, Gulfidan G, Aydogan OO, Kula C, Selvaraj G, Arga KY. Genome-scale metabolic models in translational medicine: the current status and potential of machine learning in improving the effectiveness of the models. Mol Omics 2024; 20:234-247. [PMID: 38444371 DOI: 10.1039/d3mo00152k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
The genome-scale metabolic model (GEM) has emerged as one of the leading modeling approaches for systems-level metabolic studies and has been widely explored for a broad range of organisms and applications. Owing to the development of genome sequencing technologies and available biochemical data, it is possible to reconstruct GEMs for model and non-model microorganisms as well as for multicellular organisms such as humans and animal models. GEMs will evolve in parallel with the availability of biological data, new mathematical modeling techniques and the development of automated GEM reconstruction tools. The use of high-quality, context-specific GEMs, a subset of the original GEM in which inactive reactions are removed while maintaining metabolic functions in the extracted model, for model organisms along with machine learning (ML) techniques could increase their applications and effectiveness in translational research in the near future. Here, we briefly review the current state of GEMs, discuss the potential contributions of ML approaches for more efficient and frequent application of these models in translational research, and explore the extension of GEMs to integrative cellular models.
Collapse
Affiliation(s)
- Beste Turanli
- Marmara University, Faculty of Engineering, Department of Bioengineering, Istanbul, Turkey.
- Health Biotechnology Joint Research and Application Center of Excellence, Istanbul, Turkey
| | - Gizem Gulfidan
- Marmara University, Faculty of Engineering, Department of Bioengineering, Istanbul, Turkey.
| | - Ozge Onluturk Aydogan
- Marmara University, Faculty of Engineering, Department of Bioengineering, Istanbul, Turkey.
| | - Ceyda Kula
- Marmara University, Faculty of Engineering, Department of Bioengineering, Istanbul, Turkey.
- Health Biotechnology Joint Research and Application Center of Excellence, Istanbul, Turkey
| | - Gurudeeban Selvaraj
- Concordia University, Centre for Research in Molecular Modeling & Department of Chemistry and Biochemistry, Quebec, Canada
- Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha Dental College and Hospital, Department of Biomaterials, Bioinformatics Unit, Chennai, India
| | - Kazim Yalcin Arga
- Marmara University, Faculty of Engineering, Department of Bioengineering, Istanbul, Turkey.
- Health Biotechnology Joint Research and Application Center of Excellence, Istanbul, Turkey
- Marmara University, Genetic and Metabolic Diseases Research and Investigation Center, Istanbul, Turkey
| |
Collapse
|
2
|
Wu D, Xu F, Xu Y, Huang M, Li Z, Chu J. Towards a hybrid model-driven platform based on flux balance analysis and a machine learning pipeline for biosystem design. Synth Syst Biotechnol 2024; 9:33-42. [PMID: 38234412 PMCID: PMC10793177 DOI: 10.1016/j.synbio.2023.12.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 12/22/2023] [Accepted: 12/22/2023] [Indexed: 01/19/2024] Open
Abstract
Metabolic modeling and machine learning (ML) are crucial components of the evolving next-generation tools in systems and synthetic biology, aiming to unravel the intricate relationship between genotype, phenotype, and the environment. Nonetheless, the comprehensive exploration of integrating these two frameworks, and fully harnessing the potential of fluxomic data, remains an unexplored territory. In this study, we present, rigorously evaluate, and compare ML-based techniques for data integration. The hybrid model revealed that the overexpression of six target genes and the knockout of seven target genes contribute to enhanced ethanol production. Specifically, we investigated the influence of succinate dehydrogenase (SDH) on ethanol biosynthesis in Saccharomyces cerevisiae through shake flask experiments. The findings indicate a noticeable increase in ethanol yield, ranging from 6 % to 10 %, in SDH subunit gene knockout strains compared to the wild-type strain. Moreover, in pursuit of a high-yielding strain for ethanol production, dual-gene deletion experiments were conducted targeting glycerol-3-phosphate dehydrogenase (GPD) and SDH. The results unequivocally demonstrate significant enhancements in ethanol production for the engineered strains Δsdh4Δgpd1, Δsdh5Δgpd1, Δsdh6Δgpd1, Δsdh4Δgpd2, Δsdh5Δgpd2, and Δsdh6Δgpd2, with improvements of 21.6 %, 27.9 %, and 22.7 %, respectively. Overall, the results highlighted that integrating mechanistic flux features substantially improves the prediction of gene knockout strains not accounted for in metabolic reconstructions. In addition, the finding in this study delivers valuable tools for comprehending and manipulating intricate phenotypes, thereby enhancing prediction accuracy and facilitating deeper insights into mechanistic aspects within the field of synthetic biology.
Collapse
Affiliation(s)
| | | | - Yaying Xu
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, People's Republic of China
| | - Mingzhi Huang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, People's Republic of China
| | - Zhimin Li
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, People's Republic of China
| | - Ju Chu
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, People's Republic of China
| |
Collapse
|
3
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
4
|
Strain B, Morrissey J, Antonakoudis A, Kontoravdi C. Genome-scale models as a vehicle for knowledge transfer from microbial to mammalian cell systems. Comput Struct Biotechnol J 2023; 21:1543-1549. [PMID: 36879884 PMCID: PMC9984296 DOI: 10.1016/j.csbj.2023.02.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/06/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023] Open
Abstract
With the plethora of omics data becoming available for mammalian cell and, increasingly, human cell systems, Genome-scale metabolic models (GEMs) have emerged as a useful tool for their organisation and analysis. The systems biology community has developed an array of tools for the solution, interrogation and customisation of GEMs as well as algorithms that enable the design of cells with desired phenotypes based on the multi-omics information contained in these models. However, these tools have largely found application in microbial cells systems, which benefit from smaller model size and ease of experimentation. Herein, we discuss the major outstanding challenges in the use of GEMs as a vehicle for accurately analysing data for mammalian cell systems and transferring methodologies that would enable their use to design strains and processes. We provide insights on the opportunities and limitations of applying GEMs to human cell systems for advancing our understanding of health and disease. We further propose their integration with data-driven tools and their enrichment with cellular functions beyond metabolism, which would, in theory, more accurately describe how resources are allocated intracellularly.
Collapse
Affiliation(s)
- Benjamin Strain
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | - James Morrissey
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | | | - Cleo Kontoravdi
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
5
|
Methods for Stratification and Validation Cohorts: A Scoping Review. J Pers Med 2022; 12:jpm12050688. [PMID: 35629113 PMCID: PMC9144352 DOI: 10.3390/jpm12050688] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/31/2022] [Accepted: 04/15/2022] [Indexed: 12/12/2022] Open
Abstract
Personalized medicine requires large cohorts for patient stratification and validation of patient clustering. However, standards and harmonized practices on the methods and tools to be used for the design and management of cohorts in personalized medicine remain to be defined. This study aims to describe the current state-of-the-art in this area. A scoping review was conducted searching in PubMed, EMBASE, Web of Science, Psycinfo and Cochrane Library for reviews about tools and methods related to cohorts used in personalized medicine. The search focused on cancer, stroke and Alzheimer’s disease and was limited to reports in English, French, German, Italian and Spanish published from 2005 to April 2020. The screening process was reported through a PRISMA flowchart. Fifty reviews were included, mostly including information about how data were generated (25/50) and about tools used for data management and analysis (24/50). No direct information was found about the quality of data and the requirements to monitor associated clinical data. A scarcity of information and standards was found in specific areas such as sample size calculation. With this information, comprehensive guidelines could be developed in the future to improve the reproducibility and robustness in the design and management of cohorts in personalized medicine studies.
Collapse
|
6
|
Wu L, Xie X, Liang T, Ma J, Yang L, Yang J, Li L, Xi Y, Li H, Zhang J, Chen X, Ding Y, Wu Q. Integrated Multi-Omics for Novel Aging Biomarkers and Antiaging Targets. Biomolecules 2021; 12:39. [PMID: 35053186 PMCID: PMC8773837 DOI: 10.3390/biom12010039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 12/17/2021] [Accepted: 12/19/2021] [Indexed: 12/12/2022] Open
Abstract
Aging is closely related to the occurrence of human diseases; however, its exact biological mechanism is unclear. Advancements in high-throughput technology provide new opportunities for omics research to understand the pathological process of various complex human diseases. However, single-omics technologies only provide limited insights into the biological mechanisms of diseases. DNA, RNA, protein, metabolites, and microorganisms usually play complementary roles and perform certain biological functions together. In this review, we summarize multi-omics methods based on the most relevant biomarkers in single-omics to better understand molecular functions and disease causes. The integration of multi-omics technologies can systematically reveal the interactions among aging molecules from a multidimensional perspective. Our review provides new insights regarding the discovery of aging biomarkers, mechanism of aging, and identification of novel antiaging targets. Overall, data from genomics, transcriptomics, proteomics, metabolomics, integromics, microbiomics, and systems biology contribute to the identification of new candidate biomarkers for aging and novel targets for antiaging interventions.
Collapse
Affiliation(s)
- Lei Wu
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (J.M.); (X.C.)
| | - Xinqiang Xie
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| | - Tingting Liang
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (J.M.); (X.C.)
| | - Jun Ma
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (J.M.); (X.C.)
| | - Lingshuang Yang
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| | - Juan Yang
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (J.M.); (X.C.)
| | - Longyan Li
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| | - Yu Xi
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| | - Haixin Li
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| | - Jumei Zhang
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| | - Xuefeng Chen
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an 710021, China; (J.M.); (X.C.)
| | - Yu Ding
- Department of Food Science and Technology, Institute of Food Safety and Nutrition, Jinan University, Guangzhou 510632, China
| | - Qingping Wu
- Guangdong Provincial Key Laboratory of Microbial Safety and Health, State Key Laboratory of Applied Microbiology Southern China, Institute of Microbiology, Guangdong Academy of Sciences, Guangzhou 510070, China; (L.W.); (X.X.); (T.L.); (L.Y.); (J.Y.); (L.L.); (Y.X.); (H.L.); (J.Z.)
| |
Collapse
|
7
|
Vijayakumar S, Angione C. Protocol for hybrid flux balance, statistical, and machine learning analysis of multi-omic data from the cyanobacterium Synechococcus sp. PCC 7002. STAR Protoc 2021; 2:100837. [PMID: 34632416 PMCID: PMC8488602 DOI: 10.1016/j.xpro.2021.100837] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Combining a computational framework for flux balance analysis with machine learning improves the accuracy of predicting metabolic activity across conditions, while enabling mechanistic interpretation. This protocol presents a guide to condition-specific metabolic modeling that integrates regularized flux balance analysis with machine learning approaches to extract key features from transcriptomic and fluxomic data. We demonstrate the protocol as applied to Synechococcus sp. PCC 7002; we also outline how it can be adapted to any species or community with available multi-omic data. For complete details on the use and execution of this protocol, please refer to Vijayakumar et al. (2020).
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- School of Computing, Engineering & Digital Technologies, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
| | - Claudio Angione
- School of Computing, Engineering & Digital Technologies, Teesside University, Middlesbrough, North Yorkshire TS1 3BX, UK
- Centre for Digital Innovation, Teesside University, Middlesbrough TS1 3BX, UK
- Healthcare Innovation Centre, Teesside University, Middlesbrough TS1 3BX, UK
| |
Collapse
|
8
|
Sahu A, Blätke MA, Szymański JJ, Töpfer N. Advances in flux balance analysis by integrating machine learning and mechanism-based models. Comput Struct Biotechnol J 2021; 19:4626-4640. [PMID: 34471504 PMCID: PMC8382995 DOI: 10.1016/j.csbj.2021.08.004] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 08/03/2021] [Accepted: 08/03/2021] [Indexed: 02/08/2023] Open
Abstract
The availability of multi-omics data sets and genome-scale metabolic models for various organisms provide a platform for modeling and analyzing genotype-to-phenotype relationships. Flux balance analysis is the main tool for predicting flux distributions in genome-scale metabolic models and various data-integrative approaches enable modeling context-specific network behavior. Due to its linear nature, this optimization framework is readily scalable to multi-tissue or -organ and even multi-organism models. However, both data and model size can hamper a straightforward biological interpretation of the estimated fluxes. Moreover, flux balance analysis simulates metabolism at steady-state and thus, in its most basic form, does not consider kinetics or regulatory events. The integration of flux balance analysis with complementary data analysis and modeling techniques offers the potential to overcome these challenges. In particular machine learning approaches have emerged as the tool of choice for data reduction and selection of most important variables in big data sets. Kinetic models and formal languages can be used to simulate dynamic behavior. This review article provides an overview of integrative studies that combine flux balance analysis with machine learning approaches, kinetic models, such as physiology-based pharmacokinetic models, and formal graphical modeling languages, such as Petri nets. We discuss the mathematical aspects and biological applications of these integrated approaches and outline challenges and future perspectives.
Collapse
Affiliation(s)
- Ankur Sahu
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| | - Mary-Ann Blätke
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| | - Jędrzej Jakub Szymański
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| | - Nadine Töpfer
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Corrensstraße 3, 06466 Gatersleben, Germany
| |
Collapse
|
9
|
Magazzù G, Zampieri G, Angione C. Multimodal regularised linear models with flux balance analysis for mechanistic integration of omics data. Bioinformatics 2021; 37:3546-3552. [PMID: 33974036 DOI: 10.1093/bioinformatics/btab324] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 01/06/2021] [Accepted: 04/27/2021] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION High-throughput biological data, thanks to technological advances, have become cheaper to collect, leading to the availability of vast amounts of omic data of different types. In parallel, the in silico reconstruction and modelling of metabolic systems is now acknowledged as a key tool to complement experimental data on a large scale. The integration of these model- and data-driven information is therefore emerging as a new challenge in systems biology, with no clear guidance on how to better take advantage of the inherent multi-source and multi-omic nature of these data types while preserving mechanistic interpretation. RESULTS Here we investigate different regularisation techniques for high-dimensional data derived from the integration of gene expression profiles with metabolic flux data, extracted from strain-specific metabolic models, to improve cellular growth rate predictions. To this end, we propose ad-hoc extensions of previous regularisation frameworks including group, view-specific and principal component regularisation, and experimentally compare them using data from 1,143 Saccharomyces cerevisiae strains. We observe a divergence between methods in terms of regression accuracy and integration effectiveness based on the type of regularisation employed. In multi-omic regression tasks, when learning from experimental and model-generated omic data, our results demonstrate the competitiveness and ease of interpretation of multimodal regularised linear models compared to data-hungry methods based on neural networks. AVAILABILITY All data, models, and code produced in this work are available on GitHub at https://github.com/Angione-Lab/HybridGroupIPFLasso_pc2Lasso. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Giuseppe Magazzù
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK
| | - Guido Zampieri
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK.,Department of Biology, University of Padova, Padova, Italy
| | - Claudio Angione
- School of Computing, Engineering and Digital Technologies, Teesside University, Middlesbrough, UK.,Healthcare Innovation Centre, Teesside University, Middlesbrough, UK.,Centre for Digital Innovation, Teesside University, Middlesbrough, UK
| |
Collapse
|
10
|
Cakmak A, Celik MH. Personalized Metabolic Analysis of Diseases. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:1014-1025. [PMID: 32750887 DOI: 10.1109/tcbb.2020.3008196] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
The metabolic wiring of patient cells is altered drastically in many diseases, including cancer. Understanding the nature of such changes may pave the way for new therapeutic opportunities as well as the development of personalized treatment strategies for patients. In this paper, we propose an algorithm called Metabolitics, which allows systems-level analysis of changes in the biochemical network of cells in disease states. It enables the study of a disease at both reaction- and pathway-level granularities for a detailed and summarized view of disease etiology. Metabolitics employs flux variability analysis with a dynamically built objective function based on biofluid metabolomics measurements in a personalized manner. Moreover, Metabolitics builds supervised classification models to discriminate between patients and healthy subjects based on the computed metabolic network changes. The use of Metabolitics is demonstrated for three distinct diseases, namely, breast cancer, Crohn's disease, and colorectal cancer. Our results show that the constructed supervised learning models successfully differentiate patients from healthy individuals by an average f1-score of 88 percent. Besides, in addition to the confirmation of previously reported breast cancer-associated pathways, we discovered that Biotin Metabolism along with Arginine and Proline Metabolism is subject to a significant increase in flux capacity, which have not been reported before.
Collapse
|
11
|
Sun X, Zhao B, Qu H, Chen S, Hao X, Chen S, Qin Z, Chen G, Fan Y. Sera and lungs metabonomics reveals key metabolites of resveratrol protecting against PAH in rats. Biomed Pharmacother 2021; 133:110910. [PMID: 33378990 DOI: 10.1016/j.biopha.2020.110910] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 10/12/2020] [Accepted: 10/18/2020] [Indexed: 01/13/2023] Open
Abstract
Pulmonary arterial hypertension (PAH) is a type of high morbidity and mortality disease. Currently, the intrinsic metabolic alteration and potential mechanism of PAH are still not fully uncovered. Previously, we have found that polyphenol resveratrol (Rev) reversed the remodeling of the pulmonary vasculature and decreased the number of mitochondria in pulmonary arterial smooth muscle cells (PASMCs) (Lei Yu et al. (2017)). However, potential effects of Rev on the changed metabolic molecules derived from lung tissue and serum have no fully elucidated. Thus, we conducted a systematic elaboration through the metabonomics method. Various of metabolites in different pathways including amino acid metabolism, tricarboxylic acid cycle (TCA), acetylcholine metabolism, fatty acid metabolism and biosynthesis in male Wistar rats' sera and lung tissues were explored in three groups (normal group, PAH group, PAH and Rev treatment group). We found that leucine and isoleucine degradation, valine, leucine and isoleucine biosynthesis, tryptophan metabolism and aminoacyl-tRNA biosynthesis were involved in the development of PAH. Hydroxyphenyllactic, isopalmitic acid and cytosine might be significant key metabolites. Further work in this area may inform personalized treatment approaches in clinical practice of PAH through elucidating pathophysiology mechanisms of experimental verification.
Collapse
Affiliation(s)
- Xiangju Sun
- Department of Pharmacy, Fourth Affiliated Hospital, Harbin Medical University, Harbin, 150001, China
| | - Baoshan Zhao
- College of Basic Medical Sciences, Harbin Medical University, Daqing, 163319, China
| | - Huichong Qu
- College of Pharmacy, Harbin Medical University, Daqing, 163319, China
| | - Shuo Chen
- College of Pharmacy, Harbin Medical University, Daqing, 163319, China
| | - Xuewei Hao
- Inspection Institute, Harbin Medical University, Daqing, Heilongjiang Province, 163319, China
| | - Siyue Chen
- College of Pharmacy, Harbin Medical University, Daqing, 163319, China
| | - Zhuwen Qin
- College of Pharmacy, Harbin Medical University, Daqing, 163319, China
| | - Guoyou Chen
- College of Pharmacy, Harbin Medical University, Daqing, 163319, China.
| | - Yuhua Fan
- College of Basic Medical Sciences, Harbin Medical University, Daqing, 163319, China.
| |
Collapse
|
12
|
Suthers PF, Foster CJ, Sarkar D, Wang L, Maranas CD. Recent advances in constraint and machine learning-based metabolic modeling by leveraging stoichiometric balances, thermodynamic feasibility and kinetic law formalisms. Metab Eng 2020; 63:13-33. [PMID: 33310118 DOI: 10.1016/j.ymben.2020.11.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 11/13/2020] [Accepted: 11/27/2020] [Indexed: 12/16/2022]
Abstract
Understanding the governing principles behind organisms' metabolism and growth underpins their effective deployment as bioproduction chassis. A central objective of metabolic modeling is predicting how metabolism and growth are affected by both external environmental factors and internal genotypic perturbations. The fundamental concepts of reaction stoichiometry, thermodynamics, and mass action kinetics have emerged as the foundational principles of many modeling frameworks designed to describe how and why organisms allocate resources towards both growth and bioproduction. This review focuses on the latest algorithmic advancements that have integrated these foundational principles into increasingly sophisticated quantitative frameworks.
Collapse
Affiliation(s)
- Patrick F Suthers
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, The Pennsylvania State University, University Park, PA, USA
| | - Charles J Foster
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Debolina Sarkar
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Lin Wang
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA; DOE Center for Advanced Bioenergy and Bioproducts Innovation, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
13
|
Antonakoudis A, Barbosa R, Kotidis P, Kontoravdi C. The era of big data: Genome-scale modelling meets machine learning. Comput Struct Biotechnol J 2020; 18:3287-3300. [PMID: 33240470 PMCID: PMC7663219 DOI: 10.1016/j.csbj.2020.10.011] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Revised: 10/07/2020] [Accepted: 10/08/2020] [Indexed: 12/15/2022] Open
Abstract
With omics data being generated at an unprecedented rate, genome-scale modelling has become pivotal in its organisation and analysis. However, machine learning methods have been gaining ground in cases where knowledge is insufficient to represent the mechanisms underlying such data or as a means for data curation prior to attempting mechanistic modelling. We discuss the latest advances in genome-scale modelling and the development of optimisation algorithms for network and error reduction, intracellular constraining and applications to strain design. We further review applications of supervised and unsupervised machine learning methods to omics datasets from microbial and mammalian cell systems and present efforts to harness the potential of both modelling approaches through hybrid modelling.
Collapse
Affiliation(s)
| | | | | | - Cleo Kontoravdi
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
14
|
Cabbia A, Hilbers PA, van Riel NA. A Distance-Based Framework for the Characterization of Metabolic Heterogeneity in Large Sets of Genome-Scale Metabolic Models. PATTERNS (NEW YORK, N.Y.) 2020; 1:100080. [PMID: 33205127 PMCID: PMC7660451 DOI: 10.1016/j.patter.2020.100080] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 05/29/2020] [Accepted: 07/03/2020] [Indexed: 12/17/2022]
Abstract
Gene expression and protein abundance data of cells or tissues belonging to healthy and diseased individuals can be integrated and mapped onto genome-scale metabolic networks to produce patient-derived models. As the number of available and newly developed genome-scale metabolic models increases, new methods are needed to objectively analyze large sets of models and to identify the determinants of metabolic heterogeneity. We developed a distance-based workflow that combines consensus machine learning and metabolic modeling techniques and used it to apply pattern recognition algorithms to collections of genome-scale metabolic models, both microbial and human. Model composition, network topology and flux distribution provide complementary aspects of metabolic heterogeneity in patient-specific genome-scale models of skeletal muscle. Using consensus clustering analysis we identified the metabolic processes involved in the individual responses to endurance training in older adults.
Collapse
Affiliation(s)
- Andrea Cabbia
- Computational Biology, Eindhoven University of Technology, Groene Loper 5, 5612 AE Eindhoven, the Netherlands
| | - Peter A.J. Hilbers
- Computational Biology, Eindhoven University of Technology, Groene Loper 5, 5612 AE Eindhoven, the Netherlands
| | - Natal A.W. van Riel
- Computational Biology, Eindhoven University of Technology, Groene Loper 5, 5612 AE Eindhoven, the Netherlands
- Amsterdam University Medical Centers, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, the Netherlands
| |
Collapse
|
15
|
A mechanism-aware and multiomic machine-learning pipeline characterizes yeast cell growth. Proc Natl Acad Sci U S A 2020; 117:18869-18879. [PMID: 32675233 DOI: 10.1073/pnas.2002959117] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Metabolic modeling and machine learning are key components in the emerging next generation of systems and synthetic biology tools, targeting the genotype-phenotype-environment relationship. Rather than being used in isolation, it is becoming clear that their value is maximized when they are combined. However, the potential of integrating these two frameworks for omic data augmentation and integration is largely unexplored. We propose, rigorously assess, and compare machine-learning-based data integration techniques, combining gene expression profiles with computationally generated metabolic flux data to predict yeast cell growth. To this end, we create strain-specific metabolic models for 1,143 Saccharomyces cerevisiae mutants and we test 27 machine-learning methods, incorporating state-of-the-art feature selection and multiview learning approaches. We propose a multiview neural network using fluxomic and transcriptomic data, showing that the former increases the predictive accuracy of the latter and reveals functional patterns that are not directly deducible from gene expression alone. We test the proposed neural network on a further 86 strains generated in a different experiment, therefore verifying its robustness to an additional independent dataset. Finally, we show that introducing mechanistic flux features improves the predictions also for knockout strains whose genes were not modeled in the metabolic reconstruction. Our results thus demonstrate that fusing experimental cues with in silico models, based on known biochemistry, can contribute with disjoint information toward biologically informed and interpretable machine learning. Overall, this study provides tools for understanding and manipulating complex phenotypes, increasing both the prediction accuracy and the extent of discernible mechanistic biological insights.
Collapse
|
16
|
Solovev I, Shaposhnikov M, Moskalev A. Multi-omics approaches to human biological age estimation. Mech Ageing Dev 2020; 185:111192. [DOI: 10.1016/j.mad.2019.111192] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 11/07/2019] [Accepted: 11/25/2019] [Indexed: 01/01/2023]
|
17
|
Zampieri G, Vijayakumar S, Yaneske E, Angione C. Machine and deep learning meet genome-scale metabolic modeling. PLoS Comput Biol 2019; 15:e1007084. [PMID: 31295267 PMCID: PMC6622478 DOI: 10.1371/journal.pcbi.1007084] [Citation(s) in RCA: 150] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Omic data analysis is steadily growing as a driver of basic and applied molecular biology research. Core to the interpretation of complex and heterogeneous biological phenotypes are computational approaches in the fields of statistics and machine learning. In parallel, constraint-based metabolic modeling has established itself as the main tool to investigate large-scale relationships between genotype, phenotype, and environment. The development and application of these methodological frameworks have occurred independently for the most part, whereas the potential of their integration for biological, biomedical, and biotechnological research is less known. Here, we describe how machine learning and constraint-based modeling can be combined, reviewing recent works at the intersection of both domains and discussing the mathematical and practical aspects involved. We overlap systematic classifications from both frameworks, making them accessible to nonexperts. Finally, we delineate potential future scenarios, propose new joint theoretical frameworks, and suggest concrete points of investigation for this joint subfield. A multiview approach merging experimental and knowledge-driven omic data through machine learning methods can incorporate key mechanistic information in an otherwise biologically-agnostic learning process.
Collapse
Affiliation(s)
- Guido Zampieri
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Supreeta Vijayakumar
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Elisabeth Yaneske
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, Middlesbrough, United Kingdom
- Healthcare Innovation Centre, Teesside University, Middlesbrough, United Kingdom
| |
Collapse
|
18
|
Hornung R, Wright MN. Block Forests: random forests for blocks of clinical and omics covariate data. BMC Bioinformatics 2019; 20:358. [PMID: 31248362 PMCID: PMC6598279 DOI: 10.1186/s12859-019-2942-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 06/07/2019] [Indexed: 12/25/2022] Open
Abstract
Background In the last years more and more multi-omics data are becoming available, that is, data featuring measurements of several types of omics data for each patient. Using multi-omics data as covariate data in outcome prediction is both promising and challenging due to the complex structure of such data. Random forest is a prediction method known for its ability to render complex dependency patterns between the outcome and the covariates. Against this background we developed five candidate random forest variants tailored to multi-omics covariate data. These variants modify the split point selection of random forest to incorporate the block structure of multi-omics data and can be applied to any outcome type for which a random forest variant exists, such as categorical, continuous and survival outcomes. Using 20 publicly available multi-omics data sets with survival outcome we compared the prediction performances of the block forest variants with alternatives. We also considered the common special case of having clinical covariates and measurements of a single omics data type available. Results We identify one variant termed “block forest” that outperformed all other approaches in the comparison study. In particular, it performed significantly better than standard random survival forest (adjusted p-value: 0.027). The two best performing variants have in common that the block choice is randomized in the split point selection procedure. In the case of having clinical covariates and a single omics data type available, the improvements of the variants over random survival forest were larger than in the case of the multi-omics data. The degrees of improvements over random survival forest varied strongly across data sets. Moreover, considering all clinical covariates mandatorily improved the performance. This result should however be interpreted with caution, because the level of predictive information contained in clinical covariates depends on the specific application. Conclusions The new prediction method block forest for multi-omics data can significantly improve the prediction performance of random forest and outperformed alternatives in the comparison. Block forest is particularly effective for the special case of using clinical covariates in combination with measurements of a single omics data type. Electronic supplementary material The online version of this article (10.1186/s12859-019-2942-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Roman Hornung
- Institute for Medical Information Processing, Biometry and Epidemiology, University of Munich, Marchioninistr. 15, Munich, 81377, Germany.
| | - Marvin N Wright
- Leibniz Institute for Prevention Research and Epidemiology - BIPS, Achterstr. 30, Bremen, 28359, Germany.,Section of Biostatistics, Department of Public Health, University of Copenhagen, Øster Farimagsgade 5, Copenhagen, 1014, Denmark
| |
Collapse
|
19
|
Human Systems Biology and Metabolic Modelling: A Review-From Disease Metabolism to Precision Medicine. BIOMED RESEARCH INTERNATIONAL 2019; 2019:8304260. [PMID: 31281846 PMCID: PMC6590590 DOI: 10.1155/2019/8304260] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 02/07/2019] [Accepted: 05/20/2019] [Indexed: 01/06/2023]
Abstract
In cell and molecular biology, metabolism is the only system that can be fully simulated at genome scale. Metabolic systems biology offers powerful abstraction tools to simulate all known metabolic reactions in a cell, therefore providing a snapshot that is close to its observable phenotype. In this review, we cover the 15 years of human metabolic modelling. We show that, although the past five years have not experienced large improvements in the size of the gene and metabolite sets in human metabolic models, their accuracy is rapidly increasing. We also describe how condition-, tissue-, and patient-specific metabolic models shed light on cell-specific changes occurring in the metabolic network, therefore predicting biomarkers of disease metabolism. We finally discuss current challenges and future promising directions for this research field, including machine/deep learning and precision medicine. In the omics era, profiling patients and biological processes from a multiomic point of view is becoming more common and less expensive. Starting from multiomic data collected from patients and N-of-1 trials where individual patients constitute different case studies, methods for model-building and data integration are being used to generate patient-specific models. Coupled with state-of-the-art machine learning methods, this will allow characterizing each patient's disease phenotype and delivering precision medicine solutions, therefore leading to preventative medicine, reduced treatment, and in silico clinical trials.
Collapse
|