1
|
Gonçalves DM, Henriques R, Costa RS. Predicting metabolic fluxes from omics data via machine learning: Moving from knowledge-driven towards data-driven approaches. Comput Struct Biotechnol J 2023; 21:4960-4973. [PMID: 37876626 PMCID: PMC10590844 DOI: 10.1016/j.csbj.2023.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 10/01/2023] [Accepted: 10/01/2023] [Indexed: 10/26/2023] Open
Abstract
The accurate prediction of phenotypes in microorganisms is a main challenge for systems biology. Genome-scale models (GEMs) are a widely used mathematical formalism for predicting metabolic fluxes using constraint-based modeling methods such as flux balance analysis (FBA). However, they require prior knowledge of the metabolic network of an organism and appropriate objective functions, often hampering the prediction of metabolic fluxes under different conditions. Moreover, the integration of omics data to improve the accuracy of phenotype predictions in different physiological states is still in its infancy. Here, we present a novel approach for predicting fluxes under various conditions. We explore the use of supervised machine learning (ML) models using transcriptomics and/or proteomics data and compare their performance against the standard parsimonious FBA (pFBA) approach using case studies of Escherichia coli organism as an example. Our results show that the proposed omics-based ML approach is promising to predict both internal and external metabolic fluxes with smaller prediction errors in comparison to the pFBA approach. The code, data, and detailed results are available at the project's repository[1].
Collapse
Affiliation(s)
- Daniel M. Gonçalves
- INESC-ID, Rua Alves Redol, 9, Lisbon, 1000-029, Portugal
- Instituto Superior Técnico, Av. Rovisco Pais, 1, Lisbon, 1049-001, Portugal
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, 2829-516, Portugal
| | - Rui Henriques
- INESC-ID, Rua Alves Redol, 9, Lisbon, 1000-029, Portugal
- Instituto Superior Técnico, Av. Rovisco Pais, 1, Lisbon, 1049-001, Portugal
| | - Rafael S. Costa
- LAQV-REQUIMTE, Department of Chemistry, NOVA School of Science and Technology, Universidade NOVA de Lisboa, Caparica, 2829-516, Portugal
| |
Collapse
|
2
|
Zhou Y, Gao J. A Novel Online Nomogram Established with Five Features before Surgical Resection for Predicating Prognosis of Neuroblastoma Children: A Population-Based Study. Technol Cancer Res Treat 2023; 22:15330338221145141. [PMID: 36604997 PMCID: PMC9829992 DOI: 10.1177/15330338221145141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Background: Neuroblastoma (NB) is the most common childhood cancer, but doctors are unable to predict its overall survival (OS) rate before surgery. We aimed to predict the OS of NB children with some clinical features obtained from biopsy before surgery. Methods: Clinical features of NB children were retrospectively collected from the Therapeutically Applicable Research to Generate Effective Treatments database. The C-index, area under the receiver operating characteristic curve (AUC), calibration curves, and decision curves analysis were used to estimate nomogram models. Results: A total of 488 NB children were evaluated, and the Boruta algorithm was used to detect risk factors. The results showed that artificial neural networks with selected features were able to predict more than 90% of NB children. Five risk factors were used in the construction of the nomogram, including age at diagnosis, MYCN status, ploidy value, histology, and mitosis-karyorrhexis index (MKI). The C-index of the nomogram in training cohort and validation cohort was 0.716 and 0.731. AUC values for 1-, 3-, and 5-years OS predictions were 0.706, 0.755, and 0.762, respectively, and showed good calibrations. Decision curve analysis indicated a better predictability with the nomogram model based on Cox regression compared with one that included all variables and histology only. Also, the Kaplan-Meier curves showed a significantly higher survival probability in the low-risk group (total score <118.34) versus the high-risk group (total score ≥ 118.34) (p < 0.05) using the nomogram model. Conclusions: A web application based on the nomogram model in the present study can be accessed at https://mdzhou.shinyapps.io/DynNomapp/, which could help doctors make accurate clinical decisions about NB children.
Collapse
Affiliation(s)
- Yu Zhou
- Department of Child Rehabilitation Division, Huai’an Maternal and
Child Health Care Center, Huai’an, China,Affiliated Hospital of Yang Zhou University Medical College Huai’an
Maternal and Child Health Care Center, Huai’an, China
| | - Jing Gao
- Department of Child Rehabilitation Division, Huai’an Maternal and
Child Health Care Center, Huai’an, China,Affiliated Hospital of Yang Zhou University Medical College Huai’an
Maternal and Child Health Care Center, Huai’an, China,Jing Gao, Department of Child
Rehabilitation Division, Huai’an Maternal and Child Health Care Center, Huai’an
223002, China.
| |
Collapse
|
3
|
Lo-Thong-Viramoutou O, Charton P, Cadet XF, Grondin-Perez B, Saavedra E, Damour C, Cadet F. Non-linearity of Metabolic Pathways Critically Influences the Choice of Machine Learning Model. Front Artif Intell 2022; 5:744755. [PMID: 35757298 PMCID: PMC9226554 DOI: 10.3389/frai.2022.744755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 04/29/2022] [Indexed: 11/13/2022] Open
Abstract
The use of machine learning (ML) in life sciences has gained wide interest over the past years, as it speeds up the development of high performing models. Important modeling tools in biology have proven their worth for pathway design, such as mechanistic models and metabolic networks, as they allow better understanding of mechanisms involved in the functioning of organisms. However, little has been done on the use of ML to model metabolic pathways, and the degree of non-linearity associated with them is not clear. Here, we report the construction of different metabolic pathways with several linear and non-linear ML models. Different types of data are used; they lead to the prediction of important biological data, such as pathway flux and final product concentration. A comparison reveals that the data features impact model performance and highlight the effectiveness of non-linear models (e.g., QRF: RMSE = 0.021 nmol·min-1 and R2 = 1 vs. Bayesian GLM: RMSE = 1.379 nmol·min-1 R2 = 0.823). It turns out that the greater the degree of non-linearity of the pathway, the better suited a non-linear model will be. Therefore, a decision-making support for pathway modeling is established. These findings generally support the hypothesis that non-linear aspects predominate within the metabolic pathways. This must be taken into account when devising possible applications of these pathways for the identification of biomarkers of diseases (e.g., infections, cancer, neurodegenerative diseases) or the optimization of industrial production processes.
Collapse
Affiliation(s)
- Ophélie Lo-Thong-Viramoutou
- University of Paris, BIGR—Biologie Intégrée du Globule Rouge, Inserm, UMR_S1134, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
- Laboratory DSIMB, UMR_S1134, BIGR, Inserm, Faculty of Sciences and Technology, University of La Reunion, Saint-Denis, France
| | - Philippe Charton
- University of Paris, BIGR—Biologie Intégrée du Globule Rouge, Inserm, UMR_S1134, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
- Laboratory DSIMB, UMR_S1134, BIGR, Inserm, Faculty of Sciences and Technology, University of La Reunion, Saint-Denis, France
| | | | - Brigitte Grondin-Perez
- EnergyLab, EA 4079, Faculty of Sciences and Technology, University of La Reunion, Saint-Denis, France
| | - Emma Saavedra
- Departamento de Bioquímica, Instituto Nacional de Cardiología Ignacio Chávez, Mexico City, Mexico
| | - Cédric Damour
- EnergyLab, EA 4079, Faculty of Sciences and Technology, University of La Reunion, Saint-Denis, France
| | - Frédéric Cadet
- University of Paris, BIGR—Biologie Intégrée du Globule Rouge, Inserm, UMR_S1134, Paris, France
- Laboratory of Excellence GR-Ex, Paris, France
- Laboratory DSIMB, UMR_S1134, BIGR, Inserm, Faculty of Sciences and Technology, University of La Reunion, Saint-Denis, France
| |
Collapse
|
4
|
Dong H, Wang X. Identification of Signature Genes and Construction of an Artificial Neural Network Model of Prostate Cancer. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:1562511. [PMID: 35432828 PMCID: PMC9010146 DOI: 10.1155/2022/1562511] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 03/21/2022] [Accepted: 03/23/2022] [Indexed: 11/22/2022]
Abstract
This study aimed to establish an artificial neural network (ANN) model based on prostate cancer signature genes (PCaSGs) to predict the patients with prostate cancer (PCa). In the present study, 270 differentially expressed genes (DEGs) were identified between PCa and normal prostate (NP) groups by differential gene expression analysis. Next, we performed Metascape gene annotation, pathway and process enrichment analysis, and PPI enrichment analysis on all 270 DEGs. Then, we identified and screened out 30 PCaSGs based on the random forest analysis and constructed an ANN model based on the gene score matrix consisting of 30 PCaSGs. Lastly, analysis of microarray dataset GSE46602 showed that the accuracy of this model for predicating PCa and NP samples was 88.9 and 78.6%, respectively. Our results suggested that the ANN model based on PCaSGs can be used for effectively predicting the patients with PCa and will be helpful for early PCa diagnosis and treatment.
Collapse
Affiliation(s)
- Hongye Dong
- Department of Kidney Disease and Blood Purifification Center, The Second Hospital of Tianjin Medical University, Tianjin 300211, China
| | - Xu Wang
- Department of Urology, Tianjin Institute of Urology, The Second Hospital of Tianjin Medical University, Tianjin 300211, China
| |
Collapse
|
5
|
Vijayakumar S, Magazzù G, Moon P, Occhipinti A, Angione C. A Practical Guide to Integrating Multimodal Machine Learning and Metabolic Modeling. Methods Mol Biol 2022; 2399:87-122. [PMID: 35604554 DOI: 10.1007/978-1-0716-1831-8_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Complex, distributed, and dynamic sets of clinical biomedical data are collectively referred to as multimodal clinical data. In order to accommodate the volume and heterogeneity of such diverse data types and aid in their interpretation when they are combined with a multi-scale predictive model, machine learning is a useful tool that can be wielded to deconstruct biological complexity and extract relevant outputs. Additionally, genome-scale metabolic models (GSMMs) are one of the main frameworks striving to bridge the gap between genotype and phenotype by incorporating prior biological knowledge into mechanistic models. Consequently, the utilization of GSMMs as a foundation for the integration of multi-omic data originating from different domains is a valuable pursuit towards refining predictions. In this chapter, we show how cancer multi-omic data can be analyzed via multimodal machine learning and metabolic modeling. Firstly, we focus on the merits of adopting an integrative systems biology led approach to biomedical data mining. Following this, we propose how constraint-based metabolic models can provide a stable yet adaptable foundation for the integration of multimodal data with machine learning. Finally, we provide a step-by-step tutorial for the combination of machine learning and GSMMs, which includes: (i) tissue-specific constraint-based modeling; (ii) survival analysis using time-to-event prediction for cancer; and (iii) classification and regression approaches for multimodal machine learning. The code associated with the tutorial can be found at https://github.com/Angione-Lab/Tutorials_Combining_ML_and_GSMM .
Collapse
Affiliation(s)
- Supreeta Vijayakumar
- Computational Systems Biology and Data Analytics Research Group, Teesside University, Middlebrough, UK
| | - Giuseppe Magazzù
- Computational Systems Biology and Data Analytics Research Group, Teesside University, Middlebrough, UK
| | - Pradip Moon
- Computational Systems Biology and Data Analytics Research Group, Teesside University, Middlebrough, UK
| | - Annalisa Occhipinti
- Computational Systems Biology and Data Analytics Research Group, Middlebrough, UK
- Centre for Digital Innovation, Teesside University, Middlesbrough, UK
| | - Claudio Angione
- Computational Systems Biology and Data Analytics Research Group, Teesside University, Middlebrough, UK.
- Centre for Digital Innovation, Teesside University, Middlesbrough, UK.
- Healthcare Innovation Centre, Teesside University, Middlesbrough, UK.
| |
Collapse
|
6
|
Tinte MM, Chele KH, van der Hooft JJJ, Tugizimana F. Metabolomics-Guided Elucidation of Plant Abiotic Stress Responses in the 4IR Era: An Overview. Metabolites 2021; 11:445. [PMID: 34357339 PMCID: PMC8305945 DOI: 10.3390/metabo11070445] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 06/30/2021] [Accepted: 07/03/2021] [Indexed: 12/27/2022] Open
Abstract
Plants are constantly challenged by changing environmental conditions that include abiotic stresses. These are limiting their development and productivity and are subsequently threatening our food security, especially when considering the pressure of the increasing global population. Thus, there is an urgent need for the next generation of crops with high productivity and resilience to climate change. The dawn of a new era characterized by the emergence of fourth industrial revolution (4IR) technologies has redefined the ideological boundaries of research and applications in plant sciences. Recent technological advances and machine learning (ML)-based computational tools and omics data analysis approaches are allowing scientists to derive comprehensive metabolic descriptions and models for the target plant species under specific conditions. Such accurate metabolic descriptions are imperatively essential for devising a roadmap for the next generation of crops that are resilient to environmental deterioration. By synthesizing the recent literature and collating data on metabolomics studies on plant responses to abiotic stresses, in the context of the 4IR era, we point out the opportunities and challenges offered by omics science, analytical intelligence, computational tools and big data analytics. Specifically, we highlight technological advancements in (plant) metabolomics workflows and the use of machine learning and computational tools to decipher the dynamics in the chemical space that define plant responses to abiotic stress conditions.
Collapse
Affiliation(s)
- Morena M. Tinte
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa; (M.M.T.); (K.H.C.)
| | - Kekeletso H. Chele
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa; (M.M.T.); (K.H.C.)
| | | | - Fidele Tugizimana
- Department of Biochemistry, University of Johannesburg, Auckland Park, Johannesburg 2006, South Africa; (M.M.T.); (K.H.C.)
- International Research and Development Division, Omnia Group, Ltd., Johannesburg 2021, South Africa
| |
Collapse
|
7
|
Illias HA, Lim MM, Abu Bakar AH, Mokhlis H, Ishak S, Amir MDM. Classification of abnormal location in medium voltage switchgears using hybrid gravitational search algorithm-artificial intelligence. PLoS One 2021; 16:e0253967. [PMID: 34197530 PMCID: PMC8248718 DOI: 10.1371/journal.pone.0253967] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 06/17/2021] [Indexed: 11/17/2022] Open
Abstract
In power system networks, automatic fault diagnosis techniques of switchgears with high accuracy and less time consuming are important. In this work, classification of abnormal location in switchgears is proposed using hybrid gravitational search algorithm (GSA)-artificial intelligence (AI) techniques. The measurement data were obtained from ultrasound, transient earth voltage, temperature and sound sensors. The AI classifiers used include artificial neural network (ANN) and support vector machine (SVM). The performance of both classifiers was optimized by an optimization technique, GSA. The advantages of GSA classification on AI in classifying the abnormal location in switchgears are easy implementation, fast convergence and low computational cost. For performance comparison, several well-known metaheuristic techniques were also applied on the AI classifiers. From the comparison between ANN and SVM without optimization by GSA, SVM yields 2% higher accuracy than ANN. However, ANN yields slightly higher accuracy than SVM after combining with GSA, which is in the range of 97%-99% compared to 95%-97% for SVM. On the other hand, GSA-SVM converges faster than GSA-ANN. Overall, it was found that combination of both AI classifiers with GSA yields better results than several well-known metaheuristic techniques.
Collapse
Affiliation(s)
- Hazlee Azil Illias
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
- Centre of Advanced Manufacturing & Material Processing (AMMP Centre), Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Ming Ming Lim
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Ab Halim Abu Bakar
- UM Power Energy Dedicated Advanced Centre (UMPEDAC), Level 4, Wisma R&D UM, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Hazlie Mokhlis
- Department of Electrical Engineering, Faculty of Engineering, Universiti Malaya, Kuala Lumpur, Malaysia
| | - Sanuri Ishak
- TNB Research Sdn. Bhd., No. 1, Kawasan Institusi Penyelidikan, Kajang, Selangor, Malaysia
| | - Mohd Dzaki Mohd Amir
- TNB Research Sdn. Bhd., No. 1, Kawasan Institusi Penyelidikan, Kajang, Selangor, Malaysia
| |
Collapse
|
8
|
Lo-Thong O, Charton P, Cadet XF, Grondin-Perez B, Saavedra E, Damour C, Cadet F. Identification of flux checkpoints in a metabolic pathway through white-box, grey-box and black-box modeling approaches. Sci Rep 2020; 10:13446. [PMID: 32778715 PMCID: PMC7417601 DOI: 10.1038/s41598-020-70295-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 07/27/2020] [Indexed: 11/29/2022] Open
Abstract
Metabolic pathway modeling plays an increasing role in drug design by allowing better understanding of the underlying regulation and controlling networks in the metabolism of living organisms. However, despite rapid progress in this area, pathway modeling can become a real nightmare for researchers, notably when few experimental data are available or when the pathway is highly complex. Here, three different approaches were developed to model the second part of glycolysis of E. histolytica as an application example, and have succeeded in predicting the final pathway flux: one including detailed kinetic information (white-box), another with an added adjustment term (grey-box) and the last one using an artificial neural network method (black-box). Afterwards, each model was used for metabolic control analysis and flux control coefficient determination. The first two enzymes of this pathway are identified as the key enzymes playing a role in flux control. This study revealed the significance of the three methods for building suitable models adjusted to the available data in the field of metabolic pathway modeling, and could be useful to biologists and modelers.
Collapse
Affiliation(s)
- Ophélie Lo-Thong
- University of Paris, UMR_S1134, BIGR, Inserm, 75015, Paris, France.,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, 97715, Saint-Denis, France
| | - Philippe Charton
- University of Paris, UMR_S1134, BIGR, Inserm, 75015, Paris, France.,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, 97715, Saint-Denis, France
| | - Xavier F Cadet
- PEACCEL, Artificial Intelligence Department, 6 square Albin Cachot, box 42, 75013, Paris, France
| | - Brigitte Grondin-Perez
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, 97444, St Denis cedex, France
| | - Emma Saavedra
- Departamento de Bioquímica, Instituto Nacional de Cardiología Ignacio Chávez, 14080, Mexico City, Mexico
| | - Cédric Damour
- LE2P, Laboratory of Energy, Electronics and Processes EA 4079, Faculty of Sciences and Technology, University of La Reunion, 97444, St Denis cedex, France
| | - Frédéric Cadet
- University of Paris, UMR_S1134, BIGR, Inserm, 75015, Paris, France. .,DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, 97715, Saint-Denis, France.
| |
Collapse
|
9
|
Damiani C, Gaglio D, Sacco E, Alberghina L, Vanoni M. Systems metabolomics: from metabolomic snapshots to design principles. Curr Opin Biotechnol 2020; 63:190-199. [PMID: 32278263 DOI: 10.1016/j.copbio.2020.02.013] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/11/2020] [Accepted: 02/18/2020] [Indexed: 02/07/2023]
Abstract
Metabolomics is a rapidly expanding technology that finds increasing application in a variety of fields, form metabolic disorders to cancer, from nutrition and wellness to design and optimization of cell factories. The integration of metabolic snapshots with metabolic fluxes, physiological readouts, metabolic models, and knowledge-informed Artificial Intelligence tools, is required to obtain a system-level understanding of metabolism. The emerging power of multi-omic approaches and the development of integrated experimental and computational tools, able to dissect metabolic features at cellular and subcellular resolution, provide unprecedented opportunities for understanding design principles of metabolic (dis)regulation and for the development of precision therapies in multifactorial diseases, such as cancer and neurodegenerative diseases.
Collapse
Affiliation(s)
- Chiara Damiani
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy; ISBE.IT, SYSBIO Centre of Systems Biology, Piazza della Scienza 2, Milan 20126, Italy
| | - Daniela Gaglio
- ISBE.IT, SYSBIO Centre of Systems Biology, Piazza della Scienza 2, Milan 20126, Italy; Institute of Molecular Bioimaging and Physiology (IBFM), National Research Council (CNR), Segrate, Milan, Italy
| | - Elena Sacco
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy; ISBE.IT, SYSBIO Centre of Systems Biology, Piazza della Scienza 2, Milan 20126, Italy
| | - Lilia Alberghina
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy; ISBE.IT, SYSBIO Centre of Systems Biology, Piazza della Scienza 2, Milan 20126, Italy
| | - Marco Vanoni
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy; ISBE.IT, SYSBIO Centre of Systems Biology, Piazza della Scienza 2, Milan 20126, Italy.
| |
Collapse
|
10
|
A Machine Learning Approach for Efficient Selection of Enzyme Concentrations and Its Application for Flux Optimization. Catalysts 2020. [DOI: 10.3390/catal10030291] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The metabolic engineering of pathways has been used extensively to produce molecules of interest on an industrial scale. Methods like gene regulation or substrate channeling helped to improve the desired product yield. Cell-free systems are used to overcome the weaknesses of engineered strains. One of the challenges in a cell-free system is selecting the optimized enzyme concentration for optimal yield. Here, a machine learning approach is used to select the enzyme concentration for the upper part of glycolysis. The artificial neural network approach (ANN) is known to be inefficient in extrapolating predictions outside the box: high predicted values will bump into a sort of “glass ceiling”. In order to explore this “glass ceiling” space, we developed a new methodology named glass ceiling ANN (GC-ANN). Principal component analysis (PCA) and data classification methods are used to derive a rule for a high flux, and ANN to predict the flux through the pathway using the input data of 121 balances of four enzymes in the upper part of glycolysis. The outcomes of this study are i. in silico selection of optimum enzyme concentrations for a maximum flux through the pathway and ii. experimental in vitro validation of the “out-of-the-box” fluxes predicted using this new approach. Surprisingly, flux improvements of up to 63% were obtained. Gratifyingly, these improvements are coupled with a cost decrease of up to 25% for the assay.
Collapse
|