1
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
2
|
Noecker C, Turnbaugh PJ. Emerging tools and best practices for studying gut microbial community metabolism. Nat Metab 2024:10.1038/s42255-024-01074-z. [PMID: 38961185 DOI: 10.1038/s42255-024-01074-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 05/30/2024] [Indexed: 07/05/2024]
Abstract
The human gut microbiome vastly extends the set of metabolic reactions catalysed by our own cells, with far-reaching consequences for host health and disease. However, our knowledge of gut microbial metabolism relies on a handful of model organisms, limiting our ability to interpret and predict the metabolism of complex microbial communities. In this Perspective, we discuss emerging tools for analysing and modelling the metabolism of gut microorganisms and for linking microorganisms, pathways and metabolites at the ecosystem level, highlighting promising best practices for researchers. Continued progress in this area will also require infrastructure development to facilitate cross-disciplinary synthesis of scientific findings. Collectively, these efforts can enable a broader and deeper understanding of the workings of the gut ecosystem and open new possibilities for microbiome manipulation and therapy.
Collapse
Affiliation(s)
- Cecilia Noecker
- Department of Biological Sciences, Minnesota State University, Mankato, Mankato, MN, USA
- Department of Microbiology & Immunology, University of California, San Francisco, San Francisco, CA, USA
| | - Peter J Turnbaugh
- Department of Microbiology & Immunology, University of California, San Francisco, San Francisco, CA, USA.
- Chan Zuckerberg Biohub-San Francisco, San Francisco, CA, USA.
| |
Collapse
|
3
|
Tarzi C, Zampieri G, Sullivan N, Angione C. Emerging methods for genome-scale metabolic modeling of microbial communities. Trends Endocrinol Metab 2024; 35:533-548. [PMID: 38575441 DOI: 10.1016/j.tem.2024.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/28/2024] [Accepted: 02/29/2024] [Indexed: 04/06/2024]
Abstract
Genome-scale metabolic models (GEMs) are consolidating as platforms for studying mixed microbial populations, by combining biological data and knowledge with mathematical rigor. However, deploying these models to answer research questions can be challenging due to the increasing number of available computational tools, the lack of universal standards, and their inherent limitations. Here, we present a comprehensive overview of foundational concepts for building and evaluating genome-scale models of microbial communities. We then compare tools in terms of requirements, capabilities, and applications. Next, we highlight the current pitfalls and open challenges to consider when adopting existing tools and developing new ones. Our compendium can be relevant for the expanding community of modelers, both at the entry and experienced levels.
Collapse
Affiliation(s)
- Chaimaa Tarzi
- School of Computing, Engineering and Digital Technologies, Teesside University, Southfield Rd, Middlesbrough, TS1 3BX, North Yorkshire, UK
| | - Guido Zampieri
- Department of Biology, University of Padova, Padova, 35122, Veneto, Italy
| | - Neil Sullivan
- Complement Genomics Ltd, Station Rd, Lanchester, Durham, DH7 0EX, County Durham, UK
| | - Claudio Angione
- School of Computing, Engineering and Digital Technologies, Teesside University, Southfield Rd, Middlesbrough, TS1 3BX, North Yorkshire, UK; Centre for Digital Innovation, Teesside University, Southfield Rd, Middlesbrough, TS1 3BX, North Yorkshire, UK; National Horizons Centre, Teesside University, 38 John Dixon Ln, Darlington, DL1 1HG, North Yorkshire, UK.
| |
Collapse
|
4
|
Goshisht MK. Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges. ACS OMEGA 2024; 9:9921-9945. [PMID: 38463314 PMCID: PMC10918679 DOI: 10.1021/acsomega.3c05913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 01/19/2024] [Accepted: 01/30/2024] [Indexed: 03/12/2024]
Abstract
Machine learning (ML), particularly deep learning (DL), has made rapid and substantial progress in synthetic biology in recent years. Biotechnological applications of biosystems, including pathways, enzymes, and whole cells, are being probed frequently with time. The intricacy and interconnectedness of biosystems make it challenging to design them with the desired properties. ML and DL have a synergy with synthetic biology. Synthetic biology can be employed to produce large data sets for training models (for instance, by utilizing DNA synthesis), and ML/DL models can be employed to inform design (for example, by generating new parts or advising unrivaled experiments to perform). This potential has recently been brought to light by research at the intersection of engineering biology and ML/DL through achievements like the design of novel biological components, best experimental design, automated analysis of microscopy data, protein structure prediction, and biomolecular implementations of ANNs (Artificial Neural Networks). I have divided this review into three sections. In the first section, I describe predictive potential and basics of ML along with myriad applications in synthetic biology, especially in engineering cells, activity of proteins, and metabolic pathways. In the second section, I describe fundamental DL architectures and their applications in synthetic biology. Finally, I describe different challenges causing hurdles in the progress of ML/DL and synthetic biology along with their solutions.
Collapse
Affiliation(s)
- Manoj Kumar Goshisht
- Department of Chemistry, Natural and
Applied Sciences, University of Wisconsin—Green
Bay, Green
Bay, Wisconsin 54311-7001, United States
| |
Collapse
|
5
|
Tubergen PJ, Medlock G, Moore A, Zhang X, Papin JA, Danna CH. A computational model of Pseudomonas syringae metabolism unveils a role for branched-chain amino acids in Arabidopsis leaf colonization. PLoS Comput Biol 2023; 19:e1011651. [PMID: 38150474 PMCID: PMC10775980 DOI: 10.1371/journal.pcbi.1011651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 01/09/2024] [Accepted: 11/02/2023] [Indexed: 12/29/2023] Open
Abstract
Bacterial pathogens adapt their metabolism to the plant environment to successfully colonize their hosts. In our efforts to uncover the metabolic pathways that contribute to the colonization of Arabidopsis thaliana leaves by Pseudomonas syringae pv tomato DC3000 (Pst DC3000), we created iPst19, an ensemble of 100 genome-scale network reconstructions of Pst DC3000 metabolism. We developed a novel approach for gene essentiality screens, leveraging the predictive power of iPst19 to identify core and ancillary condition-specific essential genes. Constraining the metabolic flux of iPst19 with Pst DC3000 gene expression data obtained from naïve-infected or pre-immunized-infected plants, revealed changes in bacterial metabolism imposed by plant immunity. Machine learning analysis revealed that among other amino acids, branched-chain amino acids (BCAAs) metabolism significantly contributed to the overall metabolic status of each gene-expression-contextualized iPst19 simulation. These predictions were tested and confirmed experimentally. Pst DC3000 growth and gene expression analysis showed that BCAAs suppress virulence gene expression in vitro without affecting bacterial growth. In planta, however, an excess of BCAAs suppress the expression of virulence genes at the early stages of infection and significantly impair the colonization of Arabidopsis leaves. Our findings suggesting that BCAAs catabolism is necessary to express virulence and colonize the host. Overall, this study provides valuable insights into how plant immunity impacts Pst DC3000 metabolism, and how bacterial metabolism impacts the expression of virulence.
Collapse
Affiliation(s)
- Philip J. Tubergen
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Greg Medlock
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
| | - Anni Moore
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
| | - Xiaomu Zhang
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| | - Jason A. Papin
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
| | - Cristian H. Danna
- Department of Biology, University of Virginia, Charlottesville, Virginia, United States of America
| |
Collapse
|
6
|
Procopio A, Cesarelli G, Donisi L, Merola A, Amato F, Cosentino C. Combined mechanistic modeling and machine-learning approaches in systems biology - A systematic literature review. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2023; 240:107681. [PMID: 37385142 DOI: 10.1016/j.cmpb.2023.107681] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/14/2023] [Accepted: 06/14/2023] [Indexed: 07/01/2023]
Abstract
BACKGROUND AND OBJECTIVE Mechanistic-based Model simulations (MM) are an effective approach commonly employed, for research and learning purposes, to better investigate and understand the inherent behavior of biological systems. Recent advancements in modern technologies and the large availability of omics data allowed the application of Machine Learning (ML) techniques to different research fields, including systems biology. However, the availability of information regarding the analyzed biological context, sufficient experimental data, as well as the degree of computational complexity, represent some of the issues that both MMs and ML techniques could present individually. For this reason, recently, several studies suggest overcoming or significantly reducing these drawbacks by combining the above-mentioned two methods. In the wake of the growing interest in this hybrid analysis approach, with the present review, we want to systematically investigate the studies available in the scientific literature in which both MMs and ML have been combined to explain biological processes at genomics, proteomics, and metabolomics levels, or the behavior of entire cellular populations. METHODS Elsevier Scopus®, Clarivate Web of Science™ and National Library of Medicine PubMed® databases were enquired using the queries reported in Table 1, resulting in 350 scientific articles. RESULTS Only 14 of the 350 documents returned by the comprehensive search conducted on the three major online databases met our search criteria, i.e. present a hybrid approach consisting of the synergistic combination of MMs and ML to treat a particular aspect of systems biology. CONCLUSIONS Despite the recent interest in this methodology, from a careful analysis of the selected papers, it emerged how examples of integration between MMs and ML are already present in systems biology, highlighting the great potential of this hybrid approach to both at micro and macro biological scales.
Collapse
Affiliation(s)
- Anna Procopio
- Department of Experimental and Clinical Medicine, Università degli Studi Magna Græcia, Catanzaro, 88100, Italia
| | - Giuseppe Cesarelli
- Department of Electrical Engineering and Information Technology, Università degli Studi di Napoli Federico II, Napoli, 80125, Italy
| | - Leandro Donisi
- Department of Advanced Medical and Surgical Sciences, Università della Campania Luigi Vanvitelli, Napoli, 80138, Italy
| | - Alessio Merola
- Department of Experimental and Clinical Medicine, Università degli Studi Magna Græcia, Catanzaro, 88100, Italia
| | - Francesco Amato
- Department of Electrical Engineering and Information Technology, Università degli Studi di Napoli Federico II, Napoli, 80125, Italy.
| | - Carlo Cosentino
- Department of Experimental and Clinical Medicine, Università degli Studi Magna Græcia, Catanzaro, 88100, Italia.
| |
Collapse
|
7
|
Liyanaarachchi VC, Nishshanka GKSH, Nimarshana PHV, Chang JS, Ariyadasa TU, Nagarajan D. Modeling of astaxanthin biosynthesis via machine learning, mathematical and metabolic network modeling. Crit Rev Biotechnol 2023:1-22. [PMID: 37587012 DOI: 10.1080/07388551.2023.2237183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 05/04/2023] [Accepted: 06/17/2023] [Indexed: 08/18/2023]
Abstract
Natural astaxanthin is synthesized by diverse organisms including: bacteria, fungi, microalgae, and plants involving complex cellular processes, which depend on numerous interrelated parameters. Nonetheless, existing knowledge regarding astaxanthin biosynthesis and the conditions influencing astaxanthin accumulation is fairly limited. Thus, manipulation of the growth conditions to achieve desired biomass and astaxanthin yields can be a complicated process requiring cost-intensive and time-consuming experiment-based research. As a potential solution, modeling and simulation of biological systems have recently emerged, allowing researchers to predict/estimate astaxanthin production dynamics in selected organisms. Moreover, mathematical modeling techniques would enable further optimization of astaxanthin synthesis in a shorter period of time, ultimately contributing to a notable reduction in production costs. Thus, the present review comprehensively discusses existing mathematical modeling techniques which simulate the bioaccumulation of astaxanthin in diverse organisms. Associated challenges, solutions, and future perspectives are critically analyzed and presented.
Collapse
Affiliation(s)
| | | | - P H Viraj Nimarshana
- Department of Mechanical Engineering, Faculty of Engineering, University of Moratuwa, Moratuwa, Sri Lanka
| | - Jo-Shu Chang
- Department of Chemical Engineering, National Cheng Kung University, Tainan, Taiwan
- Department of Chemical and Materials Engineering, Tunghai University, Taichung, Taiwan
- Research Center for Smart Sustainable Circular Economy, Tunghai University, Taichung, Taiwan
- Department of Chemical Engineering and Materials Science, Yuan Ze University, Chung-Li, Taiwan
| | - Thilini U Ariyadasa
- Department of Chemical and Process Engineering, Faculty of Engineering, University of Moratuwa, Moratuwa, Sri Lanka
| | - Dillirani Nagarajan
- Department of Chemical Engineering, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
8
|
Bartmanski BJ, Rocha M, Zimmermann-Kogadeeva M. Recent advances in data- and knowledge-driven approaches to explore primary microbial metabolism. Curr Opin Chem Biol 2023; 75:102324. [PMID: 37207402 PMCID: PMC10410306 DOI: 10.1016/j.cbpa.2023.102324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 04/15/2023] [Accepted: 04/18/2023] [Indexed: 05/21/2023]
Abstract
With the rapid progress in metabolomics and sequencing technologies, more data on the metabolome of single microbes and their communities become available, revealing the potential of microorganisms to metabolize a broad range of chemical compounds. The analysis of microbial metabolomics datasets remains challenging since it inherits the technical challenges of metabolomics analysis, such as compound identification and annotation, while harboring challenges in data interpretation, such as distinguishing metabolite sources in mixed samples. This review outlines the recent advances in computational methods to analyze primary microbial metabolism: knowledge-based approaches that take advantage of metabolic and molecular networks and data-driven approaches that employ machine/deep learning algorithms in combination with large-scale datasets. These methods aim at improving metabolite identification and disentangling reciprocal interactions between microbes and metabolites. We also discuss the perspective of combining these approaches and further developments required to advance the investigation of primary metabolism in mixed microbial samples.
Collapse
Affiliation(s)
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Campus of Gualtar, Braga, Portugal
| | | |
Collapse
|
9
|
Sen P, Orešič M. Integrating Omics Data in Genome-Scale Metabolic Modeling: A Methodological Perspective for Precision Medicine. Metabolites 2023; 13:855. [PMID: 37512562 PMCID: PMC10383060 DOI: 10.3390/metabo13070855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 07/11/2023] [Accepted: 07/17/2023] [Indexed: 07/30/2023] Open
Abstract
Recent advancements in omics technologies have generated a wealth of biological data. Integrating these data within mathematical models is essential to fully leverage their potential. Genome-scale metabolic models (GEMs) provide a robust framework for studying complex biological systems. GEMs have significantly contributed to our understanding of human metabolism, including the intrinsic relationship between the gut microbiome and the host metabolism. In this review, we highlight the contributions of GEMs and discuss the critical challenges that must be overcome to ensure their reproducibility and enhance their prediction accuracy, particularly in the context of precision medicine. We also explore the role of machine learning in addressing these challenges within GEMs. The integration of omics data with GEMs has the potential to lead to new insights, and to advance our understanding of molecular mechanisms in human health and disease.
Collapse
Affiliation(s)
- Partho Sen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland
- School of Medical Sciences, Faculty of Medicine and Health, Örebro University, 702 81 Örebro, Sweden
| | - Matej Orešič
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, FI-20520 Turku, Finland
- School of Medical Sciences, Faculty of Medicine and Health, Örebro University, 702 81 Örebro, Sweden
| |
Collapse
|
10
|
Molversmyr H, Øyås O, Rotnes F, Vik JO. Extracting functionally accurate context-specific models of Atlantic salmon metabolism. NPJ Syst Biol Appl 2023; 9:19. [PMID: 37244928 DOI: 10.1038/s41540-023-00280-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 05/05/2023] [Indexed: 05/29/2023] Open
Abstract
Constraint-based models (CBMs) are used to study metabolic network structure and function in organisms ranging from microbes to multicellular eukaryotes. Published CBMs are usually generic rather than context-specific, meaning that they do not capture differences in reaction activities, which, in turn, determine metabolic capabilities, between cell types, tissues, environments, or other conditions. Only a subset of a CBM's metabolic reactions and capabilities are likely to be active in any given context, and several methods have therefore been developed to extract context-specific models from generic CBMs through integration of omics data. We tested the ability of six model extraction methods (MEMs) to create functionally accurate context-specific models of Atlantic salmon using a generic CBM (SALARECON) and liver transcriptomics data from contexts differing in water salinity (life stage) and dietary lipids. Three MEMs (iMAT, INIT, and GIMME) outperformed the others in terms of functional accuracy, which we defined as the extracted models' ability to perform context-specific metabolic tasks inferred directly from the data, and one MEM (GIMME) was faster than the others. Context-specific versions of SALARECON consistently outperformed the generic version, showing that context-specific modeling better captures salmon metabolism. Thus, we demonstrate that results from human studies also hold for a non-mammalian animal and major livestock species.
Collapse
Affiliation(s)
- Håvard Molversmyr
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Ove Øyås
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Filip Rotnes
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Jon Olav Vik
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway.
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| |
Collapse
|
11
|
Chen C, Liao C, Liu YY. Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning. Nat Commun 2023; 14:2375. [PMID: 37185345 PMCID: PMC10130184 DOI: 10.1038/s41467-023-38110-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 04/14/2023] [Indexed: 05/17/2023] Open
Abstract
GEnome-scale Metabolic models (GEMs) are powerful tools to predict cellular metabolism and physiological states in living organisms. However, due to our imperfect knowledge of metabolic processes, even highly curated GEMs have knowledge gaps (e.g., missing reactions). Existing gap-filling methods typically require phenotypic data as input to tease out missing reactions. We still lack a computational method for rapid and accurate gap-filling of metabolic networks before experimental data is available. Here we present a deep learning-based method - CHEbyshev Spectral HyperlInk pREdictor (CHESHIRE) - to predict missing reactions in GEMs purely from metabolic network topology. We demonstrate that CHESHIRE outperforms other topology-based methods in predicting artificially removed reactions over 926 high- and intermediate-quality GEMs. Furthermore, CHESHIRE is able to improve the phenotypic predictions of 49 draft GEMs for fermentation products and amino acids secretions. Both types of validation suggest that CHESHIRE is a powerful tool for GEM curation to reveal unknown links between reactions and observed metabolic phenotypes.
Collapse
Affiliation(s)
- Can Chen
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Chen Liao
- Program for Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Yang-Yu Liu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA.
- Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, 61801, USA.
| |
Collapse
|
12
|
Lee CY, Dillard LR, Papin JA, Arnold KB. New perspectives into the vaginal microbiome with systems biology. Trends Microbiol 2023; 31:356-368. [PMID: 36272885 DOI: 10.1016/j.tim.2022.09.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 09/19/2022] [Accepted: 09/21/2022] [Indexed: 10/28/2022]
Abstract
The vaginal microbiome (VMB) is critical to female reproductive health; however, the mechanisms associated with optimal and non-optimal states remain poorly understood due to the complex community structure and dynamic nature. Quantitative systems biology techniques applied to the VMB have improved understanding of community composition and function using primarily statistical methods. In contrast, fewer mechanistic models that use a priori knowledge of VMB features to develop predictive models have been implemented despite their use for microbiomes at other sites, including the gastrointestinal tract. Here, we explore systems biology approaches that have been applied in the VMB, highlighting successful techniques and discussing new directions that hold promise for improving understanding of health and disease.
Collapse
Affiliation(s)
- Christina Y Lee
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Lillian R Dillard
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA; Department of Biochemistry & Molecular Genetics, University of Virginia, Charlottesville, VA, USA
| | - Jason A Papin
- Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | - Kelly B Arnold
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
13
|
Bi X, Cheng Y, Xu X, Lv X, Liu Y, Li J, Du G, Chen J, Ledesma-Amaro R, Liu L. etiBsu1209: A comprehensive multiscale metabolic model for Bacillus subtilis. Biotechnol Bioeng 2023; 120:1623-1639. [PMID: 36788025 DOI: 10.1002/bit.28355] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 12/08/2022] [Accepted: 02/13/2023] [Indexed: 02/16/2023]
Abstract
Genome-scale metabolic models (GEMs) have been widely used to guide the computational design of microbial cell factories, and to date, seven GEMs have been reported for Bacillus subtilis, a model gram-positive microorganism widely used in bioproduction of functional nutraceuticals and food ingredients. However, none of them are widely used because they often lead to erroneous predictions due to their low predictive power and lack of information on regulatory mechanisms. In this work, we constructed a new version of GEM for B. subtilis (iBsu1209), which contains 1209 genes, 1595 metabolites, and 1948 reactions. We applied machine learning to fill gaps, which formed a relatively complete metabolic network able to predict with high accuracy (89.3%) the growth of 1209 mutants under 12 different culture conditions. In addition, we developed a visualization and code-free software, Model Tool, for multiconstraints model reconstruction and analysis. We used this software to construct etiBsu1209, a multiscale model that integrates enzymatic constraints, thermodynamic constraints, and transcriptional regulatory networks. Furthermore, we used etiBsu1209 to guide a metabolic engineering strategy (knocking out fabI and yfkN genes) for the overproduction of nutraceutical menaquinone-7, and the titer increased to 153.94 mg/L, 2.2-times that of the parental strain. To the best of our knowledge, etiBsu1209 is the first comprehensive multiscale model for B. subtilis and can serve as a solid basis for rational computational design of B. subtilis cell factories for bioproduction.
Collapse
Affiliation(s)
- Xinyu Bi
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Yang Cheng
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Xianhao Xu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Jianghua Li
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Guocheng Du
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | - Jian Chen
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| | | | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi, China
| |
Collapse
|
14
|
Aminian-Dehkordi J, Valiei A, Mofrad MRK. Emerging computational paradigms to address the complex role of gut microbial metabolism in cardiovascular diseases. Front Cardiovasc Med 2022; 9:987104. [PMID: 36299869 PMCID: PMC9589059 DOI: 10.3389/fcvm.2022.987104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
The human gut microbiota and its associated perturbations are implicated in a variety of cardiovascular diseases (CVDs). There is evidence that the structure and metabolic composition of the gut microbiome and some of its metabolites have mechanistic associations with several CVDs. Nevertheless, there is a need to unravel metabolic behavior and underlying mechanisms of microbiome-host interactions. This need is even more highlighted when considering that microbiome-secreted metabolites contributing to CVDs are the subject of intensive research to develop new prevention and therapeutic techniques. In addition to the application of high-throughput data used in microbiome-related studies, advanced computational tools enable us to integrate omics into different mathematical models, including constraint-based models, dynamic models, agent-based models, and machine learning tools, to build a holistic picture of metabolic pathological mechanisms. In this article, we aim to review and introduce state-of-the-art mathematical models and computational approaches addressing the link between the microbiome and CVDs.
Collapse
Affiliation(s)
| | | | - Mohammad R. K. Mofrad
- Department of Bioengineering and Mechanical Engineering, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
15
|
Beardall WA, Stan GB, Dunlop MJ. Deep Learning Concepts and Applications for Synthetic Biology. GEN BIOTECHNOLOGY 2022; 1:360-371. [PMID: 36061221 PMCID: PMC9428732 DOI: 10.1089/genbio.2022.0017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 07/14/2022] [Indexed: 12/24/2022]
Abstract
Synthetic biology has a natural synergy with deep learning. It can be used to generate large data sets to train models, for example by using DNA synthesis, and deep learning models can be used to inform design, such as by generating novel parts or suggesting optimal experiments to conduct. Recently, research at the interface of engineering biology and deep learning has highlighted this potential through successes including the design of novel biological parts, protein structure prediction, automated analysis of microscopy data, optimal experimental design, and biomolecular implementations of artificial neural networks. In this review, we present an overview of synthetic biology-relevant classes of data and deep learning architectures. We also highlight emerging studies in synthetic biology that capitalize on deep learning to enable novel understanding and design, and discuss challenges and future opportunities in this space.
Collapse
Affiliation(s)
- William A.V. Beardall
- Department of Bioengineering, Imperial College London, London, United Kingdom
- Imperial College Centre of Excellence in Synthetic Biology, Imperial College London, London, United Kingdom
| | - Guy-Bart Stan
- Department of Bioengineering, Imperial College London, London, United Kingdom
- Imperial College Centre of Excellence in Synthetic Biology, Imperial College London, London, United Kingdom
| | - Mary J. Dunlop
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, USA
- Biological Design Center, Boston University, Boston, Massachusetts, USA
| |
Collapse
|
16
|
Liao X, Ma H, Tang YJ. Artificial intelligence: a solution to involution of design–build–test–learn cycle. Curr Opin Biotechnol 2022; 75:102712. [DOI: 10.1016/j.copbio.2022.102712] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 02/05/2022] [Accepted: 03/01/2022] [Indexed: 01/08/2023]
|
17
|
Lv X, Hueso-Gil A, Bi X, Wu Y, Liu Y, Liu L, Ledesma-Amaro R. New synthetic biology tools for metabolic control. Curr Opin Biotechnol 2022; 76:102724. [PMID: 35489308 DOI: 10.1016/j.copbio.2022.102724] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 02/28/2022] [Accepted: 03/20/2022] [Indexed: 11/29/2022]
Abstract
In industrial bioprocesses, microbial metabolism dictates the product yields, and therefore, our capacity to control it has an enormous potential to help us move towards a bio-based economy. The rapid development of multiomics data has accelerated our systematic understanding of complex metabolic regulatory mechanisms, which allow us to develop tools to manipulate them. In the last few years, machine learning-based metabolic modeling, Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) derived synthetic biology tools, and synthetic genetic circuits have been widely used to control the metabolism of microorganisms, manipulate gene expression, and build synthetic pathways for bioproduction. This review describes the latest developments for metabolic control, and focuses on the trends and challenges of metabolic engineering strategies.
Collapse
Affiliation(s)
- Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Angeles Hueso-Gil
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London SW72AZ, UK
| | - Xinyu Bi
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Yaokang Wu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China
| | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China; Science Center for Future Foods, Jiangnan University, Wuxi 214122, China.
| | - Rodrigo Ledesma-Amaro
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London SW72AZ, UK.
| |
Collapse
|
18
|
Passi A, Tibocha-Bonilla JD, Kumar M, Tec-Campos D, Zengler K, Zuniga C. Genome-Scale Metabolic Modeling Enables In-Depth Understanding of Big Data. Metabolites 2021; 12:14. [PMID: 35050136 PMCID: PMC8778254 DOI: 10.3390/metabo12010014] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/18/2021] [Accepted: 12/20/2021] [Indexed: 11/16/2022] Open
Abstract
Genome-scale metabolic models (GEMs) enable the mathematical simulation of the metabolism of archaea, bacteria, and eukaryotic organisms. GEMs quantitatively define a relationship between genotype and phenotype by contextualizing different types of Big Data (e.g., genomics, metabolomics, and transcriptomics). In this review, we analyze the available Big Data useful for metabolic modeling and compile the available GEM reconstruction tools that integrate Big Data. We also discuss recent applications in industry and research that include predicting phenotypes, elucidating metabolic pathways, producing industry-relevant chemicals, identifying drug targets, and generating knowledge to better understand host-associated diseases. In addition to the up-to-date review of GEMs currently available, we assessed a plethora of tools for developing new GEMs that include macromolecular expression and dynamic resolution. Finally, we provide a perspective in emerging areas, such as annotation, data managing, and machine learning, in which GEMs will play a key role in the further utilization of Big Data.
Collapse
Affiliation(s)
- Anurag Passi
- Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0760, USA; (A.P.); (M.K.); (D.T.-C.); (K.Z.)
| | - Juan D. Tibocha-Bonilla
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0760, USA;
| | - Manish Kumar
- Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0760, USA; (A.P.); (M.K.); (D.T.-C.); (K.Z.)
| | - Diego Tec-Campos
- Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0760, USA; (A.P.); (M.K.); (D.T.-C.); (K.Z.)
- Facultad de Ingeniería Química, Campus de Ciencias Exactas e Ingenierías, Universidad Autónoma de Yucatán, Merida 97203, Yucatan, Mexico
| | - Karsten Zengler
- Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0760, USA; (A.P.); (M.K.); (D.T.-C.); (K.Z.)
- Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093-0412, USA
- Center for Microbiome Innovation, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0403, USA
| | - Cristal Zuniga
- Department of Pediatrics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0760, USA; (A.P.); (M.K.); (D.T.-C.); (K.Z.)
| |
Collapse
|
19
|
Bacteriophage self-counting in the presence of viral replication. Proc Natl Acad Sci U S A 2021; 118:2104163118. [PMID: 34916284 DOI: 10.1073/pnas.2104163118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/04/2021] [Indexed: 11/18/2022] Open
Abstract
When host cells are in low abundance, temperate bacteriophages opt for dormant (lysogenic) infection. Phage lambda implements this strategy by increasing the frequency of lysogeny at higher multiplicity of infection (MOI). However, it remains unclear how the phage reliably counts infecting viral genomes even as their intracellular number increases because of replication. By combining theoretical modeling with single-cell measurements of viral copy number and gene expression, we find that instead of hindering lambda's decision, replication facilitates it. In a nonreplicating mutant, viral gene expression simply scales with MOI rather than diverging into lytic (virulent) and lysogenic trajectories. A similar pattern is followed during early infection by wild-type phage. However, later in the infection, the modulation of viral replication by the decision genes amplifies the initially modest gene expression differences into divergent trajectories. Replication thus ensures the optimal decision-lysis upon single-phage infection and lysogeny at higher MOI.
Collapse
|
20
|
Ankrah NYD, Bernstein DB, Biggs M, Carey M, Engevik M, García-Jiménez B, Lakshmanan M, Pacheco AR, Sulheim S, Medlock GL. Enhancing Microbiome Research through Genome-Scale Metabolic Modeling. mSystems 2021; 6:e0059921. [PMID: 34904863 PMCID: PMC8670372 DOI: 10.1128/msystems.00599-21] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Construction and analysis of genome-scale metabolic models (GEMs) is a well-established systems biology approach that can be used to predict metabolic and growth phenotypes. The ability of GEMs to produce mechanistic insight into microbial ecological processes makes them appealing tools that can open a range of exciting opportunities in microbiome research. Here, we briefly outline these opportunities, present current rate-limiting challenges for the trustworthy application of GEMs to microbiome research, and suggest approaches for moving the field forward.
Collapse
Affiliation(s)
- Nana Y. D. Ankrah
- State University of New York at Plattsburgh, Plattsburgh, New York, USA
| | | | | | - Maureen Carey
- University of Virginia, Charlottesville, Virginia, USA
| | - Melinda Engevik
- Medical University of South Carolina, Charleston, South Carolina, USA
| | | | - Meiyappan Lakshmanan
- Bioprocessing Technology Institute, Agency for Science, Technology and Research (A*STAR), Singapore
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore
| | | | | | | |
Collapse
|
21
|
Sun G, Ahn-Horst TA, Covert MW. The E. coli Whole-Cell Modeling Project. EcoSal Plus 2021; 9:eESP00012020. [PMID: 34242084 PMCID: PMC11163835 DOI: 10.1128/ecosalplus.esp-0001-2020] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 05/26/2021] [Indexed: 12/22/2022]
Abstract
The Escherichia coli whole-cell modeling project seeks to create the most detailed computational model of an E. coli cell in order to better understand and predict the behavior of this model organism. Details about the approach, framework, and current version of the model are discussed. Currently, the model includes the functions of 43% of characterized genes, with ongoing efforts to include additional data and mechanisms. As additional information is incorporated in the model, its utility and predictive power will continue to increase, which means that discovery efforts can be accelerated by community involvement in the generation and inclusion of data. This project will be an invaluable resource to the E. coli community that could be used to verify expected physiological behavior, to predict new outcomes and testable hypotheses for more efficient experimental design iterations, and to evaluate heterogeneous data sets in the context of each other through deep curation.
Collapse
Affiliation(s)
- Gwanggyu Sun
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Travis A. Ahn-Horst
- Department of Bioengineering, Stanford University, Stanford, California, USA
| | - Markus W. Covert
- Department of Bioengineering, Stanford University, Stanford, California, USA
| |
Collapse
|
22
|
Ferreira M, Ventorim R, Almeida E, Silveira S, Silveira W. Protein Abundance Prediction Through Machine Learning Methods. J Mol Biol 2021; 433:167267. [PMID: 34563548 DOI: 10.1016/j.jmb.2021.167267] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 09/09/2021] [Accepted: 09/17/2021] [Indexed: 10/20/2022]
Abstract
Proteins are responsible for most physiological processes, and their abundance provides crucial information for systems biology research. However, absolute protein quantification, as determined by mass spectrometry, still has limitations in capturing the protein pool. Protein abundance is impacted by translation kinetics, which rely on features of codons. In this study, we evaluated the effect of codon usage bias of genes on protein abundance. Notably, we observed differences regarding codon usage patterns between genes coding for highly abundant proteins and genes coding for less abundant proteins. Analysis of synonymous codon usage and evolutionary selection showed a clear split between the two groups. Our machine learning models predicted protein abundances from codon usage metrics with remarkable accuracy, achieving strong correlation with experimental data. Upon integration of the predicted protein abundance in enzyme-constrained genome-scale metabolic models, the simulated phenotypes closely matched experimental data, which demonstrates that our predictive models are valuable tools for systems metabolic engineering approaches.
Collapse
Affiliation(s)
- Mauricio Ferreira
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. https://twitter.com/@mauriciomyces
| | - Rafaela Ventorim
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil.
| | - Eduardo Almeida
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. https://twitter.com/@elm_almeida
| | - Sabrina Silveira
- Department of Computer Science, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil. https://twitter.com/@sabrina_as
| | - Wendel Silveira
- Department of Microbiology, Universidade Federal de Viçosa, Viçosa, MG 36570-900, Brazil.
| |
Collapse
|
23
|
Giannari D, Ho CH, Mahadevan R. A gap-filling algorithm for prediction of metabolic interactions in microbial communities. PLoS Comput Biol 2021; 17:e1009060. [PMID: 34723959 PMCID: PMC8584699 DOI: 10.1371/journal.pcbi.1009060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 11/11/2021] [Accepted: 10/05/2021] [Indexed: 11/24/2022] Open
Abstract
The study of microbial communities and their interactions has attracted the interest of the scientific community, because of their potential for applications in biotechnology, ecology and medicine. The complexity of interspecies interactions, which are key for the macroscopic behavior of microbial communities, cannot be studied easily experimentally. For this reason, the modeling of microbial communities has begun to leverage the knowledge of established constraint-based methods, which have long been used for studying and analyzing the microbial metabolism of individual species based on genome-scale metabolic reconstructions of microorganisms. A main problem of genome-scale metabolic reconstructions is that they usually contain metabolic gaps due to genome misannotations and unknown enzyme functions. This problem is traditionally solved by using gap-filling algorithms that add biochemical reactions from external databases to the metabolic reconstruction, in order to restore model growth. However, gap-filling algorithms could evolve by taking into account metabolic interactions among species that coexist in microbial communities. In this work, a gap-filling method that resolves metabolic gaps at the community level was developed. The efficacy of the algorithm was tested by analyzing its ability to resolve metabolic gaps on a synthetic community of auxotrophic Escherichia coli strains. Subsequently, the algorithm was applied to resolve metabolic gaps and predict metabolic interactions in a community of Bifidobacterium adolescentis and Faecalibacterium prausnitzii, two species present in the human gut microbiota, and in an experimentally studied community of Dehalobacter and Bacteroidales species of the ACT-3 community. The community gap-filling method can facilitate the improvement of metabolic models and the identification of metabolic interactions that are difficult to identify experimentally in microbial communities.
Collapse
Affiliation(s)
- Dafni Giannari
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada
| | | | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada
- The Institute of Biomaterials & Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
24
|
Khaleghi MK, Savizi ISP, Lewis NE, Shojaosadati SA. Synergisms of machine learning and constraint-based modeling of metabolism for analysis and optimization of fermentation parameters. Biotechnol J 2021; 16:e2100212. [PMID: 34390201 DOI: 10.1002/biot.202100212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 08/10/2021] [Accepted: 08/11/2021] [Indexed: 11/06/2022]
Abstract
Recent noteworthy advances in the development of high-performing microbial and mammalian strains have enabled the sustainable production of bio-economically valuable substances such as bio-compounds, biofuels, and biopharmaceuticals. However, to obtain an industrially viable mass-production scheme, much time and effort are required. The robust and rational design of fermentation processes requires analysis and optimization of different extracellular conditions and medium components, which have a massive effect on growth and productivity. In this regard, knowledge- and data-driven modeling methods have received much attention. Constraint-based modeling (CBM) is a knowledge-driven mathematical approach that has been widely used in fermentation analysis and optimization due to its capabilities of predicting the cellular phenotype from genotype through high-throughput means. On the other hand, machine learning (ML) is a data-driven statistical method that identifies the data patterns within sophisticated biological systems and processes, where there is inadequate knowledge to represent underlying mechanisms. Furthermore, ML models are becoming a viable complement to constraint-based models in a reciprocal manner when one is used as a pre-step of another. As a result, more predictable model is produced. This review highlights the applications of CBM and ML independently and the combination of these two approaches for analyzing and optimizing fermentation parameters. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Mohammad Karim Khaleghi
- Biotechnology Department, Faculty of Chemical Engineering, Tarbiat Modares University, Tehran, Iran
| | - Iman Shahidi Pour Savizi
- Biotechnology Department, Faculty of Chemical Engineering, Tarbiat Modares University, Tehran, Iran
| | - Nathan E Lewis
- Department of Bioengineering, University of California, San Diego, USA.,Department of Pediatrics, University of California, San Diego, USA
| | - Seyed Abbas Shojaosadati
- Biotechnology Department, Faculty of Chemical Engineering, Tarbiat Modares University, Tehran, Iran
| |
Collapse
|
25
|
Carey MA, Dräger A, Beber ME, Papin JA, Yurkovich JT. Community standards to facilitate development and address challenges in metabolic modeling. Mol Syst Biol 2021; 16:e9235. [PMID: 32845080 PMCID: PMC8411906 DOI: 10.15252/msb.20199235] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Standardization of data and models facilitates effective communication, especially in computational systems biology. However, both the development and consistent use of standards and resources remain challenging. As a result, the amount, quality, and format of the information contained within systems biology models are not consistent and therefore present challenges for widespread use and communication. Here, we focused on these standards, resources, and challenges in the field of constraint-based metabolic modeling by conducting a community-wide survey. We used this feedback to (i) outline the major challenges that our field faces and to propose solutions and (ii) identify a set of features that defines what a "gold standard" metabolic network reconstruction looks like concerning content, annotation, and simulation capabilities. We anticipate that this community-driven outline will help the long-term development of community-inspired resources as well as produce high-quality, accessible models within our field. More broadly, we hope that these efforts can serve as blueprints for other computational modeling communities to ensure the continued development of both practical, usable standards and reproducible, knowledge-rich models.
Collapse
Affiliation(s)
- Maureen A Carey
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Andreas Dräger
- Computational Systems Biology of Infection and Antimicrobial-Resistant Pathogens, Institute for Biomedical Informatics (IBMI), University of Tübingen, Tübingen, Germany.,Department of Computer Science, University of Tübingen, Tübingen, Germany.,German Center for Infection Research (DZIF), partner site Tübingen, Tübingen, Germany
| | - Moritz E Beber
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Denmark
| | - Jason A Papin
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA.,Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | | |
Collapse
|
26
|
Chiappino-Pepe A, Hatzimanikatis V. PhenoMapping: a protocol to map cellular phenotypes to metabolic bottlenecks, identify conditional essentiality, and curate metabolic models. STAR Protoc 2021; 2:100280. [PMID: 33532729 PMCID: PMC7829271 DOI: 10.1016/j.xpro.2020.100280] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Targeted identification of cellular processes responsible for a phenotype is of major importance in guiding efforts in bioengineering and medicine. Genome-scale metabolic models (GEMs) are widely used to integrate various types of omics data and study the cellular physiology under different conditions. Here, we present PhenoMapping, a protocol that uses GEMs, omics, and phenotypic data to map cellular processes and observed phenotypes. PhenoMapping also classifies genes as conditionally and unconditionally essential and guides a comprehensive curation of GEMs. For complete details on the use and execution of this protocol, please refer to Stanway et al. (2019) and Krishnan et al. (2020).
Collapse
Affiliation(s)
- Anush Chiappino-Pepe
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| |
Collapse
|
27
|
Zimmermann J, Kaleta C, Waschina S. gapseq: informed prediction of bacterial metabolic pathways and reconstruction of accurate metabolic models. Genome Biol 2021; 22:81. [PMID: 33691770 PMCID: PMC7949252 DOI: 10.1186/s13059-021-02295-1] [Citation(s) in RCA: 86] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 02/10/2021] [Indexed: 12/21/2022] Open
Abstract
Genome-scale metabolic models of microorganisms are powerful frameworks to predict phenotypes from an organism's genotype. While manual reconstructions are laborious, automated reconstructions often fail to recapitulate known metabolic processes. Here we present gapseq ( https://github.com/jotech/gapseq ), a new tool to predict metabolic pathways and automatically reconstruct microbial metabolic models using a curated reaction database and a novel gap-filling algorithm. On the basis of scientific literature and experimental data for 14,931 bacterial phenotypes, we demonstrate that gapseq outperforms state-of-the-art tools in predicting enzyme activity, carbon source utilisation, fermentation products, and metabolic interactions within microbial communities.
Collapse
Affiliation(s)
- Johannes Zimmermann
- Christian-Albrechts-University Kiel, Institute of Experimental Medicine, Research Group Medical Systems Biology, Michaelis-Str. 5, Kiel, 24105 Germany
| | - Christoph Kaleta
- Christian-Albrechts-University Kiel, Institute of Experimental Medicine, Research Group Medical Systems Biology, Michaelis-Str. 5, Kiel, 24105 Germany
| | - Silvio Waschina
- Christian-Albrechts-University Kiel, Institute of Experimental Medicine, Research Group Medical Systems Biology, Michaelis-Str. 5, Kiel, 24105 Germany
- Christian-Albrechts-University Kiel, Institute of Human Nutrition and Food Science, Nutriinformatics, Heinrich-Hecht-Platz 10, Kiel, 24118 Germany
| |
Collapse
|
28
|
Bernstein DB, Sulheim S, Almaas E, Segrè D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol 2021; 22:64. [PMID: 33602294 PMCID: PMC7890832 DOI: 10.1186/s13059-021-02289-z] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/04/2021] [Indexed: 02/07/2023] Open
Abstract
The reconstruction and analysis of genome-scale metabolic models constitutes a powerful systems biology approach, with applications ranging from basic understanding of genotype-phenotype mapping to solving biomedical and environmental problems. However, the biological insight obtained from these models is limited by multiple heterogeneous sources of uncertainty, which are often difficult to quantify. Here we review the major sources of uncertainty and survey existing approaches developed for representing and addressing them. A unified formal characterization of these uncertainties through probabilistic approaches and ensemble modeling will facilitate convergence towards consistent reconstruction pipelines, improved data integration algorithms, and more accurate assessment of predictive capacity.
Collapse
Affiliation(s)
- David B Bernstein
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA
| | - Snorre Sulheim
- Bioinformatics Program, Boston University, Boston, MA, USA
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biotechnology and Nanomedicine, SINTEF Industry, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Daniel Segrè
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA.
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology and Department of Physics, Boston University, Boston, MA, USA.
| |
Collapse
|
29
|
Fang X, Lloyd CJ, Palsson BO. Reconstructing organisms in silico: genome-scale models and their emerging applications. Nat Rev Microbiol 2020; 18:731-743. [PMID: 32958892 PMCID: PMC7981288 DOI: 10.1038/s41579-020-00440-4] [Citation(s) in RCA: 111] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/17/2020] [Indexed: 02/06/2023]
Abstract
Escherichia coli is considered to be the best-known microorganism given the large number of published studies detailing its genes, its genome and the biochemical functions of its molecular components. This vast literature has been systematically assembled into a reconstruction of the biochemical reaction networks that underlie E. coli's functions, a process which is now being applied to an increasing number of microorganisms. Genome-scale reconstructed networks are organized and systematized knowledge bases that have multiple uses, including conversion into computational models that interpret and predict phenotypic states and the consequences of environmental and genetic perturbations. These genome-scale models (GEMs) now enable us to develop pan-genome analyses that provide mechanistic insights, detail the selection pressures on proteome allocation and address stress phenotypes. In this Review, we first discuss the overall development of GEMs and their applications. Next, we review the evolution of the most complete GEM that has been developed to date: the E. coli GEM. Finally, we explore three emerging areas in genome-scale modelling of microbial phenotypes: collections of strain-specific models, metabolic and macromolecular expression models, and simulation of stress responses.
Collapse
Affiliation(s)
- Xin Fang
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Colton J Lloyd
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark.
| |
Collapse
|
30
|
Systematically gap-filling the genome-scale metabolic model of CHO cells. Biotechnol Lett 2020; 43:73-87. [PMID: 33040240 DOI: 10.1007/s10529-020-03021-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 10/03/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Chinese hamster ovary (CHO) cells are the leading cell factories for producing recombinant proteins in the biopharmaceutical industry. In this regard, constraint-based metabolic models are useful platforms to perform computational analysis of cell metabolism. These models need to be regularly updated in order to include the latest biochemical data of the cells, and to increase their predictive power. Here, we provide an update to iCHO1766, the metabolic model of CHO cells. RESULTS We expanded the existing model of Chinese hamster metabolism with the help of four gap-filling approaches, leading to the addition of 773 new reactions and 335 new genes. We incorporated these into an updated genome-scale metabolic network model of CHO cells, named iCHO2101. In this updated model, the number of reactions and pathways capable of carrying flux is substantially increased. CONCLUSIONS The present CHO model is an important step towards more complete metabolic models of CHO cells.
Collapse
|
31
|
Sen P, Lamichhane S, Mathema VB, McGlinchey A, Dickens AM, Khoomrung S, Orešič M. Deep learning meets metabolomics: a methodological perspective. Brief Bioinform 2020; 22:1531-1542. [PMID: 32940335 DOI: 10.1093/bib/bbaa204] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Revised: 08/08/2020] [Accepted: 08/10/2020] [Indexed: 12/15/2022] Open
Abstract
Deep learning (DL), an emerging area of investigation in the fields of machine learning and artificial intelligence, has markedly advanced over the past years. DL techniques are being applied to assist medical professionals and researchers in improving clinical diagnosis, disease prediction and drug discovery. It is expected that DL will help to provide actionable knowledge from a variety of 'big data', including metabolomics data. In this review, we discuss the applicability of DL to metabolomics, while presenting and discussing several examples from recent research. We emphasize the use of DL in tackling bottlenecks in metabolomics data acquisition, processing, metabolite identification, as well as in metabolic phenotyping and biomarker discovery. Finally, we discuss how DL is used in genome-scale metabolic modelling and in interpretation of metabolomics data. The DL-based approaches discussed here may assist computational biologists with the integration, prediction and drawing of statistical inference about biological outcomes, based on metabolomics data.
Collapse
Affiliation(s)
- Partho Sen
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland.,School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Santosh Lamichhane
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Vivek B Mathema
- Metabolomics and Systems Biology, Department of Biochemistry, and Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand
| | - Aidan McGlinchey
- School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| | - Alex M Dickens
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland
| | - Sakda Khoomrung
- Metabolomics and Systems Biology, Department of Biochemistry, and Siriraj Metabolomics and Phenomics Center, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok 10700, Thailand.,Center for Innovation in Chemistry (PERCH), Faculty of Science, Mahidol University, Rama 6 Road, Bangkok 10400, Thailand
| | - Matej Orešič
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, 20520 Turku, Finland.,School of Medical Sciences, Örebro University, 702 81 Örebro, Sweden
| |
Collapse
|
32
|
Kotidis P, Kontoravdi C. Harnessing the potential of artificial neural networks for predicting protein glycosylation. Metab Eng Commun 2020; 10:e00131. [PMID: 32489858 PMCID: PMC7256630 DOI: 10.1016/j.mec.2020.e00131] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 05/06/2020] [Accepted: 05/06/2020] [Indexed: 12/16/2022] Open
Abstract
Kinetic models offer incomparable insight on cellular mechanisms controlling protein glycosylation. However, their ability to reproduce site-specific glycoform distributions depends on accurate estimation of a large number of protein-specific kinetic parameters and prior knowledge of enzyme and transport protein levels in the Golgi membrane. Herein we propose an artificial neural network (ANN) for protein glycosylation and apply this to four recombinant glycoproteins produced in Chinese hamster ovary (CHO) cells, two monoclonal antibodies and two fusion proteins. We demonstrate that the ANN model accurately predicts site-specific glycoform distributions of up to eighteen glycan species with an average absolute error of 1.1%, correctly reproducing the effect of metabolic perturbations as part of a hybrid, kinetic/ANN, glycosylation model (HyGlycoM), as well as the impact of manganese supplementation and glycosyltransferase knock out experiments as a stand-alone machine learning algorithm. These results showcase the potential of machine learning and hybrid approaches for rapidly developing performance-driven models of protein glycosylation.
Collapse
|
33
|
Medusa: Software to build and analyze ensembles of genome-scale metabolic network reconstructions. PLoS Comput Biol 2020; 16:e1007847. [PMID: 32348298 PMCID: PMC7213742 DOI: 10.1371/journal.pcbi.1007847] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 05/11/2020] [Accepted: 04/03/2020] [Indexed: 11/19/2022] Open
Abstract
Uncertainty in the structure and parameters of networks is ubiquitous across computational biology. In constraint-based reconstruction and analysis of metabolic networks, this uncertainty is present both during the reconstruction of networks and in simulations performed with them. Here, we present Medusa, a Python package for the generation and analysis of ensembles of genome-scale metabolic network reconstructions. Medusa builds on the COBRApy package for constraint-based reconstruction and analysis by compressing a set of models into a compact ensemble object, providing functions for the generation of ensembles using experimental data, and extending constraint-based analyses to ensemble scale. We demonstrate how Medusa can be used to generate ensembles and perform ensemble simulations, and how machine learning can be used in conjunction with Medusa to guide the curation of genome-scale metabolic network reconstructions. Medusa is available under the permissive MIT license from the Python Packaging Index (https://pypi.org) and from github (https://github.com/opencobra/Medusa), and comprehensive documentation is available at https://medusa.readthedocs.io/en/latest.
Collapse
|