1
|
Lai PT, Coudert E, Aimo L, Axelsen K, Breuza L, de Castro E, Feuermann M, Morgat A, Pourcel L, Pedruzzi I, Poux S, Redaschi N, Rivoire C, Sveshnikova A, Wei CH, Leaman R, Luo L, Lu Z, Bridge A. EnzChemRED, a rich enzyme chemistry relation extraction dataset. Sci Data 2024; 11:982. [PMID: 39251610 PMCID: PMC11384730 DOI: 10.1038/s41597-024-03835-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 08/23/2024] [Indexed: 09/11/2024] Open
Abstract
Expert curation is essential to capture knowledge of enzyme functions from the scientific literature in FAIR open knowledgebases but cannot keep pace with the rate of new discoveries and new publications. In this work we present EnzChemRED, for Enzyme Chemistry Relation Extraction Dataset, a new training and benchmarking dataset to support the development of Natural Language Processing (NLP) methods such as (large) language models that can assist enzyme curation. EnzChemRED consists of 1,210 expert curated PubMed abstracts where enzymes and the chemical reactions they catalyze are annotated using identifiers from the protein knowledgebase UniProtKB and the chemical ontology ChEBI. We show that fine-tuning language models with EnzChemRED significantly boosts their ability to identify proteins and chemicals in text (86.30% F1 score) and to extract the chemical conversions (86.66% F1 score) and the enzymes that catalyze those conversions (83.79% F1 score). We apply our methods to abstracts at PubMed scale to create a draft map of enzyme functions in literature to guide curation efforts in UniProtKB and the reaction knowledgebase Rhea.
Collapse
Grants
- U24 HG007822 NHGRI NIH HHS
- NIH Intramural Research Program, National Library of Medicine
- Expert curation and evaluation of EnzChemRED at Swiss-Prot were supported by the Swiss Federal Government through the State Secretariat for Education, Research and Innovation (SERI) and the National Human Genome Research Institute (NHGRI), Office of Director [OD/DPCPSI/ODSS], National Institute of Allergy and Infectious Diseases (NIAID), National Institute on Aging (NIA), National Institute of General Medical Sciences (NIGMS), National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Eye Institute (NEI), National Cancer Institute (NCI), National Heart, Lung, and Blood Institute (NHLBI) of the National Institutes of Health [U24HG007822], and by the European Union's Horizon Europe Framework Programme (grant number 101080997), supported in Switzerland through the State Secretariat for Education, Research and Innovation (SERI).
- Fundamental Research Funds for the Central Universities [DUT23RC(3)014 to L.L.]
Collapse
Affiliation(s)
- Po-Ting Lai
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Elisabeth Coudert
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Lucila Aimo
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Kristian Axelsen
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Lionel Breuza
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Edouard de Castro
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Marc Feuermann
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Anne Morgat
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Lucille Pourcel
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Ivo Pedruzzi
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Sylvain Poux
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Nicole Redaschi
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Catherine Rivoire
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Anastasia Sveshnikova
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland
| | - Chih-Hsuan Wei
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Robert Leaman
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA
| | - Ling Luo
- School of Computer Science and Technology, Dalian University of Technology, 116024, Dalian, China
| | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, 20894, USA.
| | - Alan Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, CH-1211, Geneva, 4, Switzerland.
| |
Collapse
|
2
|
Huang W, Yang F, Zhang Q, Liu J. A dual-scale fused hypergraph convolution-based hyperedge prediction model for predicting missing reactions in genome-scale metabolic networks. Brief Bioinform 2024; 25:bbae383. [PMID: 39101499 PMCID: PMC11299038 DOI: 10.1093/bib/bbae383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/24/2024] [Accepted: 07/23/2024] [Indexed: 08/06/2024] Open
Abstract
Genome-scale metabolic models (GEMs) are powerful tools for predicting cellular metabolic and physiological states. However, there are still missing reactions in GEMs due to incomplete knowledge. Recent gaps filling methods suggest directly predicting missing responses without relying on phenotypic data. However, they do not differentiate between substrates and products when constructing the prediction models, which affects the predictive performance of the models. In this paper, we propose a hyperedge prediction model that distinguishes substrates and products based on dual-scale fused hypergraph convolution, DSHCNet, for inferring the missing reactions to effectively fill gaps in the GEM. First, we model each hyperedge as a heterogeneous complete graph and then decompose it into three subgraphs at both homogeneous and heterogeneous scales. Then we design two graph convolution-based models to, respectively, extract features of the vertices in two scales, which are then fused via the attention mechanism. Finally, the features of all vertices are further pooled to generate the representative feature of the hyperedge. The strategy of graph decomposition in DSHCNet enables the vertices to engage in message passing independently at both scales, thereby enhancing the capability of information propagation and making the obtained product and substrate features more distinguishable. The experimental results show that the average recovery rate of missing reactions obtained by DSHCNet is at least 11.7% higher than that of the state-of-the-art methods, and that the gap-filled GEMs based on our DSHCNet model achieve the best prediction performance, demonstrating the superiority of our method.
Collapse
Affiliation(s)
- Weihong Huang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| | - Feng Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| | - Qiang Zhang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| | - Juan Liu
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| |
Collapse
|
3
|
Tarzi C, Zampieri G, Sullivan N, Angione C. Emerging methods for genome-scale metabolic modeling of microbial communities. Trends Endocrinol Metab 2024; 35:533-548. [PMID: 38575441 DOI: 10.1016/j.tem.2024.02.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/28/2024] [Accepted: 02/29/2024] [Indexed: 04/06/2024]
Abstract
Genome-scale metabolic models (GEMs) are consolidating as platforms for studying mixed microbial populations, by combining biological data and knowledge with mathematical rigor. However, deploying these models to answer research questions can be challenging due to the increasing number of available computational tools, the lack of universal standards, and their inherent limitations. Here, we present a comprehensive overview of foundational concepts for building and evaluating genome-scale models of microbial communities. We then compare tools in terms of requirements, capabilities, and applications. Next, we highlight the current pitfalls and open challenges to consider when adopting existing tools and developing new ones. Our compendium can be relevant for the expanding community of modelers, both at the entry and experienced levels.
Collapse
Affiliation(s)
- Chaimaa Tarzi
- School of Computing, Engineering and Digital Technologies, Teesside University, Southfield Rd, Middlesbrough, TS1 3BX, North Yorkshire, UK
| | - Guido Zampieri
- Department of Biology, University of Padova, Padova, 35122, Veneto, Italy
| | - Neil Sullivan
- Complement Genomics Ltd, Station Rd, Lanchester, Durham, DH7 0EX, County Durham, UK
| | - Claudio Angione
- School of Computing, Engineering and Digital Technologies, Teesside University, Southfield Rd, Middlesbrough, TS1 3BX, North Yorkshire, UK; Centre for Digital Innovation, Teesside University, Southfield Rd, Middlesbrough, TS1 3BX, North Yorkshire, UK; National Horizons Centre, Teesside University, 38 John Dixon Ln, Darlington, DL1 1HG, North Yorkshire, UK.
| |
Collapse
|
4
|
Umasekar S, Virivinti N. Advances in modeling techniques for the production and purification of biomolecules: A comprehensive review. J Chromatogr B Analyt Technol Biomed Life Sci 2024; 1232:123945. [PMID: 38113723 DOI: 10.1016/j.jchromb.2023.123945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 10/17/2023] [Accepted: 11/28/2023] [Indexed: 12/21/2023]
Abstract
In response to the growing demand for therapeutic biomolecules, there is a need for continuous and cost-effective bio-separation techniques to enhance extraction yield and efficiency. Aqueous biphasic extractive fermentation has emerged as an integrated downstream processing technique, offering selective partitioning, high productivity, and preservation of biomolecule integrity. However, the dynamic nature of this technique requires a comprehensive understanding of the underlying separation mechanisms. Unfortunately, the analysis of parameters influencing this dynamic behavior can be challenging due to limited resources and time. To address this, mathematical modeling approaches can be employed to minimize the tedious trial-and-error experimentation process. This review article presents mathematical modeling approaches for both upstream and downstream processing techniques, focusing on the production of biomolecules which can be used in pharmaceutical industries in a cost-effective manner. By leveraging mathematical models, researchers can optimize the production and purification processes, leading to improved efficiency and processing cost reduction in biomolecule production.
Collapse
Affiliation(s)
- Srimathi Umasekar
- Department of Chemical Engineering, National Institute of Technology Tiruchirappalli, Tiruchirappalli, Tamil Nadu 620015, India
| | - Nagajyothi Virivinti
- Department of Chemical Engineering, National Institute of Technology Tiruchirappalli, Tiruchirappalli, Tamil Nadu 620015, India.
| |
Collapse
|
5
|
Carter EL, Constantinidou C, Alam MT. Applications of genome-scale metabolic models to investigate microbial metabolic adaptations in response to genetic or environmental perturbations. Brief Bioinform 2023; 25:bbad439. [PMID: 38048080 PMCID: PMC10694557 DOI: 10.1093/bib/bbad439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 09/21/2023] [Accepted: 11/08/2023] [Indexed: 12/05/2023] Open
Abstract
Environmental perturbations are encountered by microorganisms regularly and will require metabolic adaptations to ensure an organism can survive in the newly presenting conditions. In order to study the mechanisms of metabolic adaptation in such conditions, various experimental and computational approaches have been used. Genome-scale metabolic models (GEMs) are one of the most powerful approaches to study metabolism, providing a platform to study the systems level adaptations of an organism to different environments which could otherwise be infeasible experimentally. In this review, we are describing the application of GEMs in understanding how microbes reprogram their metabolic system as a result of environmental variation. In particular, we provide the details of metabolic model reconstruction approaches, various algorithms and tools for model simulation, consequences of genetic perturbations, integration of '-omics' datasets for creating context-specific models and their application in studying metabolic adaptation due to the change in environmental conditions.
Collapse
Affiliation(s)
- Elena Lucy Carter
- Warwick Medical School, University of Warwick, Coventry, CV4 7HL, UK
| | | | | |
Collapse
|
6
|
Hackmann TJ, Zhang B. The phenotype and genotype of fermentative prokaryotes. SCIENCE ADVANCES 2023; 9:eadg8687. [PMID: 37756392 PMCID: PMC10530074 DOI: 10.1126/sciadv.adg8687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Accepted: 08/25/2023] [Indexed: 09/29/2023]
Abstract
Fermentation is a type of metabolism pervasive in oxygen-deprived environments. Despite its importance, we know little about the range and traits of organisms that carry out this metabolism. Our study addresses this gap with a comprehensive analysis of the phenotype and genotype of fermentative prokaryotes. We assembled a dataset with phenotypic records of 8350 organisms plus 4355 genomes and 13.6 million genes. Our analysis reveals fermentation is both widespread (in ~30% of prokaryotes) and complex (forming ~300 combinations of metabolites). Furthermore, it points to previously uncharacterized proteins involved in this metabolism. Previous studies suggest that metabolic pathways for fermentation are well understood, but metabolic models built in our study show gaps in our knowledge. This study demonstrates the complexity of fermentation while showing that there is still much to learn about this metabolism. All resources in our study can be explored by the scientific community with an online, interactive tool.
Collapse
Affiliation(s)
| | - Bo Zhang
- Department of Chemical Engineering, University of California, Santa Barbara, CA, USA
| |
Collapse
|
7
|
Nègre D, Larhlimi A, Bertrand S. Reconciliation and evolution of Penicillium rubens genome-scale metabolic networks-What about specialised metabolism? PLoS One 2023; 18:e0289757. [PMID: 37647283 PMCID: PMC10468094 DOI: 10.1371/journal.pone.0289757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 07/24/2023] [Indexed: 09/01/2023] Open
Abstract
In recent years, genome sequencing of filamentous fungi has revealed a high proportion of specialised metabolites with growing pharmaceutical interest. However, detecting such metabolites through in silico genome analysis does not necessarily guarantee their expression under laboratory conditions. However, one plausible strategy for enabling their production lies in modifying the growth conditions. Devising a comprehensive experimental design testing in different culture environments is time-consuming and expensive. Therefore, using in silico modelling as a preliminary step, such as Genome-Scale Metabolic Network (GSMN), represents a promising approach to predicting and understanding the observed specialised metabolite production in a given organism. To address these questions, we reconstructed a new high-quality GSMN for the Penicillium rubens Wisconsin 54-1255 strain, a commonly used model organism. Our reconstruction, iPrub22, adheres to current convention standards and quality criteria, incorporating updated functional annotations, orthology searches with different GSMN templates, data from previous reconstructions, and manual curation steps targeting primary and specialised metabolites. With a MEMOTE score of 74% and a metabolic coverage of 45%, iPrub22 includes 5,192 unique metabolites interconnected by 5,919 reactions, of which 5,033 are supported by at least one genomic sequence. Of the metabolites present in iPrub22, 13% are categorised as belonging to specialised metabolism. While our high-quality GSMN provides a valuable resource for investigating known phenotypes expressed in P. rubens, our analysis identifies bottlenecks related, in particular, to the definition of what is a specialised metabolite, which requires consensus within the scientific community. It also points out the necessity of accessible, standardised and exhaustive databases of specialised metabolites. These questions must be addressed to fully unlock the potential of natural product production in P. rubens and other filamentous fungi. Our work represents a foundational step towards the objective of rationalising the production of natural products through GSMN modelling.
Collapse
Affiliation(s)
- Delphine Nègre
- Nantes Université, Institut des Substances et Organismes de la Mer, ISOMer, Nantes, France
- Nantes Université, École Centrale Nantes, CNRS, Nantes, France
| | | | - Samuel Bertrand
- Nantes Université, Institut des Substances et Organismes de la Mer, ISOMer, Nantes, France
| |
Collapse
|
8
|
Jenior ML, Glass EM, Papin JA. Reconstructor: a COBRApy compatible tool for automated genome-scale metabolic network reconstruction with parsimonious flux-based gap-filling. Bioinformatics 2023; 39:btad367. [PMID: 37279743 PMCID: PMC10275916 DOI: 10.1093/bioinformatics/btad367] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 05/22/2023] [Accepted: 06/02/2023] [Indexed: 06/08/2023] Open
Abstract
MOTIVATION Genome-scale metabolic network reconstructions (GENREs) are valuable for understanding cellular metabolism in silico. Several tools exist for automatic GENRE generation. However, these tools frequently (i) do not readily integrate with some of the widely-used suites of packaged methods available for network analysis, (ii) lack effective network curation tools, (iii) are not sufficiently user-friendly, and (iv) often produce low-quality draft reconstructions. RESULTS Here, we present Reconstructor, a user-friendly, COBRApy-compatible tool that produces high-quality draft reconstructions with reaction and metabolite naming conventions that are consistent with the ModelSEED biochemistry database and includes a gap-filling technique based on the principles of parsimony. Reconstructor can generate SBML GENREs from three input types: annotated protein .fasta sequences (Type 1 input), a BLASTp output (Type 2), or an existing SBML GENRE that can be further gap-filled (Type 3). While Reconstructor can be used to create GENREs of any species, we demonstrate the utility of Reconstructor with bacterial reconstructions. We demonstrate how Reconstructor readily generates high-quality GENRES that capture strain, species, and higher taxonomic differences in functional metabolism of bacteria and are useful for further biological discovery. AVAILABILITY AND IMPLEMENTATION The Reconstructor Python package is freely available for download. Complete installation and usage instructions and benchmarking data are available at http://github.com/emmamglass/reconstructor.
Collapse
Affiliation(s)
- Matthew L Jenior
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States
| | - Emma M Glass
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States
| | - Jason A Papin
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States
- Department of Medicine, Division of Infectious Diseases & International Health, University of Virginia, Charlottesville, Virginia, United States
- Department of Biochemistry & Molecular Genetics, University of Virginia, Charlottesville, Virginia, United States
| |
Collapse
|
9
|
Chen C, Liao C, Liu YY. Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning. Nat Commun 2023; 14:2375. [PMID: 37185345 PMCID: PMC10130184 DOI: 10.1038/s41467-023-38110-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 04/14/2023] [Indexed: 05/17/2023] Open
Abstract
GEnome-scale Metabolic models (GEMs) are powerful tools to predict cellular metabolism and physiological states in living organisms. However, due to our imperfect knowledge of metabolic processes, even highly curated GEMs have knowledge gaps (e.g., missing reactions). Existing gap-filling methods typically require phenotypic data as input to tease out missing reactions. We still lack a computational method for rapid and accurate gap-filling of metabolic networks before experimental data is available. Here we present a deep learning-based method - CHEbyshev Spectral HyperlInk pREdictor (CHESHIRE) - to predict missing reactions in GEMs purely from metabolic network topology. We demonstrate that CHESHIRE outperforms other topology-based methods in predicting artificially removed reactions over 926 high- and intermediate-quality GEMs. Furthermore, CHESHIRE is able to improve the phenotypic predictions of 49 draft GEMs for fermentation products and amino acids secretions. Both types of validation suggest that CHESHIRE is a powerful tool for GEM curation to reveal unknown links between reactions and observed metabolic phenotypes.
Collapse
Affiliation(s)
- Can Chen
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Chen Liao
- Program for Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Yang-Yu Liu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA.
- Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, 61801, USA.
| |
Collapse
|
10
|
Cheng Y, Bi X, Xu Y, Liu Y, Li J, Du G, Lv X, Liu L. Machine learning for metabolic pathway optimization: A review. Comput Struct Biotechnol J 2023; 21:2381-2393. [PMID: 38213889 PMCID: PMC10781721 DOI: 10.1016/j.csbj.2023.03.045] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 03/24/2023] [Accepted: 03/25/2023] [Indexed: 03/29/2023] Open
Abstract
Optimizing the metabolic pathways of microbial cell factories is essential for establishing viable biotechnological production processes. However, due to the limited understanding of the complex setup of cellular machinery, building efficient microbial cell factories remains tedious and time-consuming. Machine learning (ML), a powerful tool capable of identifying patterns within large datasets, has been used to analyze biological datasets generated using various high-throughput technologies to build data-driven models for complex bioprocesses. In addition, ML can also be integrated with Design-Build-Test-Learn to accelerate development. This review focuses on recent ML applications in genome-scale metabolic model construction, multistep pathway optimization, rate-limiting enzyme engineering, and gene regulatory element designing. In addition, we have discussed some limitations of these methods as well as potential solutions.
Collapse
Affiliation(s)
- Yang Cheng
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Xinyu Bi
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yameng Xu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yanfeng Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Jianghua Li
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Guocheng Du
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Xueqin Lv
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Long Liu
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, Jiangnan University, Wuxi 214122, China
- Science Center for Future Foods, Ministry of Education, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
11
|
Context-Specific Genome-Scale Metabolic Modelling and Its Application to the Analysis of COVID-19 Metabolic Signatures. Metabolites 2023; 13:metabo13010126. [PMID: 36677051 PMCID: PMC9866716 DOI: 10.3390/metabo13010126] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/27/2022] [Accepted: 01/10/2023] [Indexed: 01/19/2023] Open
Abstract
Genome-scale metabolic models (GEMs) have found numerous applications in different domains, ranging from biotechnology to systems medicine. Herein, we overview the most popular algorithms for the automated reconstruction of context-specific GEMs using high-throughput experimental data. Moreover, we describe different datasets applied in the process, and protocols that can be used to further automate the model reconstruction and validation. Finally, we describe recent COVID-19 applications of context-specific GEMs, focusing on the analysis of metabolic implications, identification of biomarkers and potential drug targets.
Collapse
|
12
|
Strain B, Morrissey J, Antonakoudis A, Kontoravdi C. Genome-scale models as a vehicle for knowledge transfer from microbial to mammalian cell systems. Comput Struct Biotechnol J 2023; 21:1543-1549. [PMID: 36879884 PMCID: PMC9984296 DOI: 10.1016/j.csbj.2023.02.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/06/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023] Open
Abstract
With the plethora of omics data becoming available for mammalian cell and, increasingly, human cell systems, Genome-scale metabolic models (GEMs) have emerged as a useful tool for their organisation and analysis. The systems biology community has developed an array of tools for the solution, interrogation and customisation of GEMs as well as algorithms that enable the design of cells with desired phenotypes based on the multi-omics information contained in these models. However, these tools have largely found application in microbial cells systems, which benefit from smaller model size and ease of experimentation. Herein, we discuss the major outstanding challenges in the use of GEMs as a vehicle for accurately analysing data for mammalian cell systems and transferring methodologies that would enable their use to design strains and processes. We provide insights on the opportunities and limitations of applying GEMs to human cell systems for advancing our understanding of health and disease. We further propose their integration with data-driven tools and their enrichment with cellular functions beyond metabolism, which would, in theory, more accurately describe how resources are allocated intracellularly.
Collapse
Affiliation(s)
- Benjamin Strain
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | - James Morrissey
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| | | | - Cleo Kontoravdi
- Department of Chemical Engineering, Imperial College London, London SW7 2AZ, United Kingdom
| |
Collapse
|
13
|
Aminian-Dehkordi J, Valiei A, Mofrad MRK. Emerging computational paradigms to address the complex role of gut microbial metabolism in cardiovascular diseases. Front Cardiovasc Med 2022; 9:987104. [PMID: 36299869 PMCID: PMC9589059 DOI: 10.3389/fcvm.2022.987104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
The human gut microbiota and its associated perturbations are implicated in a variety of cardiovascular diseases (CVDs). There is evidence that the structure and metabolic composition of the gut microbiome and some of its metabolites have mechanistic associations with several CVDs. Nevertheless, there is a need to unravel metabolic behavior and underlying mechanisms of microbiome-host interactions. This need is even more highlighted when considering that microbiome-secreted metabolites contributing to CVDs are the subject of intensive research to develop new prevention and therapeutic techniques. In addition to the application of high-throughput data used in microbiome-related studies, advanced computational tools enable us to integrate omics into different mathematical models, including constraint-based models, dynamic models, agent-based models, and machine learning tools, to build a holistic picture of metabolic pathological mechanisms. In this article, we aim to review and introduce state-of-the-art mathematical models and computational approaches addressing the link between the microbiome and CVDs.
Collapse
Affiliation(s)
| | | | - Mohammad R. K. Mofrad
- Department of Bioengineering and Mechanical Engineering, University of California, Berkeley, Berkeley, CA, United States
| |
Collapse
|
14
|
Nursimulu N, Moses AM, Parkinson J. Architect: A tool for aiding the reconstruction of high-quality metabolic models through improved enzyme annotation. PLoS Comput Biol 2022; 18:e1010452. [PMID: 36074804 PMCID: PMC9488769 DOI: 10.1371/journal.pcbi.1010452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 09/20/2022] [Accepted: 07/29/2022] [Indexed: 11/19/2022] Open
Abstract
Constraint-based modeling is a powerful framework for studying cellular metabolism, with applications ranging from predicting growth rates and optimizing production of high value metabolites to identifying enzymes in pathogens that may be targeted for therapeutic interventions. Results from modeling experiments can be affected at least in part by the quality of the metabolic models used. Reconstructing a metabolic network manually can produce a high-quality metabolic model but is a time-consuming task. At the same time, current methods for automating the process typically transfer metabolic function based on sequence similarity, a process known to produce many false positives. We created Architect, a pipeline for automatic metabolic model reconstruction from protein sequences. First, it performs enzyme annotation through an ensemble approach, whereby a likelihood score is computed for an EC prediction based on predictions from existing tools; for this step, our method shows both increased precision and recall compared to individual tools. Next, Architect uses these annotations to construct a high-quality metabolic network which is then gap-filled based on likelihood scores from the ensemble approach. The resulting metabolic model is output in SBML format, suitable for constraints-based analyses. Through comparisons of enzyme annotations and curated metabolic models, we demonstrate improved performance of Architect over other state-of-the-art tools, notably with higher precision and recall on the eukaryote C. elegans and when compared to UniProt annotations in two bacterial species. Code for Architect is available at https://github.com/ParkinsonLab/Architect. For ease-of-use, Architect can be readily set up and utilized using its Docker image, maintained on Docker Hub.
Collapse
Affiliation(s)
- Nirvana Nursimulu
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Program in Molecular Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Alan M. Moses
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Department of Cell & Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - John Parkinson
- Program in Molecular Medicine, The Hospital for Sick Children, Toronto, Ontario, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada
- * E-mail:
| |
Collapse
|
15
|
Garza DR, von Meijenfeldt FAB, van Dijk B, Boleij A, Huynen MA, Dutilh BE. Nutrition or nature: using elementary flux modes to disentangle the complex forces shaping prokaryote pan-genomes. BMC Ecol Evol 2022; 22:101. [PMID: 35974327 PMCID: PMC9382767 DOI: 10.1186/s12862-022-02052-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 07/22/2022] [Indexed: 11/15/2022] Open
Abstract
Background Microbial pan-genomes are shaped by a complex combination of stochastic and deterministic forces. Even closely related genomes exhibit extensive variation in their gene content. Understanding what drives this variation requires exploring the interactions of gene products with each other and with the organism’s external environment. However, to date, conceptual models of pan-genome dynamics often represent genes as independent units and provide limited information about their mechanistic interactions. Results We simulated the stochastic process of gene-loss using the pooled genome-scale metabolic reaction networks of 46 taxonomically diverse bacterial and archaeal families as proxies for their pan-genomes. The frequency by which reactions are retained in functional networks when stochastic gene loss is simulated in diverse environments allowed us to disentangle the metabolic reactions whose presence depends on the metabolite composition of the external environment (constrained by “nutrition”) from those that are independent of the environment (constrained by “nature”). By comparing the frequency of reactions from the first group with their observed frequencies in bacterial and archaeal families, we predicted the metabolic niches that shaped the genomic composition of these lineages. Moreover, we found that the lineages that were shaped by a more diverse metabolic niche also occur in more diverse biomes as assessed by global environmental sequencing datasets. Conclusion We introduce a computational framework for analyzing and interpreting pan-reactomes that provides novel insights into the ecological and evolutionary drivers of pan-genome dynamics. Supplementary Information The online version contains supplementary material available at 10.1186/s12862-022-02052-3.
Collapse
|
16
|
Beilsmith K, Henry CS, Seaver SMD. Genome-scale modeling of the primary-specialized metabolism interface. CURRENT OPINION IN PLANT BIOLOGY 2022; 68:102244. [PMID: 35714443 DOI: 10.1016/j.pbi.2022.102244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Revised: 04/21/2022] [Accepted: 05/07/2022] [Indexed: 06/15/2023]
Abstract
Environmental challenges and development require plants to reallocate resources between primary and specialized metabolites to survive. Genome-scale metabolic models, which map carbon flux through metabolic pathways, are a valuable tool in the study of tradeoffs that arise at this interface. Due to annotation gaps, models that characterize all the enzymatic steps in individual specialized pathways and their linkages to each other and to central carbon metabolism are difficult to construct. Recent studies have successfully curated subsystems of specialized metabolism and characterized the interfaces where flux is diverted to the precursors of glucosinolates, terpenes, and anthocyanins. Although advances in metabolite profiling can help to constrain models at this interface, quantitative analysis remains challenging because of the different timescales on which specialized metabolites from constitutive and reactive pathways accumulate.
Collapse
Affiliation(s)
- Kathleen Beilsmith
- Data Science and Learning Division, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, IL 60439, USA
| | - Christopher S Henry
- Data Science and Learning Division, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, IL 60439, USA
| | - Samuel M D Seaver
- Data Science and Learning Division, Argonne National Laboratory, 9700 S. Cass Avenue, Lemont, IL 60439, USA.
| |
Collapse
|
17
|
Amara A, Frainay C, Jourdan F, Naake T, Neumann S, Novoa-del-Toro EM, Salek RM, Salzer L, Scharfenberg S, Witting M. Networks and Graphs Discovery in Metabolomics Data Analysis and Interpretation. Front Mol Biosci 2022; 9:841373. [PMID: 35350714 PMCID: PMC8957799 DOI: 10.3389/fmolb.2022.841373] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Accepted: 02/18/2022] [Indexed: 01/19/2023] Open
Abstract
Both targeted and untargeted mass spectrometry-based metabolomics approaches are used to understand the metabolic processes taking place in various organisms, from prokaryotes, plants, fungi to animals and humans. Untargeted approaches allow to detect as many metabolites as possible at once, identify unexpected metabolic changes, and characterize novel metabolites in biological samples. However, the identification of metabolites and the biological interpretation of such large and complex datasets remain challenging. One approach to address these challenges is considering that metabolites are connected through informative relationships. Such relationships can be formalized as networks, where the nodes correspond to the metabolites or features (when there is no or only partial identification), and edges connect nodes if the corresponding metabolites are related. Several networks can be built from a single dataset (or a list of metabolites), where each network represents different relationships, such as statistical (correlated metabolites), biochemical (known or putative substrates and products of reactions), or chemical (structural similarities, ontological relations). Once these networks are built, they can subsequently be mined using algorithms from network (or graph) theory to gain insights into metabolism. For instance, we can connect metabolites based on prior knowledge on enzymatic reactions, then provide suggestions for potential metabolite identifications, or detect clusters of co-regulated metabolites. In this review, we first aim at settling a nomenclature and formalism to avoid confusion when referring to different networks used in the field of metabolomics. Then, we present the state of the art of network-based methods for mass spectrometry-based metabolomics data analysis, as well as future developments expected in this area. We cover the use of networks applications using biochemical reactions, mass spectrometry features, chemical structural similarities, and correlations between metabolites. We also describe the application of knowledge networks such as metabolic reaction networks. Finally, we discuss the possibility of combining different networks to analyze and interpret them simultaneously.
Collapse
Affiliation(s)
- Adam Amara
- Section of Nutrition and Metabolism, International Agency for Research on Cancer (IARC-WHO), Lyon, France
| | - Clément Frainay
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Fabien Jourdan
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, France
| | - Thomas Naake
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
| | - Steffen Neumann
- Bioinformatics and Scientific Data, Leibniz Institute of Plant Biochemistry, Halle (Saale), Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | - Elva María Novoa-del-Toro
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | | | - Liesa Salzer
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, Neuherberg, Germany
| | - Sarah Scharfenberg
- Bioinformatics and Scientific Data, Leibniz Institute of Plant Biochemistry, Halle (Saale), Germany
| | - Michael Witting
- Metabolomics and Proteomics Core, Helmholtz Zentrum München, Neuherberg, Germany
- Chair of Analytical Food Chemistry, TUM School of Life Sciences, Freising, Germany
| |
Collapse
|
18
|
Kuriya Y, Inoue M, Yamamoto M, Murata M, Araki M. Knowledge extraction from literature and enzyme sequences complements FBA analysis in metabolic engineering. Biotechnol J 2021; 16:e2000443. [PMID: 34516717 DOI: 10.1002/biot.202000443] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 09/01/2021] [Accepted: 09/10/2021] [Indexed: 11/10/2022]
Abstract
Flux balance analysis (FBA) using genome-scale metabolic model (GSM) is a useful method for improving the bio-production of useful compounds. However, FBA often does not impose important constraints such as nutrients uptakes, by-products excretions and gases (oxygen and carbon dioxide) transfers. Furthermore, important information on metabolic engineering such as enzyme amounts, activities, and characteristics caused by gene expression and enzyme sequences is basically not included in GSM. Therefore, simple FBA is often not sufficient to search for metabolic manipulation strategies that are useful for improving the production of target compounds. In this study, we proposed a method using literature and enzyme search to complement the FBA-based metabolic manipulation strategies. As a case study, this method was applied to shikimic acid production by Corynebacterium glutamicum to verify its usefulness. As unique strategies in literature-mining, overexpression of the transcriptional regulator SugR and gene disruption related to by-products productions were complemented. In the search for alternative enzyme sequences, it was suggested that those candidates are searched for from various species based on features captured by deep learning, which are not simply homologous to amino acid sequences of the base enzymes.
Collapse
Affiliation(s)
- Yuki Kuriya
- Graduate School of Medicine, Kyoto University, Kyoto, Kyoto, Japan
| | - Mai Inoue
- Graduate School of Science, Technology and Innovation, Kobe University, Kobe, Hyogo, Japan
| | - Masaki Yamamoto
- Graduate School of Science, Technology and Innovation, Kobe University, Kobe, Hyogo, Japan
| | - Masahiro Murata
- Graduate School of Medicine, Kyoto University, Kyoto, Kyoto, Japan
| | - Michihiro Araki
- Graduate School of Medicine, Kyoto University, Kyoto, Kyoto, Japan.,Graduate School of Science, Technology and Innovation, Kobe University, Kobe, Hyogo, Japan.,Artificial Intelligence Center for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition, Shinjuku-ku, Tokyo, Japan
| |
Collapse
|
19
|
Carey MA, Dräger A, Beber ME, Papin JA, Yurkovich JT. Community standards to facilitate development and address challenges in metabolic modeling. Mol Syst Biol 2021; 16:e9235. [PMID: 32845080 PMCID: PMC8411906 DOI: 10.15252/msb.20199235] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Standardization of data and models facilitates effective communication, especially in computational systems biology. However, both the development and consistent use of standards and resources remain challenging. As a result, the amount, quality, and format of the information contained within systems biology models are not consistent and therefore present challenges for widespread use and communication. Here, we focused on these standards, resources, and challenges in the field of constraint-based metabolic modeling by conducting a community-wide survey. We used this feedback to (i) outline the major challenges that our field faces and to propose solutions and (ii) identify a set of features that defines what a "gold standard" metabolic network reconstruction looks like concerning content, annotation, and simulation capabilities. We anticipate that this community-driven outline will help the long-term development of community-inspired resources as well as produce high-quality, accessible models within our field. More broadly, we hope that these efforts can serve as blueprints for other computational modeling communities to ensure the continued development of both practical, usable standards and reproducible, knowledge-rich models.
Collapse
Affiliation(s)
- Maureen A Carey
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA
| | - Andreas Dräger
- Computational Systems Biology of Infection and Antimicrobial-Resistant Pathogens, Institute for Biomedical Informatics (IBMI), University of Tübingen, Tübingen, Germany.,Department of Computer Science, University of Tübingen, Tübingen, Germany.,German Center for Infection Research (DZIF), partner site Tübingen, Tübingen, Germany
| | - Moritz E Beber
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Denmark
| | - Jason A Papin
- Division of Infectious Diseases and International Health, Department of Medicine, University of Virginia, Charlottesville, VA, USA.,Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
| | | |
Collapse
|
20
|
Ibrahim M, Raajaraam L, Raman K. Modelling microbial communities: Harnessing consortia for biotechnological applications. Comput Struct Biotechnol J 2021; 19:3892-3907. [PMID: 34584635 PMCID: PMC8441623 DOI: 10.1016/j.csbj.2021.06.048] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 06/29/2021] [Accepted: 06/29/2021] [Indexed: 02/06/2023] Open
Abstract
Microbes propagate and thrive in complex communities, and there are many benefits to studying and engineering microbial communities instead of single strains. Microbial communities are being increasingly leveraged in biotechnological applications, as they present significant advantages such as the division of labour and improved substrate utilisation. Nevertheless, they also present some interesting challenges to surmount for the design of efficient biotechnological processes. In this review, we discuss key principles of microbial interactions, followed by a deep dive into genome-scale metabolic models, focussing on a vast repertoire of constraint-based modelling methods that enable us to characterise and understand the metabolic capabilities of microbial communities. Complementary approaches to model microbial communities, such as those based on graph theory, are also briefly discussed. Taken together, these methods provide rich insights into the interactions between microbes and how they influence microbial community productivity. We finally overview approaches that allow us to generate and test numerous synthetic community compositions, followed by tools and methodologies that can predict effective genetic interventions to further improve the productivity of communities. With impending advancements in high-throughput omics of microbial communities, the stage is set for the rapid expansion of microbial community engineering, with a significant impact on biotechnological processes.
Collapse
Affiliation(s)
- Maziya Ibrahim
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Lavanya Raajaraam
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| | - Karthik Raman
- Bhupat and Jyoti Mehta School of Biosciences, Department of Biotechnology, Indian Institute of Technology (IIT) Madras, Chennai 600 036, India
- Centre for Integrative Biology and Systems Medicine (IBSE), IIT Madras, Chennai 600 036, India
- Robert Bosch Centre for Data Science and Artificial Intelligence (RBCDSAI), IIT Madras, Chennai 600 036, India
| |
Collapse
|
21
|
Iablokov SN, Novichkov PS, Osterman AL, Rodionov DA. Binary Metabolic Phenotypes and Phenotype Diversity Metrics for the Functional Characterization of Microbial Communities. Front Microbiol 2021; 12:653314. [PMID: 34113324 PMCID: PMC8185038 DOI: 10.3389/fmicb.2021.653314] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 04/06/2021] [Indexed: 01/08/2023] Open
Abstract
The profiling of 16S rRNA revolutionized the exploration of microbiomes, allowing to describe community composition by enumerating relevant taxa and their abundances. However, taxonomic profiles alone lack interpretability in terms of bacterial metabolism, and their translation into functional characteristics of microbiomes is a challenging task. This bottom-up approach minimally requires a reference collection of major metabolic traits deduced from the complete genomes of individual organisms, an accurate method of projecting these traits from a reference collection to the analyzed amplicon sequence variants (ASVs), and, ultimately, an approach to a microbiome-wide aggregation of predicted individual traits into physiologically relevant cumulative metrics to characterize and compare multiple microbiome samples. In this study, we extended a previously introduced computational approach for the functional profiling of complex microbial communities, which is based on the concept of binary metabolic phenotypes encoding the presence ("1") or absence ("0") of various measurable physiological properties in individual organisms that are termed phenotype carriers or non-carriers, respectively. Derived from complete genomes via metabolic reconstruction, binary phenotypes provide a foundation for the prediction of functional traits for each ASV identified in a microbiome sample. Here, we introduced three distinct mapping schemes for a microbiome-wide phenotype prediction and assessed their accuracy on the 16S datasets of mock bacterial communities representing human gut microbiome (HGM) as well as on two large HGM datasets, the American Gut Project and the UK twins study. The 16S sequence-based scheme yielded a more accurate phenotype predictions, while the taxonomy-based schemes demonstrated a reasonable performance to warrant their application for other types of input data (e.g., from shotgun metagenomics or qPCR). In addition to the abundance-weighted Community Phenotype Indices (CPIs) reflecting the fractional representation of various phenotype carriers in microbiome samples, we employ metrics capturing the diversity of phenotype carriers, Phenotype Alpha Diversity (PAD) and Phenotype Beta Diversity (PBD). In combination with CPI, PAD allows to classify the robustness of metabolic phenotypes by their anticipated stability in the face of potential environmental perturbations. PBD provides a promising approach for detecting the metabolic features potentially contributing to disease-associated metabolic traits as illustrated by a comparative analysis of HGM samples from healthy and Crohn's disease cohorts.
Collapse
Affiliation(s)
- Stanislav N. Iablokov
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
| | | | - Andrei L. Osterman
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, United States
| | - Dmitry A. Rodionov
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, United States
| |
Collapse
|
22
|
Villanova V, Singh D, Pagliardini J, Fell D, Le Monnier A, Finazzi G, Poolman M. Boosting Biomass Quantity and Quality by Improved Mixotrophic Culture of the Diatom Phaeodactylum tricornutum. FRONTIERS IN PLANT SCIENCE 2021; 12:642199. [PMID: 33897733 PMCID: PMC8063856 DOI: 10.3389/fpls.2021.642199] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 02/22/2021] [Indexed: 06/12/2023]
Abstract
Diatoms are photoautotrophic unicellular algae and are among the most abundant, adaptable, and diverse marine phytoplankton. They are extremely interesting not only for their ecological role but also as potential feedstocks for sustainable biofuels and high-value commodities such as omega fatty acids, because of their capacity to accumulate lipids. However, the cultivation of microalgae on an industrial scale requires higher cell densities and lipid accumulation than those found in nature to make the process economically viable. One of the known ways to induce lipid accumulation in Phaeodactylum tricornutum is nitrogen deprivation, which comes at the expense of growth inhibition and lower cell density. Thus, alternative ways need to be explored to enhance the lipid production as well as biomass density to make them sustainable at industrial scale. In this study, we have used experimental and metabolic modeling approaches to optimize the media composition, in terms of elemental composition, organic and inorganic carbon sources, and light intensity, that boost both biomass quality and quantity of P. tricornutum. Eventually, the optimized conditions were scaled-up to 2 L photobioreactors, where a better system control (temperature, pH, light, aeration/mixing) allowed a further improvement of the biomass capacity of P. tricornutum to 12 g/L.
Collapse
Affiliation(s)
- Valeria Villanova
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden
- Laboratoire de Physiologie Cellulaire et Végétale, Université Grenoble Alpes (UGA), Centre National de la Recherche Scientifique (CNRS), Commissariat á l'Énergie Atomique et aux Énergies Alternatives (CEA), Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Interdisciplinary Research Institute of Grenoble, CEA Grenoble, Grenoble, France
- Fermentalg SA, Libourne, France
| | - Dipali Singh
- Microbes in the Food Chain, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
- Cell System Modelling Group, Oxford Brookes University, Oxford, United Kingdom
| | | | - David Fell
- Cell System Modelling Group, Oxford Brookes University, Oxford, United Kingdom
| | | | - Giovanni Finazzi
- Laboratoire de Physiologie Cellulaire et Végétale, Université Grenoble Alpes (UGA), Centre National de la Recherche Scientifique (CNRS), Commissariat á l'Énergie Atomique et aux Énergies Alternatives (CEA), Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement, Interdisciplinary Research Institute of Grenoble, CEA Grenoble, Grenoble, France
| | - Mark Poolman
- Cell System Modelling Group, Oxford Brookes University, Oxford, United Kingdom
| |
Collapse
|
23
|
Chiappino-Pepe A, Hatzimanikatis V. PhenoMapping: a protocol to map cellular phenotypes to metabolic bottlenecks, identify conditional essentiality, and curate metabolic models. STAR Protoc 2021; 2:100280. [PMID: 33532729 PMCID: PMC7829271 DOI: 10.1016/j.xpro.2020.100280] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Targeted identification of cellular processes responsible for a phenotype is of major importance in guiding efforts in bioengineering and medicine. Genome-scale metabolic models (GEMs) are widely used to integrate various types of omics data and study the cellular physiology under different conditions. Here, we present PhenoMapping, a protocol that uses GEMs, omics, and phenotypic data to map cellular processes and observed phenotypes. PhenoMapping also classifies genes as conditionally and unconditionally essential and guides a comprehensive curation of GEMs. For complete details on the use and execution of this protocol, please refer to Stanway et al. (2019) and Krishnan et al. (2020).
Collapse
Affiliation(s)
- Anush Chiappino-Pepe
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| |
Collapse
|
24
|
Bernstein DB, Sulheim S, Almaas E, Segrè D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol 2021; 22:64. [PMID: 33602294 PMCID: PMC7890832 DOI: 10.1186/s13059-021-02289-z] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/04/2021] [Indexed: 02/07/2023] Open
Abstract
The reconstruction and analysis of genome-scale metabolic models constitutes a powerful systems biology approach, with applications ranging from basic understanding of genotype-phenotype mapping to solving biomedical and environmental problems. However, the biological insight obtained from these models is limited by multiple heterogeneous sources of uncertainty, which are often difficult to quantify. Here we review the major sources of uncertainty and survey existing approaches developed for representing and addressing them. A unified formal characterization of these uncertainties through probabilistic approaches and ensemble modeling will facilitate convergence towards consistent reconstruction pipelines, improved data integration algorithms, and more accurate assessment of predictive capacity.
Collapse
Affiliation(s)
- David B Bernstein
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA
| | - Snorre Sulheim
- Bioinformatics Program, Boston University, Boston, MA, USA
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biotechnology and Nanomedicine, SINTEF Industry, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Daniel Segrè
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA.
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology and Department of Physics, Boston University, Boston, MA, USA.
| |
Collapse
|
25
|
Systematically gap-filling the genome-scale metabolic model of CHO cells. Biotechnol Lett 2020; 43:73-87. [PMID: 33040240 DOI: 10.1007/s10529-020-03021-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 10/03/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Chinese hamster ovary (CHO) cells are the leading cell factories for producing recombinant proteins in the biopharmaceutical industry. In this regard, constraint-based metabolic models are useful platforms to perform computational analysis of cell metabolism. These models need to be regularly updated in order to include the latest biochemical data of the cells, and to increase their predictive power. Here, we provide an update to iCHO1766, the metabolic model of CHO cells. RESULTS We expanded the existing model of Chinese hamster metabolism with the help of four gap-filling approaches, leading to the addition of 773 new reactions and 335 new genes. We incorporated these into an updated genome-scale metabolic network model of CHO cells, named iCHO2101. In this updated model, the number of reactions and pathways capable of carrying flux is substantially increased. CONCLUSIONS The present CHO model is an important step towards more complete metabolic models of CHO cells.
Collapse
|
26
|
Rana P, Berry C, Ghosh P, Fong SS. Recent advances on constraint-based models by integrating machine learning. Curr Opin Biotechnol 2020; 64:85-91. [DOI: 10.1016/j.copbio.2019.11.007] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Revised: 11/04/2019] [Accepted: 11/06/2019] [Indexed: 01/06/2023]
|
27
|
Egan S, Fukatsu T, Francino MP. Opportunities and Challenges to Microbial Symbiosis Research in the Microbiome Era. Front Microbiol 2020; 11:1150. [PMID: 32612581 PMCID: PMC7308722 DOI: 10.3389/fmicb.2020.01150] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 05/06/2020] [Indexed: 01/04/2023] Open
Affiliation(s)
- Suhelen Egan
- Centre for Marine Science and Innovation (CMSI), School of Biological, Earth and Environmental Sciences (BEES), UNSW Sydney, Sydney, NSW, Australia
| | - Takema Fukatsu
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Japan
| | - M Pilar Francino
- Joint Research Unit in Genomics and Health, Fundació per al Foment de la Investigació Sanitária i Biomèdica de la Comunitat Valenciana (FISABIO)/Institut de Biologia Integrativa de Sistemes (Universitat de València i Consejo Superior de Investigaciones Científicas), València, Spain.,CIBER en Epidemiología y Salud Pública, Madrid, Spain
| |
Collapse
|
28
|
Zamani Amirzakaria J, Malboobi MA, Marashi SA, Lohrasebi T. In silico prediction of enzymatic reactions catalyzed by acid phosphatases. J Biomol Struct Dyn 2020; 39:3900-3911. [PMID: 32615050 DOI: 10.1080/07391102.2020.1785943] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
In present work, we describe a methodology for prediction of an enzymatic reaction for which no experimental data are available except for a gene sequence. As a challenging case, we have developed the method for identifying the putative substrates of monoester phosphatases, commonly known as acid phosphatase enzymes, which have no strong substrate specificity. Finding a preferable substrate for each one is an important task to unravel pathways involved in plant phosphate metabolism. Having used an Arabidopsis thaliana haloacid dehalogenase (HAD)-related acid phosphatases, HRP9, with an experimentally known structure and preferred substrate as an instance, we firstly predicted the 3 D-structure of HRP1 for subsequent analysis. Then, molecular docking was used to find the best protein interaction with a ligand existing in a set of possible substrates compiled from genome scale metabolic networks of A. thaliana based on binding energy, binding mode as well as the distance between phosphoric ester and cofactor, Mg2+, localized in the active site of HRP1. Molecular dynamics simulation ratified stable protein-ligand complex model. Our analysis predicted HRP1 preferably bind to pyridoxamine-5'-phosphate (PMP). Thus, it is deduced that the conversion of PMP to pyridoxamine must be catalyzed by HRP1. This procedure is expected to make a reliable pipeline to predict the enzymatic reactions catalyzed by acid phosphatases. Taken as a whole, it could be applicable for discovery of the interacting ligands, inhibitors as well as interacting proteins which limits lab works or used for gap filling in biosystems.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Javad Zamani Amirzakaria
- Department of Plant Biotechnology, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| | - Mohammad Ali Malboobi
- Department of Plant Biotechnology, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| | - Sayed-Amir Marashi
- Department of Biotechnology, Faculty of Science, University of Tehran, Tehran, Iran
| | - Tahmineh Lohrasebi
- Department of Plant Biotechnology, National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
29
|
Using automated reasoning to explore the metabolism of unconventional organisms: a first step to explore host-microbial interactions. Biochem Soc Trans 2020; 48:901-913. [PMID: 32379295 DOI: 10.1042/bst20190667] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 04/01/2020] [Accepted: 04/03/2020] [Indexed: 01/24/2023]
Abstract
Systems modelled in the context of molecular and cellular biology are difficult to represent with a single calibrated numerical model. Flux optimisation hypotheses have shown tremendous promise to accurately predict bacterial metabolism but they require a precise understanding of metabolic reactions occurring in the considered species. Unfortunately, this information may not be available for more complex organisms or non-cultured microorganisms such as those evidenced in microbiomes with metagenomic techniques. In both cases, flux optimisation techniques may not be applicable to elucidate systems functioning. In this context, we describe how automatic reasoning allows relevant features of an unconventional biological system to be identified despite a lack of data. A particular focus is put on the use of Answer Set Programming, a logic programming paradigm with combinatorial optimisation functionalities. We describe its usage to over-approximate metabolic responses of biological systems and solve gap-filling problems. In this review, we compare steady-states and Boolean abstractions of metabolic models and illustrate their complementarity via applications to the metabolic analysis of macro-algae. Ongoing applications of this formalism explore the emerging field of systems ecology, notably elucidating interactions between a consortium of microbes and a host organism. As the first step in this field, we will illustrate how the reduction in microbiotas according to expected metabolic phenotypes can be addressed with gap-filling problems.
Collapse
|
30
|
Hashemi A. CRISPR-Cas9/CRISPRi tools for cell factory construction in E. coli. World J Microbiol Biotechnol 2020; 36:96. [PMID: 32583135 DOI: 10.1007/s11274-020-02872-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 06/19/2020] [Indexed: 12/26/2022]
Abstract
The innovative CRISPR-Cas based genome editing technology provides some functionality and advantages such as the high efficiency and specificity as well as ease of handling. Both aspects of the CRISPR-Cas9 system including genetic engineering and gene regulation are advantageously applicable to the construction of microbial cell factories. As one of the most extensively used cell factories, E. coli has been engineered to produce various high value-added chemical compounds such as pharmaceuticals, biochemicals, and biofuels. Therefore, to improve the production of valuable metabolites, many investigations have been performed by focusing on CRISPR-Cas- based metabolic engineering of this host. In the current review, the biology underlying CRISPR-Cas9 system was briefly explained and then the applications of CRISPR-Cas9/CRISPRi tools were considered for cell factory construction in E. coli.
Collapse
Affiliation(s)
- Atieh Hashemi
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shahid Beheshti University of Medical Sciences, No. 2660, Vali-e-Asr Ave, Tehran, Iran.
| |
Collapse
|
31
|
Ong WK, Midford PE, Karp PD. Taxonomic weighting improves the accuracy of a gap-filling algorithm for metabolic models. Bioinformatics 2020; 36:1823-1830. [PMID: 31688932 PMCID: PMC7523652 DOI: 10.1093/bioinformatics/btz813] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 08/29/2019] [Accepted: 10/31/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The increasing availability of annotated genome sequences enables construction of genome-scale metabolic networks, which are useful tools for studying organisms of interest. However, due to incomplete genome annotations, draft metabolic models contain gaps that must be filled in a time-consuming process before they are usable. Optimization-based algorithms that fill these gaps have been developed, however, gap-filling algorithms show significant error rates and often introduce incorrect reactions. RESULTS Here, we present a new gap-filling method that computes the costs of candidate gap-filling reactions from a universal reaction database (MetaCyc) based on taxonomic information. When gap-filling a metabolic model for an organism M (such as Escherichia coli), the cost for reaction R is based on the frequency with which R occurs in other organisms within the phylum of M (in this case, Proteobacteria). The assumption behind this method is that different taxonomic groups are biased toward using different metabolic reactions. Evaluation of the new gap-filler on randomly degraded variants of the EcoCyc metabolic model for E.coli showed an increase in the average F1-score to 99.0 (when using the variable weights by frequency method at the phylum level), compared to 91.0 using the previous MetaFlux gap-filler and 80.3 using a basic gap-filler. Evaluation on two other microbial metabolic models showed similar improvements. AVAILABILITY AND IMPLEMENTATION The Pathway Tools software (including MetaFlux) is free for academic use and is available at http://pathwaytools.com. Additional code for reproducing the results presented here is available at www.ai.sri.com/pkarp/pubs/taxgap/supplementary.zip. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wai Kit Ong
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| | - Peter E Midford
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| | - Peter D Karp
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| |
Collapse
|
32
|
Norsigian CJ, Fang X, Seif Y, Monk JM, Palsson BO. A workflow for generating multi-strain genome-scale metabolic models of prokaryotes. Nat Protoc 2020; 15:1-14. [PMID: 31863076 PMCID: PMC7017905 DOI: 10.1038/s41596-019-0254-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Accepted: 10/08/2019] [Indexed: 11/09/2022]
Abstract
Genome-scale models (GEMs) of bacterial strains' metabolism have been formulated and used over the past 20 years. Recently, with the number of genome sequences exponentially increasing, multi-strain GEMs have proved valuable to define the properties of a species. Here, through four major stages, we extend the original Protocol used to generate a GEM for a single strain to enable multi-strain GEMs: (i) obtain or generate a high-quality model of a reference strain; (ii) compare the genome sequence between a reference strain and target strains to generate a homology matrix; (iii) generate draft strain-specific models from the homology matrix; and (iv) manually curate draft models. These multi-strain GEMs can be used to study pan-metabolic capabilities and strain-specific differences across a species, thus providing insights into its range of lifestyles. Unlike the original Protocol, this procedure is scalable and can be partly automated with the Supplementary Jupyter notebook Tutorial. This Protocol Extension joins the ranks of other comparable methods for generating models such as CarveMe and KBase. This extension of the original Protocol takes on the order of weeks to multiple months to complete depending on the availability of a suitable reference model.
Collapse
Affiliation(s)
- Charles J Norsigian
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Xin Fang
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Yara Seif
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Jonathan M Monk
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Lyngby, Denmark.
| |
Collapse
|
33
|
Nègre D, Aite M, Belcour A, Frioux C, Brillet-Guéguen L, Liu X, Bordron P, Godfroy O, Lipinska AP, Leblanc C, Siegel A, Dittami SM, Corre E, Markov GV. Genome-Scale Metabolic Networks Shed Light on the Carotenoid Biosynthesis Pathway in the Brown Algae Saccharina japonica and Cladosiphon okamuranus. Antioxidants (Basel) 2019; 8:E564. [PMID: 31744163 PMCID: PMC6912245 DOI: 10.3390/antiox8110564] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Revised: 11/13/2019] [Accepted: 11/15/2019] [Indexed: 12/20/2022] Open
Abstract
Understanding growth mechanisms in brown algae is a current scientific and economic challenge that can benefit from the modeling of their metabolic networks. The sequencing of the genomes of Saccharina japonica and Cladosiphon okamuranus has provided the necessary data for the reconstruction of Genome-Scale Metabolic Networks (GSMNs). The same in silico method deployed for the GSMN reconstruction of Ectocarpus siliculosus to investigate the metabolic capabilities of these two algae, was used. Integrating metabolic profiling data from the literature, we provided functional GSMNs composed of an average of 2230 metabolites and 3370 reactions. Based on these GSMNs and previously published work, we propose a model for the biosynthetic pathways of the main carotenoids in these two algae. We highlight, on the one hand, the reactions and enzymes that have been preserved through evolution and, on the other hand, the specificities related to brown algae. Our data further indicate that, if abscisic acid is produced by Saccharina japonica, its biosynthesis pathway seems to be different in its final steps from that described in land plants. Thus, our work illustrates the potential of GSMNs reconstructions for formalizing hypotheses that can be further tested using targeted biochemical approaches.
Collapse
Affiliation(s)
- Delphine Nègre
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
- Sorbonne Université, CNRS, Plateforme ABiMS (FR2424), Station Biologique de Roscoff, 29680 Roscoff, France
- Groupe Mer, Molécules, Santé-EA 2160, UFR des Sciences Pharmaceutiques et Biologiques, Université de Nantes, 9, Rue Bias, 44035 Nantes, France
| | - Méziane Aite
- Université de Rennes 1, Institute for Research in IT and Random Systems (IRISA), Equipe Dyliss, 35052 Rennes, France
| | - Arnaud Belcour
- Université de Rennes 1, Institute for Research in IT and Random Systems (IRISA), Equipe Dyliss, 35052 Rennes, France
| | - Clémence Frioux
- Université de Rennes 1, Institute for Research in IT and Random Systems (IRISA), Equipe Dyliss, 35052 Rennes, France
- Quadram Institute, Colney Lane, Norwich NR4 7UQ, UK
| | - Loraine Brillet-Guéguen
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
- Sorbonne Université, CNRS, Plateforme ABiMS (FR2424), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Xi Liu
- Sorbonne Université, CNRS, Plateforme ABiMS (FR2424), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Philippe Bordron
- Sorbonne Université, CNRS, Plateforme ABiMS (FR2424), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Olivier Godfroy
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - Agnieszka P. Lipinska
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - Catherine Leblanc
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - Anne Siegel
- Université de Rennes 1, Institute for Research in IT and Random Systems (IRISA), Equipe Dyliss, 35052 Rennes, France
| | - Simon M. Dittami
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| | - Erwan Corre
- Sorbonne Université, CNRS, Plateforme ABiMS (FR2424), Station Biologique de Roscoff, 29680 Roscoff, France
| | - Gabriel V. Markov
- Sorbonne Université, CNRS, Integrative Biology of Marine Models (LBI2M), Station Biologique de Roscoff (SBR), 29680 Roscoff, France
| |
Collapse
|
34
|
Abstract
Streptococcus mutans is a Gram-positive bacterium that thrives under acidic conditions and is a primary cause of tooth decay (dental caries). To better understand the metabolism of S. mutans on a systematic level, we manually constructed a genome-scale metabolic model of the S. mutans type strain UA159. The model, called iSMU, contains 675 reactions involving 429 metabolites and the products of 493 genes. We validated iSMU by comparing simulations with growth experiments in defined medium. The model simulations matched experimental results for 17 of 18 carbon source utilization assays and 47 of 49 nutrient depletion assays. We also simulated the effects of single gene deletions. The model's predictions agreed with 78.1% and 84.4% of the gene essentiality predictions from two experimental data sets. Our manually curated model is more accurate than S. mutans models generated from automated reconstruction pipelines and more complete than other manually curated models. We used iSMU to generate hypotheses about the S. mutans metabolic network. Subsequent genetic experiments confirmed that (i) S. mutans catabolizes sorbitol via a sorbitol-6-phosphate 2-dehydrogenase (SMU_308) and (ii) the Leloir pathway is required for growth on complex carbohydrates such as raffinose. We believe the iSMU model is an important resource for understanding the metabolism of S. mutans and guiding future experiments.IMPORTANCE Tooth decay is the most prevalent chronic disease in the United States. Decay is caused by the bacterium Streptococcus mutans, an oral pathogen that ferments sugars into tooth-destroying lactic acid. We constructed a complete metabolic model of S. mutans to systematically investigate how the bacterium grows. The model provides a valuable resource for understanding and targeting S. mutans' ability to outcompete other species in the oral microbiome.
Collapse
|
35
|
Frioux C, Fremy E, Trottier C, Siegel A. Scalable and exhaustive screening of metabolic functions carried out by microbial consortia. Bioinformatics 2019; 34:i934-i943. [PMID: 30423063 PMCID: PMC6129287 DOI: 10.1093/bioinformatics/bty588] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Motivation The selection of species exhibiting metabolic behaviors of interest is a challenging step when switching from the investigation of a large microbiota to the study of functions effectiveness. Approaches based on a compartmentalized framework are not scalable. The output of scalable approaches based on a non-compartmentalized modeling may be so large that it has neither been explored nor handled so far. Results We present the Miscoto tool to facilitate the selection of a community optimizing a desired function in a microbiome by reporting several possibilities which can be then sorted according to biological criteria. Communities are exhaustively identified using logical programming and by combining the non-compartmentalized and the compartmentalized frameworks. The benchmarking of 4.9 million metabolic functions associated with the Human Microbiome Project, shows that Miscoto is suited to screen and classify metabolic producibility in terms of feasibility, functional redundancy and cooperation processes involved. As an illustration of a host-microbial system, screening the Recon 2.2 human metabolism highlights the role of different consortia within a family of 773 intestinal bacteria. Availability and implementation Miscoto source code, instructions for use and examples are available at: https://github.com/cfrioux/miscoto.
Collapse
Affiliation(s)
| | - Enora Fremy
- Univ Rennes, Inria, CNRS, IRISA, Rennes, France
| | | | - Anne Siegel
- Univ Rennes, Inria, CNRS, IRISA, Rennes, France
| |
Collapse
|
36
|
Wilken SE, Swift CL, Podolsky IA, Lankiewicz TS, Seppälä S, O'Malley MA. Linking ‘omics’ to function unlocks the biotech potential of non-model fungi. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.coisb.2019.02.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
37
|
Abstract
Metabolomic data is the youngest of the high-throughput data types; however, it is potentially one of the most informative, as it provides a direct, quantitative biochemical phenotype. There are a number of ways in which metabolomic data can be analyzed in systems biology; however, the thermodynamic and kinetic relevance of these data cannot be overstated. Genome-scale metabolic network reconstructions provide a natural context to incorporate metabolomic data in order to provide insight into the condition-specific kinetic characteristics of metabolic networks. Herein we discuss how metabolomic data can be incorporated into constraint-based models in a flexible framework that enables scaling from small pathways to cell-scale models, while being able to accommodate coarse-grained to more detailed, allosteric interactions, all using the well-known principle of mass action.
Collapse
|
38
|
Karp PD, Weaver D, Latendresse M. How accurate is automated gap filling of metabolic models? BMC SYSTEMS BIOLOGY 2018; 12:73. [PMID: 29914471 PMCID: PMC6006690 DOI: 10.1186/s12918-018-0593-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 05/31/2018] [Indexed: 12/20/2022]
Abstract
Background Reaction gap filling is a computational technique for proposing the addition of reactions to genome-scale metabolic models to permit those models to run correctly. Gap filling completes what are otherwise incomplete models that lack fully connected metabolic networks. The models are incomplete because they are derived from annotated genomes in which not all enzymes have been identified. Here we compare the results of applying an automated likelihood-based gap filler within the Pathway Tools software with the results of manually gap filling the same metabolic model. Both gap-filling exercises were applied to the same genome-derived qualitative metabolic reconstruction for Bifidobacterium longum subsp. longum JCM 1217, and to the same modeling conditions — anaerobic growth under four nutrients producing 53 biomass metabolites. Results The solution computed by the gap-filling program GenDev contained 12 reactions, but closer examination showed that solution was not minimal; two of the twelve reactions can be removed to yield a set of ten reactions that enable model growth. The manually curated solution contained 13 reactions, eight of which were shared with the 12-reaction computed solution. Thus, GenDev achieved recall of 61.5% and precision of 66.6%. These results suggest that although computational gap fillers are populating metabolic models with significant numbers of correct reactions, automatically gap-filled metabolic models also contain significant numbers of incorrect reactions. Conclusions Our conclusion is that manual curation of gap-filler results is needed to obtain high-accuracy models. Many of the differences between the manual and automatic solutions resulted from using expert biological knowledge to direct the choice of reactions within the curated solution, such as reactions specific to the anaerobic lifestyle of B. longum. Electronic supplementary material The online version of this article (10.1186/s12918-018-0593-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, 94025, USA.
| | - Daniel Weaver
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, 94025, USA
| | - Mario Latendresse
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, 94025, USA
| |
Collapse
|
39
|
Constraint-based modeling in microbial food biotechnology. Biochem Soc Trans 2018; 46:249-260. [PMID: 29588387 PMCID: PMC5906707 DOI: 10.1042/bst20170268] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 03/01/2018] [Accepted: 03/02/2018] [Indexed: 12/19/2022]
Abstract
Genome-scale metabolic network reconstruction offers a means to leverage the value of the exponentially growing genomics data and integrate it with other biological knowledge in a structured format. Constraint-based modeling (CBM) enables both the qualitative and quantitative analyses of the reconstructed networks. The rapid advancements in these areas can benefit both the industrial production of microbial food cultures and their application in food processing. CBM provides several avenues for improving our mechanistic understanding of physiology and genotype–phenotype relationships. This is essential for the rational improvement of industrial strains, which can further be facilitated through various model-guided strain design approaches. CBM of microbial communities offers a valuable tool for the rational design of defined food cultures, where it can catalyze hypothesis generation and provide unintuitive rationales for the development of enhanced community phenotypes and, consequently, novel or improved food products. In the industrial-scale production of microorganisms for food cultures, CBM may enable a knowledge-driven bioprocess optimization by rationally identifying strategies for growth and stability improvement. Through these applications, we believe that CBM can become a powerful tool for guiding the areas of strain development, culture development and process optimization in the production of food cultures. Nevertheless, in order to make the correct choice of the modeling framework for a particular application and to interpret model predictions in a biologically meaningful manner, one should be aware of the current limitations of CBM.
Collapse
|