1
|
Huang W, Yang F, Zhang Q, Liu J. A dual-scale fused hypergraph convolution-based hyperedge prediction model for predicting missing reactions in genome-scale metabolic networks. Brief Bioinform 2024; 25:bbae383. [PMID: 39101499 PMCID: PMC11299038 DOI: 10.1093/bib/bbae383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2024] [Revised: 06/24/2024] [Accepted: 07/23/2024] [Indexed: 08/06/2024] Open
Abstract
Genome-scale metabolic models (GEMs) are powerful tools for predicting cellular metabolic and physiological states. However, there are still missing reactions in GEMs due to incomplete knowledge. Recent gaps filling methods suggest directly predicting missing responses without relying on phenotypic data. However, they do not differentiate between substrates and products when constructing the prediction models, which affects the predictive performance of the models. In this paper, we propose a hyperedge prediction model that distinguishes substrates and products based on dual-scale fused hypergraph convolution, DSHCNet, for inferring the missing reactions to effectively fill gaps in the GEM. First, we model each hyperedge as a heterogeneous complete graph and then decompose it into three subgraphs at both homogeneous and heterogeneous scales. Then we design two graph convolution-based models to, respectively, extract features of the vertices in two scales, which are then fused via the attention mechanism. Finally, the features of all vertices are further pooled to generate the representative feature of the hyperedge. The strategy of graph decomposition in DSHCNet enables the vertices to engage in message passing independently at both scales, thereby enhancing the capability of information propagation and making the obtained product and substrate features more distinguishable. The experimental results show that the average recovery rate of missing reactions obtained by DSHCNet is at least 11.7% higher than that of the state-of-the-art methods, and that the gap-filled GEMs based on our DSHCNet model achieve the best prediction performance, demonstrating the superiority of our method.
Collapse
Affiliation(s)
- Weihong Huang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| | - Feng Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| | - Qiang Zhang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| | - Juan Liu
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, Hubei 430072, China
| |
Collapse
|
2
|
Moyer DC, Reimertz J, Segrè D, Fuxman Bass JI. Semi-Automatic Detection of Errors in Genome-Scale Metabolic Models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.24.600481. [PMID: 38979177 PMCID: PMC11230171 DOI: 10.1101/2024.06.24.600481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Genome-Scale Metabolic Models (GSMMs) are used for numerous tasks requiring computational estimates of metabolic fluxes, from predicting novel drug targets to engineering microbes to produce valuable compounds. A key limiting step in most applications of GSMMs is ensuring their representation of the target organism's metabolism is complete and accurate. Identifying and visualizing errors in GSMMs is complicated by the fact that they contain thousands of densely interconnected reactions. Furthermore, many errors in GSMMs only become apparent when considering pathways of connected reactions collectively, as opposed to examining reactions individually. Results We present Metabolic Accuracy Check and Analysis Workflow (MACAW), a collection of algorithms for detecting errors in GSMMs. The relative frequencies of errors we detect in manually curated GSMMs appear to reflect the different approaches used to curate them. Changing the method used to automatically create a GSMM from a particular organism's genome can have a larger impact on the kinds of errors in the resulting GSMM than using the same method with a different organism's genome. Our algorithms are particularly capable of identifying errors that are only apparent at the pathway level, including loops, and nontrivial cases of dead ends. Conclusions MACAW is capable of identifying inaccuracies of varying severity in a wide range of GSMMs. Correcting these errors can measurably improve the predictive capacity of a GSMM. The relative prevalence of each type of error we identify in a large collection of GSMMs could help shape future efforts for further automation of error correction and GSMM creation.
Collapse
|
3
|
Cruz F, Capela J, Ferreira EC, Rocha M, Dias O. BioISO: An Objective-Oriented Application for Assisting the Curation of Genome-Scale Metabolic Models. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:215-226. [PMID: 38170658 DOI: 10.1109/tcbb.2023.3339972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
As the reconstruction of Genome-Scale Metabolic Models (GEMs) becomes standard practice in systems biology, the number of organisms having at least one metabolic model is peaking at an unprecedented scale. The automation of laborious tasks, such as gap-finding and gap-filling, allowed the development of GEMs for poorly described organisms. However, the quality of these models can be compromised by the automation of several steps, which may lead to erroneous phenotype simulations. Biological networks constraint-based In Silico Optimisation (BioISO) is a computational tool aimed at accelerating the reconstruction of GEMs. This tool facilitates manual curation steps by reducing the large search spaces often met when debugging in silico biological models. BioISO uses a recursive relation-like algorithm and Flux Balance Analysis (FBA) to evaluate and guide debugging of in silico phenotype simulations. The potential of BioISO to guide the debugging of model reconstructions was showcased and compared with the results of two other state-of-the-art gap-filling tools (Meneco and fastGapFill). In this assessment, BioISO is better suited to reducing the search space for errors and gaps in metabolic networks by identifying smaller ratios of dead-end metabolites. Furthermore, BioISO was used as Meneco's gap-finding algorithm to reduce the number of proposed solutions for filling the gaps.
Collapse
|
4
|
Chen C, Liao C, Liu YY. Teasing out missing reactions in genome-scale metabolic networks through hypergraph learning. Nat Commun 2023; 14:2375. [PMID: 37185345 PMCID: PMC10130184 DOI: 10.1038/s41467-023-38110-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 04/14/2023] [Indexed: 05/17/2023] Open
Abstract
GEnome-scale Metabolic models (GEMs) are powerful tools to predict cellular metabolism and physiological states in living organisms. However, due to our imperfect knowledge of metabolic processes, even highly curated GEMs have knowledge gaps (e.g., missing reactions). Existing gap-filling methods typically require phenotypic data as input to tease out missing reactions. We still lack a computational method for rapid and accurate gap-filling of metabolic networks before experimental data is available. Here we present a deep learning-based method - CHEbyshev Spectral HyperlInk pREdictor (CHESHIRE) - to predict missing reactions in GEMs purely from metabolic network topology. We demonstrate that CHESHIRE outperforms other topology-based methods in predicting artificially removed reactions over 926 high- and intermediate-quality GEMs. Furthermore, CHESHIRE is able to improve the phenotypic predictions of 49 draft GEMs for fermentation products and amino acids secretions. Both types of validation suggest that CHESHIRE is a powerful tool for GEM curation to reveal unknown links between reactions and observed metabolic phenotypes.
Collapse
Affiliation(s)
- Can Chen
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA
| | - Chen Liao
- Program for Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Yang-Yu Liu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, 02115, USA.
- Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign, IL, 61801, USA.
| |
Collapse
|
5
|
Network Reconstruction and Modelling Made Reproducible with moped. Metabolites 2022; 12:metabo12040275. [PMID: 35448462 PMCID: PMC9032245 DOI: 10.3390/metabo12040275] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/24/2022] [Accepted: 03/15/2022] [Indexed: 11/23/2022] Open
Abstract
Mathematical modeling of metabolic networks is a powerful approach to investigate the underlying principles of metabolism and growth. Such approaches include, among others, differential-equation-based modeling of metabolic systems, constraint-based modeling and metabolic network expansion of metabolic networks. Most of these methods are well established and are implemented in numerous software packages, but these are scattered between different programming languages, packages and syntaxes. This complicates establishing straight forward pipelines integrating model construction and simulation. We present a Python package moped that serves as an integrative hub for reproducible construction, modification, curation and analysis of metabolic models. moped supports draft reconstruction of models directly from genome/proteome sequences and pathway/genome databases utilizing GPR annotations, providing a completely reproducible model construction and curation process within executable Python scripts. Alternatively, existing models published in SBML format can be easily imported. Models are represented as Python objects, for which a wide spectrum of easy-to-use modification and analysis methods exist. The model structure can be manually altered by adding, removing or modifying reactions, and gap-filling reactions can be found and inspected. This greatly supports the development of draft models, as well as the curation and testing of models. Moreover, moped provides several analysis methods, in particular including the calculation of biosynthetic capacities using metabolic network expansion. The integration with other Python-based tools is facilitated through various model export options. For example, a model can be directly converted into a CobraPy object for constraint-based analyses. moped is a fully documented and expandable Python package. We demonstrate the capability to serve as a hub for integrating reproducible model construction and curation, database import, metabolic network expansion and export for constraint-based analyses.
Collapse
|
6
|
Giannari D, Ho CH, Mahadevan R. A gap-filling algorithm for prediction of metabolic interactions in microbial communities. PLoS Comput Biol 2021; 17:e1009060. [PMID: 34723959 PMCID: PMC8584699 DOI: 10.1371/journal.pcbi.1009060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 11/11/2021] [Accepted: 10/05/2021] [Indexed: 11/24/2022] Open
Abstract
The study of microbial communities and their interactions has attracted the interest of the scientific community, because of their potential for applications in biotechnology, ecology and medicine. The complexity of interspecies interactions, which are key for the macroscopic behavior of microbial communities, cannot be studied easily experimentally. For this reason, the modeling of microbial communities has begun to leverage the knowledge of established constraint-based methods, which have long been used for studying and analyzing the microbial metabolism of individual species based on genome-scale metabolic reconstructions of microorganisms. A main problem of genome-scale metabolic reconstructions is that they usually contain metabolic gaps due to genome misannotations and unknown enzyme functions. This problem is traditionally solved by using gap-filling algorithms that add biochemical reactions from external databases to the metabolic reconstruction, in order to restore model growth. However, gap-filling algorithms could evolve by taking into account metabolic interactions among species that coexist in microbial communities. In this work, a gap-filling method that resolves metabolic gaps at the community level was developed. The efficacy of the algorithm was tested by analyzing its ability to resolve metabolic gaps on a synthetic community of auxotrophic Escherichia coli strains. Subsequently, the algorithm was applied to resolve metabolic gaps and predict metabolic interactions in a community of Bifidobacterium adolescentis and Faecalibacterium prausnitzii, two species present in the human gut microbiota, and in an experimentally studied community of Dehalobacter and Bacteroidales species of the ACT-3 community. The community gap-filling method can facilitate the improvement of metabolic models and the identification of metabolic interactions that are difficult to identify experimentally in microbial communities.
Collapse
Affiliation(s)
- Dafni Giannari
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada
| | | | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario, Canada
- The Institute of Biomaterials & Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
7
|
Moyer D, Pacheco AR, Bernstein DB, Segrè D. Stoichiometric Modeling of Artificial String Chemistries Reveals Constraints on Metabolic Network Structure. J Mol Evol 2021; 89:472-483. [PMID: 34230992 PMCID: PMC8318951 DOI: 10.1007/s00239-021-10018-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 06/12/2021] [Indexed: 11/15/2022]
Abstract
Uncovering the general principles that govern the structure of metabolic networks is key to understanding the emergence and evolution of living systems. Artificial chemistries can help illuminate this problem by enabling the exploration of chemical reaction universes that are constrained by general mathematical rules. Here, we focus on artificial chemistries in which strings of characters represent simplified molecules, and string concatenation and splitting represent possible chemical reactions. We developed a novel Python package, ARtificial CHemistry NEtwork Toolbox (ARCHNET), to study string chemistries using tools from the field of stoichiometric constraint-based modeling. In addition to exploring the topological characteristics of different string chemistry networks, we developed a network-pruning algorithm that can generate minimal metabolic networks capable of producing a specified set of biomass precursors from a given assortment of environmental nutrients. We found that the composition of these minimal metabolic networks was influenced more strongly by the metabolites in the biomass reaction than the identities of the environmental nutrients. This finding has important implications for the reconstruction of organismal metabolic networks and could help us better understand the rise and evolution of biochemical organization. More generally, our work provides a bridge between artificial chemistries and stoichiometric modeling, which can help address a broad range of open questions, from the spontaneous emergence of an organized metabolism to the structure of microbial communities.
Collapse
Affiliation(s)
- Devlin Moyer
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA
- Department of Biology, Boston University, Boston, MA, 02215, USA
| | - Alan R Pacheco
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA
- Biological Design Center, Boston University, Boston, MA, 02215, USA
| | - David B Bernstein
- Biological Design Center, Boston University, Boston, MA, 02215, USA
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA
| | - Daniel Segrè
- Bioinformatics Program, Boston University, Boston, MA, 02215, USA.
- Department of Biology, Boston University, Boston, MA, 02215, USA.
- Biological Design Center, Boston University, Boston, MA, 02215, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, 02215, USA.
- Department of Physics, Boston University, Boston, MA, 02215, USA.
| |
Collapse
|
8
|
Bernstein DB, Sulheim S, Almaas E, Segrè D. Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol 2021; 22:64. [PMID: 33602294 PMCID: PMC7890832 DOI: 10.1186/s13059-021-02289-z] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 02/04/2021] [Indexed: 02/07/2023] Open
Abstract
The reconstruction and analysis of genome-scale metabolic models constitutes a powerful systems biology approach, with applications ranging from basic understanding of genotype-phenotype mapping to solving biomedical and environmental problems. However, the biological insight obtained from these models is limited by multiple heterogeneous sources of uncertainty, which are often difficult to quantify. Here we review the major sources of uncertainty and survey existing approaches developed for representing and addressing them. A unified formal characterization of these uncertainties through probabilistic approaches and ensemble modeling will facilitate convergence towards consistent reconstruction pipelines, improved data integration algorithms, and more accurate assessment of predictive capacity.
Collapse
Affiliation(s)
- David B Bernstein
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA
| | - Snorre Sulheim
- Bioinformatics Program, Boston University, Boston, MA, USA
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biotechnology and Nanomedicine, SINTEF Industry, Trondheim, Norway
| | - Eivind Almaas
- Department of Biotechnology and Food Science, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
- K.G. Jebsen Center for Genetic Epidemiology, NTNU - Norwegian University of Science and Technology, Trondheim, Norway
| | - Daniel Segrè
- Department of Biomedical Engineering and Biological Design Center, Boston University, Boston, MA, USA.
- Bioinformatics Program, Boston University, Boston, MA, USA.
- Department of Biology and Department of Physics, Boston University, Boston, MA, USA.
| |
Collapse
|
9
|
Xie Y, Chen L, Sun T, Zhang W. Deciphering and engineering high-light tolerant cyanobacteria for efficient photosynthetic cell factories. Chin J Chem Eng 2021. [DOI: 10.1016/j.cjche.2020.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
10
|
Belcour A, Frioux C, Aite M, Bretaudeau A, Hildebrand F, Siegel A. Metage2Metabo, microbiota-scale metabolic complementarity for the identification of key species. eLife 2020; 9:e61968. [PMID: 33372654 PMCID: PMC7861615 DOI: 10.7554/elife.61968] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Accepted: 12/25/2020] [Indexed: 12/13/2022] Open
Abstract
To capture the functional diversity of microbiota, one must identify metabolic functions and species of interest within hundreds or thousands of microorganisms. We present Metage2Metabo (M2M) a resource that meets the need for de novo functional screening of genome-scale metabolic networks (GSMNs) at the scale of a metagenome, and the identification of critical species with respect to metabolic cooperation. M2M comprises a flexible pipeline for the characterisation of individual metabolisms and collective metabolic complementarity. In addition, M2M identifies key species, that are meaningful members of the community for functions of interest. We demonstrate that M2M is applicable to collections of genomes as well as metagenome-assembled genomes, permits an efficient GSMN reconstruction with Pathway Tools, and assesses the cooperation potential between species. M2M identifies key organisms by reducing the complexity of a large-scale microbiota into minimal communities with equivalent properties, suitable for further analyses.
Collapse
Affiliation(s)
| | - Clémence Frioux
- Univ Rennes, Inria, CNRS, IRISARennesFrance
- Inria Bordeaux Sud-OuestTalenceFrance
- Gut Microbes and Heath, Quadram InstituteNorwichUnited Kingdom
- Digital Biology, Earlham InstituteNorwichUnited Kingdom
| | | | - Anthony Bretaudeau
- Univ Rennes, Inria, CNRS, IRISARennesFrance
- Inria, UMR IGEPP, BioInformatics Platform for Agroecosystems Arthropods (BIPAA)RennesFrance
- Inria, IRISA, GenOuest Core FacilityRennesFrance
| | - Falk Hildebrand
- Gut Microbes and Heath, Quadram InstituteNorwichUnited Kingdom
- Digital Biology, Earlham InstituteNorwichUnited Kingdom
| | | |
Collapse
|
11
|
Hanna EM, Zhang X, Eide M, Fallahi S, Furmanek T, Yadetie F, Zielinski DC, Goksøyr A, Jonassen I. ReCodLiver0.9: Overcoming Challenges in Genome-Scale Metabolic Reconstruction of a Non-model Species. Front Mol Biosci 2020; 7:591406. [PMID: 33324679 PMCID: PMC7726423 DOI: 10.3389/fmolb.2020.591406] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 10/22/2020] [Indexed: 12/13/2022] Open
Abstract
The availability of genome sequences, annotations, and knowledge of the biochemistry underlying metabolic transformations has led to the generation of metabolic network reconstructions for a wide range of organisms in bacteria, archaea, and eukaryotes. When modeled using mathematical representations, a reconstruction can simulate underlying genotype-phenotype relationships. Accordingly, genome-scale metabolic models (GEMs) can be used to predict the response of organisms to genetic and environmental variations. A bottom-up reconstruction procedure typically starts by generating a draft model from existing annotation data on a target organism. For model species, this part of the process can be straightforward, due to the abundant organism-specific biochemical data. However, the process becomes complicated for non-model less-annotated species. In this paper, we present a draft liver reconstruction, ReCodLiver0.9, of Atlantic cod (Gadus morhua), a non-model teleost fish, as a practicable guide for cases with comparably few resources. Although the reconstruction is considered a draft version, we show that it already has utility in elucidating metabolic response mechanisms to environmental toxicants by mapping gene expression data of exposure experiments to the resulting model.
Collapse
Affiliation(s)
- Eileen Marie Hanna
- Department of Computer Science and Mathematics, Lebanese American University, Byblos, Lebanon
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Xiaokang Zhang
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| | - Marta Eide
- Department of Biological Sciences, University of Bergen, Bergen, Norway
| | - Shirin Fallahi
- Department of Mathematics, University of Bergen, Bergen, Norway
| | | | - Fekadu Yadetie
- Department of Biological Sciences, University of Bergen, Bergen, Norway
| | - Daniel Craig Zielinski
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, United States
| | - Anders Goksøyr
- Department of Biological Sciences, University of Bergen, Bergen, Norway
| | - Inge Jonassen
- Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
| |
Collapse
|
12
|
Systematically gap-filling the genome-scale metabolic model of CHO cells. Biotechnol Lett 2020; 43:73-87. [PMID: 33040240 DOI: 10.1007/s10529-020-03021-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 10/03/2020] [Indexed: 10/23/2022]
Abstract
OBJECTIVE Chinese hamster ovary (CHO) cells are the leading cell factories for producing recombinant proteins in the biopharmaceutical industry. In this regard, constraint-based metabolic models are useful platforms to perform computational analysis of cell metabolism. These models need to be regularly updated in order to include the latest biochemical data of the cells, and to increase their predictive power. Here, we provide an update to iCHO1766, the metabolic model of CHO cells. RESULTS We expanded the existing model of Chinese hamster metabolism with the help of four gap-filling approaches, leading to the addition of 773 new reactions and 335 new genes. We incorporated these into an updated genome-scale metabolic network model of CHO cells, named iCHO2101. In this updated model, the number of reactions and pathways capable of carrying flux is substantially increased. CONCLUSIONS The present CHO model is an important step towards more complete metabolic models of CHO cells.
Collapse
|
13
|
Ong WK, Midford PE, Karp PD. Taxonomic weighting improves the accuracy of a gap-filling algorithm for metabolic models. Bioinformatics 2020; 36:1823-1830. [PMID: 31688932 PMCID: PMC7523652 DOI: 10.1093/bioinformatics/btz813] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 08/29/2019] [Accepted: 10/31/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The increasing availability of annotated genome sequences enables construction of genome-scale metabolic networks, which are useful tools for studying organisms of interest. However, due to incomplete genome annotations, draft metabolic models contain gaps that must be filled in a time-consuming process before they are usable. Optimization-based algorithms that fill these gaps have been developed, however, gap-filling algorithms show significant error rates and often introduce incorrect reactions. RESULTS Here, we present a new gap-filling method that computes the costs of candidate gap-filling reactions from a universal reaction database (MetaCyc) based on taxonomic information. When gap-filling a metabolic model for an organism M (such as Escherichia coli), the cost for reaction R is based on the frequency with which R occurs in other organisms within the phylum of M (in this case, Proteobacteria). The assumption behind this method is that different taxonomic groups are biased toward using different metabolic reactions. Evaluation of the new gap-filler on randomly degraded variants of the EcoCyc metabolic model for E.coli showed an increase in the average F1-score to 99.0 (when using the variable weights by frequency method at the phylum level), compared to 91.0 using the previous MetaFlux gap-filler and 80.3 using a basic gap-filler. Evaluation on two other microbial metabolic models showed similar improvements. AVAILABILITY AND IMPLEMENTATION The Pathway Tools software (including MetaFlux) is free for academic use and is available at http://pathwaytools.com. Additional code for reproducing the results presented here is available at www.ai.sri.com/pkarp/pubs/taxgap/supplementary.zip. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wai Kit Ong
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| | - Peter E Midford
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| | - Peter D Karp
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA 94025, USA
| |
Collapse
|
14
|
Distributed flux balance analysis simulations of serial biomass fermentation by two organisms. PLoS One 2020; 15:e0227363. [PMID: 31945096 PMCID: PMC6964848 DOI: 10.1371/journal.pone.0227363] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 12/17/2019] [Indexed: 12/16/2022] Open
Abstract
Intelligent biorefinery design that addresses both the composition of the biomass feedstock as well as fermentation microorganisms could benefit from dedicated tools for computational simulation and computer-assisted optimization. Here we present the BioLego Vn2.0 framework, based on Microsoft Azure Cloud, which supports large-scale simulations of biomass serial fermentation processes by two different organisms. BioLego enables the simultaneous analysis of multiple fermentation scenarios and the comparison of fermentation potential of multiple feedstock compositions. Thanks to the effective use of cloud computing it further allows resource intensive analysis and exploration of media and organism modifications. We use BioLego to obtain biological and validation results, including (1) exploratory search for the optimal utilization of corn biomasses-corn cobs, corn fiber and corn stover-in fermentation biorefineries; (2) analysis of the possible effects of changes in the composition of K. alvarezi biomass on the ethanol production yield in an anaerobic two-step process (S. cerevisiae followed by E. coli); (3) analysis of the impact, on the estimated ethanol production yield, of knocking out single organism reactions either in one or in both organisms in an anaerobic two-step fermentation process of Ulva sp. into ethanol (S. cerevisiae followed by E. coli); and (4) comparison of several experimentally measured ethanol fermentation rates with the predictions of BioLego.
Collapse
|
15
|
Vijayakumar S, Conway M, Lió P, Angione C. Seeing the wood for the trees: a forest of methods for optimization and omic-network integration in metabolic modelling. Brief Bioinform 2019; 19:1218-1235. [PMID: 28575143 DOI: 10.1093/bib/bbx053] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2017] [Indexed: 11/13/2022] Open
Abstract
Metabolic modelling has entered a mature phase with dozens of methods and software implementations available to the practitioner and the theoretician. It is not easy for a modeller to be able to see the wood (or the forest) for the trees. Driven by this analogy, we here present a 'forest' of principal methods used for constraint-based modelling in systems biology. This provides a tree-based view of methods available to prospective modellers, also available in interactive version at http://modellingmetabolism.net, where it will be kept updated with new methods after the publication of the present manuscript. Our updated classification of existing methods and tools highlights the most promising in the different branches, with the aim to develop a vision of how existing methods could hybridize and become more complex. We then provide the first hands-on tutorial for multi-objective optimization of metabolic models in R. We finally discuss the implementation of multi-view machine learning approaches in poly-omic integration. Throughout this work, we demonstrate the optimization of trade-offs between multiple metabolic objectives, with a focus on omic data integration through machine learning. We anticipate that the combination of a survey, a perspective on multi-view machine learning and a step-by-step R tutorial should be of interest for both the beginner and the advanced user.
Collapse
Affiliation(s)
| | - Max Conway
- Computer Laboratory, University of Cambridge, UK
| | - Pietro Lió
- Computer Laboratory, University of Cambridge, UK
| | - Claudio Angione
- Department of Computer Science and Information Systems, Teesside University, UK
| |
Collapse
|
16
|
Cho JS, Gu C, Han TH, Ryu JY, Lee SY. Reconstruction of context-specific genome-scale metabolic models using multiomics data to study metabolic rewiring. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.coisb.2019.02.009] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
17
|
Santos-Merino M, Singh AK, Ducat DC. New Applications of Synthetic Biology Tools for Cyanobacterial Metabolic Engineering. Front Bioeng Biotechnol 2019; 7:33. [PMID: 30873404 PMCID: PMC6400836 DOI: 10.3389/fbioe.2019.00033] [Citation(s) in RCA: 114] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 02/05/2019] [Indexed: 01/25/2023] Open
Abstract
Cyanobacteria are promising microorganisms for sustainable biotechnologies, yet unlocking their potential requires radical re-engineering and application of cutting-edge synthetic biology techniques. In recent years, the available devices and strategies for modifying cyanobacteria have been increasing, including advances in the design of genetic promoters, ribosome binding sites, riboswitches, reporter proteins, modular vector systems, and markerless selection systems. Because of these new toolkits, cyanobacteria have been successfully engineered to express heterologous pathways for the production of a wide variety of valuable compounds. Cyanobacterial strains with the potential to be used in real-world applications will require the refinement of genetic circuits used to express the heterologous pathways and development of accurate models that predict how these pathways can be best integrated into the larger cellular metabolic network. Herein, we review advances that have been made to translate synthetic biology tools into cyanobacterial model organisms and summarize experimental and in silico strategies that have been employed to increase their bioproduction potential. Despite the advances in synthetic biology and metabolic engineering during the last years, it is clear that still further improvements are required if cyanobacteria are to be competitive with heterotrophic microorganisms for the bioproduction of added-value compounds.
Collapse
Affiliation(s)
- María Santos-Merino
- MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI, United States
| | - Amit K. Singh
- MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI, United States
| | - Daniel C. Ducat
- MSU-DOE Plant Research Laboratory, Michigan State University, East Lansing, MI, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, United States
| |
Collapse
|
18
|
Vitkin E, Solomon O, Sultan S, Yakhini Z. Genome-wide analysis of fitness data and its application to improve metabolic models. BMC Bioinformatics 2018; 19:368. [PMID: 30305012 PMCID: PMC6180484 DOI: 10.1186/s12859-018-2341-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 08/28/2018] [Indexed: 11/17/2022] Open
Abstract
Background Synthetic biology and related techniques enable genome scale high-throughput investigation of the effect on organism fitness of different gene knock-downs/outs and of other modifications of genomic sequence. Results We develop statistical and computational pipelines and frameworks for analyzing high throughput fitness data over a genome scale set of sequence variants. Analyzing data from a high-throughput knock-down/knock-out bacterial study, we investigate differences and determinants of the effect on fitness in different conditions. Comparing fitness vectors of genes, across tens of conditions, we observe that fitness consequences strongly depend on genomic location and more weakly depend on gene sequence similarity and on functional relationships. In analyzing promoter sequences, we identified motifs associated with conditions studied in bacterial media such as Casaminos, D-glucose, Sucrose, and other sugars and amino-acid sources. We also use fitness data to infer genes associated with orphan metabolic reactions in the iJO1366 E. coli metabolic model. To do this, we developed a new computational method that integrates gene fitness and gene expression profiles within a given reaction network neighborhood to associate this reaction with a set of genes that potentially encode the catalyzing proteins. We then apply this approach to predict candidate genes for 107 orphan reactions in iJO1366. Furthermore - we validate our methodology with known reactions using a leave-one-out approach. Specifically, using top-20 candidates selected based on combined fitness and expression datasets, we correctly reconstruct 39.7% of the reactions, as compared to 33% based on fitness and to 26% based on expression separately, and to 4.02% as a random baseline. Our model improvement results include a novel association of a gene to an orphan cytosine nucleosidation reaction. Conclusion Our pipeline for metabolic modeling shows a clear benefit of using fitness data for predicting genes of orphan reactions. Along with the analysis pipelines we developed, it can be used to analyze similar high-throughput data. Electronic supplementary material The online version of this article (10.1186/s12859-018-2341-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Edward Vitkin
- Department of Computer Science, Technion, Haifa, Israel
| | - Oz Solomon
- Faculty of Biotechnology and Food Engineering, Technion, Haifa, Israel. .,School of Computer Science, The Interdisciplinary Center, Herzliya, Israel.
| | - Sharon Sultan
- School of Computer Science, The Interdisciplinary Center, Herzliya, Israel
| | - Zohar Yakhini
- Department of Computer Science, Technion, Haifa, Israel. .,School of Computer Science, The Interdisciplinary Center, Herzliya, Israel.
| |
Collapse
|
19
|
Kim H, Kim S, Yoon SH. Metabolic network reconstruction and phenome analysis of the industrial microbe, Escherichia coli BL21(DE3). PLoS One 2018; 13:e0204375. [PMID: 30240424 PMCID: PMC6150544 DOI: 10.1371/journal.pone.0204375] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Accepted: 09/06/2018] [Indexed: 01/25/2023] Open
Abstract
Escherichia coli BL21(DE3) is an industrial model microbe for the mass-production of bioproducts such as biofuels, biorefineries, and recombinant proteins. However, despite its important role in scientific research and biotechnological applications, a high-quality metabolic network model for metabolic engineering is yet to be developed. Here, we present the comprehensive metabolic network model of E. coli BL21(DE3), named iHK1487, based on the latest genome reannotation and phenome analysis. The metabolic model consists of 1,164 unique metabolites, 2,701 metabolic reactions, and 1,487 genes. The model was validated and improved by comparing the simulation results with phenome data from phenotype microarray tests. Previous transcriptome profile data was incorporated during model reconstruction, and flux prediction was simulated using the model. iHK1487 was simulated to explore the metabolic features of BL21(DE3) such as broad spectrum amino acid utilization and enhanced flux through the upper glycolytic pathway and TCA cycle. iHK1487 will contribute to systematic understanding of cellular physiology and metabolism of E. coli BL21(DE3) and highlight its biotechnological applications.
Collapse
Affiliation(s)
- Hanseol Kim
- Department of Bioscience and Biotechnology, Konkuk University, Seoul, Republic of Korea
| | - Sinyeon Kim
- Department of Bioscience and Biotechnology, Konkuk University, Seoul, Republic of Korea
| | - Sung Ho Yoon
- Department of Bioscience and Biotechnology, Konkuk University, Seoul, Republic of Korea
| |
Collapse
|
20
|
Sun T, Li S, Song X, Diao J, Chen L, Zhang W. Toolboxes for cyanobacteria: Recent advances and future direction. Biotechnol Adv 2018; 36:1293-1307. [DOI: 10.1016/j.biotechadv.2018.04.007] [Citation(s) in RCA: 78] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2018] [Revised: 04/09/2018] [Accepted: 04/26/2018] [Indexed: 12/20/2022]
|
21
|
Karp PD, Weaver D, Latendresse M. How accurate is automated gap filling of metabolic models? BMC SYSTEMS BIOLOGY 2018; 12:73. [PMID: 29914471 PMCID: PMC6006690 DOI: 10.1186/s12918-018-0593-7] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 05/31/2018] [Indexed: 12/20/2022]
Abstract
Background Reaction gap filling is a computational technique for proposing the addition of reactions to genome-scale metabolic models to permit those models to run correctly. Gap filling completes what are otherwise incomplete models that lack fully connected metabolic networks. The models are incomplete because they are derived from annotated genomes in which not all enzymes have been identified. Here we compare the results of applying an automated likelihood-based gap filler within the Pathway Tools software with the results of manually gap filling the same metabolic model. Both gap-filling exercises were applied to the same genome-derived qualitative metabolic reconstruction for Bifidobacterium longum subsp. longum JCM 1217, and to the same modeling conditions — anaerobic growth under four nutrients producing 53 biomass metabolites. Results The solution computed by the gap-filling program GenDev contained 12 reactions, but closer examination showed that solution was not minimal; two of the twelve reactions can be removed to yield a set of ten reactions that enable model growth. The manually curated solution contained 13 reactions, eight of which were shared with the 12-reaction computed solution. Thus, GenDev achieved recall of 61.5% and precision of 66.6%. These results suggest that although computational gap fillers are populating metabolic models with significant numbers of correct reactions, automatically gap-filled metabolic models also contain significant numbers of incorrect reactions. Conclusions Our conclusion is that manual curation of gap-filler results is needed to obtain high-accuracy models. Many of the differences between the manual and automatic solutions resulted from using expert biological knowledge to direct the choice of reactions within the curated solution, such as reactions specific to the anaerobic lifestyle of B. longum. Electronic supplementary material The online version of this article (10.1186/s12918-018-0593-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Peter D Karp
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, 94025, USA.
| | - Daniel Weaver
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, 94025, USA
| | - Mario Latendresse
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, 94025, USA
| |
Collapse
|
22
|
Latendresse M, Karp PD. Evaluation of reaction gap-filling accuracy by randomization. BMC Bioinformatics 2018; 19:53. [PMID: 29444634 PMCID: PMC5813426 DOI: 10.1186/s12859-018-2050-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Accepted: 01/31/2018] [Indexed: 12/18/2022] Open
Abstract
Background Completion of genome-scale flux-balance models using computational reaction gap-filling is a widely used approach, but its accuracy is not well known. Results We report on computational experiments of reaction gap filling in which we generated degraded versions of the EcoCyc-20.0-GEM model by randomly removing flux-carrying reactions from a growing model. We gap-filled the degraded models and compared the resulting gap-filled models with the original model. Gap-filling was performed by the Pathway Tools MetaFlux software using its General Development Mode (GenDev) and its Fast Development Mode (FastDev). We explored 12 GenDev variants including two linear solvers (SCIP and CPLEX) for solving the Mixed Integer Linear Programming (MILP) problems for gap filling; three different sets of linear constraints were applied; and two MILP methods were implemented. We compared these 13 variants according to accuracy, speed, and amount of information returned to the user. Conclusions We observed large variation among the performance of the 13 gap-filling variants. Although no variant was best in all dimensions, we found one variant that was fast, accurate, and returned more information to the user. Some gap-filling variants were inaccurate, producing solutions that were non-minimum or invalid (did not enable model growth). The best GenDev variant showed a best average precision of 87% and a best average recall of 61%. FastDev showed an average precision of 71% and an average recall of 59%. Thus, using the most accurate variant, approximately 13% of the gap-filled reactions were incorrect (were not the reactions removed from the model), and 39% of gap-filled reactions were not found, suggesting that curation is still an important aspect of metabolic-model development.
Collapse
Affiliation(s)
- Mario Latendresse
- SRI International/Artificial Intelligence Center, 333 Ravenswood Ave, Menlo Park, 94025, USA.
| | - Peter D Karp
- SRI International/Artificial Intelligence Center, 333 Ravenswood Ave, Menlo Park, 94025, USA
| |
Collapse
|
23
|
Advances in gap-filling genome-scale metabolic models and model-driven experiments lead to novel metabolic discoveries. Curr Opin Biotechnol 2017; 51:103-108. [PMID: 29278837 DOI: 10.1016/j.copbio.2017.12.012] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Revised: 12/08/2017] [Accepted: 12/08/2017] [Indexed: 12/18/2022]
Abstract
With rapid improvements in next-generation sequencing technologies, our knowledge about metabolism of many organisms is rapidly increasing. However, gaps in metabolic networks exist due to incomplete knowledge (e.g., missing reactions, unknown pathways, unannotated and misannotated genes, promiscuous enzymes, and underground metabolic pathways). In this review, we discuss recent advances in gap-filling algorithms based on genome-scale metabolic models and the importance of both high-throughput experiments and detailed biochemical characterization, which work in concert with in silico methods, to allow a more accurate and comprehensive understanding of metabolism.
Collapse
|
24
|
Pan S, Nikolakakis K, Adamczyk PA, Pan M, Ruby EG, Reed JL. Model-enabled gene search (MEGS) allows fast and direct discovery of enzymatic and transport gene functions in the marine bacterium Vibrio fischeri. J Biol Chem 2017; 292:10250-10261. [PMID: 28446608 DOI: 10.1074/jbc.m116.763193] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 04/23/2017] [Indexed: 12/23/2022] Open
Abstract
Whereas genomes can be rapidly sequenced, the functions of many genes are incompletely or erroneously annotated because of a lack of experimental evidence or prior functional knowledge in sequence databases. To address this weakness, we describe here a model-enabled gene search (MEGS) approach that (i) identifies metabolic functions either missing from an organism's genome annotation or incorrectly assigned to an ORF by using discrepancies between metabolic model predictions and experimental culturing data; (ii) designs functional selection experiments for these specific metabolic functions; and (iii) selects a candidate gene(s) responsible for these functions from a genomic library and directly interrogates this gene's function experimentally. To discover gene functions, MEGS uses genomic functional selections instead of relying on correlations across large experimental datasets or sequence similarity as do other approaches. When applied to the bioluminescent marine bacterium Vibrio fischeri, MEGS successfully identified five genes that are responsible for four metabolic and transport reactions whose absence from a draft metabolic model of V. fischeri caused inaccurate modeling of high-throughput experimental data. This work demonstrates that MEGS provides a rapid and efficient integrated computational and experimental approach for annotating metabolic genes, including those that have previously been uncharacterized or misannotated.
Collapse
Affiliation(s)
- Shu Pan
- From the Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53706
| | - Kiel Nikolakakis
- From the Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53706
| | - Paul A Adamczyk
- From the Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53706
| | - Min Pan
- the School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China, and
| | - Edward G Ruby
- the Pacific Biosciences Research Center, University of Hawaii, Manoa, Hawaii 96813
| | - Jennifer L Reed
- From the Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53706,
| |
Collapse
|
25
|
Liu L, Zhang Z, Sheng T, Chen M. DEF: an automated dead-end filling approach based on quasi-endosymbiosis. Bioinformatics 2017; 33:405-413. [PMID: 28171511 DOI: 10.1093/bioinformatics/btw604] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2015] [Revised: 06/27/2016] [Accepted: 09/16/2016] [Indexed: 11/15/2022] Open
Abstract
Motivation Gap filling for the reconstruction of metabolic networks is to restore the connectivity of metabolites via finding high-confidence reactions that could be missed in target organism. Current methods for gap filling either fall into the network topology or have limited capability in finding missing reactions that are indirectly related to dead-end metabolites but of biological importance to the target model. Results We present an automated dead-end filling (DEF) approach, which is derived from the wisdom of endosymbiosis theory, to fill gaps by finding the most efficient dead-end utilization paths in a constructed quasi-endosymbiosis model. The recalls of reactions and dead ends of DEF reach around 73% and 86%, respectively. This method is capable of finding indirectly dead-end-related reactions with biological importance for the target organism and is applicable to any given metabolic model. In the E. coli iJR904 model, for instance, about 42% of the dead-end metabolites were fixed by our proposed method. Availabilty and Implementaion DEF is publicly available at http://bis.zju.edu.cn/DEF/. Contact mchen@zju.edu.cn Supplimentary Information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lili Liu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Zijun Zhang
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China.,Department of Bioinformatics, University of California, Los Angeles, Los Angeles, CA, USA
| | - Taotao Sheng
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
26
|
Prigent S, Frioux C, Dittami SM, Thiele S, Larhlimi A, Collet G, Gutknecht F, Got J, Eveillard D, Bourdon J, Plewniak F, Tonon T, Siegel A. Meneco, a Topology-Based Gap-Filling Tool Applicable to Degraded Genome-Wide Metabolic Networks. PLoS Comput Biol 2017; 13:e1005276. [PMID: 28129330 PMCID: PMC5302834 DOI: 10.1371/journal.pcbi.1005276] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 02/10/2017] [Accepted: 11/30/2016] [Indexed: 11/18/2022] Open
Abstract
Increasing amounts of sequence data are becoming available for a wide range of non-model organisms. Investigating and modelling the metabolic behaviour of those organisms is highly relevant to understand their biology and ecology. As sequences are often incomplete and poorly annotated, draft networks of their metabolism largely suffer from incompleteness. Appropriate gap-filling methods to identify and add missing reactions are therefore required to address this issue. However, current tools rely on phenotypic or taxonomic information, or are very sensitive to the stoichiometric balance of metabolic reactions, especially concerning the co-factors. This type of information is often not available or at least prone to errors for newly-explored organisms. Here we introduce Meneco, a tool dedicated to the topological gap-filling of genome-scale draft metabolic networks. Meneco reformulates gap-filling as a qualitative combinatorial optimization problem, omitting constraints raised by the stoichiometry of a metabolic network considered in other methods, and solves this problem using Answer Set Programming. Run on several artificial test sets gathering 10,800 degraded Escherichia coli networks Meneco was able to efficiently identify essential reactions missing in networks at high degradation rates, outperforming the stoichiometry-based tools in scalability. To demonstrate the utility of Meneco we applied it to two case studies. Its application to recent metabolic networks reconstructed for the brown algal model Ectocarpus siliculosus and an associated bacterium Candidatus Phaeomarinobacter ectocarpi revealed several candidate metabolic pathways for algal-bacterial interactions. Then Meneco was used to reconstruct, from transcriptomic and metabolomic data, the first metabolic network for the microalga Euglena mutabilis. These two case studies show that Meneco is a versatile tool to complete draft genome-scale metabolic networks produced from heterogeneous data, and to suggest relevant reactions that explain the metabolic capacity of a biological system.
Collapse
Affiliation(s)
- Sylvain Prigent
- Institute for Research in IT and Random Systems - IRISA, Université de Rennes 1, Rennes, France
- Department of Biology and Biological Engineering, Chalmers University of Technology, Göteborg, Sweden
- Irisa, CNRS, Rennes, France
- Dyliss, Inria, Rennes, France
- * E-mail: (AS); (SP)
| | - Clémence Frioux
- Institute for Research in IT and Random Systems - IRISA, Université de Rennes 1, Rennes, France
- Irisa, CNRS, Rennes, France
- Dyliss, Inria, Rennes, France
| | - Simon M. Dittami
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, Roscoff, France
| | | | - Abdelhalim Larhlimi
- Computer Science Laboratory of Nantes Atlantique - LINA UMR6241, Université de Nantes, Nantes, France
| | - Guillaume Collet
- Institute for Research in IT and Random Systems - IRISA, Université de Rennes 1, Rennes, France
- Irisa, CNRS, Rennes, France
- Dyliss, Inria, Rennes, France
| | - Fabien Gutknecht
- Molecular Genetics, Genomics and Microbiology - GMGM, Université de Strasbourg, Strasbourg, France
| | - Jeanne Got
- Institute for Research in IT and Random Systems - IRISA, Université de Rennes 1, Rennes, France
- Irisa, CNRS, Rennes, France
- Dyliss, Inria, Rennes, France
| | - Damien Eveillard
- Computer Science Laboratory of Nantes Atlantique - LINA UMR6241, Université de Nantes, Nantes, France
| | - Jérémie Bourdon
- Computer Science Laboratory of Nantes Atlantique - LINA UMR6241, Université de Nantes, Nantes, France
| | - Frédéric Plewniak
- Molecular Genetics, Genomics and Microbiology - GMGM, Université de Strasbourg, Strasbourg, France
- GMGM, CNRS, Strasbourg, France
| | - Thierry Tonon
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, UMR 8227, Integrative Biology of Marine Models, Station Biologique de Roscoff, Roscoff, France
| | - Anne Siegel
- Institute for Research in IT and Random Systems - IRISA, Université de Rennes 1, Rennes, France
- Irisa, CNRS, Rennes, France
- Dyliss, Inria, Rennes, France
- * E-mail: (AS); (SP)
| |
Collapse
|
27
|
Fu W, Chaiboonchoe A, Khraiwesh B, Nelson DR, Al-Khairy D, Mystikou A, Alzahmi A, Salehi-Ashtiani K. Algal Cell Factories: Approaches, Applications, and Potentials. Mar Drugs 2016; 14:md14120225. [PMID: 27983586 PMCID: PMC5192462 DOI: 10.3390/md14120225] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Revised: 12/02/2016] [Accepted: 12/05/2016] [Indexed: 12/26/2022] Open
Abstract
With the advent of modern biotechnology, microorganisms from diverse lineages have been used to produce bio-based feedstocks and bioactive compounds. Many of these compounds are currently commodities of interest, in a variety of markets and their utility warrants investigation into improving their production through strain development. In this review, we address the issue of strain improvement in a group of organisms with strong potential to be productive “cell factories”: the photosynthetic microalgae. Microalgae are a diverse group of phytoplankton, involving polyphyletic lineage such as green algae and diatoms that are commonly used in the industry. The photosynthetic microalgae have been under intense investigation recently for their ability to produce commercial compounds using only light, CO2, and basic nutrients. However, their strain improvement is still a relatively recent area of work that is under development. Importantly, it is only through appropriate engineering methods that we may see the full biotechnological potential of microalgae come to fruition. Thus, in this review, we address past and present endeavors towards the aim of creating productive algal cell factories and describe possible advantageous future directions for the field.
Collapse
Affiliation(s)
- Weiqi Fu
- Division of Science and Math, New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - Amphun Chaiboonchoe
- Division of Science and Math, New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - Basel Khraiwesh
- Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - David R Nelson
- Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - Dina Al-Khairy
- Division of Science and Math, New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - Alexandra Mystikou
- Division of Science and Math, New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - Amnah Alzahmi
- Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| | - Kourosh Salehi-Ashtiani
- Division of Science and Math, New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
- Center for Genomics and Systems Biology (CGSB), New York University Abu Dhabi, P.O. Box 129188 Saadiyat Island, Abu Dhabi, UAE.
| |
Collapse
|
28
|
Missing gene identification using functional coherence scores. Sci Rep 2016; 6:31725. [PMID: 27552989 PMCID: PMC4995438 DOI: 10.1038/srep31725] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 07/22/2016] [Indexed: 11/18/2022] Open
Abstract
Reconstructing metabolic and signaling pathways is an effective way of interpreting a genome sequence. A challenge in a pathway reconstruction is that often genes in a pathway cannot be easily found, reflecting current imperfect information of the target organism. In this work, we developed a new method for finding missing genes, which integrates multiple features, including gene expression, phylogenetic profile, and function association scores. Particularly, for considering function association between candidate genes and neighboring proteins to the target missing gene in the network, we used Co-occurrence Association Score (CAS) and PubMed Association Score (PAS), which are designed for capturing functional coherence of proteins. We showed that adding CAS and PAS substantially improve the accuracy of identifying missing genes in the yeast enzyme-enzyme network compared to the cases when only the conventional features, gene expression, phylogenetic profile, were used. Finally, it was also demonstrated that the accuracy improves by considering indirect neighbors to the target enzyme position in the network using a proper network-topology-based weighting scheme.
Collapse
|
29
|
Cuevas DA, Edirisinghe J, Henry CS, Overbeek R, O’Connell TG, Edwards RA. From DNA to FBA: How to Build Your Own Genome-Scale Metabolic Model. Front Microbiol 2016; 7:907. [PMID: 27379044 PMCID: PMC4911401 DOI: 10.3389/fmicb.2016.00907] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 05/27/2016] [Indexed: 11/19/2022] Open
Abstract
Microbiological studies are increasingly relying on in silico methods to perform exploration and rapid analysis of genomic data, and functional genomics studies are supplemented by the new perspectives that genome-scale metabolic models offer. A mathematical model consisting of a microbe's entire metabolic map can be rapidly determined from whole-genome sequencing and annotating the genomic material encoded in its DNA. Flux-balance analysis (FBA), a linear programming technique that uses metabolic models to predict the phenotypic responses imposed by environmental elements and factors, is the leading method to simulate and manipulate cellular growth in silico. However, the process of creating an accurate model to use in FBA consists of a series of steps involving a multitude of connections between bioinformatics databases, enzyme resources, and metabolic pathways. We present the methodology and procedure to obtain a metabolic model using PyFBA, an extensible Python-based open-source software package aimed to provide a platform where functional annotations are used to build metabolic models (http://linsalrob.github.io/PyFBA). Backed by the Model SEED biochemistry database, PyFBA contains methods to reconstruct a microbe's metabolic map, run FBA upon different media conditions, and gap-fill its metabolism. The extensibility of PyFBA facilitates novel techniques in creating accurate genome-scale metabolic models.
Collapse
Affiliation(s)
- Daniel A. Cuevas
- Computational Science Research Center, San Diego State University, San DiegoCA, USA
| | - Janaka Edirisinghe
- Mathematics and Computer Science Division, Argonne National Laboratory, ArgonneIL, USA
| | - Chris S. Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, ArgonneIL, USA
| | - Ross Overbeek
- Fellowship for Interpretation of Genomes, Burr RidgeIL, USA
| | - Taylor G. O’Connell
- Biological and Medical Informatics Research Center, San Diego State University, San DiegoCA, USA
| | - Robert A. Edwards
- Computational Science Research Center, San Diego State University, San DiegoCA, USA
- Biological and Medical Informatics Research Center, San Diego State University, San DiegoCA, USA
- Department of Computer Science, San Diego State University, San DiegoCA, USA
- Department of Biology, San Diego State University, San DiegoCA, USA
| |
Collapse
|
30
|
Jiang R, Linzon Y, Vitkin E, Yakhini Z, Chudnovsky A, Golberg A. Thermochemical hydrolysis of macroalgae Ulva for biorefinery: Taguchi robust design method. Sci Rep 2016; 6:27761. [PMID: 27291594 PMCID: PMC4904202 DOI: 10.1038/srep27761] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2016] [Accepted: 05/18/2016] [Indexed: 11/09/2022] Open
Abstract
Understanding the impact of all process parameters on the efficiency of biomass hydrolysis and on the final yield of products is critical to biorefinery design. Using Taguchi orthogonal arrays experimental design and Partial Least Square Regression, we investigated the impact of change and the comparative significance of thermochemical process temperature, treatment time, %Acid and %Solid load on carbohydrates release from green macroalgae from Ulva genus, a promising biorefinery feedstock. The average density of hydrolysate was determined using a new microelectromechanical optical resonator mass sensor. In addition, using Flux Balance Analysis techniques, we compared the potential fermentation yields of these hydrolysate products using metabolic models of Escherichia coli, Saccharomyces cerevisiae wild type, Saccharomyces cerevisiae RN1016 with xylose isomerase and Clostridium acetobutylicum. We found that %Acid plays the most significant role and treatment time the least significant role in affecting the monosaccharaides released from Ulva biomass. We also found that within the tested range of parameters, hydrolysis with 121 °C, 30 min 2% Acid, 15% Solids could lead to the highest yields of conversion: 54.134–57.500 gr ethanol kg−1Ulva dry weight by S. cerevisiae RN1016 with xylose isomerase. Our results support optimized marine algae utilization process design and will enable smart energy harvesting by thermochemical hydrolysis.
Collapse
Affiliation(s)
- Rui Jiang
- The Porter School of Environmental Studies, Tel Aviv University, Tel Aviv, Israel
| | - Yoav Linzon
- Department of Mechanical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Edward Vitkin
- Department of Computer Science, Technion - Israel Institute of Technology, Haifa, Israel
| | - Zohar Yakhini
- Department of Computer Science, Technion - Israel Institute of Technology, Haifa, Israel
| | - Alexandra Chudnovsky
- Department of Geography and Human Environment, Enviro-Digital Lab, Tel Aviv University, Israel
| | - Alexander Golberg
- The Porter School of Environmental Studies, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
31
|
Tobalina L, Pey J, Rezola A, Planes FJ. Assessment of FBA Based Gene Essentiality Analysis in Cancer with a Fast Context-Specific Network Reconstruction Method. PLoS One 2016; 11:e0154583. [PMID: 27145226 PMCID: PMC4856428 DOI: 10.1371/journal.pone.0154583] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 04/15/2016] [Indexed: 01/28/2023] Open
Abstract
MOTIVATION Gene Essentiality Analysis based on Flux Balance Analysis (FBA-based GEA) is a promising tool for the identification of novel metabolic therapeutic targets in cancer. The reconstruction of cancer-specific metabolic networks, typically based on gene expression data, constitutes a sensible step in this approach. However, to our knowledge, no extensive assessment on the influence of the reconstruction process on the obtained results has been carried out to date. RESULTS In this article, we aim to study context-specific networks and their FBA-based GEA results for the identification of cancer-specific metabolic essential genes. To that end, we used gene expression datasets from the Cancer Cell Line Encyclopedia (CCLE), evaluating the results obtained in 174 cancer cell lines. In order to more clearly observe the effect of cancer-specific expression data, we did the same analysis using randomly generated expression patterns. Our computational analysis showed some essential genes that are fairly common in the reconstructions derived from both gene expression and randomly generated data. However, though of limited size, we also found a subset of essential genes that are very rare in the randomly generated networks, while recurrent in the sample derived networks, and, thus, would presumably constitute relevant drug targets for further analysis. In addition, we compare the in-silico results to high-throughput gene silencing experiments from Project Achilles with conflicting results, which leads us to raise several questions, particularly the strong influence of the selected biomass reaction on the obtained results. Notwithstanding, using previous literature in cancer research, we evaluated the most relevant of our targets in three different cancer cell lines, two derived from Gliobastoma Multiforme and one from Non-Small Cell Lung Cancer, finding that some of the predictions are in the right track.
Collapse
Affiliation(s)
- Luis Tobalina
- CEIT and Tecnun (University of Navarra), Manuel de Lardizábal 15, 20018, San Sebastian, Spain
| | - Jon Pey
- CEIT and Tecnun (University of Navarra), Manuel de Lardizábal 15, 20018, San Sebastian, Spain
| | - Alberto Rezola
- CEIT and Tecnun (University of Navarra), Manuel de Lardizábal 15, 20018, San Sebastian, Spain
| | - Francisco J. Planes
- CEIT and Tecnun (University of Navarra), Manuel de Lardizábal 15, 20018, San Sebastian, Spain
- * E-mail:
| |
Collapse
|
32
|
Mohammadi R, Fallah-Mehrabadi J, Bidkhori G, Zahiri J, Javad Niroomand M, Masoudi-Nejad A. A systems biology approach to reconcile metabolic network models with application to Synechocystis sp. PCC 6803 for biofuel production. MOLECULAR BIOSYSTEMS 2016; 12:2552-61. [DOI: 10.1039/c6mb00119j] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Metabolic network models can be optimized for the production of desired materials like biofuels.
Collapse
Affiliation(s)
- Reza Mohammadi
- Laboratory of Systems Biology and Bioinformatics (LBB)
- Institute of Biochemistry and Biophysics
- University of Tehran
- Tehran
- Iran
| | | | | | - Javad Zahiri
- Bioinformatics and Computational Omics Lab (BioCOOL)
- Department of Biophysics
- Faculty of Biological Sciences
- Tarbiat Modares University
- Tehran
| | - Mohammad Javad Niroomand
- Learning Intelligent Systems Lab
- School of Electrical and Computer Engineering
- University of Tehran
- Tehran
- Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB)
- Institute of Biochemistry and Biophysics
- University of Tehran
- Tehran
- Iran
| |
Collapse
|
33
|
Ponce-de-Leon M, Calle-Espinosa J, Peretó J, Montero F. Consistency Analysis of Genome-Scale Models of Bacterial Metabolism: A Metamodel Approach. PLoS One 2015; 10:e0143626. [PMID: 26629901 PMCID: PMC4668087 DOI: 10.1371/journal.pone.0143626] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 11/06/2015] [Indexed: 01/10/2023] Open
Abstract
Genome-scale metabolic models usually contain inconsistencies that manifest as blocked reactions and gap metabolites. With the purpose to detect recurrent inconsistencies in metabolic models, a large-scale analysis was performed using a previously published dataset of 130 genome-scale models. The results showed that a large number of reactions (~22%) are blocked in all the models where they are present. To unravel the nature of such inconsistencies a metamodel was construed by joining the 130 models in a single network. This metamodel was manually curated using the unconnected modules approach, and then, it was used as a reference network to perform a gap-filling on each individual genome-scale model. Finally, a set of 36 models that had not been considered during the construction of the metamodel was used, as a proof of concept, to extend the metamodel with new biochemical information, and to assess its impact on gap-filling results. The analysis performed on the metamodel allowed to conclude: 1) the recurrent inconsistencies found in the models were already present in the metabolic database used during the reconstructions process; 2) the presence of inconsistencies in a metabolic database can be propagated to the reconstructed models; 3) there are reactions not manifested as blocked which are active as a consequence of some classes of artifacts, and; 4) the results of an automatic gap-filling are highly dependent on the consistency and completeness of the metamodel or metabolic database used as the reference network. In conclusion the consistency analysis should be applied to metabolic databases in order to detect and fill gaps as well as to detect and remove artifacts and redundant information.
Collapse
Affiliation(s)
- Miguel Ponce-de-Leon
- Departamento de Bioquímica y Biología Molecular I, Facultad de Ciencias Químicas, Universidad Complutense de Madrid, Ciudad Universitaria, Madrid 28045, Spain
- * E-mail:
| | - Jorge Calle-Espinosa
- Departamento de Bioquímica y Biología Molecular I, Facultad de Ciencias Químicas, Universidad Complutense de Madrid, Ciudad Universitaria, Madrid 28045, Spain
| | - Juli Peretó
- Departament de Bioquímica i Biologia Molecular and Institut Cavanilles de Biodiversitat i Biologia Evolutiva, Universitat de València, C/José Beltrán 2, Paterna 46980, Spain
| | - Francisco Montero
- Departamento de Bioquímica y Biología Molecular I, Facultad de Ciencias Químicas, Universidad Complutense de Madrid, Ciudad Universitaria, Madrid 28045, Spain
| |
Collapse
|
34
|
Krumholz EW, Libourel IGL. Sequence-based Network Completion Reveals the Integrality of Missing Reactions in Metabolic Networks. J Biol Chem 2015; 290:19197-207. [PMID: 26041773 DOI: 10.1074/jbc.m114.634121] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Indexed: 11/06/2022] Open
Abstract
Genome-scale metabolic models are central in connecting genotypes to metabolic phenotypes. However, even for well studied organisms, such as Escherichia coli, draft networks do not contain a complete biochemical network. Missing reactions are referred to as gaps. These gaps need to be filled to enable functional analysis, and gap-filling choices influence model predictions. To investigate whether functional networks existed where all gap-filling reactions were supported by sequence similarity to annotated enzymes, four draft networks were supplemented with all reactions from the Model SEED database for which minimal sequence similarity was found in their genomes. Quadratic programming revealed that the number of reactions that could partake in a gap-filling solution was vast: 3,270 in the case of E. coli, where 72% of the metabolites in the draft network could connect a gap-filling solution. Nonetheless, no network could be completed without the inclusion of orphaned enzymes, suggesting that parts of the biochemistry integral to biomass precursor formation are uncharacterized. However, many gap-filling reactions were well determined, and the resulting networks showed improved prediction of gene essentiality compared with networks generated through canonical gap filling. In addition, gene essentiality predictions that were sensitive to poorly determined gap-filling reactions were of poor quality, suggesting that damage to the network structure resulting from the inclusion of erroneous gap-filling reactions may be predictable.
Collapse
Affiliation(s)
| | - Igor G L Libourel
- From the Department of Plant Biology and the Biotechnology Institute, University of Minnesota, Saint Paul, Minnesota 55108
| |
Collapse
|
35
|
Monk J, Nogales J, Palsson BO. Optimizing genome-scale network reconstructions. Nat Biotechnol 2015; 32:447-52. [PMID: 24811519 DOI: 10.1038/nbt.2870] [Citation(s) in RCA: 138] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Jonathan Monk
- 1] Department of Bioengineering, University of California, San Diego, La Jolla, California, USA. [2]
| | - Juan Nogales
- 1] Department of Bioengineering, University of California, San Diego, La Jolla, California, USA. [2] Department of Environmental Biology, Centro de Investigaciones Biológicas, CSIC, Madrid, Spain. [3]
| | - Bernhard O Palsson
- Department of Bioengineering, University of California, San Diego, La Jolla, California, USA
| |
Collapse
|
36
|
Tobalina L, Bargiela R, Pey J, Herbst FA, Lores I, Rojo D, Barbas C, Peláez AI, Sánchez J, von Bergen M, Seifert J, Ferrer M, Planes FJ. Context-specific metabolic network reconstruction of a naphthalene-degrading bacterial community guided by metaproteomic data. ACTA ACUST UNITED AC 2015; 31:1771-9. [PMID: 25618865 DOI: 10.1093/bioinformatics/btv036] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Accepted: 01/18/2015] [Indexed: 12/21/2022]
Abstract
MOTIVATION With the advent of meta-'omics' data, the use of metabolic networks for the functional analysis of microbial communities became possible. However, while network-based methods are widely developed for single organisms, their application to bacterial communities is currently limited. RESULTS Herein, we provide a novel, context-specific reconstruction procedure based on metaproteomic and taxonomic data. Without previous knowledge of a high-quality, genome-scale metabolic networks for each different member in a bacterial community, we propose a meta-network approach, where the expression levels and taxonomic assignments of proteins are used as the most relevant clues for inferring an active set of reactions. Our approach was applied to draft the context-specific metabolic networks of two different naphthalene-enriched communities derived from an anthropogenically influenced, polyaromatic hydrocarbon contaminated soil, with (CN2) or without (CN1) bio-stimulation. We were able to capture the overall functional differences between the two conditions at the metabolic level and predict an important activity for the fluorobenzoate degradation pathway in CN1 and for geraniol metabolism in CN2. Experimental validation was conducted, and good agreement with our computational predictions was observed. We also hypothesize different pathway organizations at the organismal level, which is relevant to disentangle the role of each member in the communities. The approach presented here can be easily transferred to the analysis of genomic, transcriptomic and metabolomic data.
Collapse
Affiliation(s)
- Luis Tobalina
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Rafael Bargiela
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Jon Pey
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Florian-Alexander Herbst
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Iván Lores
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - David Rojo
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Coral Barbas
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Ana I Peláez
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Jesús Sánchez
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Martin von Bergen
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Jana Seifert
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Manuel Ferrer
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | - Francisco J Planes
- CEIT and Tecnun (University of Navarra), San Sebastián, Spain, CSIC, Institute of Catalysis, Madrid, Spain, Helmholtz Centre for Environmental Research, Department of Proteomics, Leipzig, Germany, Área de Microbiología, IUBA, Universidad de Oviedo, Oviedo, Spain, Centro de Metabolómica y Bioanálisis (CEMBIO), Facultad de Farmacia, Universidad CEU San Pablo, Campus Monteprincipe, Boadilla del Monte, Madrid, Spain, Department of Metabolomics, UFZ-Helmholtz-Zentrum für Umweltforschung GmbH, Leipzig, Germany and Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| |
Collapse
|
37
|
Gudmundsson S, Nogales J. Cyanobacteria as photosynthetic biocatalysts: a systems biology perspective. MOLECULAR BIOSYSTEMS 2015; 11:60-70. [DOI: 10.1039/c4mb00335g] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
A review of cyanobacterial biocatalysts highlighting their metabolic features that argues for the need for systems-level metabolic engineering.
Collapse
Affiliation(s)
| | - Juan Nogales
- Department of Environmental Biology
- Centro de Investigaciones Biológicas-CSIC
- 28040 Madrid
- Spain
| |
Collapse
|
38
|
Benedict MN, Mundy MB, Henry CS, Chia N, Price ND. Likelihood-based gene annotations for gap filling and quality assessment in genome-scale metabolic models. PLoS Comput Biol 2014; 10:e1003882. [PMID: 25329157 PMCID: PMC4199484 DOI: 10.1371/journal.pcbi.1003882] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Accepted: 08/25/2014] [Indexed: 12/27/2022] Open
Abstract
Genome-scale metabolic models provide a powerful means to harness information from genomes to deepen biological insights. With exponentially increasing sequencing capacity, there is an enormous need for automated reconstruction techniques that can provide more accurate models in a short time frame. Current methods for automated metabolic network reconstruction rely on gene and reaction annotations to build draft metabolic networks and algorithms to fill gaps in these networks. However, automated reconstruction is hampered by database inconsistencies, incorrect annotations, and gap filling largely without considering genomic information. Here we develop an approach for applying genomic information to predict alternative functions for genes and estimate their likelihoods from sequence homology. We show that computed likelihood values were significantly higher for annotations found in manually curated metabolic networks than those that were not. We then apply these alternative functional predictions to estimate reaction likelihoods, which are used in a new gap filling approach called likelihood-based gap filling to predict more genomically consistent solutions. To validate the likelihood-based gap filling approach, we applied it to models where essential pathways were removed, finding that likelihood-based gap filling identified more biologically relevant solutions than parsimony-based gap filling approaches. We also demonstrate that models gap filled using likelihood-based gap filling provide greater coverage and genomic consistency with metabolic gene functions compared to parsimony-based approaches. Interestingly, despite these findings, we found that likelihoods did not significantly affect consistency of gap filled models with Biolog and knockout lethality data. This indicates that the phenotype data alone cannot necessarily be used to discriminate between alternative solutions for gap filling and therefore, that the use of other information is necessary to obtain a more accurate network. All described workflows are implemented as part of the DOE Systems Biology Knowledgebase (KBase) and are publicly available via API or command-line web interface.
Collapse
Affiliation(s)
- Matthew N. Benedict
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Michael B. Mundy
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Christopher S. Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, Illinois, United States of America
| | - Nicholas Chia
- Center for Individualized Medicine, Mayo Clinic, Rochester, Minnesota, United States of America
- Department of Surgery, Mayo Clinic, Rochester, Minnesota, United States of America
- Department of Physiology and Bioengineering, Mayo Clinic, Rochester, Minnesota, United States of America
- * E-mail: (NC); (NDP)
| | - Nathan D. Price
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
- Institute for Systems Biology, Seattle, Washington, United States of America
- * E-mail: (NC); (NDP)
| |
Collapse
|
39
|
Kim J, Reed JL. Refining metabolic models and accounting for regulatory effects. Curr Opin Biotechnol 2014; 29:34-8. [PMID: 24632483 DOI: 10.1016/j.copbio.2014.02.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2013] [Revised: 01/23/2014] [Accepted: 02/13/2014] [Indexed: 11/28/2022]
Abstract
Advances in genome-scale metabolic modeling allow us to investigate and engineer metabolism at a systems level. Metabolic network reconstructions have been made for many organisms and computational approaches have been developed to convert these reconstructions into predictive models. However, due to incomplete knowledge these reconstructions often have missing or extraneous components and interactions, which can be identified by reconciling model predictions with experimental data. Recent studies have provided methods to further improve metabolic model predictions by incorporating transcriptional regulatory interactions and high-throughput omics data to yield context-specific metabolic models. Here we discuss recent approaches for resolving model-data discrepancies and building context-specific metabolic models. Once developed highly accurate metabolic models can be used in a variety of biotechnology applications.
Collapse
Affiliation(s)
- Joonhoon Kim
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, United States; Department of Chemical and Biological Engineering, University of Wisconsin-Madison, United States
| | - Jennifer L Reed
- DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, United States; Department of Chemical and Biological Engineering, University of Wisconsin-Madison, United States.
| |
Collapse
|
40
|
Vlassis N, Pacheco MP, Sauter T. Fast reconstruction of compact context-specific metabolic network models. PLoS Comput Biol 2014; 10:e1003424. [PMID: 24453953 PMCID: PMC3894152 DOI: 10.1371/journal.pcbi.1003424] [Citation(s) in RCA: 150] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Accepted: 11/20/2013] [Indexed: 12/14/2022] Open
Abstract
Systemic approaches to the study of a biological cell or tissue rely increasingly on the use of context-specific metabolic network models. The reconstruction of such a model from high-throughput data can routinely involve large numbers of tests under different conditions and extensive parameter tuning, which calls for fast algorithms. We present fastcore, a generic algorithm for reconstructing context-specific metabolic network models from global genome-wide metabolic network models such as Recon X. fastcore takes as input a core set of reactions that are known to be active in the context of interest (e.g., cell or tissue), and it searches for a flux consistent subnetwork of the global network that contains all reactions from the core set and a minimal set of additional reactions. Our key observation is that a minimal consistent reconstruction can be defined via a set of sparse modes of the global network, and fastcore iteratively computes such a set via a series of linear programs. Experiments on liver data demonstrate speedups of several orders of magnitude, and significantly more compact reconstructions, over a rival method. Given its simplicity and its excellent performance, fastcore can form the backbone of many future metabolic network reconstruction algorithms.
Collapse
Affiliation(s)
- Nikos Vlassis
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Luxembourg City, Luxembourg
| | - Maria Pires Pacheco
- Life Sciences Research Unit, University of Luxembourg, Luxembourg City, Luxembourg
| | - Thomas Sauter
- Life Sciences Research Unit, University of Luxembourg, Luxembourg City, Luxembourg
| |
Collapse
|
41
|
Mueller TJ, Berla BM, Pakrasi HB, Maranas CD. Rapid construction of metabolic models for a family of Cyanobacteria using a multiple source annotation workflow. BMC SYSTEMS BIOLOGY 2013; 7:142. [PMID: 24369854 PMCID: PMC3880981 DOI: 10.1186/1752-0509-7-142] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2013] [Accepted: 12/19/2013] [Indexed: 12/02/2022]
Abstract
Background Cyanobacteria are photoautotrophic prokaryotes that exhibit robust growth under diverse environmental conditions with minimal nutritional requirements. They can use solar energy to convert CO2 and other reduced carbon sources into biofuels and chemical products. The genus Cyanothece includes unicellular nitrogen-fixing cyanobacteria that have been shown to offer high levels of hydrogen production and nitrogen fixation. The reconstruction of quality genome-scale metabolic models for organisms with limited annotation resources remains a challenging task. Results Here we reconstruct and subsequently analyze and compare the metabolism of five Cyanothece strains, namely Cyanothece sp. PCC 7424, 7425, 7822, 8801 and 8802, as the genome-scale metabolic reconstructions iCyc792, iCyn731, iCyj826, iCyp752, and iCyh755 respectively. We compare these phylogenetically related Cyanothece strains to assess their bio-production potential. A systematic workflow is introduced for integrating and prioritizing annotation information from the Universal Protein Resource (Uniprot), NCBI Protein Clusters, and the Rapid Annotations using Subsystems Technology (RAST) method. The genome-scale metabolic models include fully traced photosynthesis reactions and respiratory chains, as well as balanced reactions and GPR associations. Metabolic differences between the organisms are highlighted such as the non-fermentative pathway for alcohol production found in only Cyanothece 7424, 8801, and 8802. Conclusions Our development workflow provides a path for constructing models using information from curated models of related organisms and reviewed gene annotations. This effort lays the foundation for the expedient construction of curated metabolic models for organisms that, while not being the target of comprehensive research, have a sequenced genome and are related to an organism with a curated metabolic model. Organism-specific models, such as the five presented in this paper, can be used to identify optimal genetic manipulations for targeted metabolite overproduction as well as to investigate the biology of diverse organisms.
Collapse
Affiliation(s)
| | | | | | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, Pennsylvania, USA.
| |
Collapse
|
42
|
Tervo CJ, Reed JL. BioMog: a computational framework for the de novo generation or modification of essential biomass components. PLoS One 2013; 8:e81322. [PMID: 24339916 PMCID: PMC3855262 DOI: 10.1371/journal.pone.0081322] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Accepted: 10/11/2013] [Indexed: 12/20/2022] Open
Abstract
The success of genome-scale metabolic modeling is contingent on a model's ability to accurately predict growth and metabolic behaviors. To date, little focus has been directed towards developing systematic methods of proposing, modifying and interrogating an organism's biomass requirements that are used in constraint-based models. To address this gap, the biomass modification and generation (BioMog) framework was created and used to generate lists of biomass components de novo, as well as to modify predefined biomass component lists, for models of Escherichia coli (iJO1366) and of Shewanella oneidensis (iSO783) from high-throughput growth phenotype and fitness datasets. BioMog's de novo biomass component lists included, either implicitly or explicitly, up to seventy percent of the components included in the predefined biomass equations, and the resulting de novo biomass equations outperformed the predefined biomass equations at qualitatively predicting mutant growth phenotypes by up to five percent. Additionally, the BioMog procedure can quantify how many experiments support or refute a particular metabolite's essentiality to a cell, and it facilitates the determination of inconsistent experiments and inaccurate reaction and/or gene to reaction associations. To further interrogate metabolite essentiality, the BioMog framework includes an experiment generation algorithm that allows for the design of experiments to test whether a metabolite is essential. Using BioMog, we correct experimental results relating to the essentiality of thyA gene in E. coli, as well as perform knockout experiments supporting the essentiality of protoheme. With these capabilities, BioMog can be a valuable resource for analyzing growth phenotyping data and component of a model developer's toolbox.
Collapse
Affiliation(s)
- Christopher J. Tervo
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Jennifer L. Reed
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- * E-mail:
| |
Collapse
|
43
|
Montagud A, Gamermann D, Fernández de Córdoba P, Urchueguía JF. Synechocystis sp. PCC6803 metabolic models for the enhanced production of hydrogen. Crit Rev Biotechnol 2013; 35:184-98. [PMID: 24090244 DOI: 10.3109/07388551.2013.829799] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
In the present economy, difficulties to access energy sources are real drawbacks to maintain our current lifestyle. In fact, increasing interests have been gathered around efficient strategies to use energy sources that do not generate high CO2 titers. Thus, science-funding agencies have invested more resources into research on hydrogen among other biofuels as interesting energy vectors. This article reviews present energy challenges and frames it into the present fuel usage landscape. Different strategies for hydrogen production are explained and evaluated. Focus is on biological hydrogen production; fermentation and photon-fuelled hydrogen production are compared. Mathematical models in biology can be used to assess, explore and design production strategies for industrially relevant metabolites, such as biofuels. We assess the diverse construction and uses of genome-scale metabolic models of cyanobacterium Synechocystis sp. PCC6803 to efficiently obtain biofuels. This organism has been studied as a potential photon-fuelled production platform for its ability to grow from carbon dioxide, water and photons, on simple culture media. Finally, we review studies that propose production strategies to weigh this organism's viability as a biofuel production platform. Overall, the work presented in this review unveils the industrial capabilities of cyanobacterium Synechocystis sp. PCC6803 to evolve interesting metabolites as a clean biofuel production platform.
Collapse
Affiliation(s)
- Arnau Montagud
- Instituto Universitario de Matemática Pura y Aplicada, Universitat Politècnica de València , Valencia , Spain
| | | | | | | |
Collapse
|
44
|
Seifert J, Herbst FA, Halkjaer Nielsen P, Planes FJ, Jehmlich N, Ferrer M, von Bergen M. Bioinformatic progress and applications in metaproteogenomics for bridging the gap between genomic sequences and metabolic functions in microbial communities. Proteomics 2013; 13:2786-804. [PMID: 23625762 DOI: 10.1002/pmic.201200566] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Revised: 03/07/2013] [Accepted: 03/28/2013] [Indexed: 11/06/2022]
Abstract
Metaproteomics of microbial communities promises to add functional information to the blueprint of genes derived from metagenomics. Right from its beginning, the achievements and developments in metaproteomics were closely interlinked with metagenomics. In addition, the evaluation, visualization, and interpretation of metaproteome data demanded for the developments in bioinformatics. This review will give an overview about recent strategies to use genomic data either from public databases or organismal specific genomes/metagenomes to increase the number of identified proteins obtained by mass spectrometric measurements. We will review different published metaproteogenomic approaches in respect to the used MS pipeline and to the used protein identification workflow. Furthermore, different approaches of data visualization and strategies for phylogenetic interpretation of metaproteome data are discussed as well as approaches for functional mapping of the results to the investigated biological systems. This information will in the end allow a comprehensive analysis of interactions and interdependencies within microbial communities.
Collapse
Affiliation(s)
- Jana Seifert
- Department of Proteomics, UFZ-Helmholtz Centre for Environmental Research, Leipzig, Germany; Institute of Animal Nutrition, University of Hohenheim, Stuttgart, Germany
| | | | | | | | | | | | | |
Collapse
|
45
|
Dreyfuss JM, Zucker JD, Hood HM, Ocasio LR, Sachs MS, Galagan JE. Reconstruction and validation of a genome-scale metabolic model for the filamentous fungus Neurospora crassa using FARM. PLoS Comput Biol 2013; 9:e1003126. [PMID: 23935467 PMCID: PMC3730674 DOI: 10.1371/journal.pcbi.1003126] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2013] [Accepted: 05/20/2013] [Indexed: 11/18/2022] Open
Abstract
The filamentous fungus Neurospora crassa played a central role in the development of twentieth-century genetics, biochemistry and molecular biology, and continues to serve as a model organism for eukaryotic biology. Here, we have reconstructed a genome-scale model of its metabolism. This model consists of 836 metabolic genes, 257 pathways, 6 cellular compartments, and is supported by extensive manual curation of 491 literature citations. To aid our reconstruction, we developed three optimization-based algorithms, which together comprise Fast Automated Reconstruction of Metabolism (FARM). These algorithms are: LInear MEtabolite Dilution Flux Balance Analysis (limed-FBA), which predicts flux while linearly accounting for metabolite dilution; One-step functional Pruning (OnePrune), which removes blocked reactions with a single compact linear program; and Consistent Reproduction Of growth/no-growth Phenotype (CROP), which reconciles differences between in silico and experimental gene essentiality faster than previous approaches. Against an independent test set of more than 300 essential/non-essential genes that were not used to train the model, the model displays 93% sensitivity and specificity. We also used the model to simulate the biochemical genetics experiments originally performed on Neurospora by comprehensively predicting nutrient rescue of essential genes and synthetic lethal interactions, and we provide detailed pathway-based mechanistic explanations of our predictions. Our model provides a reliable computational framework for the integration and interpretation of ongoing experimental efforts in Neurospora, and we anticipate that our methods will substantially reduce the manual effort required to develop high-quality genome-scale metabolic models for other organisms. Few organisms have been as foundational to the development of modern genetics and cellular metabolism as Neurospora crassa. Given the wealth of knowledge available for this filamentous fungus, the effort required to manually curate a high-quality genome-scale metabolic reconstruction would be daunting. To aid the reconstruction process, we developed three optimization-based algorithms. The first algorithm predicts flux while linearly accounting for metabolite dilution; the second algorithm removes blocked reactions with one compact linear program; and the third algorithm reconciles differences between in silico predictions and experimental observations of mutant viability. We have used these algorithms to develop the first genome-scale metabolic model for Neurospora. We have validated the accuracy of our model against an independent test set of more than 300 growth/no-growth phenotypes, and our model displays 93% sensitivity and specificity. Simulating the biochemical genetics experiments originally performed on Neurospora, we comprehensively predicted essential genes, nutrient rescues of auxotroph mutants and synthetic lethal interactions. With these predictions, we provide potential mechanistic insight into known mutant phenotypes, and testable hypotheses for novel mutant phenotypes. The model, the algorithms and the testable hypotheses provide a computational foundation for the study of Neurospora crassa metabolism.
Collapse
Affiliation(s)
- Jonathan M. Dreyfuss
- Graduate Program in Bioinformatics, Boston University, Boston, Massachusetts, United States of America
| | - Jeremy D. Zucker
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Tardigrade Biotechnologies, Jamaica Plain, Massachusetts, United States of America
| | - Heather M. Hood
- Institute of Environmental Health, Oregon Health & Science University, Portland, Oregon, United States of America
| | - Linda R. Ocasio
- Tardigrade Biotechnologies, Jamaica Plain, Massachusetts, United States of America
| | - Matthew S. Sachs
- Department of Biology, Texas A&M University, College Station, Texas, United States of America
| | - James E. Galagan
- Graduate Program in Bioinformatics, Boston University, Boston, Massachusetts, United States of America
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
46
|
Knoop H, Gründel M, Zilliges Y, Lehmann R, Hoffmann S, Lockau W, Steuer R. Flux balance analysis of cyanobacterial metabolism: the metabolic network of Synechocystis sp. PCC 6803. PLoS Comput Biol 2013; 9:e1003081. [PMID: 23843751 PMCID: PMC3699288 DOI: 10.1371/journal.pcbi.1003081] [Citation(s) in RCA: 189] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2012] [Accepted: 04/15/2013] [Indexed: 12/18/2022] Open
Abstract
Cyanobacteria are versatile unicellular phototrophic microorganisms that are highly abundant in many environments. Owing to their capability to utilize solar energy and atmospheric carbon dioxide for growth, cyanobacteria are increasingly recognized as a prolific resource for the synthesis of valuable chemicals and various biofuels. To fully harness the metabolic capabilities of cyanobacteria necessitates an in-depth understanding of the metabolic interconversions taking place during phototrophic growth, as provided by genome-scale reconstructions of microbial organisms. Here we present an extended reconstruction and analysis of the metabolic network of the unicellular cyanobacterium Synechocystis sp. PCC 6803. Building upon several recent reconstructions of cyanobacterial metabolism, unclear reaction steps are experimentally validated and the functional consequences of unknown or dissenting pathway topologies are discussed. The updated model integrates novel results with respect to the cyanobacterial TCA cycle, an alleged glyoxylate shunt, and the role of photorespiration in cellular growth. Going beyond conventional flux-balance analysis, we extend the computational analysis to diurnal light/dark cycles of cyanobacterial metabolism.
Collapse
Affiliation(s)
- Henning Knoop
- Humboldt-Universität zu Berlin, Institut für Theoretische Biologie, Berlin, Germany
- * E-mail: (HK); (RS)
| | - Marianne Gründel
- Humboldt-Universität zu Berlin, Institut für Biologie, Berlin, Germany
| | - Yvonne Zilliges
- Humboldt-Universität zu Berlin, Institut für Biologie, Berlin, Germany
| | - Robert Lehmann
- Humboldt-Universität zu Berlin, Institut für Theoretische Biologie, Berlin, Germany
| | - Sabrina Hoffmann
- Humboldt-Universität zu Berlin, Institut für Theoretische Biologie, Berlin, Germany
| | - Wolfgang Lockau
- Humboldt-Universität zu Berlin, Institut für Biologie, Berlin, Germany
| | - Ralf Steuer
- Humboldt-Universität zu Berlin, Institut für Theoretische Biologie, Berlin, Germany
- CzechGlobe - Global Change Research Center, Academy of Sciences of the Czech Republic, Brno, Czech Republic
- * E-mail: (HK); (RS)
| |
Collapse
|
47
|
Fong NL, Lerman JA, Lam I, Palsson BO, Charusanti P. Reconciling a Salmonella enterica metabolic model with experimental data confirms that overexpression of the glyoxylate shunt can rescue a lethal ppc deletion mutant. FEMS Microbiol Lett 2013; 342:62-9. [PMID: 23432746 DOI: 10.1111/1574-6968.12109] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2013] [Revised: 02/18/2013] [Accepted: 02/18/2013] [Indexed: 11/27/2022] Open
Abstract
The in silico reconstruction of metabolic networks has become an effective and useful systems biology approach to predict and explain many different cellular phenotypes. When simulation outputs do not match experimental data, the source of the inconsistency can often be traced to incomplete biological information that is consequently not captured in the model. To address this problem, general approaches continue to be needed that can suggest experimentally testable hypotheses to reconcile inconsistencies between simulation and experimental data. Here, we present such an approach that focuses specifically on correcting cases in which experimental data show a particular gene to be essential but model simulations do not. We use metabolic models to predict efficient compensatory pathways, after which cloning and overexpression of these pathways are performed to investigate whether they restore growth and to help determine why these compensatory pathways are not active in mutant cells. We demonstrate this technique for a ppc knockout of Salmonella enterica serovar Typhimurium; the inability of cells to route flux through the glyoxylate shunt when ppc is removed was correctly identified by our approach as the cause of the discrepancy. These results demonstrate the feasibility of our approach to drive biological discovery while simultaneously refining metabolic network reconstructions.
Collapse
Affiliation(s)
- Nicole L Fong
- Department of Bioengineering, University of California, San Diego, CA 92093-0412, USA
| | | | | | | | | |
Collapse
|