1
|
Qian J, Ye C. Development and applications of genome-scale metabolic network models. ADVANCES IN APPLIED MICROBIOLOGY 2024; 126:1-26. [PMID: 38637105 DOI: 10.1016/bs.aambs.2024.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/20/2024]
Abstract
The genome-scale metabolic network model is an effective tool for characterizing the gene-protein-response relationship in the entire metabolic pathway of an organism. By combining various algorithms, the genome-scale metabolic network model can effectively simulate the influence of a specific environment on the physiological state of cells, optimize the culture conditions of strains, and predict the targets of genetic modification to achieve targeted modification of strains. In this review, we summarize the whole process of model building, sort out the various tools that may be involved in the model building process, and explain the role of various algorithms in model analysis. In addition, we also summarized the application of GSMM in network characteristics, cell phenotypes, metabolic engineering, etc. Finally, we discuss the current challenges facing GSMM.
Collapse
Affiliation(s)
- Jinyi Qian
- Ministry of Education Key Laboratory of NSLSCS, Nanjing Normal University, Nanjing, PR China
| | - Chao Ye
- Ministry of Education Key Laboratory of NSLSCS, Nanjing Normal University, Nanjing, PR China; School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, Nanjing, PR China.
| |
Collapse
|
2
|
Siddharth T, Lewis NE. Predicting pathways for old and new metabolites through clustering. J Theor Biol 2024; 578:111684. [PMID: 38048983 PMCID: PMC11139542 DOI: 10.1016/j.jtbi.2023.111684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 11/17/2023] [Accepted: 11/29/2023] [Indexed: 12/06/2023]
Abstract
The diverse metabolic pathways are fundamental to all living organisms, as they harvest energy, synthesize biomass components, produce molecules to interact with the microenvironment, and neutralize toxins. While the discovery of new metabolites and pathways continues, the prediction of pathways for new metabolites can be challenging. It can take vast amounts of time to elucidate pathways for new metabolites; thus, according to HMDB (Human Metabolome Database), only 60% of metabolites get assigned to pathways. Here, we present an approach to identify pathways based on metabolite structure. We extracted 201 features from SMILES annotations and identified new metabolites from PubMed abstracts and HMDB. After applying clustering algorithms to both groups of features, we quantified correlations between metabolites, and found the clusters accurately linked 92% of known metabolites to their respective pathways. Thus, this approach could be valuable for predicting metabolic pathways for new metabolites.
Collapse
Affiliation(s)
- Thiru Siddharth
- Department of Computer Science and Engineering, Indian Institute of Information Technology, Bhopal, MP 462003, India
| | - Nathan E Lewis
- Department of Pediatrics and Bioengineering, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
3
|
Loira N, Zhukova A, Sherman DJ. Pantograph: A template-based method for genome-scale metabolic model reconstruction. J Bioinform Comput Biol 2015; 13:1550006. [PMID: 25572717 DOI: 10.1142/s0219720015500067] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Genome-scale metabolic models are a powerful tool to study the inner workings of biological systems and to guide applications. The advent of cheap sequencing has brought the opportunity to create metabolic maps of biotechnologically interesting organisms. While this drives the development of new methods and automatic tools, network reconstruction remains a time-consuming process where extensive manual curation is required. This curation introduces specific knowledge about the modeled organism, either explicitly in the form of molecular processes, or indirectly in the form of annotations of the model elements. Paradoxically, this knowledge is usually lost when reconstruction of a different organism is started. We introduce the Pantograph method for metabolic model reconstruction. This method combines a template reaction knowledge base, orthology mappings between two organisms, and experimental phenotypic evidence, to build a genome-scale metabolic model for a target organism. Our method infers implicit knowledge from annotations in the template, and rewrites these inferences to include them in the resulting model of the target organism. The generated model is well suited for manual curation. Scripts for evaluating the model with respect to experimental data are automatically generated, to aid curators in iterative improvement. We present an implementation of the Pantograph method, as a toolbox for genome-scale model reconstruction, curation and validation. This open source package can be obtained from: http://pathtastic.gforge.inria.fr.
Collapse
Affiliation(s)
- Nicolas Loira
- Center for Mathematical Modeling and Center for Genome Regulation, Universidad de Chile, Beauchef 851, Piso7, Santiago, Chile
| | | | | |
Collapse
|
4
|
Zhukova A, Sherman DJ. Knowledge-based generalization of metabolic networks: A practical study. J Bioinform Comput Biol 2014; 12:1441001. [DOI: 10.1142/s0219720014410017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The complex process of genome-scale metabolic network reconstruction involves semi-automatic reaction inference, analysis, and refinement through curation by human experts. Unfortunately, decisions by experts are hampered by the complexity of the network, which can mask errors in the inferred network. In order to aid an expert in making sense out of the thousands of reactions in the organism's metabolism, we developed a method for knowledge-based generalization that provides a higher-level view of the network, highlighting the particularities and essential structure, while hiding the details. In this study, we show the application of this generalization method to 1,286 metabolic networks of organisms in Path2Models that describe fatty acid metabolism. We compare the generalised networks and show that we successfully highlight the aspects that are important for their curation and comparison.
Collapse
Affiliation(s)
- Anna Zhukova
- Inria/Université Bordeaux/CNRS Joint Project-Team, MAGNOME F-33405 Talence, France
| | - David J. Sherman
- Inria/Université Bordeaux/CNRS Joint Project-Team, MAGNOME F-33405 Talence, France
| |
Collapse
|
5
|
The RAVEN toolbox and its use for generating a genome-scale metabolic model for Penicillium chrysogenum. PLoS Comput Biol 2013; 9:e1002980. [PMID: 23555215 PMCID: PMC3605104 DOI: 10.1371/journal.pcbi.1002980] [Citation(s) in RCA: 267] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2012] [Accepted: 01/24/2013] [Indexed: 01/06/2023] Open
Abstract
We present the RAVEN (Reconstruction, Analysis and Visualization of Metabolic Networks) Toolbox: a software suite that allows for semi-automated reconstruction of genome-scale models. It makes use of published models and/or the KEGG database, coupled with extensive gap-filling and quality control features. The software suite also contains methods for visualizing simulation results and omics data, as well as a range of methods for performing simulations and analyzing the results. The software is a useful tool for system-wide data analysis in a metabolic context and for streamlined reconstruction of metabolic networks based on protein homology. The RAVEN Toolbox workflow was applied in order to reconstruct a genome-scale metabolic model for the important microbial cell factory Penicillium chrysogenum Wisconsin54-1255. The model was validated in a bibliomic study of in total 440 references, and it comprises 1471 unique biochemical reactions and 1006 ORFs. It was then used to study the roles of ATP and NADPH in the biosynthesis of penicillin, and to identify potential metabolic engineering targets for maximization of penicillin production. Genome-scale models (GEMs) are large stoichiometric models of cell metabolism, where the goal is to incorporate every metabolic transformation that an organism can perform. Such models have been extensively used for the study of bacterial metabolism, in particular for metabolic engineering purposes. More recently, the use of GEMs for eukaryotic organisms has become increasingly widespread. Since these models typically involve thousands of metabolic reactions, the reconstruction and validation of them can be a very complex task. We have developed a software suite, RAVEN Toolbox, which aims at automating parts of the reconstruction process in order to allow for faster reconstruction of high-quality GEMs. The software is particularly well suited for reconstruction of models for eukaryotic organisms, due to how it deals with sub-cellular localization of reactions. We used the software for reconstructing a model of the filamentous fungi Penicillium chrysogenum, the organism used in penicillin production and an important microbial cell factory. The resulting model was validated through an extensive literature survey and by comparison with published fermentation data. The model was used for the identification of transcriptionally regulated metabolic bottlenecks in order to increase the yield in penicillin fermentations. In this paper we present the RAVEN Toolbox and the GEM for P. chrysogenum.
Collapse
|
6
|
Arakawa K, Tomita M. Merging multiple omics datasets in silico: statistical analyses and data interpretation. Methods Mol Biol 2013; 985:459-70. [PMID: 23417818 DOI: 10.1007/978-1-62703-299-5_23] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
By the combinations of high-throughput analytical technologies in the fields of transcriptomics, proteomics, and metabolomics, we are now able to gain comprehensive and quantitative snapshots of the intracellular processes. Dynamic intracellular activities and their regulations can be elucidated by systematic observation of these multi-omics data. On the other hand, careful statistical analysis is necessary for such integration, since each of the omics layers as well as the specific analytical methodologies harbor different levels of noise and variations. Moreover, interpretation of such multitude of data requires an intuitive pathway context. Here we describe such statistical methods for the integration and comparison of multi-omics data, as well as the computational methods for pathway reconstruction, ID conversion, mapping, and visualization that play key roles for the efficient study of multi-omics information.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa, Japan.
| | | |
Collapse
|
7
|
Abstract
Metabolism can be defined as the complete set of chemical reactions that occur in living organisms in order to maintain life. Enzymes are the main players in this process as they are responsible for catalyzing the chemical reactions. The enzyme-reaction relationships can be used for the reconstruction of a network of reactions, which leads to a metabolic model of metabolism. A genome-scale metabolic network of chemical reactions that take place inside a living organism is primarily reconstructed from the information that is present in its genome and the literature and involves steps such as functional annotation of the genome, identification of the associated reactions and determination of their stoichiometry, assignment of localization, determination of the biomass composition, estimation of energy requirements, and definition of model constraints. This information can be integrated into a stoichiometric model of metabolism that can be used for detailed analysis of the metabolic potential of the organism using constraint-based modeling approaches and hence is valuable in understanding its metabolic capabilities.
Collapse
Affiliation(s)
- Gino J E Baart
- VIB Department of Plant Systems Biology/Department of Biology, Protistology and Aquatic Ecology, Ghent University, Ghent, Belgium.
| | | |
Collapse
|
8
|
Han TL, Cannon RD, Villas-Bôas SG. The metabolic basis of Candida albicans morphogenesis and quorum sensing. Fungal Genet Biol 2011; 48:747-63. [DOI: 10.1016/j.fgb.2011.04.002] [Citation(s) in RCA: 115] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2009] [Revised: 03/07/2011] [Accepted: 04/05/2011] [Indexed: 12/15/2022]
|
9
|
Dale JM, Popescu L, Karp PD. Machine learning methods for metabolic pathway prediction. BMC Bioinformatics 2010; 11:15. [PMID: 20064214 PMCID: PMC3146072 DOI: 10.1186/1471-2105-11-15] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Accepted: 01/08/2010] [Indexed: 12/29/2022] Open
Abstract
Background A key challenge in systems biology is the reconstruction of an organism's metabolic network from its genome sequence. One strategy for addressing this problem is to predict which metabolic pathways, from a reference database of known pathways, are present in the organism, based on the annotated genome of the organism. Results To quantitatively validate methods for pathway prediction, we developed a large "gold standard" dataset of 5,610 pathway instances known to be present or absent in curated metabolic pathway databases for six organisms. We defined a collection of 123 pathway features, whose information content we evaluated with respect to the gold standard. Feature data were used as input to an extensive collection of machine learning (ML) methods, including naïve Bayes, decision trees, and logistic regression, together with feature selection and ensemble methods. We compared the ML methods to the previous PathoLogic algorithm for pathway prediction using the gold standard dataset. We found that ML-based prediction methods can match the performance of the PathoLogic algorithm. PathoLogic achieved an accuracy of 91% and an F-measure of 0.786. The ML-based prediction methods achieved accuracy as high as 91.2% and F-measure as high as 0.787. The ML-based methods output a probability for each predicted pathway, whereas PathoLogic does not, which provides more information to the user and facilitates filtering of predicted pathways. Conclusions ML methods for pathway prediction perform as well as existing methods, and have qualitative advantages in terms of extensibility, tunability, and explainability. More advanced prediction methods and/or more sophisticated input features may improve the performance of ML methods. However, pathway prediction performance appears to be limited largely by the ability to correctly match enzymes to the reactions they catalyze based on genome annotations.
Collapse
Affiliation(s)
- Joseph M Dale
- Bioinformatics Research Group, SRI International, 333 Ravenswood Ave, Menlo Park, CA 94025, USA
| | | | | |
Collapse
|
10
|
Schneider J, Vorh�lter FJ, Trost E, Blom J, Musa Y, Neuweger H, Niehaus K, Schatschneider S, Tauch A, Goesmann A. CARMEN - Comparative Analysis and in silico Reconstruction of organism-specific MEtabolic Networks. GENETICS AND MOLECULAR RESEARCH 2010; 9:1660-72. [DOI: 10.4238/vol9-3gmr901] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
11
|
Melzer G, Esfandabadi ME, Franco-Lara E, Wittmann C. Flux Design: In silico design of cell factories based on correlation of pathway fluxes to desired properties. BMC SYSTEMS BIOLOGY 2009; 3:120. [PMID: 20035624 PMCID: PMC2808316 DOI: 10.1186/1752-0509-3-120] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2009] [Accepted: 12/25/2009] [Indexed: 02/07/2023]
Abstract
Background The identification of genetic target genes is a key step for rational engineering of production strains towards bio-based chemicals, fuels or therapeutics. This is often a difficult task, because superior production performance typically requires a combination of multiple targets, whereby the complex metabolic networks complicate straightforward identification. Recent attempts towards target prediction mainly focus on the prediction of gene deletion targets and therefore can cover only a part of genetic modifications proven valuable in metabolic engineering. Efficient in silico methods for simultaneous genome-scale identification of targets to be amplified or deleted are still lacking. Results Here we propose the identification of targets via flux correlation to a chosen objective flux as approach towards improved biotechnological production strains with optimally designed fluxes. The approach, we name Flux Design, computes elementary modes and, by search through the modes, identifies targets to be amplified (positive correlation) or down-regulated (negative correlation). Supported by statistical evaluation, a target potential is attributed to the identified reactions in a quantitative manner. Based on systems-wide models of the industrial microorganisms Corynebacterium glutamicum and Aspergillus niger, up to more than 20,000 modes were obtained for each case, differing strongly in production performance and intracellular fluxes. For lysine production in C. glutamicum the identified targets nicely matched with reported successful metabolic engineering strategies. In addition, simulations revealed insights, e.g. into the flexibility of energy metabolism. For enzyme production in A.niger flux correlation analysis suggested a number of targets, including non-obvious ones. Hereby, the relevance of most targets depended on the metabolic state of the cell and also on the carbon source. Conclusions Objective flux correlation analysis provided a detailed insight into the metabolic networks of industrially relevant prokaryotic and eukaryotic microorganisms. It was shown that capacity, pathway usage, and relevant genetic targets for optimal production partly depend on the network structure and the metabolic state of the cell which should be considered in future metabolic engineering strategies. The presented strategy can be generally used to identify priority sorted amplification and deletion targets for metabolic engineering purposes under various conditions and thus displays a useful strategy to be incorporated into efficient strain and bioprocess optimization.
Collapse
Affiliation(s)
- Guido Melzer
- Institute of Biochemical Engineering, Technische Universität Braunschweig, Gaussstr 17, 38106 Braunschweig, Germany.
| | | | | | | |
Collapse
|
12
|
Duan Y, Zhou L, Hall DG, Li W, Doddapaneni H, Lin H, Liu L, Vahling CM, Gabriel DW, Williams KP, Dickerman A, Sun Y, Gottwald T. Complete genome sequence of citrus huanglongbing bacterium, 'Candidatus Liberibacter asiaticus' obtained through metagenomics. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2009; 22:1011-20. [PMID: 19589076 DOI: 10.1094/mpmi-22-8-1011] [Citation(s) in RCA: 339] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Citrus huanglongbing is the most destructive disease of citrus worldwide. It is spread by citrus psyllids and is associated with a low-titer, phloem-limited infection by any of three uncultured species of alpha-Proteobacteria, 'Candidatus Liberibacter asiaticus', 'Ca. L. americanus', and 'Ca. L. africanus'. A complete circular 'Ca. L. asiaticus' genome has been obtained by metagenomics, using the DNA extracted from a single 'Ca. L. asiaticus'-infected psyllid. The 1.23-Mb genome has an average 36.5% GC content. Annotation revealed a high percentage of genes involved in both cell motility (4.5%) and active transport in general (8.0%), which may contribute to its virulence. 'Ca. L. asiaticus' appears to have a limited ability for aerobic respiration and is likely auxotrophic for at least five amino acids. Consistent with its intracellular nature, 'Ca. L. asiaticus' lacks type III and type IV secretion systems as well as typical free-living or plant-colonizing extracellular degradative enzymes. 'Ca. L. asiaticus' appears to have all type I secretion system genes needed for both multidrug efflux and toxin effector secretion. Multi-protein phylogenetic analysis confirmed 'Ca. L. asiaticus' as an early-branching and highly divergent member of the family Rhizobiaceae. This is the first genome sequence of an uncultured alpha-proteobacteria that is both an intracellular plant pathogen and insect symbiont.
Collapse
|
13
|
Durot M, Bourguignon PY, Schachter V. Genome-scale models of bacterial metabolism: reconstruction and applications. FEMS Microbiol Rev 2009; 33:164-90. [PMID: 19067749 PMCID: PMC2704943 DOI: 10.1111/j.1574-6976.2008.00146.x] [Citation(s) in RCA: 195] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2008] [Revised: 10/22/2008] [Accepted: 10/22/2008] [Indexed: 12/16/2022] Open
Abstract
Genome-scale metabolic models bridge the gap between genome-derived biochemical information and metabolic phenotypes in a principled manner, providing a solid interpretative framework for experimental data related to metabolic states, and enabling simple in silico experiments with whole-cell metabolism. Models have been reconstructed for almost 20 bacterial species, so far mainly through expert curation efforts integrating information from the literature with genome annotation. A wide variety of computational methods exploiting metabolic models have been developed and applied to bacteria, yielding valuable insights into bacterial metabolism and evolution, and providing a sound basis for computer-assisted design in metabolic engineering. Recent advances in computational systems biology and high-throughput experimental technologies pave the way for the systematic reconstruction of metabolic models from genomes of new species, and a corresponding expansion of the scope of their applications. In this review, we provide an introduction to the key ideas of metabolic modeling, survey the methods, and resources that enable model reconstruction and refinement, and chart applications to the investigation of global properties of metabolic systems, the interpretation of experimental results, and the re-engineering of their biochemical capabilities.
Collapse
Affiliation(s)
- Maxime Durot
- Genoscope (CEA) and UMR 8030 CNRS-Genoscope-Université d'Evry, Evry, France
| | | | | |
Collapse
|
14
|
Sun J, Lu X, Rinas U, Zeng AP. Metabolic peculiarities of Aspergillus niger disclosed by comparative metabolic genomics. Genome Biol 2008; 8:R182. [PMID: 17784953 PMCID: PMC2375020 DOI: 10.1186/gb-2007-8-9-r182] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2007] [Revised: 07/13/2007] [Accepted: 09/04/2007] [Indexed: 11/10/2022] Open
Abstract
A genome-scale metabolic network and an in-depth genomic comparison of Aspergillus niger with seven other fungi is presented, revealing more than 1,100 enzyme-coding genes that are unique to A. niger. Background Aspergillus niger is an important industrial microorganism for the production of both metabolites, such as citric acid, and proteins, such as fungal enzymes or heterologous proteins. Despite its extensive industrial applications, the genetic inventory of this fungus is only partially understood. The recently released genome sequence opens a new horizon for both scientific studies and biotechnological applications. Results Here, we present the first genome-scale metabolic network for A. niger and an in-depth genomic comparison of this species to seven other fungi to disclose its metabolic peculiarities. The raw genomic sequences of A. niger ATCC 9029 were first annotated. The reconstructed metabolic network is based on the annotation of two A. niger genomes, CBS 513.88 and ATCC 9029, including enzymes with 988 unique EC numbers, 2,443 reactions and 2,349 metabolites. More than 1,100 enzyme-coding genes are unique to A. niger in comparison to the other seven fungi. For example, we identified additional copies of genes such as those encoding alternative mitochondrial oxidoreductase and citrate synthase in A. niger, which might contribute to the high citric acid production efficiency of this species. Moreover, nine genes were identified as encoding enzymes with EC numbers exclusively found in A. niger, mostly involved in the biosynthesis of complex secondary metabolites and degradation of aromatic compounds. Conclusion The genome-level reconstruction of the metabolic network and genome-based metabolic comparison disclose peculiarities of A. niger highly relevant to its biotechnological applications and should contribute to future rational metabolic design and systems biology studies of this black mold and related species.
Collapse
Affiliation(s)
- Jibin Sun
- Helmholtz Centre for Infection Research, Inhoffenstr., 38124 Braunschweig, Germany
| | - Xin Lu
- Helmholtz Centre for Infection Research, Inhoffenstr., 38124 Braunschweig, Germany
| | - Ursula Rinas
- Helmholtz Centre for Infection Research, Inhoffenstr., 38124 Braunschweig, Germany
| | - An Ping Zeng
- Helmholtz Centre for Infection Research, Inhoffenstr., 38124 Braunschweig, Germany
- Hamburg University of Technology, Institute of Bioprocess and Biosystems Engineering, Denickestr., 21071 Hamburg, Germany
| |
Collapse
|
15
|
Zhang Q, Teng H, Sun Y, Xiu Z, Zeng A. Metabolic flux and robustness analysis of glycerol metabolism in Klebsiella pneumoniae. Bioprocess Biosyst Eng 2007; 31:127-35. [PMID: 17713793 DOI: 10.1007/s00449-007-0155-7] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2007] [Accepted: 08/02/2007] [Indexed: 11/24/2022]
Abstract
The knowledge of the mechanism of flux distribution will benefit understanding cell physiology and regulation of metabolism. In this study, the measured fluxes obtained under steady-state conditions were used to estimate intracellular fluxes and identify the robustness of branch points of the anaerobic glycerol metabolism in Klebsiella pneumoniae for the production of 1,3-propanediol by metabolic flux analysis. The biomass concentration increased as NADH(2)/NAD(+) decreased at low initial concentration and inversed at high initial glycerol concentration. The flux distribution revealed that the branch points of glycerol and dihydroxyacetonephosphate were rigid to the environmental conditions. However, the pyruvate and acetyl coenzyme A metabolisms gave cells the flexibility to regulate the energy and intermediate fluxes under various environmental conditions. Additionly, it was found that the formation rate of ethanol and the ratio of pyruvate dehydrogenase to pyruvate formate lyase appeared visible fluctuations at high glycerol uptake rate.
Collapse
Affiliation(s)
- Qingrui Zhang
- Department of Bioscience and Biotechnology, Dalian University of Technology, Linggong Road 2, Dalian, 116023, People's Republic of China
| | | | | | | | | |
Collapse
|
16
|
Toward the automated generation of genome-scale metabolic networks in the SEED. BMC Bioinformatics 2007; 8:139. [PMID: 17462086 PMCID: PMC1868769 DOI: 10.1186/1471-2105-8-139] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2006] [Accepted: 04/26/2007] [Indexed: 12/13/2022] Open
Abstract
Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Conclusion Our method sets the stage for the automated generation of substantially complete metabolic networks for over 400 complete genome sequences currently in the SEED. With each genome that is processed using our tools, the database of common components grows to cover more of the diversity of metabolic pathways. This increases the likelihood that components of reaction networks for subsequently processed genomes can be retrieved from the database, rather than assembled and verified manually.
Collapse
|
17
|
Deckwer WD, Jahn D, Hempel D, Zeng AP. Systems Biology Approaches to Bioprocess Development. Eng Life Sci 2006. [DOI: 10.1002/elsc.200620153] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
18
|
Sun J, Wang W, Hundertmark C, Zeng AP, Jahn D, Deckwer WD. A protein database constructed from low-coverage genomic sequence of Bacillus megaterium and its use for accelerated proteomic analysis. J Biotechnol 2006; 124:486-95. [PMID: 16567015 DOI: 10.1016/j.jbiotec.2006.01.033] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2005] [Revised: 12/27/2005] [Accepted: 01/13/2006] [Indexed: 11/18/2022]
Abstract
Peptide mass fingerprint (PMF) matching is a high-throughput method used for protein spot identification in connection with two-dimensional gel electrophoresis (2DE). However, the success of PMF matching largely depends on whether the proteins to be identified exist in the database searched. Consequently, it is often necessary to apply other more sophisticated but also time-consuming technologies to generate sequence-tags for definitive protein identification. On the other hand, modern sequencing technologies are generating a large quantity of DNA sequences, first in unfinished form or with low genome coverage due to the time-consuming and thus limiting steps of finishing and annotation. We recently started to sequence the genome of Bacillus megaterium DSM 319, a bacterium of industrial interest. In this study, we demonstrate that a protein database generated from merely three-fold coverage, unfinished genomic sequences of this bacterium allows a fast and reliable protein spot identification solely based on PMF from high-throughput MALDI-TOF MS analysis. We further show that the strain-specific protein database from low coverage genomic sequence greatly outperforms the commonly used cross-species databases constructed from 13 completely sequenced Bacillus strains for protein spot identification via PMF.
Collapse
Affiliation(s)
- Jibin Sun
- Group Systems Biology, GBF-German Research Center for Biotechnology, Mascheroder Weg 1, 38124 Braunschweig, Germany
| | | | | | | | | | | |
Collapse
|
19
|
Notebaart RA, van Enckevort FHJ, Francke C, Siezen RJ, Teusink B. Accelerating the reconstruction of genome-scale metabolic networks. BMC Bioinformatics 2006; 7:296. [PMID: 16772023 PMCID: PMC1550432 DOI: 10.1186/1471-2105-7-296] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2006] [Accepted: 06/13/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The genomic information of a species allows for the genome-scale reconstruction of its metabolic capacity. Such a metabolic reconstruction gives support to metabolic engineering, but also to integrative bioinformatics and visualization. Sequence-based automatic reconstructions require extensive manual curation, which can be very time-consuming. Therefore, we present a method to accelerate the time-consuming process of network reconstruction for a query species. The method exploits the availability of well-curated metabolic networks and uses high-resolution predictions of gene equivalency between species, allowing the transfer of gene-reaction associations from curated networks. RESULTS We have evaluated the method using Lactococcus lactis IL1403, for which a genome-scale metabolic network was published recently. We recovered most of the gene-reaction associations (i.e. 74 - 85%) which are incorporated in the published network. Moreover, we predicted over 200 additional genes to be associated to reactions, including genes with unknown function, genes for transporters and genes with specific metabolic reactions, which are good candidates for an extension to the previously published network. In a comparison of our developed method with the well-established approach Pathologic, we predicted 186 additional genes to be associated to reactions. We also predicted a relatively high number of complete conserved protein complexes, which are derived from curated metabolic networks, illustrating the potential predictive power of our method for protein complexes. CONCLUSION We show that our methodology can be applied to accelerate the reconstruction of genome-scale metabolic networks by taking optimal advantage of existing, manually curated networks. As orthology detection is the first step in the method, only the translated open reading frames (ORFs) of a newly sequenced genome are necessary to reconstruct a metabolic network. When more manually curated metabolic networks will become available in the near future, the usefulness of our method in network prediction is likely to increase.
Collapse
Affiliation(s)
- Richard A Notebaart
- Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen, P.O.Box 9010, 6500GL Nijmegen, The Netherlands
| | - Frank HJ van Enckevort
- Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen, P.O.Box 9010, 6500GL Nijmegen, The Netherlands
- NIZO food research BV, P.O.Box 20, 6710BA, Ede, The Netherlands
- Present address: Friesland Foods Corporate Research, Deventer, The Netherlands
| | - Christof Francke
- Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen, P.O.Box 9010, 6500GL Nijmegen, The Netherlands
- Wageningen Center for Food Sciences, P.O.Box 557, 6700AN Wageningen, The Netherlands
| | - Roland J Siezen
- Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen, P.O.Box 9010, 6500GL Nijmegen, The Netherlands
- NIZO food research BV, P.O.Box 20, 6710BA, Ede, The Netherlands
- Wageningen Center for Food Sciences, P.O.Box 557, 6700AN Wageningen, The Netherlands
| | - Bas Teusink
- Center for Molecular and Biomolecular Informatics, Radboud University Nijmegen, P.O.Box 9010, 6500GL Nijmegen, The Netherlands
- NIZO food research BV, P.O.Box 20, 6710BA, Ede, The Netherlands
- Wageningen Center for Food Sciences, P.O.Box 557, 6700AN Wageningen, The Netherlands
| |
Collapse
|
20
|
Arakawa K, Yamada Y, Shinoda K, Nakayama Y, Tomita M. GEM System: automatic prototyping of cell-wide metabolic pathway models from genomes. BMC Bioinformatics 2006; 7:168. [PMID: 16553966 PMCID: PMC1435936 DOI: 10.1186/1471-2105-7-168] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2005] [Accepted: 03/23/2006] [Indexed: 11/18/2022] Open
Abstract
Background Successful realization of a "systems biology" approach to analyzing cells is a grand challenge for our understanding of life. However, current modeling approaches to cell simulation are labor-intensive, manual affairs, and therefore constitute a major bottleneck in the evolution of computational cell biology. Results We developed the Genome-based Modeling (GEM) System for the purpose of automatically prototyping simulation models of cell-wide metabolic pathways from genome sequences and other public biological information. Models generated by the GEM System include an entire Escherichia coli metabolism model comprising 968 reactions of 1195 metabolites, achieving 100% coverage when compared with the KEGG database, 92.38% with the EcoCyc database, and 95.06% with iJR904 genome-scale model. Conclusion The GEM System prototypes qualitative models to reduce the labor-intensive tasks required for systems biology research. Models of over 90 bacterial genomes are available at our web site.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | - Yohei Yamada
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | - Kosaku Shinoda
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | - Yoichi Nakayama
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| |
Collapse
|
21
|
Deckwer WD, Hempel D, Zeng AP, Jahn D. Systembiotechnologische Ansätze zur Prozessentwicklung. CHEM-ING-TECH 2006. [DOI: 10.1002/cite.200500156] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
22
|
Gaur M, Choudhury D, Prasad R. Complete inventory of ABC proteins in human pathogenic yeast, Candida albicans. J Mol Microbiol Biotechnol 2006; 9:3-15. [PMID: 16254441 DOI: 10.1159/000088141] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The recent completion of the sequencing project of the opportunistic human pathogenic yeast, Candida albicans (http://www.ncbi.nlm.nih.gov/), led us to analyze and classify its ATP-binding cassette (ABC) proteins, which constitute one of the largest superfamilies of proteins. Some of its members are multidrug transporters responsible for the commonly encountered problem of antifungal resistance. TBLASTN searches together with domain analysis identified 81 nucleotide-binding domains, which belong to 51 different putative open reading frames. Considering that each allelic pair represents a single ABC protein of the Candida genome, the total number of putative members of this superfamily is 28. Domain organization, sequence-based analysis and self-organizing map-based clustering led to the classification of Candida ABC proteins into 6 distinct subfamilies. Each subfamily from C. albicans has an equivalent in Saccharomyces cerevisiae suggesting a close evolutionary relationship between the two yeasts. Our searches also led to the identification of a new motif to each subfamily in Candida that could be used to identify sequences from the corresponding subfamily in other organisms. It is hoped that the inventory of Candida ABC transporters thus created will provide new insights into the role of ABC proteins in antifungal resistance as well as help in the functional characterization of the superfamily of these proteins.
Collapse
Affiliation(s)
- Manisha Gaur
- School of Life Sciences, Jawaharlal Nehru University, New Delhi, India
| | | | | |
Collapse
|
23
|
Sun J, Gunzer F, Westendorf AM, Buer J, Scharfe M, Jarek M, Gössling F, Blöcker H, Zeng AP. Genomic peculiarity of coding sequences and metabolic potential of probiotic Escherichia coli strain Nissle 1917 inferred from raw genome data. J Biotechnol 2005; 117:147-61. [PMID: 15823404 DOI: 10.1016/j.jbiotec.2005.01.008] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2004] [Revised: 12/16/2004] [Accepted: 01/07/2005] [Indexed: 10/25/2022]
Abstract
Probiotic Escherichia coli strain Nissle 1917 (O6:K5:H1) is a commensal E. coli isolate that has a long tradition in medicine for the treatment of various intestinal disorders in humans. To elucidate the molecular basis of its probiotic nature, we started sequencing the genome of this organism with a whole-genome shotgun approach. A 7.8-fold coverage of the genomic sequence has been generated and is now in the finishing stage. To exploit the genome data as early as possible and to generate hypotheses for functional studies, the unfinished sequencing data were analyzed in this work using a new method [Sun, J., Zeng, A.P., 2004. IdentiCS--identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence. BMC Bioinformatics 5, 112] which is particularly suitable for the prediction of coding sequences (CDSs) from unannotated genome sequence. The CDSs predicted for E. coli Nissle 1917 were compared with those of all five other sequenced E. coli strains (E. coli K-12 MG1655, E. coli K-12 W3110, E. coli CFT073, EHEC O157:H7 EDL933 and EHEC O157:H7 Sakai) published to date. Five thousand one hundred and ninety-two CDSs were predicted for E. coli Nissle 1917, of which 1065 were assigned with enzyme EC numbers. The comparison of all predicted CDSs of E. coli Nissle 1917 to the other E. coli strains revealed 108 CDSs specific for this isolate. They are organized as four big genome islands and many other smaller gene clusters. Based on CDSs with EC numbers for enzymes, the potential metabolic network of Nissle 1917 was reconstructed and compared to those of the other five E. coli strains. Overall, the comparative genomic analysis sheds light on the genomic peculiarity of the probiotic E. coli strain Nissle 1917 and is helpful for designing further functional studies long before the sequencing project is completely finished.
Collapse
Affiliation(s)
- Jibin Sun
- GBF - German Research Centre for Biotechnology, Experimental Bioinformatics, Mascheroder Weg 1, D-38124 Braunschweig, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Arakawa K, Suzuki H, Fujishima K, Fujimoto K, Ueda S, Matsui M, Tomita M. A Comprehensive Software Suite for the Analysis of cDNAs. GENOMICS, PROTEOMICS & BIOINFORMATICS 2005; 3:179-88. [PMID: 16487083 PMCID: PMC5172547 DOI: 10.1016/s1672-0229(05)03023-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We have developed a comprehensive software suite for bioinformatics research of cDNAs; it is aimed at rapid characterization of the features of genes and the proteins they code. Methods implemented include the detection of translation initiation and termination signals, statistical analysis of codon usage, comparative study of amino acid composition, comparative modeling of the structures of product proteins, prediction of alternative splice forms, and metabolic pathway reconstruction. The software package is freely available under the GNU General Public License at http://www.g-language.org/data/cdna/.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan.
| | | | | | | | | | | | | |
Collapse
|
25
|
Combined Literature Mining and Gene Expression Analysis for Modeling Neuro-endocrine-immune Interactions. LECTURE NOTES IN COMPUTER SCIENCE 2005. [DOI: 10.1007/11538356_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
|