1
|
Chang D, Wang C, Ndayisenga F, Yu Z. Mutations in adaptively evolved Escherichia coli LGE2 facilitated the cost-effective upgrading of undetoxified bio-oil to bioethanol fuel. BIORESOUR BIOPROCESS 2021; 8:105. [PMID: 38650237 PMCID: PMC10991953 DOI: 10.1186/s40643-021-00459-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 10/11/2021] [Indexed: 11/10/2022] Open
Abstract
Levoglucosan is a promising sugar present in the lignocellulose pyrolysis bio-oil, which is a renewable and environment-friendly source for various value-added productions. Although many microbial catalysts have been engineered to produce biofuels and chemicals from levoglucosan, the demerits that these biocatalysts can only utilize pure levoglucosan while inhibited by the inhibitors co-existing with levoglucosan in the bio-oil have greatly limited the industrial-scale application of these biocatalysts in lignocellulose biorefinery. In this study, the previously engineered Escherichia coli LGE2 was evolved for enhanced inhibitor tolerance using long-term adaptive evolution under the stress of multiple inhibitors and finally, a stable mutant E. coli-H was obtained after ~ 374 generations' evolution. In the bio-oil media with an extremely acidic pH of 3.1, E. coli-H with high inhibitor tolerance exhibited remarkable levoglucosan consumption and ethanol production abilities comparable to the control, while the growth of the non-evolved strain was completely blocked even when the pH was adjusted to 7.0. Finally, 8.4 g/L ethanol was achieved by E. coli-H in the undetoxified bio-oil media with ~ 2.0% (w/v) levoglucosan, reaching 82% of the theoretical yield. Whole-genome re-sequencing to monitor the acquisition of mutations identified 4 new mutations within the globally regulatory genes rssB, yqhA, and basR, and the - 10 box of the putative promoter of yqhD-dgkA operon. Especially, yqhA was the first time to be revealed as a gene responsible for inhibitor tolerance. The mutations were all responsible for improved fitness, while basR mutation greatly contributed to the fitness improvement of E. coli-H. This study, for the first time, generated an inhibitor-tolerant levoglucosan-utilizing strain that could produce cost-effective bioethanol from the toxic bio-oil without detoxification process, and provided important experimental evidence and valuable genetic/proteinic information for the development of other robust microbial platforms involved in lignocellulose biorefining processes.
Collapse
Affiliation(s)
- Dongdong Chang
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China
| | - Cong Wang
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China
| | - Fabrice Ndayisenga
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China
| | - Zhisheng Yu
- College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, People's Republic of China.
- RCEES-IMCAS-UCAS Joint-Lab of Microbial Technology for Environmental Science, Beijing, 100085, People's Republic of China.
| |
Collapse
|
2
|
Tabssum F, Ahmad QUA, Qazi JI. DNA sequenced based bacterial taxonomy should entail decisive phenotypic remarks: Towards a balanced approach. J Basic Microbiol 2018; 58:918-927. [PMID: 30144131 DOI: 10.1002/jobm.201800319] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2018] [Revised: 07/13/2018] [Accepted: 07/16/2018] [Indexed: 11/11/2022]
Abstract
Phenotypic characteristics while complimenting 16S rRNA gene sequencing in identifying bacteria become decisive in solving conflicts of equal % similarity of a given DNA sequence to more than one classified microorganisms. "Phenotypic light" may also indicate right direction when a new species' 16S rDNA sequence is under consideration. In fact 16S rRNA gene sequences give indication that either a novel species has been isolated or the test organism is identified. In each case additional tests are required for resolving different issues. Predictions of microbial phenotypes from metagenomic data depend heavily on our knowledge of expressed genes. Thus renaissance of microbial phenotypic characterization is likely to emerge at par with genotypic signatures. Interplay of these and other complimentary levels of analyses are likely to lead DNA barcoding for microorganisms as it has provided efficient methods for species-level identifications of animals and plants. In this review, an attempt has been made to realize the reader(s) importance of interplay of genotypic and phenotypic characteristics of bacteria for development of comprehensive and more stable classification schemes. It is expected that future valid classification schemes will be based on the phenetic relationships of microorganisms.
Collapse
Affiliation(s)
- Fouzia Tabssum
- Department of Zoology, University of the Punjab, Lahore, Pakistan
| | | | - Javed Iqbal Qazi
- Department of Zoology, University of the Punjab, Lahore, Pakistan
| |
Collapse
|
3
|
Computational Techniques for a Comprehensive Understanding of Different Genotype-Phenotype Factors in Biological Systems and Their Applications. Synth Biol (Oxf) 2018. [DOI: 10.1007/978-981-10-8693-9_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
4
|
Blank CE, Cui H, Moore LR, Walls RL. MicrO: an ontology of phenotypic and metabolic characters, assays, and culture media found in prokaryotic taxonomic descriptions. J Biomed Semantics 2016; 7:18. [PMID: 27076900 PMCID: PMC4830071 DOI: 10.1186/s13326-016-0060-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 04/02/2016] [Indexed: 12/03/2022] Open
Abstract
Background MicrO is an ontology of microbiological terms, including prokaryotic qualities and processes, material entities (such as cell components), chemical entities (such as microbiological culture media and medium ingredients), and assays. The ontology was built to support the ongoing development of a natural language processing algorithm, MicroPIE (or, Microbial Phenomics Information Extractor). During the MicroPIE design process, we realized there was a need for a prokaryotic ontology which would capture the evolutionary diversity of phenotypes and metabolic processes across the tree of life, capture the diversity of synonyms and information contained in the taxonomic literature, and relate microbiological entities and processes to terms in a large number of other ontologies, most particularly the Gene Ontology (GO), the Phenotypic Quality Ontology (PATO), and the Chemical Entities of Biological Interest (ChEBI). We thus constructed MicrO to be rich in logical axioms and synonyms gathered from the taxonomic literature. Results MicrO currently has ~14550 classes (~2550 of which are new, the remainder being microbiologically-relevant classes imported from other ontologies), connected by ~24,130 logical axioms (5,446 of which are new), and is available at (http://purl.obolibrary.org/obo/MicrO.owl) and on the project website at https://github.com/carrineblank/MicrO. MicrO has been integrated into the OBO Foundry Library (http://www.obofoundry.org/ontology/micro.html), so that other ontologies can borrow and re-use classes. Term requests and user feedback can be made using MicrO’s Issue Tracker in GitHub. We designed MicrO such that it can support the ongoing and future development of algorithms that can leverage the controlled vocabulary and logical inference power provided by the ontology. Conclusions By connecting microbial classes with large numbers of chemical entities, material entities, biological processes, molecular functions, and qualities using a dense array of logical axioms, we intend MicrO to be a powerful new tool to increase the computing power of bioinformatics tools such as the automated text mining of prokaryotic taxonomic descriptions using natural language processing. We also intend MicrO to support the development of new bioinformatics tools that aim to develop new connections between microbial phenotypes and genotypes (i.e., the gene content in genomes). Future ontology development will include incorporation of pathogenic phenotypes and prokaryotic habitats. Electronic supplementary material The online version of this article (doi:10.1186/s13326-016-0060-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Carrine E Blank
- Department of Geosciences, University of Montana, Missoula, MT 59812 USA
| | - Hong Cui
- School of Information, University of Arizona, Tucson, AZ 85719 USA
| | - Lisa R Moore
- Department of Biological Sciences, University of Southern Maine, Portland, ME 04104 USA
| | | |
Collapse
|
5
|
Luby E, Ibekwe AM, Zilles J, Pruden A. Molecular Methods for Assessment of Antibiotic Resistance in Agricultural Ecosystems: Prospects and Challenges. JOURNAL OF ENVIRONMENTAL QUALITY 2016; 45:441-453. [PMID: 27065390 DOI: 10.2134/jeq2015.07.0367] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Agricultural ecosystems are of special interest for monitoring the potential for antibiotic resistance to spread through the environment and contribute to human exposure. Molecular methods, which target DNA, RNA, and other molecular components of bacterial cells, present certain advantages for characterizing and quantifying markers of antibiotic resistance and their horizontal gene transfer. These include rapid, unambiguous detection of targets; consistent results; and avoidance of culture bias. However, molecular methods are also subject to limitations that are not always clearly addressed or taken into consideration in the interpretation of scientific data. In particular, DNA-based methods do not directly assess viability or presence within an intact bacterial host, but such information may be inferred based on appropriate experimental design or in concert with complementary methods. The purpose of this review is to provide an overview of existing molecular methods for tracking antibiotic resistance in agricultural ecosystems, to define their strengths and weaknesses, and to recommend a path forward for future applications of molecular methods and standardized reporting in the literature. This will guide research along the farm-to-fork continuum and support comparability of the growing number of studies in the literature in a manner that informs management decisions and policy development.
Collapse
|
6
|
Oberhardt MA, Zarecki R, Reshef L, Xia F, Duran-Frigola M, Schreiber R, Henry CS, Ben-Tal N, Dwyer DJ, Gophna U, Ruppin E. Systems-Wide Prediction of Enzyme Promiscuity Reveals a New Underground Alternative Route for Pyridoxal 5'-Phosphate Production in E. coli. PLoS Comput Biol 2016; 12:e1004705. [PMID: 26821166 PMCID: PMC4731195 DOI: 10.1371/journal.pcbi.1004705] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 12/14/2015] [Indexed: 11/18/2022] Open
Abstract
Recent insights suggest that non-specific and/or promiscuous enzymes are common and active across life. Understanding the role of such enzymes is an important open question in biology. Here we develop a genome-wide method, PROPER, that uses a permissive PSI-BLAST approach to predict promiscuous activities of metabolic genes. Enzyme promiscuity is typically studied experimentally using multicopy suppression, in which over-expression of a promiscuous 'replacer' gene rescues lethality caused by inactivation of a 'target' gene. We use PROPER to predict multicopy suppression in Escherichia coli, achieving highly significant overlap with published cases (hypergeometric p = 4.4e-13). We then validate three novel predicted target-replacer gene pairs in new multicopy suppression experiments. We next go beyond PROPER and develop a network-based approach, GEM-PROPER, that integrates PROPER with genome-scale metabolic modeling to predict promiscuous replacements via alternative metabolic pathways. GEM-PROPER predicts a new indirect replacer (thiG) for an essential enzyme (pdxB) in production of pyridoxal 5'-phosphate (the active form of Vitamin B6), which we validate experimentally via multicopy suppression. We perform a structural analysis of thiG to determine its potential promiscuous active site, which we validate experimentally by inactivating the pertaining residues and showing a loss of replacer activity. Thus, this study is a successful example where a computational investigation leads to a network-based identification of an indirect promiscuous replacement of a key metabolic enzyme, which would have been extremely difficult to identify directly.
Collapse
Affiliation(s)
- Matthew A. Oberhardt
- School of Computer Sciences and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park, Maryland, United States of America
- * E-mail: (MAO); (ER)
| | - Raphy Zarecki
- School of Computer Sciences and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Leah Reshef
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Fangfang Xia
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, United States of America
| | - Miquel Duran-Frigola
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), Barcelona, Spain
| | - Rachel Schreiber
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Christopher S. Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, United States of America
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Daniel J. Dwyer
- Department of Cell Biology and Molecular Genetics, Institute for Physical Science and Technology, Department of Bioengineering, Maryland Pathogen Research Institute, University of Maryland, College Park, Maryland, United States of America
| | - Uri Gophna
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Eytan Ruppin
- School of Computer Sciences and Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park, Maryland, United States of America
- * E-mail: (MAO); (ER)
| |
Collapse
|
7
|
Abstract
Most natural microbial systems have evolved to function in environments with temporal and spatial variations. A major limitation to understanding such complex systems is the lack of mathematical modelling frameworks that connect the genomes of individual species and temporal and spatial variations in the environment to system behaviour. The goal of this review is to introduce the emerging field of spatiotemporal metabolic modelling based on genome-scale reconstructions of microbial metabolism. The extension of flux balance analysis (FBA) to account for both temporal and spatial variations in the environment is termed spatiotemporal FBA (SFBA). Following a brief overview of FBA and its established dynamic extension, the SFBA problem is introduced and recent progress is described. Three case studies are reviewed to illustrate the current state-of-the-art and possible future research directions are outlined. The author posits that SFBA is the next frontier for microbial metabolic modelling and a rapid increase in methods development and system applications is anticipated.
Collapse
Affiliation(s)
- Michael A Henson
- Department of Chemical Engineering, University of Massachusetts, Amherst, MA 01003, U.S.A.
| |
Collapse
|
8
|
Harnessing the landscape of microbial culture media to predict new organism-media pairings. Nat Commun 2015; 6:8493. [PMID: 26460590 PMCID: PMC4633754 DOI: 10.1038/ncomms9493] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 08/27/2015] [Indexed: 01/01/2023] Open
Abstract
Culturing microorganisms is a critical step in understanding and utilizing microbial life. Here we map the landscape of existing culture media by extracting natural-language media recipes into a Known Media Database (KOMODO), which includes >18,000 strain-media combinations, >3300 media variants and compound concentrations (the entire collection of the Leibniz Institute DSMZ repository). Using KOMODO, we show that although media are usually tuned for individual strains using biologically common salts, trace metals and vitamins/cofactors are the most differentiating components between defined media of strains within a genus. We leverage KOMODO to predict new organism-media pairings using a transitivity property (74% growth in new in vitro experiments) and a phylogeny-based collaborative filtering tool (83% growth in new in vitro experiments and stronger growth on predicted well-scored versus poorly scored media). These resources are integrated into a web-based platform that predicts media given an organism's 16S rDNA sequence, facilitating future cultivation efforts.
Collapse
|
9
|
Khurana JK, Reeder JE, Shrimpton AE, Thakar J. GESPA: classifying nsSNPs to predict disease association. BMC Bioinformatics 2015. [PMID: 26206375 PMCID: PMC4513380 DOI: 10.1186/s12859-015-0673-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. RESULTS GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA's overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. CONCLUSIONS GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa.
Collapse
Affiliation(s)
- Jay K Khurana
- Department of Urology, SUNY Upstate Medical University, Syracuse, NY, USA.
| | - Jay E Reeder
- Department of Obstetrics and Gynecology, University of Rochester, Rochester, NY, USA.
| | - Antony E Shrimpton
- Department of Pathology, SUNY Upstate Medical University, Syracuse, NY, USA.
| | - Juilee Thakar
- Department of Microbiology and Immunology, University of Rochester, Rochester, NY, USA. .,Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY, USA.
| |
Collapse
|
10
|
Krumholz EW, Libourel IGL. Sequence-based Network Completion Reveals the Integrality of Missing Reactions in Metabolic Networks. J Biol Chem 2015; 290:19197-207. [PMID: 26041773 DOI: 10.1074/jbc.m114.634121] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Indexed: 11/06/2022] Open
Abstract
Genome-scale metabolic models are central in connecting genotypes to metabolic phenotypes. However, even for well studied organisms, such as Escherichia coli, draft networks do not contain a complete biochemical network. Missing reactions are referred to as gaps. These gaps need to be filled to enable functional analysis, and gap-filling choices influence model predictions. To investigate whether functional networks existed where all gap-filling reactions were supported by sequence similarity to annotated enzymes, four draft networks were supplemented with all reactions from the Model SEED database for which minimal sequence similarity was found in their genomes. Quadratic programming revealed that the number of reactions that could partake in a gap-filling solution was vast: 3,270 in the case of E. coli, where 72% of the metabolites in the draft network could connect a gap-filling solution. Nonetheless, no network could be completed without the inclusion of orphaned enzymes, suggesting that parts of the biochemistry integral to biomass precursor formation are uncharacterized. However, many gap-filling reactions were well determined, and the resulting networks showed improved prediction of gene essentiality compared with networks generated through canonical gap filling. In addition, gene essentiality predictions that were sensitive to poorly determined gap-filling reactions were of poor quality, suggesting that damage to the network structure resulting from the inclusion of erroneous gap-filling reactions may be predictable.
Collapse
Affiliation(s)
| | - Igor G L Libourel
- From the Department of Plant Biology and the Biotechnology Institute, University of Minnesota, Saint Paul, Minnesota 55108
| |
Collapse
|
11
|
Zarecki R, Oberhardt MA, Reshef L, Gophna U, Ruppin E. A novel nutritional predictor links microbial fastidiousness with lowered ubiquity, growth rate, and cooperativeness. PLoS Comput Biol 2014; 10:e1003726. [PMID: 25033033 PMCID: PMC4102436 DOI: 10.1371/journal.pcbi.1003726] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Accepted: 06/02/2014] [Indexed: 02/01/2023] Open
Abstract
Understanding microbial nutritional requirements is a key challenge in microbiology. Here we leverage the recent availability of thousands of automatically generated genome-scale metabolic models to develop a predictor of microbial minimal medium requirements, which we apply to thousands of species to study the relationship between their nutritional requirements and their ecological and genomic traits. We first show that nutritional requirements are more similar among species that co-habit many ecological niches. We then reveal three fundamental characteristics of microbial fastidiousness (i.e., complex and specific nutritional requirements): (1) more fastidious microorganisms tend to be more ecologically limited; (2) fastidiousness is positively associated with smaller genomes and smaller metabolic networks; and (3) more fastidious species grow more slowly and have less ability to cooperate with other species than more metabolically versatile organisms. These associations reflect the adaptation of fastidious microorganisms to unique niches with few cohabitating species. They also explain how non-fastidious species inhabit many ecological niches with high abundance rates. Taken together, these results advance our understanding microbial nutrition on a large scale, by presenting new nutrition-related associations that govern the distribution of microorganisms in nature. Understanding microbial nutrition is critical for understanding microbial life, and thus has a major influence in many areas of biology. In recent years, the traditional methods of studying microbial nutrition, which rely on culturing bacteria and assessing their nutritional needs through extensive experiments, have been augmented by the development of genome-scale metabolic models, which enable in-depth analysis and prediction of nutrition for a few well-studied organisms. Recently, a pipeline was developed for generating genome-scale metabolic models automatically (the SEED). Here, we leverage models built from this pipeline in order to develop a novel predictor of microbial minimal medium requirements, which we then apply broadly for thousands of microbes across the tree of life. We first show that nutritional requirements are more similar among microorganisms that co-habit many ecological niches. We then use our medium predictions to examine the fastidiousness of organisms (i.e., their need for complex/specific media), and suggest an explanation for certain observed features of microbial abundance patterns. This study is one of the first to leverage genome-scale models on a large (>1000 species) scale, and sets the potential for a new host of strategies for understanding microbial nutrition and ecology in the future.
Collapse
Affiliation(s)
- Raphy Zarecki
- School of Computer Sciences & Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Matthew A. Oberhardt
- School of Computer Sciences & Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Leah Reshef
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Uri Gophna
- Department of Molecular Microbiology and Biotechnology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Eytan Ruppin
- School of Computer Sciences & Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
12
|
de Crécy-Lagard V. Variations in metabolic pathways create challenges for automated metabolic reconstructions: Examples from the tetrahydrofolate synthesis pathway. Comput Struct Biotechnol J 2014; 10:41-50. [PMID: 25210598 PMCID: PMC4151868 DOI: 10.1016/j.csbj.2014.05.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
The availability of thousands of sequenced genomes has revealed the diversity of biochemical solutions to similar chemical problems. Even for molecules at the heart of metabolism, such as cofactors, the pathway enzymes first discovered in model organisms like Escherichia coli or Saccharomyces cerevisiae are often not universally conserved. Tetrahydrofolate (THF) (or its close relative tetrahydromethanopterin) is a universal and essential C1-carrier that most microbes and plants synthesize de novo. The THF biosynthesis pathway and enzymes are, however, not universal and alternate solutions are found for most steps, making this pathway a challenge to annotate automatically in many genomes. Comparing THF pathway reconstructions and functional annotations of a chosen set of folate synthesis genes in specific prokaryotes revealed the strengths and weaknesses of different microbial annotation platforms. This analysis revealed that most current platforms fail in metabolic reconstruction of variant pathways. However, all the pieces are in place to quickly correct these deficiencies if the different databases were built on each other's strengths.
Collapse
Affiliation(s)
- Valérie de Crécy-Lagard
- Department of Microbiology and Cell Science and Genetics Institute, University of Florida, Gainesville, FL, United States
| |
Collapse
|
13
|
High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource. Proc Natl Acad Sci U S A 2014; 111:9645-50. [PMID: 24927599 DOI: 10.1073/pnas.1401329111] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The increasing number of sequenced plant genomes is placing new demands on the methods applied to analyze, annotate, and model these genomes. Today's annotation pipelines result in inconsistent gene assignments that complicate comparative analyses and prevent efficient construction of metabolic models. To overcome these problems, we have developed the PlantSEED, an integrated, metabolism-centric database to support subsystems-based annotation and metabolic model reconstruction for plant genomes. PlantSEED combines SEED subsystems technology, first developed for microbial genomes, with refined protein families and biochemical data to assign fully consistent functional annotations to orthologous genes, particularly those encoding primary metabolic pathways. Seamless integration with its parent, the prokaryotic SEED database, makes PlantSEED a unique environment for cross-kingdom comparative analysis of plant and bacterial genomes. The consistent annotations imposed by PlantSEED permit rapid reconstruction and modeling of primary metabolism for all plant genomes in the database. This feature opens the unique possibility of model-based assessment of the completeness and accuracy of gene annotation and thus allows computational identification of genes and pathways that are restricted to certain genomes or need better curation. We demonstrate the PlantSEED system by producing consistent annotations for 10 reference genomes. We also produce a functioning metabolic model for each genome, gapfilling to identify missing annotations and proposing gene candidates for missing annotations. Models are built around an extended biomass composition representing the most comprehensive published to date. To our knowledge, our models are the first to be published for seven of the genomes analyzed.
Collapse
|
14
|
Shestov AA, Barker B, Gu Z, Locasale JW. Computational approaches for understanding energy metabolism. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2013; 5:733-50. [PMID: 23897661 PMCID: PMC3906216 DOI: 10.1002/wsbm.1238] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
There has been a surge of interest in understanding the regulation of metabolic networks involved in disease in recent years. Quantitative models are increasingly being used to interrogate the metabolic pathways that are contained within this complex disease biology. At the core of this effort is the mathematical modeling of central carbon metabolism involving glycolysis and the citric acid cycle (referred to as energy metabolism). Here, we discuss several approaches used to quantitatively model metabolic pathways relating to energy metabolism and discuss their formalisms, successes, and limitations.
Collapse
Affiliation(s)
| | - Brandon Barker
- Division of Nutritional Sciences, Cornell University, Ithaca NY 14850
- Tri-Institutional Field of Computational Biology and Medicine, Cornell University, Ithaca NY 14850
| | - Zhenglong Gu
- Division of Nutritional Sciences, Cornell University, Ithaca NY 14850
- Tri-Institutional Field of Computational Biology and Medicine, Cornell University, Ithaca NY 14850
| | - Jason W Locasale
- Division of Nutritional Sciences, Cornell University, Ithaca NY 14850
- Tri-Institutional Field of Computational Biology and Medicine, Cornell University, Ithaca NY 14850
| |
Collapse
|
15
|
Konwar KM, Hanson NW, Pagé AP, Hallam SJ. MetaPathways: a modular pipeline for constructing pathway/genome databases from environmental sequence information. BMC Bioinformatics 2013; 14:202. [PMID: 23800136 PMCID: PMC3695837 DOI: 10.1186/1471-2105-14-202] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 06/13/2013] [Indexed: 02/01/2023] Open
Abstract
Background A central challenge to understanding the ecological and biogeochemical roles of microorganisms in natural and human engineered ecosystems is the reconstruction of metabolic interaction networks from environmental sequence information. The dominant paradigm in metabolic reconstruction is to assign functional annotations using BLAST. Functional annotations are then projected onto symbolic representations of metabolism in the form of KEGG pathways or SEED subsystems. Results Here we present MetaPathways, an open source pipeline for pathway inference that uses the PathoLogic algorithm to map functional annotations onto the MetaCyc collection of reactions and pathways, and construct environmental Pathway/Genome Databases (ePGDBs) compatible with the editing and navigation features of Pathway Tools. The pipeline accepts assembled or unassembled nucleotide sequences, performs quality assessment and control, predicts and annotates noncoding genes and open reading frames, and produces inputs to PathoLogic. In addition to constructing ePGDBs, MetaPathways uses MLTreeMap to build phylogenetic trees for selected taxonomic anchor and functional gene markers, converts General Feature Format (GFF) files into concatenated GenBank files for ePGDB construction based on third-party annotations, and generates useful file formats including Sequin files for direct GenBank submission and gene feature tables summarizing annotations, MLTreeMap trees, and ePGDB pathway coverage summaries for statistical comparisons. Conclusions MetaPathways provides users with a modular annotation and analysis pipeline for predicting metabolic interaction networks from environmental sequence information using an alternative to KEGG pathways and SEED subsystems mapping. It is extensible to genomic and transcriptomic datasets from a wide range of sequencing platforms, and generates useful data products for microbial community structure and function analysis. The MetaPathways software package, installation instructions, and example data can be obtained from http://hallam.microbiology.ubc.ca/MetaPathways.
Collapse
Affiliation(s)
- Kishori M Konwar
- Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T1Z3, Canada.
| | | | | | | |
Collapse
|
16
|
Boeker M, Jansen L, Grewe N, Röhl J, Schober D, Seddig-Raufie D, Schulz S. Effects of guideline-based training on the quality of formal ontologies: a randomized controlled trial. PLoS One 2013; 8:e61425. [PMID: 23667440 PMCID: PMC3646875 DOI: 10.1371/journal.pone.0061425] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2012] [Accepted: 03/09/2013] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND The importance of ontologies in the biomedical domain is generally recognized. However, their quality is often too poor for large-scale use in critical applications, at least partially due to insufficient training of ontology developers. OBJECTIVE To show the efficacy of guideline-based ontology development training on the performance of ontology developers. The hypothesis was that students who received training on top-level ontologies and design patterns perform better than those who only received training in the basic principles of formal ontology engineering. METHODS A curriculum was implemented based on a guideline for ontology design. A randomized controlled trial on the efficacy of this curriculum was performed with 24 students from bioinformatics and related fields. After joint training on the fundamentals of ontology development the students were randomly allocated to two groups. During the intervention, each group received training on different topics in ontology development. In the assessment phase, all students were asked to solve modeling problems on topics taught differentially in the intervention phase. Primary outcome was the similarity of the students' ontology artefacts compared with gold standard ontologies developed by the authors before the experiment; secondary outcome was the intra-group similarity of group members' ontologies. RESULTS The experiment showed no significant effect of the guideline-based training on the performance of ontology developers (a) the ontologies developed after specific training were only slightly but not significantly closer to the gold standard ontologies than the ontologies developed without prior specific training; (b) although significant differences for certain ontologies were detected, the intra-group similarity was not consistently influenced in one direction by the differential training. CONCLUSION Methodologically limited, this study cannot be interpreted as a general failure of a guideline-based approach to ontology development. Further research is needed to increase insight into whether specific development guidelines and practices in ontology design are effective.
Collapse
Affiliation(s)
- Martin Boeker
- Institute of Medical Biometry and Medical Informatics, Albert-Ludwigs University Freiburg, Freiburg, Germany.
| | | | | | | | | | | | | |
Collapse
|
17
|
Dinsdale EA, Edwards RA, Bailey BA, Tuba I, Akhter S, McNair K, Schmieder R, Apkarian N, Creek M, Guan E, Hernandez M, Isaacs K, Peterson C, Regh T, Ponomarenko V. Multivariate analysis of functional metagenomes. Front Genet 2013; 4:41. [PMID: 23579547 PMCID: PMC3619665 DOI: 10.3389/fgene.2013.00041] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2013] [Accepted: 03/06/2013] [Indexed: 12/21/2022] Open
Abstract
Metagenomics is a primary tool for the description of microbial and viral communities. The sheer magnitude of the data generated in each metagenome makes identifying key differences in the function and taxonomy between communities difficult to elucidate. Here we discuss the application of seven different data mining and statistical analyses by comparing and contrasting the metabolic functions of 212 microbial metagenomes within and between 10 environments. Not all approaches are appropriate for all questions, and researchers should decide which approach addresses their questions. This work demonstrated the use of each approach: for example, random forests provided a robust and enlightening description of both the clustering of metagenomes and the metabolic processes that were important in separating microbial communities from different environments. All analyses identified that the presence of phage genes within the microbial community was a predictor of whether the microbial community was host-associated or free-living. Several analyses identified the subtle differences that occur with environments, such as those seen in different regions of the marine environment.
Collapse
|
18
|
Udatha DBRKG, Rasmussen S, Sicheritz-Pontén T, Panagiotou G. Targeted metabolic engineering guided by computational analysis of single-nucleotide polymorphisms (SNPs). Methods Mol Biol 2013; 985:409-428. [PMID: 23417815 DOI: 10.1007/978-1-62703-299-5_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
The non-synonymous SNPs, the so-called non-silent SNPs, which are single-nucleotide variations in the coding regions that give "birth" to amino acid mutations, are often involved in the modulation of protein function. Understanding the effect of individual amino acid mutations on a protein/enzyme function or stability is useful for altering its properties for a wide variety of engineering studies. Since measuring the effects of amino acid mutations experimentally is a laborious process, a variety of computational methods have been discussed here that aid to extract direct genotype to phenotype information.
Collapse
Affiliation(s)
- D B R K Gupta Udatha
- Department of Chemical and Biological Engineering, Industrial Biotechnology, Chalmers University of Technology, Gothenburg, Sweden
| | | | | | | |
Collapse
|
19
|
SEED servers: high-performance access to the SEED genomes, annotations, and metabolic models. PLoS One 2012; 7:e48053. [PMID: 23110173 PMCID: PMC3480482 DOI: 10.1371/journal.pone.0048053] [Citation(s) in RCA: 147] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2012] [Accepted: 09/19/2012] [Indexed: 01/13/2023] Open
Abstract
The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users.
Collapse
|
20
|
Crécy-Lagard VD, Phillips G, Grochowski LL, Yacoubi BE, Jenney F, Adams MWW, Murzin AG, White RH. Comparative genomics guided discovery of two missing archaeal enzyme families involved in the biosynthesis of the pterin moiety of tetrahydromethanopterin and tetrahydrofolate. ACS Chem Biol 2012; 7:1807-16. [PMID: 22931285 PMCID: PMC3500442 DOI: 10.1021/cb300342u] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
![]()
C-1 carriers are essential cofactors in all domains of
life, and
in Archaea, these can be derivatives of tetrahydromethanopterin (H4-MPT) or tetrahydrofolate (H4-folate). Their synthesis
requires 6-hydroxymethyl-7,8-dihydropterin diphosphate (6-HMDP) as
the precursor, but the nature of pathways that lead to its formation
were unknown until the recent discovery of the GTP cyclohydrolase
IB/MptA family that catalyzes the first step, the conversion of GTP
to dihydroneopterin 2′,3′-cyclic phosphate or 7,8-dihydroneopterin
triphosphate [El Yacoubi, B.; et al. (2006) J. Biol. Chem., 281, 37586–37593
and Grochowski, L. L.; et al. (2007) Biochemistry46, 6658–6667]. Using a combination of comparative
genomics analyses, heterologous complementation tests, and in vitro assays, we show that the archaeal protein families
COG2098 and COG1634 specify two of the missing 6-HMDP synthesis enzymes.
Members of the COG2098 family catalyze the formation of 6-hydroxymethyl-7,8-dihydropterin
from 7,8-dihydroneopterin, while members of the COG1634 family catalyze
the formation of 6-HMDP from 6-hydroxymethyl-7,8-dihydropterin. The
discovery of these missing genes solves a long-standing mystery and
provides novel examples of convergent evolutions where proteins of
dissimilar architectures perform the same biochemical function.
Collapse
Affiliation(s)
- Valérie de Crécy-Lagard
- Department of Microbiology and
Department of Microbiology and Cell Science, University of Florida, P.O. Box 110700, Gainesville, Florida 32611-0700,
United States
| | - Gabriela Phillips
- Department of Microbiology and
Department of Microbiology and Cell Science, University of Florida, P.O. Box 110700, Gainesville, Florida 32611-0700,
United States
| | - Laura L. Grochowski
- Department
of Biochemistry (0308), Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United
States
| | - Basma El Yacoubi
- Department of Microbiology and
Department of Microbiology and Cell Science, University of Florida, P.O. Box 110700, Gainesville, Florida 32611-0700,
United States
| | - Francis Jenney
- Department of Basic
Sciences,
Georgia Campus, Philadelphia College of Osteopathic Medicine, Suwanee, Georgia 30024, United States
| | - Michael W. W. Adams
- Department of Biochemistry and
Molecular Biology, University of Georgia, Athens, Georgia 30602, United States
| | - Alexey G. Murzin
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH,
U.K
| | - Robert H. White
- Department
of Biochemistry (0308), Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United
States
| |
Collapse
|
21
|
Abstract
Until recently, the focus in dental research has been on studying a small fraction of the oral microbiome-so-called opportunistic pathogens. With the advent of next-generation sequencing (NGS) technologies, researchers now have the tools that allow for profiling of the microbiomes and metagenomes at unprecedented depths. The major advantages of NGS are the high throughput and the fact that specific taxa do not need to be targeted. The relatively low cost and the availability of sequencing facilities have contributed to nearly exponential growth of NGS datasets. The quality and interpretation of the NGS data could be undermined at numerous steps-from sample collection, storage, and DNA extraction to PCR bias, sequencing errors, choice of algorithms for data processing, and statistical analyses. Making sense out of this data deluge is and will be the major challenge. The community analyses based on systems ecology principles will bring us closer to an understanding of the underlying forces that facilitate the stability (or imbalance) of the microbiome. The next logical step will take us beyond the microbiome. The integration of bacterial, viral, fungal "meta-omes" such as the meta-transcriptome, meta-proteome, and meta-metabolome, together with the host as a major co-factor, should be the ultimate goal in unraveling the complexity of the oral interactome.
Collapse
Affiliation(s)
- E Zaura
- Department of Preventive Dentistry, Academic Centre for Dentistry Amsterdam, University of Amsterdam and VU University Amsterdam, Netherlands.
| |
Collapse
|
22
|
Larsen PE, Gibbons SM, Gilbert JA. Modeling microbial community structure and functional diversity across time and space. FEMS Microbiol Lett 2012; 332:91-8. [PMID: 22553907 PMCID: PMC3396557 DOI: 10.1111/j.1574-6968.2012.02588.x] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2011] [Revised: 04/16/2012] [Accepted: 04/18/2012] [Indexed: 12/21/2022] Open
Abstract
Microbial communities exhibit exquisitely complex structure. Many aspects of this complexity, from the number of species to the total number of interactions, are currently very difficult to examine directly. However, extraordinary efforts are being made to make these systems accessible to scientific investigation. While recent advances in high-throughput sequencing technologies have improved accessibility to the taxonomic and functional diversity of complex communities, monitoring the dynamics of these systems over time and space - using appropriate experimental design - is still expensive. Fortunately, modeling can be used as a lens to focus low-resolution observations of community dynamics to enable mathematical abstractions of functional and taxonomic dynamics across space and time. Here, we review the approaches for modeling bacterial diversity at both the very large and the very small scales at which microbial systems interact with their environments. We show that modeling can help to connect biogeochemical processes to specific microbial metabolic pathways.
Collapse
|
23
|
Krumholz EW, Yang H, Weisenhorn P, Henry CS, Libourel IGL. Genome-wide metabolic network reconstruction of the picoalga Ostreococcus. JOURNAL OF EXPERIMENTAL BOTANY 2012; 63:2353-2362. [PMID: 22207618 DOI: 10.1093/jxb/err407] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The green picoalga Ostreococcus is emerging as a simple plant model organism, and two species, O. lucimarinus and O. tauri, have now been sequenced and annotated manually. To evaluate the completeness of the metabolic annotation of both species, metabolic networks of O. lucimarinus and O. tauri were reconstructed from the KEGG database, thermodynamically constrained, elementally balanced, and functionally evaluated. The draft networks contained extensive gaps and, in the case of O. tauri, no biomass components could be produced due to an incomplete Calvin cycle. To find and remove gaps from the networks, an extensive reference biochemical reaction database was assembled using a stepwise approach that minimized the inclusion of microbial reactions. Gaps were then removed from both Ostreococcus networks using two existing gap-filling methodologies. In the first method, a bottom-up approach, a minimal list of reactions was added to each model to enable the production of all metabolites included in our biomass equation. In the second method, a top-down approach, all reactions in the reference database were added to the target networks and subsequently trimmed away based on the sequence alignment scores of identified orthologues. Because current gap-filling methods do not produce unique solutions, a quality metric that includes a weighting for phylogenetic distance and sequence similarity was developed to distinguish between gap-filling results automatically. The draft O. lucimarinus and O. tauri networks required the addition of 56 and 70 reactions, respectively, in order to produce the same biomass precursor metabolites that were produced by our plant reference database.
Collapse
Affiliation(s)
- Elias W Krumholz
- Department of Plant Biology, University of Minnesota, Saint Paul, MN 55108, USA
| | | | | | | | | |
Collapse
|