1
|
Genome-scale metabolic reconstruction of 7,302 human microorganisms for personalized medicine. Nat Biotechnol 2023; 41:1320-1331. [PMID: 36658342 PMCID: PMC10497413 DOI: 10.1038/s41587-022-01628-0] [Citation(s) in RCA: 40] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 11/30/2022] [Indexed: 01/21/2023]
Abstract
The human microbiome influences the efficacy and safety of a wide variety of commonly prescribed drugs. Designing precision medicine approaches that incorporate microbial metabolism would require strain- and molecule-resolved, scalable computational modeling. Here, we extend our previous resource of genome-scale metabolic reconstructions of human gut microorganisms with a greatly expanded version. AGORA2 (assembly of gut organisms through reconstruction and analysis, version 2) accounts for 7,302 strains, includes strain-resolved drug degradation and biotransformation capabilities for 98 drugs, and was extensively curated based on comparative genomics and literature searches. The microbial reconstructions performed very well against three independently assembled experimental datasets with an accuracy of 0.72 to 0.84, surpassing other reconstruction resources and predicted known microbial drug transformations with an accuracy of 0.81. We demonstrate that AGORA2 enables personalized, strain-resolved modeling by predicting the drug conversion potential of the gut microbiomes from 616 patients with colorectal cancer and controls, which greatly varied between individuals and correlated with age, sex, body mass index and disease stages. AGORA2 serves as a knowledge base for the human microbiome and paves the way to personalized, predictive analysis of host-microbiome metabolic interactions.
Collapse
|
2
|
A functional microbiome catalog crowdsourced from North American rivers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.22.550117. [PMID: 37502915 PMCID: PMC10370164 DOI: 10.1101/2023.07.22.550117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Predicting elemental cycles and maintaining water quality under increasing anthropogenic influence requires understanding the spatial drivers of river microbiomes. However, the unifying microbial determinants governing river biogeochemistry are hindered by a lack of genome-resolved functional insights and sampling across multiple rivers. Here we employed a community science effort to accelerate the sampling of river microbiomes to create the Genome Resolved Open Watersheds database (GROWdb). This resource profiled the identity, distribution, function, and expression of thousands of microbial genomes across rivers covering 90% of United States watersheds. We identified the most cosmopolitan microbiome members, while also revealing local drivers of strain endemism across ecological dimensions. We provide the first evidence that microbial functional trait expression followed the tenets of the River Continuum Concept, suggesting the structure and function of river microbiomes is predictable. GROWdb is a publicly available resource that paves the way for watershed predictive modeling and microbiome-based management practices.
Collapse
|
3
|
kb_DRAM: annotation and metabolic profiling of genomes with DRAM in KBase. Bioinformatics 2023; 39:btad110. [PMID: 36857575 PMCID: PMC10068739 DOI: 10.1093/bioinformatics/btad110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 12/30/2022] [Accepted: 02/28/2023] [Indexed: 03/03/2023] Open
Abstract
Microbial genome annotation is the process of identifying structural and functional elements in DNA sequences and subsequently attaching biological information to those elements. DRAM is a tool developed to annotate bacterial, archaeal, and viral genomes derived from pure cultures or metagenomes. DRAM goes beyond traditional annotation tools by distilling multiple gene annotations to genome level summaries of functional potential. Despite these benefits, a downside of DRAM is the requirement of large computational resources, which limits its accessibility. Further, it did not integrate with downstream metabolic modeling tools that require genome annotation. To alleviate these constraints, DRAM and the viral counterpart, DRAM-v, are now available and integrated with the freely accessible KBase cyberinfrastructure. With kb_DRAM users can generate DRAM annotations and functional summaries from microbial or viral genomes in a point-and-click interface, as well as generate genome-scale metabolic models from DRAM annotations. AVAILABILITY AND IMPLEMENTATION For kb_DRAM users, the kb_DRAM apps on KBase can be found in the catalog at https://narrative.kbase.us/#catalog/modules/kb_DRAM. For kb_DRAM users, a tutorial workflow with all documentation is available at https://narrative.kbase.us/narrative/129480. For kb_DRAM developers, software is available at https://github.com/shafferm/kb_DRAM.
Collapse
|
4
|
Application of the Metabolic Modeling Pipeline in KBase to Categorize Reactions, Predict Essential Genes, and Predict Pathways in an Isolate Genome. Methods Mol Biol 2022; 2349:291-320. [PMID: 34719000 DOI: 10.1007/978-1-0716-1585-0_13] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The DOE Systems Biology Knowledgebase (KBase) platform offers a range of powerful tools for the reconstruction, refinement, and analysis of genome-scale metabolic models built from microbial isolate genomes. In this chapter, we describe and demonstrate these tools in action with an analysis of isoprene production in the Bacillus subtilis DSM genome. Two different methods are applied to build initial metabolic models for the DSM genome, then the models are gapfilled in three different growth conditions. Next, flux balance analysis (FBA) and flux variability analysis (FVA) techniques are applied to both study the growth of these models in minimal media and classify reactions within each model based on essentiality and functionality. The models are applied with the FBA method to predict essential genes, which are then compared to an updated list of essential genes obtained for B. subtilis 168, a very similar strain to the DSM isolate. The models are also applied to simulate Biolog growth conditions, and these results are compared with Biolog data collected for B. subtilis 168. Finally, the DSM metabolic models are applied to explore the pathways and genes responsible for producing isoprene in this strain. These studies demonstrate the accuracy and utility of models generated from the KBase pipelines, as well as exploring the tools available for analyzing these models.
Collapse
|
5
|
|
6
|
Abstract
The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth's continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.
Collapse
|
7
|
The ModelSEED Biochemistry Database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes. Nucleic Acids Res 2021; 49:D575-D588. [PMID: 32986834 PMCID: PMC7778927 DOI: 10.1093/nar/gkaa746] [Citation(s) in RCA: 86] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 08/25/2020] [Accepted: 09/24/2020] [Indexed: 12/31/2022] Open
Abstract
For over 10 years, ModelSEED has been a primary resource for the construction of draft genome-scale metabolic models based on annotated microbial or plant genomes. Now being released, the biochemistry database serves as the foundation of biochemical data underlying ModelSEED and KBase. The biochemistry database embodies several properties that, taken together, distinguish it from other published biochemistry resources by: (i) including compartmentalization, transport reactions, charged molecules and proton balancing on reactions; (ii) being extensible by the user community, with all data stored in GitHub; and (iii) design as a biochemical 'Rosetta Stone' to facilitate comparison and integration of annotations from many different tools and databases. The database was constructed by combining chemical data from many resources, applying standard transformations, identifying redundancies and computing thermodynamic properties. The ModelSEED biochemistry is continually tested using flux balance analysis to ensure the biochemical network is modeling-ready and capable of simulating diverse phenotypes. Ontologies can be designed to aid in comparing and reconciling metabolic reconstructions that differ in how they represent various metabolic pathways. ModelSEED now includes 33,978 compounds and 36,645 reactions, available as a set of extensible files on GitHub, and available to search at https://modelseed.org/biochem and KBase.
Collapse
|
8
|
The ModelSEED Biochemistry Database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes. Nucleic Acids Res 2021; 49:D1555. [PMID: 33179751 DOI: 10.1101/2020.03.31.018663] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/21/2023] Open
Abstract
ABSTRACTFor over ten years, ModelSEED has been a primary resource for the construction of draft genome-scale metabolic models based on annotated microbial or plant genomes. Now being released, the biochemistry database serves as the foundation of biochemical data underlying ModelSEED and KBase. The biochemistry database embodies several properties that, taken together, distinguish it from other published biochemistry resources by: (i) including compartmentalization, transport reactions, charged molecules and proton balancing on reactions;; (ii) being extensible by the user community, with all data stored in GitHub; and (iii) design as a biochemical “Rosetta Stone” to facilitate comparison and integration of annotations from many different tools and databases. The database was constructed by combining chemical data from many resources, applying standard transformations, identifying redundancies, and computing thermodynamic properties. The ModelSEED biochemistry is continually tested using flux balance analysis to ensure the biochemical network is modeling-ready and capable of simulating diverse phenotypes. Ontologies can be designed to aid in comparing and reconciling metabolic reconstructions that differ in how they represent various metabolic pathways. ModelSEED now includes 33,978 compounds and 36,645 reactions, available as a set of extensible files on GitHub, and available to search at https://modelseed.org and KBase.
Collapse
|
9
|
The ModelSEED Biochemistry Database for the integration of metabolic annotations and the reconstruction, comparison and analysis of metabolic models for plants, fungi and microbes. Nucleic Acids Res 2020; 49:D1555. [PMID: 33179751 PMCID: PMC7778962 DOI: 10.1093/nar/gkaa1143] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
10
|
|
11
|
Reconstruction and Analysis of Central Metabolism in Microbes. Methods Mol Biol 2018; 1716:111-129. [PMID: 29222751 DOI: 10.1007/978-1-4939-7528-0_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Genome-scale metabolic models (GEMs) generated from automated reconstruction pipelines often lack accuracy due to the need for extensive gapfilling and the inference of periphery metabolic pathways based on lower-confidence annotations. The central carbon pathways and electron transport chains are among the most well-understood regions of microbial metabolism, and these pathways contribute significantly toward defining cellular behavior and growth conditions. Thus, it is often useful to construct a simplified core metabolic model (CMM) that is comprised of only the high-confidence central pathways. In this chapter, we discuss methods for producing core metabolic models (CMM) based on genome annotations. With its reduced scope compared to GEMs, CMM reconstruction focuses on accurate representation of the central metabolic pathways related to energy biosynthesis and accurate energy yield predictions. We demonstrate the reconstruction and analysis of CMMs using the DOE Systems Biology Knowledgebase (KBase). The complete workflow is available at http://kbase.us/core-models/.
Collapse
|
12
|
Evolution of substrate specificity in a retained enzyme driven by gene loss. eLife 2017; 6. [PMID: 28362260 PMCID: PMC5404923 DOI: 10.7554/elife.22679] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 03/25/2017] [Indexed: 12/13/2022] Open
Abstract
The connection between gene loss and the functional adaptation of retained proteins is still poorly understood. We apply phylogenomics and metabolic modeling to detect bacterial species that are evolving by gene loss, with the finding that Actinomycetaceae genomes from human cavities are undergoing sizable reductions, including loss of L-histidine and L-tryptophan biosynthesis. We observe that the dual-substrate phosphoribosyl isomerase A or priA gene, at which these pathways converge, appears to coevolve with the occurrence of trp and his genes. Characterization of a dozen PriA homologs shows that these enzymes adapt from bifunctionality in the largest genomes, to a monofunctional, yet not necessarily specialized, inefficient form in genomes undergoing reduction. These functional changes are accomplished via mutations, which result from relaxation of purifying selection, in residues structurally mapped after sequence and X-ray structural analyses. Our results show how gene loss can drive the evolution of substrate specificity from retained enzymes. DOI:http://dx.doi.org/10.7554/eLife.22679.001
Collapse
|
13
|
A novel signal transduction protein: Combination of solute binding and tandem PAS-like sensor domains in one polypeptide chain. Protein Sci 2017; 26:857-869. [PMID: 28168783 DOI: 10.1002/pro.3134] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Revised: 01/27/2017] [Accepted: 02/02/2017] [Indexed: 11/07/2022]
Abstract
We report the structural and biochemical characterization of a novel periplasmic ligand-binding protein, Dret_0059, from Desulfohalobium retbaense DSM 5692, an organism isolated from Lake Retba, in Senegal. The structure of the protein consists of a unique combination of a periplasmic solute binding protein (SBP) domain at the N-terminal and a tandem PAS-like sensor domain at the C-terminal region. SBP domains are found ubiquitously, and their best known function is in solute transport across membranes. PAS-like sensor domains are commonly found in signal transduction proteins. These domains are widely observed as parts of many protein architectures and complexes but have not been observed previously within the same polypeptide chain. In the structure of Dret_0059, a ketoleucine moiety is bound to the SBP, whereas a cytosine molecule is bound in the distal PAS-like domain of the tandem PAS-like domain. Differential scanning flourimetry support the binding of ligands observed in the crystal structure. There is significant interaction between the SBP and tandem PAS-like domains, and it is possible that the binding of one ligand could have an effect on the binding of the other. We uncovered three other proteins with this structural architecture in the non-redundant sequence data base, and predict that they too bind the same substrates. The genomic context of this protein did not offer any clues for its function. We did not find any biological process in which the two observed ligands are coupled. The protein Dret_0059 could be involved in either signal transduction or solute transport.
Collapse
|
14
|
Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation. Front Microbiol 2016; 7:1819. [PMID: 27933038 PMCID: PMC5121216 DOI: 10.3389/fmicb.2016.01819] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 10/28/2016] [Indexed: 01/13/2023] Open
Abstract
Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. An important step toward meeting the challenge of understanding gene function and regulation is the identification of sets of genes that are always co-expressed. These gene sets, Atomic Regulons (ARs), represent fundamental units of function within a cell and could be used to associate genes of unknown function with cellular processes and to enable rational genetic engineering of cellular systems. Here, we describe an approach for inferring ARs that leverages large-scale expression data sets, gene context, and functional relationships among genes. We computed ARs for Escherichia coli based on 907 gene expression experiments and compared our results with gene clusters produced by two prevalent data-driven methods: Hierarchical clustering and k-means clustering. We compared ARs and purely data-driven gene clusters to the curated set of regulatory interactions for E. coli found in RegulonDB, showing that ARs are more consistent with gold standard regulons than are data-driven gene clusters. We further examined the consistency of ARs and data-driven gene clusters in the context of gene interactions predicted by Context Likelihood of Relatedness (CLR) analysis, finding that the ARs show better agreement with CLR predicted interactions. We determined the impact of increasing amounts of expression data on AR construction and find that while more data improve ARs, it is not necessary to use the full set of gene expression experiments available for E. coli to produce high quality ARs. In order to explore the conservation of co-regulated gene sets across different organisms, we computed ARs for Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus, each of which represents increasing degrees of phylogenetic distance from E. coli. Comparison of the organism-specific ARs showed that the consistency of AR gene membership correlates with phylogenetic distance, but there is clear variability in the regulatory networks of closely related organisms. As large scale expression data sets become increasingly common for model and non-model organisms, comparative analyses of atomic regulons will provide valuable insights into fundamental regulatory modules used across the bacterial domain.
Collapse
|
15
|
Modeling central metabolism and energy biosynthesis across microbial life. BMC Genomics 2016; 17:568. [PMID: 27502787 PMCID: PMC4977884 DOI: 10.1186/s12864-016-2887-8] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 07/06/2016] [Indexed: 12/22/2022] Open
Abstract
Background Automatically generated bacterial metabolic models, and even some curated models, lack accuracy in predicting energy yields due to poor representation of key pathways in energy biosynthesis and the electron transport chain (ETC). Further compounding the problem, complex interlinking pathways in genome-scale metabolic models, and the need for extensive gapfilling to support complex biomass reactions, often results in predicting unrealistic yields or unrealistic physiological flux profiles. Results To overcome this challenge, we developed methods and tools (http://coremodels.mcs.anl.gov) to build high quality core metabolic models (CMM) representing accurate energy biosynthesis based on a well studied, phylogenetically diverse set of model organisms. We compare these models to explore the variability of core pathways across all microbial life, and by analyzing the ability of our core models to synthesize ATP and essential biomass precursors, we evaluate the extent to which the core metabolic pathways and functional ETCs are known for all microbes. 6,600 (80 %) of our models were found to have some type of aerobic ETC, whereas 5,100 (62 %) have an anaerobic ETC, and 1,279 (15 %) do not have any ETC. Using our manually curated ETC and energy biosynthesis pathways with no gapfilling at all, we predict accurate ATP yields for nearly 5586 (70 %) of the models under aerobic and anaerobic growth conditions. This study revealed gaps in our knowledge of the central pathways that result in 2,495 (30 %) CMMs being unable to produce ATP under any of the tested conditions. We then established a methodology for the systematic identification and correction of inconsistent annotations using core metabolic models coupled with phylogenetic analysis. Conclusions We predict accurate energy yields based on our improved annotations in energy biosynthesis pathways and the implementation of diverse ETC reactions across the microbial tree of life. We highlighted missing annotations that were essential to energy biosynthesis in our models. We examine the diversity of these pathways across all microbial life and enable the scientific community to explore the analyses generated from this large-scale analysis of over 8000 microbial genomes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2887-8) contains supplementary material, which is available to authorized users.
Collapse
|
16
|
Enabling comparative modeling of closely related genomes: example genus Brucella. 3 Biotech 2015; 5:101-105. [PMID: 28324362 PMCID: PMC4327756 DOI: 10.1007/s13205-014-0202-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2013] [Accepted: 02/17/2014] [Indexed: 12/22/2022] Open
Abstract
For many scientific applications, it is highly desirable to be able to compare metabolic models of closely related genomes. In this short report, we attempt to raise awareness to the fact that taking annotated genomes from public repositories and using them for metabolic model reconstructions is far from being trivial due to annotation inconsistencies. We are proposing a protocol for comparative analysis of metabolic models on closely related genomes, using fifteen strains of genus Brucella, which contains pathogens of both humans and livestock. This study lead to the identification and subsequent correction of inconsistent annotations in the SEED database, as well as the identification of 31 biochemical reactions that are common to Brucella, which are not originally identified by automated metabolic reconstructions. We are currently implementing this protocol for improving automated annotations within the SEED database and these improvements have been propagated into PATRIC, Model-SEED, KBase and RAST. This method is an enabling step for the future creation of consistent annotation systems and high-quality model reconstructions that will support in predicting accurate phenotypes such as pathogenicity, media requirements or type of respiration.
Collapse
|
17
|
Comparative genomics of cultured and uncultured strains suggests genes essential for free-living growth of Liberibacter. PLoS One 2014; 9:e84469. [PMID: 24416233 PMCID: PMC3885570 DOI: 10.1371/journal.pone.0084469] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Accepted: 11/21/2013] [Indexed: 12/28/2022] Open
Abstract
The full genomes of two uncultured plant pathogenic Liberibacter, Ca. Liberibacter asiaticus and Ca. Liberibacter solanacearum, are publicly available. Recently, the larger genome of a closely related cultured strain, Liberibacter crescens BT-1, was described. To gain insights into our current inability to culture most Liberibacter, a comparative genomics analysis was done based on the RAST, KEGG, and manual annotations of these three organisms. In addition, pathogenicity genes were examined in all three bacteria. Key deficiencies were identified in Ca. L. asiaticus and Ca. L. solanacearum that might suggest why these organisms have not yet been cultured. Over 100 genes involved in amino acid and vitamin synthesis were annotated exclusively in L. crescens BT-1. However, none of these deficiencies are limiting in the rich media used to date. Other genes exclusive to L. crescens BT-1 include those involved in cell division, the stringent response regulatory pathway, and multiple two component regulatory systems. These results indicate that L. crescens is capable of growth under a much wider range of conditions than the uncultured Liberibacter strains. No outstanding differences were noted in pathogenicity-associated systems, suggesting that L. crescens BT-1 may be a plant pathogen on an as yet unidentified host.
Collapse
|
18
|
Abstract
Human UBIAD1 localizes to mitochondria and converts vitamin K(1) to vitamin K(2). Vitamin K(2) is best known as a cofactor in blood coagulation, but in bacteria it is a membrane-bound electron carrier. Whether vitamin K(2) exerts a similar carrier function in eukaryotic cells is unknown. We identified Drosophila UBIAD1/Heix as a modifier of pink1, a gene mutated in Parkinson's disease that affects mitochondrial function. We found that vitamin K(2) was necessary and sufficient to transfer electrons in Drosophila mitochondria. Heix mutants showed severe mitochondrial defects that were rescued by vitamin K(2), and, similar to ubiquinone, vitamin K(2) transferred electrons in Drosophila mitochondria, resulting in more efficient adenosine triphosphate (ATP) production. Thus, mitochondrial dysfunction was rescued by vitamin K(2) that serves as a mitochondrial electron carrier, helping to maintain normal ATP production.
Collapse
|
19
|
Genome sequences of the biotechnologically important Bacillus megaterium strains QM B1551 and DSM319. J Bacteriol 2011; 193:4199-213. [PMID: 21705586 PMCID: PMC3147683 DOI: 10.1128/jb.00449-11] [Citation(s) in RCA: 120] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2011] [Accepted: 06/10/2011] [Indexed: 11/20/2022] Open
Abstract
Bacillus megaterium is deep-rooted in the Bacillus phylogeny, making it an evolutionarily key species and of particular importance in understanding genome evolution, dynamics, and plasticity in the bacilli. B. megaterium is a commercially available, nonpathogenic host for the biotechnological production of several substances, including vitamin B(12), penicillin acylase, and amylases. Here, we report the analysis of the first complete genome sequences of two important B. megaterium strains, the plasmidless strain DSM319 and QM B1551, which harbors seven indigenous plasmids. The 5.1-Mbp chromosome carries approximately 5,300 genes, while QM B1551 plasmids represent a combined 417 kb and 523 genes, one of the largest plasmid arrays sequenced in a single bacterial strain. We have documented extensive gene transfer between the plasmids and the chromosome. Each strain carries roughly 300 strain-specific chromosomal genes that account for differences in their experimentally confirmed phenotypes. B. megaterium is able to synthesize vitamin B(12) through an oxygen-independent adenosylcobalamin pathway, which together with other key energetic and metabolic pathways has now been fully reconstructed. Other novel genes include a second ftsZ gene, which may be responsible for the large cell size of members of this species, as well as genes for gas vesicles, a second β-galactosidase gene, and most but not all of the genes needed for genetic competence. Comprehensive analyses of the global Bacillus gene pool showed that only an asymmetric region around the origin of replication was syntenic across the genus. This appears to be a characteristic feature of the Bacillus spp. genome architecture and may be key to their sporulating lifestyle.
Collapse
|