1
|
A fast comparative genome browser for diverse bacteria and archaea. PLoS One 2024; 19:e0301871. [PMID: 38593165 PMCID: PMC11003636 DOI: 10.1371/journal.pone.0301871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 03/22/2024] [Indexed: 04/11/2024] Open
Abstract
Genome sequencing has revealed an incredible diversity of bacteria and archaea, but there are no fast and convenient tools for browsing across these genomes. It is cumbersome to view the prevalence of homologs for a protein of interest, or the gene neighborhoods of those homologs, across the diversity of the prokaryotes. We developed a web-based tool, fast.genomics, that uses two strategies to support fast browsing across the diversity of prokaryotes. First, the database of genomes is split up. The main database contains one representative from each of the 6,377 genera that have a high-quality genome, and additional databases for each taxonomic order contain up to 10 representatives of each species. Second, homologs of proteins of interest are identified quickly by using accelerated searches, usually in a few seconds. Once homologs are identified, fast.genomics can quickly show their prevalence across taxa, view their neighboring genes, or compare the prevalence of two different proteins. Fast.genomics is available at https://fast.genomics.lbl.gov.
Collapse
|
2
|
Evaluating E. coli genome-scale metabolic model accuracy with high-throughput mutant fitness data. Mol Syst Biol 2023; 19:e11566. [PMID: 37888487 DOI: 10.15252/msb.202311566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 09/23/2023] [Accepted: 10/05/2023] [Indexed: 10/28/2023] Open
Abstract
The Escherichia coli genome-scale metabolic model (GEM) is an exemplar systems biology model for the simulation of cellular metabolism. Experimental validation of model predictions is essential to pinpoint uncertainty and ensure continued development of accurate models. Here, we quantified the accuracy of four subsequent E. coli GEMs using published mutant fitness data across thousands of genes and 25 different carbon sources. This evaluation demonstrated the utility of the area under a precision-recall curve relative to alternative accuracy metrics. An analysis of errors in the latest (iML1515) model identified several vitamins/cofactors that are likely available to mutants despite being absent from the experimental growth medium and highlighted isoenzyme gene-protein-reaction mapping as a key source of inaccurate predictions. A machine learning approach further identified metabolic fluxes through hydrogen ion exchange and specific central metabolism branch points as important determinants of model accuracy. This work outlines improved practices for the assessment of GEM accuracy with high-throughput mutant fitness data and highlights promising areas for future model refinement in E. coli and beyond.
Collapse
|
3
|
Taxonomic and environmental distribution of bacterial amino acid auxotrophies. Nat Commun 2023; 14:7608. [PMID: 37993466 PMCID: PMC10665431 DOI: 10.1038/s41467-023-43435-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/08/2023] [Indexed: 11/24/2023] Open
Abstract
Many microorganisms are auxotrophic-unable to synthesize the compounds they require for growth. With this work, we quantify the prevalence of amino acid auxotrophies across a broad diversity of bacteria and habitats. We predicted the amino acid biosynthetic capabilities of 26,277 unique bacterial genomes spanning 12 phyla using a metabolic pathway model validated with empirical data. Amino acid auxotrophy is widespread across bacterial phyla, but we conservatively estimate that the majority of taxa (78.4%) are able to synthesize all amino acids. Our estimates indicate that amino acid auxotrophies are more prevalent among obligate intracellular parasites and in free-living taxa with genomic attributes characteristic of 'streamlined' life history strategies. We predicted the amino acid biosynthetic capabilities of bacterial communities found in 12 unique habitats to investigate environmental associations with auxotrophy, using data compiled from 3813 samples spanning major aquatic, terrestrial, and engineered environments. Auxotrophic taxa were more abundant in host-associated environments (including the human oral cavity and gut) and in fermented food products, with auxotrophic taxa being relatively rare in soil and aquatic systems. Overall, this work contributes to a more complete understanding of amino acid auxotrophy across the bacterial tree of life and the ecological contexts in which auxotrophy can be a successful strategy.
Collapse
|
4
|
Large-scale genetic characterization of the model sulfate-reducing bacterium, Desulfovibrio vulgaris Hildenborough. Front Microbiol 2023; 14:1095191. [PMID: 37065130 PMCID: PMC10102598 DOI: 10.3389/fmicb.2023.1095191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 03/10/2023] [Indexed: 04/03/2023] Open
Abstract
Sulfate-reducing bacteria (SRB) are obligate anaerobes that can couple their growth to the reduction of sulfate. Despite the importance of SRB to global nutrient cycles and their damage to the petroleum industry, our molecular understanding of their physiology remains limited. To systematically provide new insights into SRB biology, we generated a randomly barcoded transposon mutant library in the model SRB Desulfovibrio vulgaris Hildenborough (DvH) and used this genome-wide resource to assay the importance of its genes under a range of metabolic and stress conditions. In addition to defining the essential gene set of DvH, we identified a conditional phenotype for 1,137 non-essential genes. Through examination of these conditional phenotypes, we were able to make a number of novel insights into our molecular understanding of DvH, including how this bacterium synthesizes vitamins. For example, we identified DVU0867 as an atypical L-aspartate decarboxylase required for the synthesis of pantothenic acid, provided the first experimental evidence that biotin synthesis in DvH occurs via a specialized acyl carrier protein and without methyl esters, and demonstrated that the uncharacterized dehydrogenase DVU0826:DVU0827 is necessary for the synthesis of pyridoxal phosphate. In addition, we used the mutant fitness data to identify genes involved in the assimilation of diverse nitrogen sources and gained insights into the mechanism of inhibition of chlorate and molybdate. Our large-scale fitness dataset and RB-TnSeq mutant library are community-wide resources that can be used to generate further testable hypotheses into the gene functions of this environmentally and industrially important group of bacteria.
Collapse
|
5
|
Filling gaps in bacterial catabolic pathways with computation and high-throughput genetics. PLoS Genet 2022; 18:e1010156. [PMID: 35417463 PMCID: PMC9007349 DOI: 10.1371/journal.pgen.1010156] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 03/18/2022] [Indexed: 12/02/2022] Open
Abstract
To discover novel catabolic enzymes and transporters, we combined high-throughput genetic data from 29 bacteria with an automated tool to find gaps in their catabolic pathways. GapMind for carbon sources automatically annotates the uptake and catabolism of 62 compounds in bacterial and archaeal genomes. For the compounds that are utilized by the 29 bacteria, we systematically examined the gaps in GapMind's predicted pathways, and we used the mutant fitness data to find additional genes that were involved in their utilization. We identified novel pathways or enzymes for the utilization of glucosamine, citrulline, myo-inositol, lactose, and phenylacetate, and we annotated 299 diverged enzymes and transporters. We also curated 125 proteins from published reports. For the 29 bacteria with genetic data, GapMind finds high-confidence paths for 85% of utilized carbon sources. In diverse bacteria and archaea, 38% of utilized carbon sources have high-confidence paths, which was improved from 27% by incorporating the fitness-based annotations and our curation. GapMind for carbon sources is available as a web server (http://papers.genomics.lbl.gov/carbon) and takes just 30 seconds for the typical genome.
Collapse
|
6
|
Systematic discovery of pseudomonad genetic factors involved in sensitivity to tailocins. THE ISME JOURNAL 2021; 15:2289-2305. [PMID: 33649553 PMCID: PMC8319346 DOI: 10.1038/s41396-021-00921-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 01/14/2021] [Accepted: 02/01/2021] [Indexed: 12/13/2022]
Abstract
Tailocins are bactericidal protein complexes produced by a wide variety of bacteria that kill closely related strains and may play a role in microbial community structure. Thanks to their high specificity, tailocins have been proposed as precision antibacterial agents for therapeutic applications. Compared to tailed phages, with whom they share an evolutionary and morphological relationship, bacterially produced tailocins kill their host upon production but producing strains display resistance to self-intoxication. Though lipopolysaccharide (LPS) has been shown to act as a receptor for tailocins, the breadth of factors involved in tailocin sensitivity, and the mechanisms behind resistance to self-intoxication, remain unclear. Here, we employed genome-wide screens in four non-model pseudomonads to identify mutants with altered fitness in the presence of tailocins produced by closely related pseudomonads. Our mutant screens identified O-antigen composition and display as most important in defining sensitivity to our tailocins. In addition, the screens suggest LPS thinning as a mechanism by which resistant strains can become more sensitive to tailocins. We validate many of these novel findings, and extend these observations of tailocin sensitivity to 130 genome-sequenced pseudomonads. This work offers insights into tailocin-bacteria interactions, informing the potential use of tailocins in microbiome manipulation and antibacterial therapy.
Collapse
|
7
|
Functional genetics of human gut commensal Bacteroides thetaiotaomicron reveals metabolic requirements for growth across environments. Cell Rep 2021; 34:108789. [PMID: 33657378 PMCID: PMC8121099 DOI: 10.1016/j.celrep.2021.108789] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 11/30/2020] [Accepted: 02/03/2021] [Indexed: 12/12/2022] Open
Abstract
Harnessing the microbiota for beneficial outcomes is limited by our poor understanding of the constituent bacteria, as the functions of most of their genes are unknown. Here, we measure the growth of a barcoded transposon mutant library of the gut commensal Bacteroides thetaiotaomicron on 48 carbon sources, in the presence of 56 stress-inducing compounds, and during mono-colonization of gnotobiotic mice. We identify 516 genes with a specific phenotype under only one or a few conditions, enabling informed predictions of gene function. For example, we identify a glycoside hydrolase important for growth on type I rhamnogalacturonan, a DUF4861 protein for glycosaminoglycan utilization, a 3-keto-glucoside hydrolase for disaccharide utilization, and a tripartite multidrug resistance system specifically for bile salt tolerance. Furthermore, we show that B. thetaiotaomicron uses alternative enzymes for synthesizing nitrogen-containing metabolic precursors based on ammonium availability and that these enzymes are used differentially in vivo in a diet-dependent manner.
Collapse
|
8
|
Abstract
Although most organisms synthesize methionine from homocysteine and methyl folates, some have “core” methionine synthases that lack folate-binding domains and use other methyl donors. In vitro, the characterized core synthases use methylcobalamin as a methyl donor, but in vivo, they probably rely on corrinoid (vitamin B12-binding) proteins. We identified four families of core methionine synthases that are distantly related to each other (under 30% pairwise amino acid identity). From the characterized enzymes, we identified the families MesA, which is found in methanogens, and MesB, which is found in anaerobic bacteria and archaea with the Wood-Ljungdahl pathway. A third uncharacterized family, MesC, is found in anaerobic archaea that have the Wood-Ljungdahl pathway and lack known forms of methionine synthase. We predict that most members of the MesB and MesC families accept methyl groups from the iron-sulfur corrinoid protein of that pathway. The fourth family, MesD, is found only in aerobic bacteria. Using transposon mutants and complementation, we show that MesD does not require 5-methyltetrahydrofolate or cobalamin. Instead, MesD requires an uncharacterized protein family (DUF1852) and oxygen for activity. Methionine is one of the amino acids that make up proteins, and the final step in methionine synthesis is the transfer of a methyl group. In most organisms, the methyl group is obtained from methyl folates, but some anaerobic bacteria and archaea are thought to use corrinoid (vitamin B12-binding) proteins instead. By analyzing the sequences of the potential methionine synthases across the genomes of diverse bacteria and archaea, we identified four families of folate-independent methionine synthases. For three of these families, we can use co-occurrence with corrinoid proteins to predict their likely partners. We show that the fourth family does not require vitamin B12; instead, it obtains methyl groups from an oxygen-dependent partner protein. Our results will help us understand the growth requirements of diverse bacteria and archaea.
Collapse
|
9
|
Abstract
Bacteriophages (phages) are critical players in the dynamics and function of microbial communities and drive processes as diverse as global biogeochemical cycles and human health. Phages tend to be predators finely tuned to attack specific hosts, even down to the strain level, which in turn defend themselves using an array of mechanisms. However, to date, efforts to rapidly and comprehensively identify bacterial host factors important in phage infection and resistance have yet to be fully realized. Here, we globally map the host genetic determinants involved in resistance to 14 phylogenetically diverse double-stranded DNA phages using two model Escherichia coli strains (K-12 and BL21) with known sequence divergence to demonstrate strain-specific differences. Using genome-wide loss-of-function and gain-of-function genetic technologies, we are able to confirm previously described phage receptors as well as uncover a number of previously unknown host factors that confer resistance to one or more of these phages. We uncover differences in resistance factors that strongly align with the susceptibility of K-12 and BL21 to specific phage. We also identify both phage-specific mechanisms, such as the unexpected role of cyclic-di-GMP in host sensitivity to phage N4, and more generic defenses, such as the overproduction of colanic acid capsular polysaccharide that defends against a wide array of phages. Our results indicate that host responses to phages can occur via diverse cellular mechanisms. Our systematic and high-throughput genetic workflow to characterize phage-host interaction determinants can be extended to diverse bacteria to generate datasets that allow predictive models of how phage-mediated selection will shape bacterial phenotype and evolution. The results of this study and future efforts to map the phage resistance landscape will lead to new insights into the coevolution of hosts and their phage, which can ultimately be used to design better phage therapeutic treatments and tools for precision microbiome engineering.
Collapse
|
10
|
Selective carbon sources influence the end products of microbial nitrate respiration. THE ISME JOURNAL 2020; 14:2034-2045. [PMID: 32372050 PMCID: PMC7368043 DOI: 10.1038/s41396-020-0666-7] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 03/31/2020] [Accepted: 04/22/2020] [Indexed: 11/09/2022]
Abstract
Respiratory and catabolic genes are differentially distributed across microbial genomes. Thus, specific carbon sources may favor different respiratory processes. We profiled the influence of 94 carbon sources on the end products of nitrate respiration in microbial enrichment cultures from diverse terrestrial environments. We found that some carbon sources consistently favor dissimilatory nitrate reduction to ammonium (DNRA/nitrate ammonification) while other carbon sources favor nitrite accumulation or denitrification. For an enrichment culture from aquatic sediment, we sequenced the genomes of the most abundant strains, matched these genomes to 16S rDNA exact sequence variants (ESVs), and used 16S rDNA amplicon sequencing to track the differential enrichment of functionally distinct ESVs on different carbon sources. We found that changes in the abundances of strains with different genetic potentials for nitrite accumulation, DNRA or denitrification were correlated with the nitrite or ammonium concentrations in the enrichment cultures recovered on different carbon sources. Specifically, we found that either L-sorbose or D-cellobiose enriched for a Klebsiella nitrite accumulator, other sugars enriched for an Escherichia nitrate ammonifier, and citrate or formate enriched for a Pseudomonas denitrifier and a Sulfurospirillum nitrate ammonifier. Our results add important nuance to the current paradigm that higher concentrations of carbon will always favor DNRA over denitrification or nitrite accumulation, and we propose that, in some cases, carbon composition can be as important as carbon concentration in determining nitrate respiratory end products. Furthermore, our approach can be extended to other environments and metabolisms to characterize how selective parameters influence microbial community composition, gene content, and function.
Collapse
|
11
|
Abstract
GapMind is a Web-based tool for annotating amino acid biosynthesis in bacteria and archaea (http://papers.genomics.lbl.gov/gaps). GapMind incorporates many variant pathways and 130 different reactions, and it analyzes a genome in just 15 s. To avoid error-prone transitive annotations, GapMind relies primarily on a database of experimentally characterized proteins. GapMind correctly handles fusion proteins and split proteins, which often cause errors for best-hit approaches. To improve GapMind's coverage, we examined genetic data from 35 bacteria that grow in defined media without amino acids, and we filled many gaps in amino acid biosynthesis pathways. For example, we identified additional genes for arginine synthesis with succinylated intermediates in Bacteroides thetaiotaomicron, and we propose that Dyella japonica synthesizes tyrosine from phenylalanine. Nevertheless, for many bacteria and archaea that grow in minimal media, genes for some steps still cannot be identified. To help interpret potential gaps, GapMind checks if they match known gaps in related microbes that can grow in minimal media. GapMind should aid the identification of microbial growth requirements.IMPORTANCE Many microbes can make all of the amino acids (the building blocks of proteins). In principle, we should be able to predict which amino acids a microbe can make, and which it requires as nutrients, by checking its genome sequence for all of the necessary genes. However, in practice, it is difficult to check for all of the alternative pathways. Furthermore, new pathways and enzymes are still being discovered. We built an automated tool, GapMind, to annotate amino acid biosynthesis in bacterial and archaeal genomes. We used GapMind to list gaps: cases where a microbe makes an amino acid but a complete pathway cannot be identified in its genome. We used these gaps, together with data from mutants, to identify new pathways and enzymes. However, for most bacteria and archaea, we still do not know how they can make all of the amino acids.
Collapse
|
12
|
Correction: Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics. PLoS Genet 2019; 15:e1008106. [PMID: 30943208 PMCID: PMC6447180 DOI: 10.1371/journal.pgen.1008106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
13
|
A new family of transcriptional regulators of tungstoenzymes and molybdate/tungstate transport. Environ Microbiol 2019; 21:784-799. [PMID: 30536693 DOI: 10.1111/1462-2920.14500] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2018] [Revised: 12/03/2018] [Accepted: 12/07/2018] [Indexed: 11/30/2022]
Abstract
Bacterial genes for molybdenum-containing and tungsten-containing enzymes are often differentially regulated depending on the metal availability in the environment. Here, we describe a new family of transcription factors with an unusual DNA-binding domain related to excisionases of bacteriophages. These transcription factors are associated with genes for various molybdate and tungstate-specific transporting systems as well as molybdo/tungsto-enzymes in a wide range of bacterial genomes. We used a combination of computational and experimental techniques to study a member of the TF family, named TaoR (for tungsten-containing aldehyde oxidoreductase regulator). In Desulfovibrio vulgaris Hildenborough, a model bacterium for sulfate reduction studies, TaoR activates expression of aldehyde oxidoreductase aor and represses tungsten-specific ABC-type transporter tupABC genes under tungsten-replete conditions. TaoR binding sites at aor promoter were identified by electrophoretic mobility shift assay and DNase I footprinting. We also reconstructed TaoR regulons in 45 Deltaproteobacteria by comparative genomics approach and predicted target genes for TaoR family members in other Proteobacteria and Firmicutes.
Collapse
|
14
|
The selective pressures on the microbial community in a metal-contaminated aquifer. ISME JOURNAL 2018; 13:937-949. [PMID: 30523276 PMCID: PMC6461962 DOI: 10.1038/s41396-018-0328-1] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 11/12/2018] [Accepted: 11/22/2018] [Indexed: 12/25/2022]
Abstract
In many environments, toxic compounds restrict which microorganisms persist. However, in complex mixtures of inhibitory compounds, it is challenging to determine which specific compounds cause changes in abundance and prevent some microorganisms from growing. We focused on a contaminated aquifer in Oak Ridge, Tennessee, USA that has large gradients of pH and widely varying concentrations of uranium, nitrate, and many other inorganic ions. In the most contaminated wells, the microbial community is enriched in the Rhodanobacter genus. Rhodanobacter abundance is positively correlated with low pH and high concentrations of uranium and 13 other ions and we sought to determine which of these ions are selective pressures that favor the growth of Rhodanobacter over other taxa. Of these ions, low pH and high UO22+, Mn2+, Al3+, Cd2+, Zn2+, Co2+, and Ni2+ are both (a) selectively inhibitory of a Pseudomonas isolate from an uncontaminated well vs. a Rhodanobacter isolate from a contaminated well, and (b) reach toxic concentrations (for the Pseudomonas isolate) in the Rhodanobacter-dominated wells. We used mixtures of ions to simulate the groundwater conditions in the most contaminated wells and verified that few isolates aside from Rhodanobacter can tolerate these eight ions. These results clarify which ions are likely causal factors that impact the microbial community at this field site and are not merely correlated with taxonomic shifts. Furthermore, our general high-throughput approach can be applied to other environments, isolates, and conditions to systematically help identify selective pressures on microbial communities.
Collapse
|
15
|
Mutant phenotypes for thousands of bacterial genes of unknown function. Nature 2018; 557:503-509. [PMID: 29769716 DOI: 10.1038/s41586-018-0124-0] [Citation(s) in RCA: 285] [Impact Index Per Article: 47.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 04/09/2018] [Indexed: 01/25/2023]
Abstract
One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because they are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.
Collapse
|
16
|
Filling gaps in bacterial amino acid biosynthesis pathways with high-throughput genetics. PLoS Genet 2018; 14:e1007147. [PMID: 29324779 PMCID: PMC5764234 DOI: 10.1371/journal.pgen.1007147] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Accepted: 12/10/2017] [Indexed: 11/18/2022] Open
Abstract
For many bacteria with sequenced genomes, we do not understand how they synthesize some amino acids. This makes it challenging to reconstruct their metabolism, and has led to speculation that bacteria might be cross-feeding amino acids. We studied heterotrophic bacteria from 10 different genera that grow without added amino acids even though an automated tool predicts that the bacteria have gaps in their amino acid synthesis pathways. Across these bacteria, there were 11 gaps in their amino acid biosynthesis pathways that we could not fill using current knowledge. Using genome-wide mutant fitness data, we identified novel enzymes that fill 9 of the 11 gaps and hence explain the biosynthesis of methionine, threonine, serine, or histidine by bacteria from six genera. We also found that the sulfate-reducing bacterium Desulfovibrio vulgaris synthesizes homocysteine (which is a precursor to methionine) by using DUF39, NIL/ferredoxin, and COG2122 proteins, and that homoserine is not an intermediate in this pathway. Our results suggest that most free-living bacteria can likely make all 20 amino acids and illustrate how high-throughput genetics can uncover previously-unknown amino acid biosynthesis genes. For a few bacteria, it is well known how they can make all 20 of the standard amino acids (the building blocks of proteins). For many other bacteria, their genome sequence implies that there are gaps in these biosynthetic pathways, so that the bacteria cannot make all of the amino acids and would need to take up some of them from their environment instead. But many bacteria can grow in minimal media (without any amino acids) despite these apparent gaps. We studied 10 bacteria with predicted gaps in amino acid biosynthesis that nevertheless grow in minimal media. Most of these gaps were spurious, but 11 of the gaps were genuine and could not be explained by current knowledge. Using high-throughput genetics, we systematically identified genes that were required for growth in minimal media and identified the biosynthetic genes that fill 9 of the 11 gaps. We hope that this approach can be applied to many more bacteria and will eventually allow us to accurately predict the nutritional requirements of a bacterium from its genome sequence.
Collapse
|
17
|
A metabolic pathway for catabolizing levulinic acid in bacteria. Nat Microbiol 2017; 2:1624-1634. [PMID: 28947739 PMCID: PMC5705400 DOI: 10.1038/s41564-017-0028-z] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Accepted: 08/16/2017] [Indexed: 12/21/2022]
Abstract
Microorganisms can catabolize a wide range of organic compounds and therefore have the potential to perform many industrially relevant bioconversions. One barrier to realizing the potential of biorefining strategies lies in our incomplete knowledge of metabolic pathways, including those that can be used to assimilate naturally abundant or easily generated feedstocks. For instance, levulinic acid (LA) is a carbon source that is readily obtainable as a dehydration product of lignocellulosic biomass and can serve as the sole carbon source for some bacteria. Yet, the genetics and structure of LA catabolism have remained unknown. Here, we report the identification and characterization of a seven-gene operon that enables LA catabolism in Pseudomonas putida KT2440. When the pathway was reconstituted with purified proteins, we observed the formation of four acyl-CoA intermediates, including a unique 4-phosphovaleryl-CoA and the previously observed 3-hydroxyvaleryl-CoA product. Using adaptive evolution, we obtained a mutant of Escherichia coli LS5218 with functional deletions of fadE and atoC that was capable of robust growth on LA when it expressed the five enzymes from the P. putida operon. This discovery will enable more efficient use of biomass hydrolysates and metabolic engineering to develop bioconversions using LA as a feedstock.
Collapse
|
18
|
Abstract
We use simple models of the costs and benefits of microbial gene expression to show that changing a protein's expression away from its optimum by 2-fold should reduce fitness by at least [Formula: see text], where P is the fraction the cell's protein that the gene accounts for. As microbial genes are usually expressed at above 5 parts per million, and effective population sizes are likely to be above 10(6), this implies that 2-fold changes to gene expression levels are under strong selection, as [Formula: see text], where Ne is the effective population size and s is the selection coefficient. Thus, most gene duplications should be selected against. On the other hand, we predict that for most genes, small changes in the expression will be effectively neutral.
Collapse
|
19
|
Mechanisms of direct inhibition of the respiratory sulfate-reduction pathway by (per)chlorate and nitrate. ISME JOURNAL 2014; 9:1295-305. [PMID: 25405978 DOI: 10.1038/ismej.2014.216] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2014] [Revised: 09/30/2014] [Accepted: 10/04/2014] [Indexed: 12/26/2022]
Abstract
We investigated perchlorate (ClO(4)(-)) and chlorate (ClO(3)(-)) (collectively (per)chlorate) in comparison with nitrate as potential inhibitors of sulfide (H(2)S) production by mesophilic sulfate-reducing microorganisms (SRMs). We demonstrate the specificity and potency of (per)chlorate as direct SRM inhibitors in both pure cultures and undefined sulfidogenic communities. We demonstrate that (per)chlorate and nitrate are antagonistic inhibitors and resistance is cross-inducible implying that these compounds share at least one common mechanism of resistance. Using tagged-transposon pools we identified genes responsible for sensitivity and resistance in Desulfovibrio alaskensis G20. We found that mutants in Dde_2702 (Rex), a repressor of the central sulfate-reduction pathway were resistant to both (per)chlorate and nitrate. In general, Rex derepresses its regulon in response to increasing intracellular NADH:NAD(+) ratios. In cells in which respiratory sulfate reduction is inhibited, NADH:NAD(+) ratios should increase leading to derepression of the sulfate-reduction pathway. In support of this, in (per)chlorate or nitrate-stressed wild-type G20 we observed higher NADH:NAD(+) ratios, increased transcripts and increased peptide counts for genes in the core Rex regulon. We conclude that one mode of (per)chlorate and nitrate toxicity is as direct inhibitors of the central sulfate-reduction pathway. Our results demonstrate that (per)chlorate are more potent inhibitors than nitrate in both pure cultures and communities, implying that they represent an attractive alternative for controlling sulfidogenesis in industrial ecosystems. Of these, perchlorate offers better application logistics because of its inhibitory potency, solubility, relative chemical stability, low affinity for mineral cations and high mobility in environmental systems.
Collapse
|
20
|
The genetic basis of energy conservation in the sulfate-reducing bacterium Desulfovibrio alaskensis G20. Front Microbiol 2014; 5:577. [PMID: 25400629 PMCID: PMC4215793 DOI: 10.3389/fmicb.2014.00577] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Accepted: 10/13/2014] [Indexed: 11/13/2022] Open
Abstract
Sulfate-reducing bacteria play major roles in the global carbon and sulfur cycles, but it remains unclear how reducing sulfate yields energy. To determine the genetic basis of energy conservation, we measured the fitness of thousands of pooled mutants of Desulfovibrio alaskensis G20 during growth in 12 different combinations of electron donors and acceptors. We show that ion pumping by the ferredoxin:NADH oxidoreductase Rnf is required whenever substrate-level phosphorylation is not possible. The uncharacterized complex Hdr/flox-1 (Dde_1207:13) is sometimes important alongside Rnf and may perform an electron bifurcation to generate more reduced ferredoxin from NADH to allow further ion pumping. Similarly, during the oxidation of malate or fumarate, the electron-bifurcating transhydrogenase NfnAB-2 (Dde_1250:1) is important and may generate reduced ferredoxin to allow additional ion pumping by Rnf. During formate oxidation, the periplasmic [NiFeSe] hydrogenase HysAB is required, which suggests that hydrogen forms in the periplasm, diffuses to the cytoplasm, and is used to reduce ferredoxin, thus providing a substrate for Rnf. During hydrogen utilization, the transmembrane electron transport complex Tmc is important and may move electrons from the periplasm into the cytoplasmic sulfite reduction pathway. Finally, mutants of many other putative electron carriers have no clear phenotype, which suggests that they are not important under our growth conditions, although we cannot rule out genetic redundancy.
Collapse
|
21
|
Genetic basis for nitrate resistance in Desulfovibrio strains. Front Microbiol 2014; 5:153. [PMID: 24795702 PMCID: PMC4001038 DOI: 10.3389/fmicb.2014.00153] [Citation(s) in RCA: 162] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 03/21/2014] [Indexed: 12/31/2022] Open
Abstract
Nitrate is an inhibitor of sulfate-reducing bacteria (SRB). In petroleum production sites, amendments of nitrate and nitrite are used to prevent SRB production of sulfide that causes souring of oil wells. A better understanding of nitrate stress responses in the model SRB, Desulfovibrio vulgaris Hildenborough and Desulfovibrio alaskensis G20, will strengthen predictions of environmental outcomes of nitrate application. Nitrate inhibition of SRB has historically been considered to result from the generation of small amounts of nitrite, to which SRB are quite sensitive. Here we explored the possibility that nitrate might inhibit SRB by a mechanism other than through nitrite inhibition. We found that nitrate-stressed D. vulgaris cultures grown in lactate-sulfate conditions eventually grew in the presence of high concentrations of nitrate, and their resistance continued through several subcultures. Nitrate consumption was not detected over the course of the experiment, suggesting adaptation to nitrate. With high-throughput genetic approaches employing TnLE-seq for D. vulgaris and a pooled mutant library of D. alaskensis, we determined the fitness of many transposon mutants of both organisms in nitrate stress conditions. We found that several mutants, including homologs present in both strains, had a greatly increased ability to grow in the presence of nitrate but not nitrite. The mutated genes conferring nitrate resistance included the gene encoding the putative Rex transcriptional regulator (DVU0916/Dde_2702), as well as a cluster of genes (DVU0251-DVU0245/Dde_0597-Dde_0605) that is poorly annotated. Follow-up studies with individual D. vulgaris transposon and deletion mutants confirmed high-throughput results. We conclude that, in D. vulgaris and D. alaskensis, nitrate resistance in wild-type cultures is likely conferred by spontaneous mutations. Furthermore, the mechanisms that confer nitrate resistance may be different from those that confer nitrite resistance.
Collapse
|
22
|
The energy-conserving electron transfer system used byDesulfovibrio alaskensisstrain G20 during pyruvate fermentation involves reduction of endogenously formed fumarate and cytoplasmic and membrane-bound complexes, Hdr-Flox and Rnf. Environ Microbiol 2014; 16:3463-86. [DOI: 10.1111/1462-2920.12405] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Revised: 01/08/2014] [Accepted: 01/13/2014] [Indexed: 12/01/2022]
|
23
|
Control of methionine metabolism by the SahR transcriptional regulator in Proteobacteria. Environ Microbiol 2013; 16:1-8. [PMID: 24118949 DOI: 10.1111/1462-2920.12273] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Sulphur is an essential element in the metabolism. The sulphur-containing amino acid methionine is a metabolic precursor for S-adenosylmethionine (SAM), which serves as a coenzyme for ubiquitous methyltrtansferases. Recycling of organic sulphur compounds, e.g. via the SAM cycle, is an important metabolic process that needs to be tightly regulated. Knowledge about transcriptional regulation of these processes is still limited for many free-living bacteria. We identified a novel transcription factor SahR from the ArsR family that controls the SAM cycle genes in diverse microorganisms from soil and aquatic ecosystems. By using comparative genomics, we predicted SahR-binding DNA motifs and reconstructed SahR regulons in the genomes of 62 Proteobacteria. The conserved core of SahR regulons includes all enzymes required for the SAM cycle: the SAH hydrolase AhcY, the methionine biosynthesis enzymes MetE/MetH and MetF, and the SAM synthetase MetK. By using a combination of experimental techniques, we validated the SahR regulon in the sulphate-reducing Deltaproteobacterium Desulfovibrio alaskensis. SahR functions as a negative regulator that responds to the S-adenosylhomocysteine (SAH). The elevated SAH level in the cell dissociates SahR from its DNA operators and induces the expression of SAM cycle genes. The effector-sensing domain in SahR is related to SAM-dependent methylases that are able to tightly bind SAH. SahR represents a novel type of transcriptional regulators for the control of sulphur amino acid metabolism.
Collapse
|
24
|
Dissecting a complex chemical stress: chemogenomic profiling of plant hydrolysates. Mol Syst Biol 2013; 9:674. [PMID: 23774757 PMCID: PMC3964314 DOI: 10.1038/msb.2013.30] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 05/12/2013] [Indexed: 11/09/2022] Open
Abstract
Complex chemical stress arises during the production of biofuels. Large-scale mutant fitness profiling was used to identify bacterial and yeast tolerance genes and to model fitness in a complex hydrolysate mixture. The resulting model can be used to engineer more tolerant strains. ![]()
Genome-wide fitness profiling was used to identify plant hydrolysate tolerance genes in Zymomonas mobilis and Saccharomyces cerevisiae. We modeled fitness in hydrolysate as a mixture of fitness in its components. Outliers in our model led to the identification of a previously unknown component of hydrolysate. Overexpression of a Z. mobilis tolerance gene of unknown function improved ethanol productivity in plant hydrolysate.
The efficient production of biofuels from cellulosic feedstocks will require the efficient fermentation of the sugars in hydrolyzed plant material. Unfortunately, plant hydrolysates also contain many compounds that inhibit microbial growth and fermentation. We used DNA-barcoded mutant libraries to identify genes that are important for hydrolysate tolerance in both Zymomonas mobilis (44 genes) and Saccharomyces cerevisiae (99 genes). Overexpression of a Z. mobilis tolerance gene of unknown function (ZMO1875) improved its specific ethanol productivity 2.4-fold in the presence of miscanthus hydrolysate. However, a mixture of 37 hydrolysate-derived inhibitors was not sufficient to explain the fitness profile of plant hydrolysate. To deconstruct the fitness profile of hydrolysate, we profiled the 37 inhibitors against a library of Z. mobilis mutants and we modeled fitness in hydrolysate as a mixture of fitness in its components. By examining outliers in this model, we identified methylglyoxal as a previously unknown component of hydrolysate. Our work provides a general strategy to dissect how microbes respond to a complex chemical stress and should enable further engineering of hydrolysate tolerance.
Collapse
|
25
|
Indirect and suboptimal control of gene expression is widespread in bacteria. Mol Syst Biol 2013; 9:660. [PMID: 23591776 PMCID: PMC3658271 DOI: 10.1038/msb.2013.16] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 03/13/2013] [Indexed: 11/09/2022] Open
Abstract
Gene regulation in bacteria is usually described as an adaptive response to an environmental change so that genes are expressed when they are required. We instead propose that most genes are under indirect control: their expression responds to signal(s) that are not directly related to the genes' function. Indirect control should perform poorly in artificial conditions, and we show that gene regulation is often maladaptive in the laboratory. In Shewanella oneidensis MR-1, 24% of genes are detrimental to fitness in some conditions, and detrimental genes tend to be highly expressed instead of being repressed when not needed. In diverse bacteria, there is little correlation between when genes are important for optimal growth or fitness and when those genes are upregulated. Two common types of indirect control are constitutive expression and regulation by growth rate; these occur for genes with diverse functions and often seem to be suboptimal. Because genes that have closely related functions can have dissimilar expression patterns, regulation may be suboptimal in the wild as well as in the laboratory.
Collapse
|
26
|
Metabolic footprinting of mutant libraries to map metabolite utilization to genotype. ACS Chem Biol 2013; 8:189-99. [PMID: 23082955 DOI: 10.1021/cb300477w] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The discrepancy between the pace of sequencing and functional characterization of genomes is a major challenge in understanding complex microbial metabolic processes and metabolic interactions in the environment. Here, we identified and validated genes related to the utilization of specific metabolites in bacteria by profiling metabolite utilization in libraries of mutant strains. Untargeted mass spectrometry based metabolomics was used to identify metabolites utilized by Escherichia coli and Shewanella oneidensis MR-1. Targeted high-throughput metabolite profiling of spent media of 8042 individual mutant strains was performed to link utilization to specific genes. Using this approach we identified genes of known function as well as novel transport proteins and enzymes required for the utilization of tested metabolites. Specific examples include two subunits of a predicted ABC transporter encoded by the genes SO1043 and SO1044 required for the utilization of citrulline and a predicted histidase encoded by the gene SO3057 required for the utilization of ergothioneine by S. oneidensis. In vitro assays with purified proteins showed substrate specificity of SO3057 toward ergothioneine and histidine betaine in contrast to substrate specificity of a paralogous histidase SO0098 toward histidine. This generally applicable, high-throughput workflow has the potential both to discover novel metabolic capabilities of microorganisms and to identify the corresponding genes.
Collapse
|
27
|
Evidence-based annotation of gene function in Shewanella oneidensis MR-1 using genome-wide fitness profiling across 121 conditions. PLoS Genet 2011; 7:e1002385. [PMID: 22125499 PMCID: PMC3219624 DOI: 10.1371/journal.pgen.1002385] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Accepted: 09/30/2011] [Indexed: 11/21/2022] Open
Abstract
Most genes in bacteria are experimentally uncharacterized and cannot be annotated with a specific function. Given the great diversity of bacteria and the ease of genome sequencing, high-throughput approaches to identify gene function experimentally are needed. Here, we use pools of tagged transposon mutants in the metal-reducing bacterium Shewanella oneidensis MR-1 to probe the mutant fitness of 3,355 genes in 121 diverse conditions including different growth substrates, alternative electron acceptors, stresses, and motility. We find that 2,350 genes have a pattern of fitness that is significantly different from random and 1,230 of these genes (37% of our total assayed genes) have enough signal to show strong biological correlations. We find that genes in all functional categories have phenotypes, including hundreds of hypotheticals, and that potentially redundant genes (over 50% amino acid identity to another gene in the genome) are also likely to have distinct phenotypes. Using fitness patterns, we were able to propose specific molecular functions for 40 genes or operons that lacked specific annotations or had incomplete annotations. In one example, we demonstrate that the previously hypothetical gene SO_3749 encodes a functional acetylornithine deacetylase, thus filling a missing step in S. oneidensis metabolism. Additionally, we demonstrate that the orphan histidine kinase SO_2742 and orphan response regulator SO_2648 form a signal transduction pathway that activates expression of acetyl-CoA synthase and is required for S. oneidensis to grow on acetate as a carbon source. Lastly, we demonstrate that gene expression and mutant fitness are poorly correlated and that mutant fitness generates more confident predictions of gene function than does gene expression. The approach described here can be applied generally to create large-scale gene-phenotype maps for evidence-based annotation of gene function in prokaryotes. Many computationally predicted gene annotations in bacteria are incomplete or wrong. Consequently, experimental methods to systematically determine gene function in bacteria are required. Here, we describe a genetic approach to meet this challenge. We constructed a large transposon mutant library in the metal-reducing bacterium Shewanella oneidensis MR-1 and profiled the fitness of this collection in more than 100 diverse experimental conditions. In addition to identifying a phenotype for more than 2,000 genes, we demonstrate that mutant fitness profiles can be used to assign “evidence-based” gene annotations for enzymes, signaling proteins, transporters, and transcription factors, a subset of which we verify experimentally.
Collapse
|
28
|
Systematic mapping of two component response regulators to gene targets in a model sulfate reducing bacterium. Genome Biol 2011; 12:R99. [PMID: 21992415 PMCID: PMC3333781 DOI: 10.1186/gb-2011-12-10-r99] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2011] [Revised: 07/23/2011] [Accepted: 10/12/2011] [Indexed: 01/26/2023] Open
Abstract
Background Two component regulatory systems are the primary form of signal transduction in bacteria. Although genomic binding sites have been determined for several eukaryotic and bacterial transcription factors, comprehensive identification of gene targets of two component response regulators remains challenging due to the lack of knowledge of the signals required for their activation. We focused our study on Desulfovibrio vulgaris Hildenborough, a sulfate reducing bacterium that encodes unusually diverse and largely uncharacterized two component signal transduction systems. Results We report the first systematic mapping of the genes regulated by all transcriptionally acting response regulators in a single bacterium. Our results enabled functional predictions for several response regulators and include key processes of carbon, nitrogen and energy metabolism, cell motility and biofilm formation, and responses to stresses such as nitrite, low potassium and phosphate starvation. Our study also led to the prediction of new genes and regulatory networks, which found corroboration in a compendium of transcriptome data available for D. vulgaris. For several regulators we predicted and experimentally verified the binding site motifs, most of which were discovered as part of this study. Conclusions The gene targets identified for the response regulators allowed strong functional predictions to be made for the corresponding two component systems. By tracking the D. vulgaris regulators and their motifs outside the Desulfovibrio spp. we provide testable hypotheses regarding the functions of orthologous regulators in other organisms. The in vitro array based method optimized here is generally applicable for the study of such systems in all organisms.
Collapse
|
29
|
Abstract
Systems-level analyses of non-model microorganisms are limited by the existence of numerous uncharacterized genes and a corresponding over-reliance on automated computational annotations. One solution to this challenge is to disrupt gene function using DNA tag technology, which has been highly successful in parallelizing reverse genetics in Saccharomyces cerevisiae and has led to discoveries in gene function, genetic interactions and drug mechanism of action. To extend the yeast DNA tag methodology to a wide variety of microorganisms and applications, we have created a universal, sequence-verified TagModule collection. A hallmark of the 4280 TagModules is that they are cloned into a Gateway entry vector, thus facilitating rapid transfer to any compatible genetic system. Here, we describe the application of the TagModules to rapidly generate tagged mutants by transposon mutagenesis in the metal-reducing bacterium Shewanella oneidensis MR-1 and the pathogenic yeast Candida albicans. Our results demonstrate the optimal hybridization properties of the TagModule collection, the flexibility in applying the strategy to diverse microorganisms and the biological insights that can be gained from fitness profiling tagged mutant collections. The publicly available TagModule collection is a platform-independent resource for the functional genomics of a wide range of microbial systems in the post-genome era.
Collapse
|
30
|
Impact of elevated nitrate on sulfate-reducing bacteria: a comparative study of Desulfovibrio vulgaris. ISME JOURNAL 2010; 4:1386-97. [PMID: 20445634 DOI: 10.1038/ismej.2010.59] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Sulfate-reducing bacteria have been extensively studied for their potential in heavy-metal bioremediation. However, the occurrence of elevated nitrate in contaminated environments has been shown to inhibit sulfate reduction activity. Although the inhibition has been suggested to result from the competition with nitrate-reducing bacteria, the possibility of direct inhibition of sulfate reducers by elevated nitrate needs to be explored. Using Desulfovibrio vulgaris as a model sulfate-reducing bacterium, functional genomics analysis reveals that osmotic stress contributed to growth inhibition by nitrate as shown by the upregulation of the glycine/betaine transporter genes and the relief of nitrate inhibition by osmoprotectants. The observation that significant growth inhibition was effected by 70 mM NaNO(3) but not by 70 mM NaCl suggests the presence of inhibitory mechanisms in addition to osmotic stress. The differential expression of genes characteristic of nitrite stress responses, such as the hybrid cluster protein gene, under nitrate stress condition further indicates that nitrate stress response by D. vulgaris was linked to components of both osmotic and nitrite stress responses. The involvement of the oxidative stress response pathway, however, might be the result of a more general stress response. Given the low similarities between the response profiles to nitrate and other stresses, less-defined stress response pathways could also be important in nitrate stress, which might involve the shift in energy metabolism. The involvement of nitrite stress response upon exposure to nitrate may provide detoxification mechanisms for nitrite, which is inhibitory to sulfate-reducing bacteria, produced by microbial nitrate reduction as a metabolic intermediate and may enhance the survival of sulfate-reducing bacteria in environments with elevated nitrate level.
Collapse
|
31
|
FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One 2010; 5:e9490. [PMID: 20224823 PMCID: PMC2835736 DOI: 10.1371/journal.pone.0009490] [Citation(s) in RCA: 8217] [Impact Index Per Article: 586.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2009] [Accepted: 02/09/2010] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND We recently described FastTree, a tool for inferring phylogenies for alignments with up to hundreds of thousands of sequences. Here, we describe improvements to FastTree that improve its accuracy without sacrificing scalability. METHODOLOGY/PRINCIPAL FINDINGS Where FastTree 1 used nearest-neighbor interchanges (NNIs) and the minimum-evolution criterion to improve the tree, FastTree 2 adds minimum-evolution subtree-pruning-regrafting (SPRs) and maximum-likelihood NNIs. FastTree 2 uses heuristics to restrict the search for better trees and estimates a rate of evolution for each site (the "CAT" approximation). Nevertheless, for both simulated and genuine alignments, FastTree 2 is slightly more accurate than a standard implementation of maximum-likelihood NNIs (PhyML 3 with default settings). Although FastTree 2 is not quite as accurate as methods that use maximum-likelihood SPRs, most of the splits that disagree are poorly supported, and for large alignments, FastTree 2 is 100-1,000 times faster. FastTree 2 inferred a topology and likelihood-based local support values for 237,882 distinct 16S ribosomal RNAs on a desktop computer in 22 hours and 5.8 gigabytes of memory. CONCLUSIONS/SIGNIFICANCE FastTree 2 allows the inference of maximum-likelihood phylogenies for huge alignments. FastTree 2 is freely available at http://www.microbesonline.org/fasttree.
Collapse
|
32
|
Abstract
Since 2003, MicrobesOnline (http://www.microbesonline.org) has been providing a community resource for comparative and functional genome analysis. The portal includes over 1000 complete genomes of bacteria, archaea and fungi and thousands of expression microarrays from diverse organisms ranging from model organisms such as Escherichia coli and Saccharomyces cerevisiae to environmental microbes such as Desulfovibrio vulgaris and Shewanella oneidensis. To assist in annotating genes and in reconstructing their evolutionary history, MicrobesOnline includes a comparative genome browser based on phylogenetic trees for every gene family as well as a species tree. To identify co-regulated genes, MicrobesOnline can search for genes based on their expression profile, and provides tools for identifying regulatory motifs and seeing if they are conserved. MicrobesOnline also includes fast phylogenetic profile searches, comparative views of metabolic pathways, operon predictions, a workbench for sequence analysis and integration with RegTransBase and other microbial genome resources. The next update of MicrobesOnline will contain significant new functionality, including comparative analysis of metagenomic sequence data. Programmatic access to the database, along with source code and documentation, is available at http://microbesonline.org/programmers.html.
Collapse
|
33
|
FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 2009; 26:1641-50. [PMID: 19377059 PMCID: PMC2693737 DOI: 10.1093/molbev/msp077] [Citation(s) in RCA: 3106] [Impact Index Per Article: 207.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Gene families are growing rapidly, but standard methods for inferring phylogenies do not scale to alignments with over 10,000 sequences. We present FastTree, a method for constructing large phylogenies and for estimating their reliability. Instead of storing a distance matrix, FastTree stores sequence profiles of internal nodes in the tree. FastTree uses these profiles to implement Neighbor-Joining and uses heuristics to quickly identify candidate joins. FastTree then uses nearest neighbor interchanges to reduce the length of the tree. For an alignment with N sequences, L sites, and a different characters, a distance matrix requires O(N2) space and O(N2L) time, but FastTree requires just O(NLa + N) memory and O(Nlog (N)La) time. To estimate the tree's reliability, FastTree uses local bootstrapping, which gives another 100-fold speedup over a distance matrix. For example, FastTree computed a tree and support values for 158,022 distinct 16S ribosomal RNAs in 17 h and 2.4 GB of memory. Just computing pairwise Jukes–Cantor distances and storing them, without inferring a tree or bootstrapping, would require 17 h and 50 GB of memory. In simulations, FastTree was slightly more accurate than Neighbor-Joining, BIONJ, or FastME; on genuine alignments, FastTree's topologies had higher likelihoods. FastTree is available at http://microbesonline.org/fasttree.
Collapse
|
34
|
Abstract
Background All-versus-all BLAST, which searches for homologous pairs of sequences in a database of proteins, is used to identify potential orthologs, to find new protein families, and to provide rapid access to these homology relationships. As DNA sequencing accelerates and data sets grow, all-versus-all BLAST has become computationally demanding. Methodology/Principal Findings We present FastBLAST, a heuristic replacement for all-versus-all BLAST that relies on alignments of proteins to known families, obtained from tools such as PSI-BLAST and HMMer. FastBLAST avoids most of the work of all-versus-all BLAST by taking advantage of these alignments and by clustering similar sequences. FastBLAST runs in two stages: the first stage identifies additional families and aligns them, and the second stage quickly identifies the homologs of a query sequence, based on the alignments of the families, before generating pairwise alignments. On 6.53 million proteins from the non-redundant Genbank database (“NR”), FastBLAST identifies new families 25 times faster than all-versus-all BLAST. Once the first stage is completed, FastBLAST identifies homologs for the average query in less than 5 seconds (8.6 times faster than BLAST) and gives nearly identical results. For hits above 70 bits, FastBLAST identifies 98% of the top 3,250 hits per query. Conclusions/Significance FastBLAST enables research groups that do not have supercomputers to analyze large protein sequence data sets. FastBLAST is open source software and is available at http://microbesonline.org/fastblast.
Collapse
|
35
|
Orthologous transcription factors in bacteria have different functions and regulate different genes. PLoS Comput Biol 2007; 3:1739-50. [PMID: 17845071 PMCID: PMC1971122 DOI: 10.1371/journal.pcbi.0030175] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2007] [Accepted: 07/25/2007] [Indexed: 11/21/2022] Open
Abstract
Transcription factors (TFs) form large paralogous gene families and have complex evolutionary histories. Here, we ask whether putative orthologs of TFs, from bidirectional best BLAST hits (BBHs), are evolutionary orthologs with conserved functions. We show that BBHs of TFs from distantly related bacteria are usually not evolutionary orthologs. Furthermore, the false orthologs usually respond to different signals and regulate distinct pathways, while the few BBHs that are evolutionary orthologs do have conserved functions. To test the conservation of regulatory interactions, we analyze expression patterns. We find that regulatory relationships between TFs and their regulated genes are usually not conserved for BBHs in Escherichia coli K12 and Bacillus subtilis. Even in the much more closely related bacteria Vibrio cholerae and Shewanella oneidensis MR-1, predicting regulation from E. coli BBHs has high error rates. Using gene–regulon correlations, we identify genes whose expression pattern differs between E. coli and S. oneidensis. Using literature searches and sequence analysis, we show that these changes in expression patterns reflect changes in gene regulation, even for evolutionary orthologs. We conclude that the evolution of bacterial regulation should be analyzed with phylogenetic trees, rather than BBHs, and that bacterial regulatory networks evolve more rapidly than previously thought. Living organisms use transcription factors (TFs) to control the production of proteins. For example, the bacterium E. coli contains a TF that prevents it from making enzymes that degrade lactose when lactose is absent. Bacterial genomes encode a huge diversity of TFs, and except in a few well-studied organisms, the function of these TFs is not known. To predict the function of a TF, biologists often search for a similar TF, from another organism, that has been characterized. It is generally believed that orthologous TFs—TFs that are derived from the organisms' common ancestor—will have conserved functions. The authors show that a commonly used method to identify orthologous TFs gives misleading results when applied to distantly related bacteria: the “orthologous” TFs are evolutionarily distant, they sense different signals, and they regulate different pathways. Biologists often predict, more specifically, that orthologous TFs will regulate orthologous genes. However, the authors show that even in more closely related bacteria, where the orthologous TFs do have conserved functions, these specific predictions are often incorrect. It seems that gene regulation in bacteria evolves rapidly, and it will be difficult to predict regulation in diverse bacteria from our knowledge of a few well-studied bacteria.
Collapse
|
36
|
Abstract
Operons are a major feature of all prokaryotic genomes, but how and why operon structures vary is not well understood. To elucidate the life-cycle of operons, we compared gene order between Escherichia coli K12 and its relatives and identified the recently formed and destroyed operons in E. coli. This allowed us to determine how operons form, how they become closely spaced, and how they die. Our findings suggest that operon evolution may be driven by selection on gene expression patterns. First, both operon creation and operon destruction lead to large changes in gene expression patterns. For example, the removal of lysA and ruvA from ancestral operons that contained essential genes allowed their expression to respond to lysine levels and DNA damage, respectively. Second, some operons have undergone accelerated evolution, with multiple new genes being added during a brief period. Third, although genes within operons are usually closely spaced because of a neutral bias toward deletion and because of selection against large overlaps, genes in highly expressed operons tend to be widely spaced because of regulatory fine-tuning by intervening sequences. Although operon evolution may be adaptive, it need not be optimal: new operons often comprise functionally unrelated genes that were already in proximity before the operon formed.
Collapse
|
37
|
|
38
|
OpWise: operons aid the identification of differentially expressed genes in bacterial microarray experiments. BMC Bioinformatics 2006; 7:19. [PMID: 16412220 PMCID: PMC1397872 DOI: 10.1186/1471-2105-7-19] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2005] [Accepted: 01/13/2006] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Differentially expressed genes are typically identified by analyzing the variation between replicate measurements. These procedures implicitly assume that there are no systematic errors in the data even though several sources of systematic error are known. RESULTS OpWise estimates the amount of systematic error in bacterial microarray data by assuming that genes in the same operon have matching expression patterns. OpWise then performs a Bayesian analysis of a linear model to estimate significance. In simulations, OpWise corrects for systematic error and is robust to deviations from its assumptions. In several bacterial data sets, significant amounts of systematic error are present, and replicate-based approaches overstate the confidence of the changers dramatically, while OpWise does not. Finally, OpWise can identify additional changers by assigning genes higher confidence if they are consistent with other genes in the same operon. CONCLUSION Although microarray data can contain large amounts of systematic error, operons provide an external standard and allow for reasonable estimates of significance. OpWise is available at http://microbesonline.org/OpWise.
Collapse
|
39
|
Abstract
At present, hundreds of microbial genomes have been sequenced, and hundreds more are currently in the pipeline. The Virtual Institute for Microbial Stress and Survival has developed a publicly available suite of Web-based comparative genomic tools (http://www.microbesonline.org) designed to facilitate multispecies comparison among prokaryotes. Highlights of the MicrobesOnline Web site include operon and regulon predictions, a multispecies genome browser, a multispecies Gene Ontology browser, a comparative KEGG metabolic pathway viewer, a Bioinformatics Workbench for in-depth sequence analysis, and Gene Carts that allow users to save genes of interest for further study while they browse. In addition, we provide an interface for genome annotation, which like all of the tools reported here, is freely available to the scientific community.
Collapse
|
40
|
Interruptions in gene expression drive highly expressed operons to the leading strand of DNA replication. Nucleic Acids Res 2005; 33:3224-34. [PMID: 15942025 PMCID: PMC1143696 DOI: 10.1093/nar/gki638] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
In bacteria, most genes are on the leading strand of replication, a phenomenon attributed to collisions between the DNA and RNA polymerases. In Escherichia coli, these collisions slow the movement of the replication fork through actively transcribed genes only if they are coded on the lagging strand. For genes on both strands, however, these collisions sever nascent transcripts and interrupt gene expression. Based on these observations, we propose a new theory to explain strand bias: genes whose expression is important for fitness are selected to the leading strand because this reduces the duration of these interruptions. Our theory predicts that multi-gene operons, which are subject to longer interruptions, should be more strongly selected to the leading strand than singleton transcripts. We show that this is true even after controlling for the tendency for essential genes, which are strongly biased to the leading strand, to occur in operons. Our theory also predicts that other factors that are associated with strand bias should have stronger effects for genes that are in operons. We find that expression level and phylogenetic ubiquity are correlated with strand bias for both essential and non-essential genes, but only for genes in operons.
Collapse
|
41
|
Operon formation is driven by co-regulation and not by horizontal gene transfer. Genome Res 2005; 15:809-19. [PMID: 15930492 PMCID: PMC1142471 DOI: 10.1101/gr.3368805] [Citation(s) in RCA: 117] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2004] [Accepted: 03/16/2005] [Indexed: 11/24/2022]
Abstract
The organization of bacterial genes into operons was originally ascribed to the benefits of co-regulation. More recently, the "selfish operon" model, in which operons are formed by repeated gain and loss of genes, was proposed. Indeed, operons are often subject to horizontal gene transfer (HGT). On the other hand, non-HGT genes are particularly likely to be in operons. To clarify whether HGT is involved in operon formation, we identified recently formed operons in Escherichia coli K12. We show that genes that have homologs in distantly related bacteria but not in close relatives of E. coli--indicating HGT--form new operons at about the same rates as native genes. Furthermore, genes in new operons are no more likely than other genes to have phylogenetic trees that are inconsistent with the species tree. In contrast, essential genes and ubiquitous genes without paralogs--genes believed to undergo HGT rarely--often form new operons. We conclude that HGT is not a cause of operon formation but instead promotes the prevalence of pre-existing operons. To explain operon formation, we propose that new operons reduce the amount of regulatory information required to specify optimal expression patterns and infer that operons should be more likely to evolve than independent promoters when regulation is complex. Consistent with this hypothesis, operons have greater amounts of conserved regulatory sequences than do individually transcribed genes.
Collapse
|
42
|
Abstract
We combine comparative genomic measures and the distance separating adjacent genes to predict operons in 124 completely sequenced prokaryotic genomes. Our method automatically tailors itself to each genome using sequence information alone, and thus can be applied to any prokaryote. For Escherichia coli K12 and Bacillus subtilis, our method is 85 and 83% accurate, respectively, which is similar to the accuracy of methods that use the same features but are trained on experimentally characterized transcripts. In Halobacterium NRC-1 and in Helicobacter pylori, our method correctly infers that genes in operons are separated by shorter distances than they are in E.coli, and its predictions using distance alone are more accurate than distance-only predictions trained on a database of E.coli transcripts. We use microarray data from six phylogenetically diverse prokaryotes to show that combining intergenic distance with comparative genomic measures further improves accuracy and that our method is broadly effective. Finally, we survey operon structure across 124 genomes, and find several surprises: H.pylori has many operons, contrary to previous reports; Bacillus anthracis has an unusual number of pseudogenes within conserved operons; and Synechocystis PCC 6803 has many operons even though it has unusually wide spacings between conserved adjacent genes.
Collapse
|
43
|
Finding coexpressed genes in counts-based data: an improved measure with validation experiments. Bioinformatics 2004; 20:945-52. [PMID: 14751974 DOI: 10.1093/bioinformatics/bth011] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Expressed sequence tag (EST) data reflects variation in gene expression, but previous methods for finding coexpressed genes in EST data are subject to bias and vastly overstate the statistical significance of putatively coexpressed genes. RESULTS We introduce a new method (LNP) that reports reasonable p-values and also detects more biological relationships in human dbEST than do previous methods. In simulations with human dbEST library sizes, previous methods report p-values as low as 10(-30) on 1/1000 uncorrelated pairs, while LNP reports significance correctly. We validate the analysis on real human genes by comparing coexpressed pairs to gene ontology annotations and find that LNP is more sensitive than the three previous methods. We also find a small but statistically significant level of coexpression between interacting proteins relative to randomized controls. The LNP method is based on a log-normal prior on the distribution of expression levels.
Collapse
|
44
|
|