1
|
Efficient and Robust Search of Microbial Genomes via Phylogenetic Compression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.15.536996. [PMID: 37131636 PMCID: PMC10153118 DOI: 10.1101/2023.04.15.536996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Comprehensive collections approaching millions of sequenced genomes have become central information sources in the life sciences. However, the rapid growth of these collections has made it effectively impossible to search these data using tools such as BLAST and its successors. Here, we present a technique called phylogenetic compression, which uses evolutionary history to guide compression and efficiently search large collections of microbial genomes using existing algorithms and data structures. We show that, when applied to modern diverse collections approaching millions of genomes, lossless phylogenetic compression improves the compression ratios of assemblies, de Bruijn graphs, and k -mer indexes by one to two orders of magnitude. Additionally, we develop a pipeline for a BLAST-like search over these phylogeny-compressed reference data, and demonstrate it can align genes, plasmids, or entire sequencing experiments against all sequenced bacteria until 2019 on ordinary desktop computers within a few hours. Phylogenetic compression has broad applications in computational biology and may provide a fundamental design principle for future genomics infrastructure.
Collapse
|
2
|
Tracking SARS-CoV-2 variants of concern in wastewater: an assessment of nine computational tools using simulated genomic data. Microb Genom 2024; 10. [PMID: 38785221 DOI: 10.1099/mgen.0.001249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024] Open
Abstract
Wastewater-based surveillance (WBS) is an important epidemiological and public health tool for tracking pathogens across the scale of a building, neighbourhood, city, or region. WBS gained widespread adoption globally during the SARS-CoV-2 pandemic for estimating community infection levels by qPCR. Sequencing pathogen genes or genomes from wastewater adds information about pathogen genetic diversity, which can be used to identify viral lineages (including variants of concern) that are circulating in a local population. Capturing the genetic diversity by WBS sequencing is not trivial, as wastewater samples often contain a diverse mixture of viral lineages with real mutations and sequencing errors, which must be deconvoluted computationally from short sequencing reads. In this study we assess nine different computational tools that have recently been developed to address this challenge. We simulated 100 wastewater sequence samples consisting of SARS-CoV-2 BA.1, BA.2, and Delta lineages, in various mixtures, as well as a Delta-Omicron recombinant and a synthetic 'novel' lineage. Most tools performed well in identifying the true lineages present and estimating their relative abundances and were generally robust to variation in sequencing depth and read length. While many tools identified lineages present down to 1 % frequency, results were more reliable above a 5 % threshold. The presence of an unknown synthetic lineage, which represents an unclassified SARS-CoV-2 lineage, increases the error in relative abundance estimates of other lineages, but the magnitude of this effect was small for most tools. The tools also varied in how they labelled novel synthetic lineages and recombinants. While our simulated dataset represents just one of many possible use cases for these methods, we hope it helps users understand potential sources of error or bias in wastewater sequencing analysis and to appreciate the commonalities and differences across methods.
Collapse
|
3
|
Diverse and abundant phages exploit conjugative plasmids. Nat Commun 2024; 15:3197. [PMID: 38609370 PMCID: PMC11015023 DOI: 10.1038/s41467-024-47416-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Accepted: 03/28/2024] [Indexed: 04/14/2024] Open
Abstract
Phages exert profound evolutionary pressure on bacteria by interacting with receptors on the cell surface to initiate infection. While the majority of phages use chromosomally encoded cell surface structures as receptors, plasmid-dependent phages exploit plasmid-encoded conjugation proteins, making their host range dependent on horizontal transfer of the plasmid. Despite their unique biology and biotechnological significance, only a small number of plasmid-dependent phages have been characterized. Here we systematically search for new plasmid-dependent phages targeting IncP and IncF plasmids using a targeted discovery platform, and find that they are common and abundant in wastewater, and largely unexplored in terms of their genetic diversity. Plasmid-dependent phages are enriched in non-canonical types of phages, and all but one of the 65 phages we isolated were non-tailed, and members of the lipid-containing tectiviruses, ssDNA filamentous phages or ssRNA phages. We show that plasmid-dependent tectiviruses exhibit profound differences in their host range which is associated with variation in the phage holin protein. Despite their relatively high abundance in wastewater, plasmid-dependent tectiviruses are missed by metaviromic analyses, underscoring the continued importance of culture-based phage discovery. Finally, we identify a tailed phage dependent on the IncF plasmid, and find related structural genes in phages that use the orthogonal type 4 pilus as a receptor, highlighting the evolutionarily promiscuous use of these distinct contractile structures by multiple groups of phages. Taken together, these results indicate plasmid-dependent phages play an under-appreciated evolutionary role in constraining horizontal gene transfer via conjugative plasmids.
Collapse
|
4
|
Changing fitness effects of mutations through long-term bacterial evolution. Science 2024; 383:eadd1417. [PMID: 38271521 DOI: 10.1126/science.add1417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 12/12/2023] [Indexed: 01/27/2024]
Abstract
The distribution of fitness effects of new mutations shapes evolution, but it is challenging to observe how it changes as organisms adapt. Using Escherichia coli lineages spanning 50,000 generations of evolution, we quantify the fitness effects of insertion mutations in every gene. Macroscopically, the fraction of deleterious mutations changed little over time whereas the beneficial tail declined sharply, approaching an exponential distribution. Microscopically, changes in individual gene essentiality and deleterious effects often occurred in parallel; altered essentiality is only partly explained by structural variation. The identity and effect sizes of beneficial mutations changed rapidly over time, but many targets of selection remained predictable because of the importance of loss-of-function mutations. Taken together, these results reveal the dynamic-but statistically predictable-nature of mutational fitness effects.
Collapse
|
5
|
Diverse and abundant phages exploit conjugative plasmids. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.19.532758. [PMID: 36993299 PMCID: PMC10055259 DOI: 10.1101/2023.03.19.532758] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Phages exert profound evolutionary pressure on bacteria by interacting with receptors on the cell surface to initiate infection. While the majority of phages use chromosomally-encoded cell surface structures as receptors, plasmid-dependent phages exploit plasmid-encoded conjugation proteins, making their host range dependent on horizontal transfer of the plasmid. Despite their unique biology and biotechnological significance, only a small number of plasmid-dependent phages have been characterized. Here we systematically search for new plasmid-dependent phages targeting IncP and IncF plasmids using a targeted discovery platform, and find that they are common and abundant in wastewater, and largely unexplored in terms of their genetic diversity. Plasmid-dependent phages are enriched in non-canonical types of phages, and all but one of the 64 phages we isolated were non-tailed, and members of the lipid-containing tectiviruses, ssDNA filamentous phages or ssRNA phages. We show that plasmid-dependent tectiviruses exhibit profound differences in their host range which is associated with variation in the phage holin protein. Despite their relatively high abundance in wastewater, plasmid-dependent tectiviruses are missed by metaviromic analyses, underscoring the continued importance of culture-based phage discovery. Finally, we identify a tailed phage dependent on the IncF plasmid, and find related structural genes in phages that use the orthogonal type 4 pilus as a receptor, highlighting the promiscuous use of these distinct contractile structures by multiple groups of phages. Taken together, these results indicate plasmid-dependent phages play an under-appreciated evolutionary role in constraining horizontal gene transfer via conjugative plasmids.
Collapse
|
6
|
Resolving Deleterious and Near-Neutral Effects Requires Different Pooled Fitness Assay Designs. J Mol Evol 2023; 91:325-333. [PMID: 37160452 DOI: 10.1007/s00239-023-10110-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 04/06/2023] [Indexed: 05/11/2023]
Abstract
Pooled sequencing-based fitness assays are a powerful and widely used approach to quantifying fitness of thousands of genetic variants in parallel. Despite the throughput of such assays, they are prone to biases in fitness estimates, and errors in measurements are typically larger for deleterious fitness effects, relative to neutral effects. In practice, designing pooled fitness assays involves tradeoffs between the number of timepoints, the sequencing depth, and other parameters to gain as much information as possible within a feasible experiment. Here, we combined simulations and reanalysis of an existing experimental dataset to explore how assay parameters impact measurements of near-neutral and deleterious fitness effects using a standard fitness estimator. We found that sequencing multiple timepoints at relatively modest depth improved estimates of near-neutral fitness effects, but systematically biased measurements of deleterious effects. We showed that a fixed total number of reads, deeper sequencing at fewer timepoints improved resolution of deleterious fitness effects. Our results highlight a tradeoff between measurement of deleterious and near-neutral effect sizes for a fixed amount of data and suggest that fitness assay design should be tuned for fitness effects that are relevant to the specific biological question.
Collapse
|
7
|
A swapped genetic code prevents viral infections and gene transfer. Nature 2023; 615:720-727. [PMID: 36922599 PMCID: PMC10151025 DOI: 10.1038/s41586-023-05824-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 02/10/2023] [Indexed: 03/17/2023]
Abstract
Engineering the genetic code of an organism has been proposed to provide a firewall from natural ecosystems by preventing viral infections and gene transfer1-6. However, numerous viruses and mobile genetic elements encode parts of the translational apparatus7-9, potentially rendering a genetic-code-based firewall ineffective. Here we show that such mobile transfer RNAs (tRNAs) enable gene transfer and allow viral replication in Escherichia coli despite the genome-wide removal of 3 of the 64 codons and the previously essential cognate tRNA and release factor genes. We then establish a genetic firewall by discovering viral tRNAs that provide exceptionally efficient codon reassignment allowing us to develop cells bearing an amino acid-swapped genetic code that reassigns two of the six serine codons to leucine during translation. This amino acid-swapped genetic code renders cells resistant to viral infections by mistranslating viral proteomes and prevents the escape of synthetic genetic information by engineered reliance on serine codons to produce leucine-requiring proteins. As these cells may have a selective advantage over wild organisms due to virus resistance, we also repurpose a third codon to biocontain this virus-resistant host through dependence on an amino acid not found in nature10. Our results may provide the basis for a general strategy to make any organism safely resistant to all natural viruses and prevent genetic information flow into and out of genetically modified organisms.
Collapse
|
8
|
Mycobacterial nucleoid-associated protein Lsr2 is required for productive mycobacteriophage infection. Nat Microbiol 2023; 8:695-710. [PMID: 36823286 PMCID: PMC10066036 DOI: 10.1038/s41564-023-01333-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 01/23/2023] [Indexed: 02/25/2023]
Abstract
Mycobacteriophages are a diverse group of viruses infecting Mycobacterium with substantial therapeutic potential. However, as this potential becomes realized, the molecular details of phage infection and mechanisms of resistance remain ill-defined. Here we use live-cell fluorescence microscopy to visualize the spatiotemporal dynamics of mycobacteriophage infection in single cells and populations, showing that infection is dependent on the host nucleoid-associated Lsr2 protein. Mycobacteriophages preferentially adsorb at Mycobacterium smegmatis sites of new cell wall synthesis and following DNA injection, Lsr2 reorganizes away from host replication foci to establish zones of phage DNA replication (ZOPR). Cells lacking Lsr2 proceed through to cell lysis when infected but fail to generate consecutive phage bursts that trigger epidemic spread of phage particles to neighbouring cells. Many mycobacteriophages code for their own Lsr2-related proteins, and although their roles are unknown, they do not rescue the loss of host Lsr2.
Collapse
|
9
|
Lineage abundance estimation for SARS-CoV-2 in wastewater using transcriptome quantification techniques. Genome Biol 2022; 23:236. [PMID: 36348471 PMCID: PMC9643916 DOI: 10.1186/s13059-022-02805-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 10/25/2022] [Indexed: 11/09/2022] Open
Abstract
Effectively monitoring the spread of SARS-CoV-2 mutants is essential to efforts to counter the ongoing pandemic. Predicting lineage abundance from wastewater, however, is technically challenging. We show that by sequencing SARS-CoV-2 RNA in wastewater and applying algorithms initially used for transcriptome quantification, we can estimate lineage abundance in wastewater samples. We find high variability in signal among individual samples, but the overall trends match those observed from sequencing clinical samples. Thus, while clinical sequencing remains a more sensitive technique for population surveillance, wastewater sequencing can be used to monitor trends in mutant prevalence in situations where clinical sequencing is unavailable.
Collapse
|
10
|
Decreased thermal niche breadth as a trade-off of antibiotic resistance. THE ISME JOURNAL 2022; 16:1843-1852. [PMID: 35422477 PMCID: PMC9213455 DOI: 10.1038/s41396-022-01235-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 03/03/2022] [Accepted: 03/31/2022] [Indexed: 01/24/2023]
Abstract
Evolutionary theory predicts that adaptations, including antibiotic resistance, should come with associated fitness costs; yet, many resistance mutations seemingly contradict this prediction by inducing no growth rate deficit. However, most growth assays comparing sensitive and resistant strains have been performed under a narrow range of environmental conditions, which do not reflect the variety of contexts that a pathogenic bacterium might encounter when causing infection. We hypothesized that reduced niche breadth, defined as diminished growth across a diversity of environments, can be a cost of antibiotic resistance. Specifically, we test whether chloramphenicol-resistant Escherichia coli incur disproportionate growth deficits in novel thermal conditions. Here we show that chloramphenicol-resistant bacteria have greater fitness costs at novel temperatures than their antibiotic-sensitive ancestors. In several cases, we observed no resistance cost in growth rate at the historic temperature but saw diminished growth at warmer and colder temperatures. These results were consistent across various genetic mechanisms of resistance. Thus, we propose that decreased thermal niche breadth is an under-documented fitness cost of antibiotic resistance. Furthermore, these results demonstrate that the cost of antibiotic resistance shifts rapidly as the environment changes; these context-dependent resistance costs should select for the rapid gain and loss of resistance as an evolutionary strategy.
Collapse
|
11
|
Prophages encode phage-defense systems with cognate self-immunity. Cell Host Microbe 2021; 29:1620-1633.e8. [PMID: 34597593 PMCID: PMC8585504 DOI: 10.1016/j.chom.2021.09.002] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 02/23/2021] [Accepted: 09/03/2021] [Indexed: 12/18/2022]
Abstract
Temperate phages are pervasive in bacterial genomes, existing as vertically inherited islands termed prophages. Prophages are vulnerable to predation of their host bacterium by exogenous phages. Here, we identify BstA, a family of prophage-encoded phage-defense proteins in diverse Gram-negative bacteria. BstA localizes to sites of exogenous phage DNA replication and mediates abortive infection, suppressing the competing phage epidemic. During lytic replication, the BstA-encoding prophage is not itself inhibited by BstA due to self-immunity conferred by the anti-BstA (aba) element, a short stretch of DNA within the bstA locus. Inhibition of phage replication by distinct BstA proteins from Salmonella, Klebsiella, and Escherichia prophages is generally interchangeable, but each possesses a cognate aba element. The specificity of the aba element ensures that immunity is exclusive to the replicating prophage, preventing exploitation by variant BstA-encoding phages. The BstA protein allows prophages to defend host cells against exogenous phage attack without sacrificing the ability to replicate lytically. BstA is an abortive infection protein found in prophages of Gram-negative bacteria aba, a short DNA sequence within the bstA locus, acts as a self-immunity element aba gives BstA-encoding prophages immunity to BstA-driven abortive infection Variant BstA proteins have distinct and cognate aba elements
Collapse
|
12
|
Variant abundance estimation for SARS-CoV-2 in wastewater using RNA-Seq quantification. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.08.31.21262938. [PMID: 34494031 PMCID: PMC8423229 DOI: 10.1101/2021.08.31.21262938] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Effectively monitoring the spread of SARS-CoV-2 variants is essential to efforts to counter the ongoing pandemic. Wastewater monitoring of SARS-CoV-2 RNA has proven an effective and efficient technique to approximate COVID-19 case rates in the population. Predicting variant abundances from wastewater, however, is technically challenging. Here we show that by sequencing SARS-CoV-2 RNA in wastewater and applying computational techniques initially used for RNA-Seq quantification, we can estimate the abundance of variants in wastewater samples. We show by sequencing samples from wastewater and clinical isolates in Connecticut U.S.A. between January and April 2021 that the temporal dynamics of variant strains broadly correspond. We further show that this technique can be used with other wastewater sequencing techniques by expanding to samples taken across the United States in a similar timeframe. We find high variability in signal among individual samples, and limited ability to detect the presence of variants with clinical frequencies <10%; nevertheless, the overall trends match what we observed from sequencing clinical samples. Thus, while clinical sequencing remains a more sensitive technique for population surveillance, wastewater sequencing can be used to monitor trends in variant prevalence in situations where clinical sequencing is unavailable or impractical.
Collapse
|
13
|
Abstract
de Bruijn graphs play an essential role in bioinformatics, yet they lack a universal scalable representation. Here, we introduce simplitigs as a compact, efficient, and scalable representation, and ProphAsm, a fast algorithm for their computation. For the example of assemblies of model organisms and two bacterial pan-genomes, we compare simplitigs to unitigs, the best existing representation, and demonstrate that simplitigs provide a substantial improvement in the cumulative sequence length and their number. When combined with the commonly used Burrows-Wheeler Transform index, simplitigs reduce memory, and index loading and query times, as demonstrated with large-scale examples of GenBank bacterial pan-genomes.
Collapse
|
14
|
Barcoded microbial system for high-resolution object provenance. Science 2020; 368:1135-1140. [PMID: 32499444 DOI: 10.1126/science.aba5584] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 03/31/2020] [Indexed: 02/22/2024]
Abstract
Determining where an object has been is a fundamental challenge for human health, commerce, and food safety. Location-specific microbes in principle offer a cheap and sensitive way to determine object provenance. We created a synthetic, scalable microbial spore system that identifies object provenance in under 1 hour at meter-scale resolution and near single-spore sensitivity and can be safely introduced into and recovered from the environment. This system solves the key challenges in object provenance: persistence in the environment, scalability, rapid and facile decoding, and biocontainment. Our system is compatible with SHERLOCK, a Cas13a RNA-guided nucleic acid detection assay, facilitating its implementation in a wide range of applications.
Collapse
|
15
|
Escape mutations circumvent a tradeoff between resistance to a beta-lactam and resistance to a beta-lactamase inhibitor. Nat Commun 2020; 11:2029. [PMID: 32332717 PMCID: PMC7181632 DOI: 10.1038/s41467-020-15666-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Accepted: 03/13/2020] [Indexed: 11/09/2022] Open
Abstract
Beta-lactamase inhibitors are increasingly used to counteract antibiotic resistance mediated by beta-lactamase enzymes. These inhibitors compete with the beta-lactam antibiotic for the same binding site on the beta-lactamase, thus generating an evolutionary tradeoff: mutations that increase the enzyme's beta-lactamase activity tend to increase also its susceptibility to the inhibitor. Here, we investigate how common and accessible are mutants that escape this adaptive tradeoff. Screening a deep mutant library of the blaampC beta-lactamase gene of Escherichia coli, we identified mutations that allow growth at beta-lactam concentrations far exceeding those inhibiting growth of the wildtype strain, even in the presence of the enzyme inhibitor (avibactam). These escape mutations are rare and drug-specific, and some combinations of avibactam with beta-lactam drugs appear to prevent such escape phenotypes. Our results, showing differential adaptive potential of blaampC to combinations of avibactam and different beta-lactam antibiotics, suggest that it may be possible to identify treatments that are more resilient to evolution of resistance.
Collapse
|
16
|
Rapid inference of antibiotic resistance and susceptibility by genomic neighbour typing. Nat Microbiol 2020; 5:455-464. [PMID: 32042129 PMCID: PMC7044115 DOI: 10.1038/s41564-019-0656-6] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 12/06/2019] [Indexed: 11/09/2022]
Abstract
Surveillance of drug-resistant bacteria is essential for healthcare providers to deliver effective empirical antibiotic therapy. However, traditional molecular epidemiology does not typically occur on a timescale that could affect patient treatment and outcomes. Here, we present a method called 'genomic neighbour typing' for inferring the phenotype of a bacterial sample by identifying its closest relatives in a database of genomes with metadata. We show that this technique can infer antibiotic susceptibility and resistance for both Streptococcus pneumoniae and Neisseria gonorrhoeae. We implemented this with rapid k-mer matching, which, when used on Oxford Nanopore MinION data, can run in real time. This resulted in the determination of resistance within 10 min (91% sensitivity and 100% specificity for S. pneumoniae and 81% sensitivity and 100% specificity for N. gonorrhoeae from isolates with a representative database) of starting sequencing, and within 4 h of sample collection (75% sensitivity and 100% specificity for S. pneumoniae) for clinical metagenomic sputum samples. This flexible approach has wide application for pathogen surveillance and may be used to greatly accelerate appropriate empirical antibiotic treatment.
Collapse
|
17
|
The fitness landscape of the African Salmonella Typhimurium ST313 strain D23580 reveals unique properties of the pBT1 plasmid. PLoS Pathog 2019; 15:e1007948. [PMID: 31560731 PMCID: PMC6785131 DOI: 10.1371/journal.ppat.1007948] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 10/09/2019] [Accepted: 08/30/2019] [Indexed: 12/13/2022] Open
Abstract
We have used a transposon insertion sequencing (TIS) approach to establish the fitness landscape of the African Salmonella enterica serovar Typhimurium ST313 strain D23580, to complement our previous comparative genomic and functional transcriptomic studies. We used a genome-wide transposon library with insertions every 10 nucleotides to identify genes required for survival and growth in vitro and during infection of murine macrophages. The analysis revealed genomic regions important for fitness under two in vitro growth conditions. Overall, 724 coding genes were required for optimal growth in LB medium, and 851 coding genes were required for growth in SPI-2-inducing minimal medium. These findings were consistent with the essentiality analyses of other S. Typhimurium ST19 and S. Typhi strains. The global mutagenesis approach also identified 60 sRNAs and 413 intergenic regions required for growth in at least one in vitro growth condition. By infecting murine macrophages with the transposon library, we identified 68 genes that were required for intra-macrophage replication but did not impact fitness in vitro. None of these genes were unique to S. Typhimurium D23580, consistent with a high conservation of gene function between S. Typhimurium ST313 and ST19 and suggesting that novel virulence factors are not involved in the interaction of strain D23580 with murine macrophages. We discovered that transposon insertions rarely occurred in many pBT1 plasmid-encoded genes (36), compared with genes carried by the pSLT-BT virulence plasmid and other bacterial plasmids. The key essential protein encoded by pBT1 is a cysteinyl-tRNA synthetase, and our enzymological analysis revealed that the plasmid-encoded CysRSpBT1 had a lower ability to charge tRNA than the chromosomally-encoded CysRSchr enzyme. The presence of aminoacyl-tRNA synthetases in plasmids from a range of Gram-negative and Gram-positive bacteria suggests that plasmid-encoded essential genes are more common than had been appreciated.
Collapse
|
18
|
|
19
|
Engineered bacteria can function in the mammalian gut long-term as live diagnostics of inflammation. Nat Biotechnol 2017; 35:653-658. [PMID: 28553941 DOI: 10.1038/nbt.3879] [Citation(s) in RCA: 206] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2016] [Accepted: 04/07/2017] [Indexed: 02/07/2023]
Abstract
Bacteria can be engineered to function as diagnostics or therapeutics in the mammalian gut but commercial translation of technologies to accomplish this has been hindered by the susceptibility of synthetic genetic circuits to mutation and unpredictable function during extended gut colonization. Here, we report stable, engineered bacterial strains that maintain their function for 6 months in the mouse gut. We engineered a commensal murine Escherichia coli strain to detect tetrathionate, which is produced during inflammation. Using our engineered diagnostic strain, which retains memory of exposure in the gut for analysis by fecal testing, we detected tetrathionate in both infection-induced and genetic mouse models of inflammation over 6 months. The synthetic genetic circuits in the engineered strain were genetically stable and functioned as intended over time. The durable performance of these strains confirms the potential of engineered bacteria as living diagnostics.
Collapse
|
20
|
Spatiotemporal microbial evolution on antibiotic landscapes. Science 2017; 353:1147-51. [PMID: 27609891 DOI: 10.1126/science.aag0822] [Citation(s) in RCA: 294] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Accepted: 07/28/2016] [Indexed: 01/03/2023]
Abstract
A key aspect of bacterial survival is the ability to evolve while migrating across spatially varying environmental challenges. Laboratory experiments, however, often study evolution in well-mixed systems. Here, we introduce an experimental device, the microbial evolution and growth arena (MEGA)-plate, in which bacteria spread and evolved on a large antibiotic landscape (120 × 60 centimeters) that allowed visual observation of mutation and selection in a migrating bacterial front. While resistance increased consistently, multiple coexisting lineages diversified both phenotypically and genotypically. Analyzing mutants at and behind the propagating front, we found that evolution is not always led by the most resistant mutants; highly resistant mutants may be trapped behind more sensitive lineages. The MEGA-plate provides a versatile platform for studying microbial adaption and directly visualizing evolutionary dynamics.
Collapse
|
21
|
Barcode extension for analysis and reconstruction of structures. Nat Commun 2017; 8:14698. [PMID: 28287117 PMCID: PMC5355802 DOI: 10.1038/ncomms14698] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 01/24/2017] [Indexed: 12/17/2022] Open
Abstract
Collections of DNA sequences can be rationally designed to self-assemble into predictable three-dimensional structures. The geometric and functional diversity of DNA nanostructures created to date has been enhanced by improvements in DNA synthesis and computational design. However, existing methods for structure characterization typically image the final product or laboriously determine the presence of individual, labelled strands using gel electrophoresis. Here we introduce a new method of structure characterization that uses barcode extension and next-generation DNA sequencing to quantitatively measure the incorporation of every strand into a DNA nanostructure. By quantifying the relative abundances of distinct DNA species in product and monomer bands, we can study the influence of geometry and sequence on assembly. We have tested our method using 2D and 3D DNA brick and DNA origami structures. Our method is general and should be extensible to a wide variety of DNA nanostructures. Techniques for structural characterization and quantification of DNA origami are still poorly developed, despite advances in other aspects of DNA nanotechnology. Here, the authors combine barcoding and next generation sequencing to simultaneously image and quantify self-assembled DNA nanostructures.
Collapse
|
22
|
Abstract
Antibiotic treatment has two conflicting effects: the desired, immediate effect of inhibiting bacterial growth and the undesired, long-term effect of promoting the evolution of resistance. Although these contrasting outcomes seem inextricably linked, recent work has revealed several ways by which antibiotics can be combined to inhibit bacterial growth while, counterintuitively, selecting against resistant mutants. Decoupling treatment efficacy from the risk of resistance can be achieved by exploiting specific interactions between drugs, and the ways in which resistance mutations to a given drug can modulate these interactions or increase the sensitivity of the bacteria to other compounds. Although their practical application requires much further development and validation, and relies on advances in genomic diagnostics, these discoveries suggest novel paradigms that may restrict or even reverse the evolution of resistance.
Collapse
|
23
|
Delayed commitment to evolutionary fate in antibiotic resistance fitness landscapes. Nat Commun 2015; 6:7385. [PMID: 26060115 DOI: 10.1038/ncomms8385] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 05/01/2015] [Indexed: 11/09/2022] Open
Abstract
Predicting evolutionary paths to antibiotic resistance is key for understanding and controlling drug resistance. When considering a single final resistant genotype, epistatic contingencies among mutations restrict evolution to a small number of adaptive paths. Less attention has been given to multi-peak landscapes, and while specific peaks can be favoured, it is unknown whether and how early a commitment to final fate is made. Here we characterize a multi-peaked adaptive landscape for trimethoprim resistance by constructing all combinatorial alleles of seven resistance-conferring mutations in dihydrofolate reductase. We observe that epistatic interactions increase rather than decrease the accessibility of each peak; while they restrict the number of direct paths, they generate more indirect paths, where mutations are adaptively gained and later adaptively lost or changed. This enhanced accessibility allows evolution to proceed through many adaptive steps while delaying commitment to genotypic fate, hindering our ability to predict or control evolutionary outcomes.
Collapse
|
24
|
Abstract
Whole-genome sequencing has become an indispensible tool of modern biology. However, the cost of sample preparation relative to the cost of sequencing remains high, especially for small genomes where the former is dominant. Here we present a protocol for rapid and inexpensive preparation of hundreds of multiplexed genomic libraries for Illumina sequencing. By carrying out the Nextera tagmentation reaction in small volumes, replacing costly reagents with cheaper equivalents, and omitting unnecessary steps, we achieve a cost of library preparation of $8 per sample, approximately 6 times cheaper than the standard Nextera XT protocol. Furthermore, our procedure takes less than 5 hours for 96 samples. Several hundred samples can then be pooled on the same HiSeq lane via custom barcodes. Our method will be useful for re-sequencing of microbial or viral genomes, including those from evolution experiments, genetic screens, and environmental samples, as well as for other sequencing applications including large amplicon, open chromosome, artificial chromosomes, and RNA sequencing.
Collapse
|
25
|
The Separatrix Algorithm for synthesis and analysis of stochastic simulations with applications in disease modeling. PLoS One 2014; 9:e103467. [PMID: 25078087 PMCID: PMC4117517 DOI: 10.1371/journal.pone.0103467] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Accepted: 07/03/2014] [Indexed: 11/18/2022] Open
Abstract
Decision makers in epidemiology and other disciplines are faced with the daunting challenge of designing interventions that will be successful with high probability and robust against a multitude of uncertainties. To facilitate the decision making process in the context of a goal-oriented objective (e.g., eradicate polio by [Formula: see text]), stochastic models can be used to map the probability of achieving the goal as a function of parameters. Each run of a stochastic model can be viewed as a Bernoulli trial in which "success" is returned if and only if the goal is achieved in simulation. However, each run can take a significant amount of time to complete, and many replicates are required to characterize each point in parameter space, so specialized algorithms are required to locate desirable interventions. To address this need, we present the Separatrix Algorithm, which strategically locates parameter combinations that are expected to achieve the goal with a user-specified probability of success (e.g. 95%). Technically, the algorithm iteratively combines density-corrected binary kernel regression with a novel information-gathering experiment design to produce results that are asymptotically correct and work well in practice. The Separatrix Algorithm is demonstrated on several test problems, and on a detailed individual-based simulation of malaria.
Collapse
|
26
|
Abstract
MOTIVATION The exponential growth of protein sequence databases has increasingly made the fundamental question of searching for homologs a computational bottleneck. The amount of unique data, however, is not growing nearly as fast; we can exploit this fact to greatly accelerate homology search. Acceleration of programs in the popular PSI/DELTA-BLAST family of tools will not only speed-up homology search directly but also the huge collection of other current programs that primarily interact with large protein databases via precisely these tools. RESULTS We introduce a suite of homology search tools, powered by compressively accelerated protein BLAST (CaBLASTP), which are significantly faster than and comparably accurate with all known state-of-the-art tools, including HHblits, DELTA-BLAST and PSI-BLAST. Further, our tools are implemented in a manner that allows direct substitution into existing analysis pipelines. The key idea is that we introduce a local similarity-based compression scheme that allows us to operate directly on the compressed data. Importantly, CaBLASTP's runtime scales almost linearly in the amount of unique data, as opposed to current BLASTP variants, which scale linearly in the size of the full protein database being searched. Our compressive algorithms will speed-up many tasks, such as protein structure prediction and orthology mapping, which rely heavily on homology search. AVAILABILITY CaBLASTP is available under the GNU Public License at http://cablastp.csail.mit.edu/ CONTACT bab@mit.edu.
Collapse
|
27
|
|
28
|
Abstract
We describe IsoBase, a database identifying functionally related proteins, across five major eukaryotic model organisms: Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Mus musculus and Homo Sapiens. Nearly all existing algorithms for orthology detection are based on sequence comparison. Although these have been successful in orthology prediction to some extent, we seek to go beyond these methods by the integration of sequence data and protein–protein interaction (PPI) networks to help in identifying true functionally related proteins. With that motivation, we introduce IsoBase, the first publicly available ortholog database that focuses on functionally related proteins. The groupings were computed using the IsoRankN algorithm that uses spectral methods to combine sequence and PPI data and produce clusters of functionally related proteins. These clusters compare favorably with those from existing approaches: proteins within an IsoBase cluster are more likely to share similar Gene Ontology (GO) annotation. A total of 48 120 proteins were clustered into 12 693 functionally related groups. The IsoBase database may be browsed for functionally related proteins across two or more species and may also be queried by accession numbers, species-specific identifiers, gene name or keyword. The database is freely available for download at http://isobase.csail.mit.edu/.
Collapse
|
29
|
Abstract
MOTIVATION With the increasing availability of large protein-protein interaction networks, the question of protein network alignment is becoming central to systems biology. Network alignment is further delineated into two sub-problems: local alignment, to find small conserved motifs across networks, and global alignment, which attempts to find a best mapping between all nodes of the two networks. In this article, our aim is to improve upon existing global alignment results. Better network alignment will enable, among other things, more accurate identification of functional orthologs across species. RESULTS We introduce IsoRankN (IsoRank-Nibble) a global multiple-network alignment tool based on spectral clustering on the induced graph of pairwise alignment scores. IsoRankN outperforms existing algorithms for global network alignment in coverage and consistency on multiple alignments of the five available eukaryotic networks. Being based on spectral methods, IsoRankN is both error tolerant and computationally efficient. AVAILABILITY Our software is available freely for non-commercial purposes on request from: http://isorank.csail.mit.edu/.
Collapse
|
30
|
Conserved quantities and adaptation to the edge of chaos. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2006; 73:056210. [PMID: 16803029 DOI: 10.1103/physreve.73.056210] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/28/2005] [Revised: 03/09/2006] [Indexed: 05/10/2023]
Abstract
Certain dynamical systems, such as the shift map and the logistic map, have an edge of chaos in their parameter spaces. On one side of this edge, the dynamics are chaotic for many parameter values, on the other side of the edge they are periodic. We find that discrete-time dynamical systems with wavelet filtered feedback from the dynamical variable to the parameters are attracted to a narrow parameter range near the edge of chaos, the periodic boundary regime. We show that the migration from the chaotic regime to the periodic boundary regime can be attributed to a conserved quantity, and find that such adaptation to the edge of chaos is accompanied by a depopulation of the chaotic regime. We use this conserved quantity to determine the location of the periodic boundary regime and show that its size is proportional to the size of the feedback. Further, we compute the dynamics of the probability density for the parameter for a specific example.
Collapse
|