1
|
Valencia-Toxqui G, Ramsey J. How to introduce a new bacteriophage on the block: a short guide to phage classification. J Virol 2024:e0182123. [PMID: 39264154 DOI: 10.1128/jvi.01821-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024] Open
Abstract
Bacteriophage (phage) studies established the field of molecular biology and continue to propel life science research forward due to their diversity, abundance, and potential applications. In this Gem article, we orient newcomers to four common ways phages are currently classified: infection cycle, morphology, taxonomy, and supergroup. By using these classifications, researchers can determine where any novel phage fits into the scheme of the known "phage-verse".
Collapse
Affiliation(s)
- Guadalupe Valencia-Toxqui
- Department of Biology, Center for Phage Technology, Texas A&M University, College Station, Texas, USA
| | - Jolene Ramsey
- Department of Biology, Center for Phage Technology, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
2
|
Tokuda M, Shintani M. Microbial evolution through horizontal gene transfer by mobile genetic elements. Microb Biotechnol 2024; 17:e14408. [PMID: 38226780 PMCID: PMC10832538 DOI: 10.1111/1751-7915.14408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 12/20/2023] [Accepted: 01/02/2024] [Indexed: 01/17/2024] Open
Abstract
Mobile genetic elements (MGEs) are crucial for horizontal gene transfer (HGT) in bacteria and facilitate their rapid evolution and adaptation. MGEs include plasmids, integrative and conjugative elements, transposons, insertion sequences and bacteriophages. Notably, the spread of antimicrobial resistance genes (ARGs), which poses a serious threat to public health, is primarily attributable to HGT through MGEs. This mini-review aims to provide an overview of the mechanisms by which MGEs mediate HGT in microbes. Specifically, the behaviour of conjugative plasmids in different environments and conditions was discussed, and recent methodologies for tracing the dynamics of MGEs were summarised. A comprehensive understanding of the mechanisms underlying HGT and the role of MGEs in bacterial evolution and adaptation is important to develop strategies to combat the spread of ARGs.
Collapse
Affiliation(s)
- Maho Tokuda
- Department of Environment and Energy Systems, Graduate School of Science and TechnologyShizuoka UniversityHamamatsuJapan
| | - Masaki Shintani
- Department of Environment and Energy Systems, Graduate School of Science and TechnologyShizuoka UniversityHamamatsuJapan
- Research Institute of Green Science and TechnologyShizuoka UniversityHamamatsuJapan
- Japan Collection of MicroorganismsRIKEN BioResource Research CenterIbarakiJapan
- Graduate School of Integrated Science and TechnologyShizuoka UniversityHamamatsuJapan
| |
Collapse
|
3
|
Arnau V, Díaz-Villanueva W, Mifsut Benet J, Villasante P, Beamud B, Mompó P, Sanjuan R, González-Candelas F, Domingo-Calap P, Džunková M. Inference of the Life Cycle of Environmental Phages from Genomic Signature Distances to Their Hosts. Viruses 2023; 15:v15051196. [PMID: 37243281 DOI: 10.3390/v15051196] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 05/17/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023] Open
Abstract
The environmental impact of uncultured phages is shaped by their preferred life cycle (lytic or lysogenic). However, our ability to predict it is very limited. We aimed to discriminate between lytic and lysogenic phages by comparing the similarity of their genomic signatures to those of their hosts, reflecting their co-evolution. We tested two approaches: (1) similarities of tetramer relative frequencies, (2) alignment-free comparisons based on exact k = 14 oligonucleotide matches. First, we explored 5126 reference bacterial host strains and 284 associated phages and found an approximate threshold for distinguishing lysogenic and lytic phages using both oligonucleotide-based methods. The analysis of 6482 plasmids revealed the potential for horizontal gene transfer between different host genera and, in some cases, distant bacterial taxa. Subsequently, we experimentally analyzed combinations of 138 Klebsiella pneumoniae strains and their 41 phages and found that the phages with the largest number of interactions with these strains in the laboratory had the shortest genomic distances to K. pneumoniae. We then applied our methods to 24 single-cells from a hot spring biofilm containing 41 uncultured phage-host pairs, and the results were compatible with the lysogenic life cycle of phages detected in this environment. In conclusion, oligonucleotide-based genome analysis methods can be used for predictions of (1) life cycles of environmental phages, (2) phages with the broadest host range in culture collections, and (3) potential horizontal gene transfer by plasmids.
Collapse
Affiliation(s)
- Vicente Arnau
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| | - Wladimiro Díaz-Villanueva
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| | - Jorge Mifsut Benet
- Department of Space, Earth and Environment, Chalmers University of Technology, 41296 Gothenburg, Sweden
| | | | - Beatriz Beamud
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
| | - Paula Mompó
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
| | - Rafael Sanjuan
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
| | - Fernando González-Candelas
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| | - Pilar Domingo-Calap
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
| | - Mária Džunková
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
| |
Collapse
|
4
|
de la Fuente R, Díaz-Villanueva W, Arnau V, Moya A. Genomic Signature in Evolutionary Biology: A Review. BIOLOGY 2023; 12:biology12020322. [PMID: 36829597 PMCID: PMC9953303 DOI: 10.3390/biology12020322] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 02/11/2023] [Accepted: 02/13/2023] [Indexed: 02/19/2023]
Abstract
Organisms are unique physical entities in which information is stored and continuously processed. The digital nature of DNA sequences enables the construction of a dynamic information reservoir. However, the distinction between the hardware and software components in the information flow is crucial to identify the mechanisms generating specific genomic signatures. In this work, we perform a bibliometric analysis to identify the different purposes of looking for particular patterns in DNA sequences associated with a given phenotype. This study has enabled us to make a conceptual breakdown of the genomic signature and differentiate the leading applications. On the one hand, it refers to gene expression profiling associated with a biological function, which may be shared across taxa. This signature is the focus of study in precision medicine. On the other hand, it also refers to characteristic patterns in species-specific DNA sequences. This interpretation plays a key role in comparative genomics, identifying evolutionary relationships. Looking at the relevant studies in our bibliographic database, we highlight the main factors causing heterogeneities in genome composition and how they can be quantified. All these findings lead us to reformulate some questions relevant to evolutionary biology.
Collapse
Affiliation(s)
- Rebeca de la Fuente
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
- Correspondence:
| | - Wladimiro Díaz-Villanueva
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
| | - Vicente Arnau
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
| | - Andrés Moya
- Institute of Integrative Systems Biology (I2Sysbio), University of Valencia and Spanish Research Council (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| |
Collapse
|
5
|
Karaynir A, Salih H, Bozdoğan B, Güçlü Ö, Keskin D. Isolation and characterization of Brochothrix phage ADU4. Virus Res 2022; 321:198902. [PMID: 36064042 DOI: 10.1016/j.virusres.2022.198902] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/22/2022] [Accepted: 08/25/2022] [Indexed: 12/24/2022]
Abstract
B. thermosphacta is a psychrotrophic bacterium that often forms the predominant part of the spoilage microflora of aerobically and anaerobically stored meats. Bacteriophages are natural enemies of bacteria and their potential for use in environmentally friendly biocontrol of specific pathogens in food is being intensively studied. In this study, we reported the isolation and characterization of the newly isolated lytic Brochothrix phage ADU4, which is capable of infecting the B. thermosphacta bacterium. For the characterization of Brochothrix phage ADU4; host range, multiplicity of infection values (MOI), phage growth parameters (latent period and burst size), stability at various temperatures and pH, reduction growth of bacteria, effect on biofilm, and molecular characterization were investigated. The spot-test analysis showed positivity with B. thermosphacta strains, while no infection was observed in any other species and genera of bacteria tested. The optimal MOI value of the phage was determined as 0.1. The phage latent period and burst sizes were 40-50 min and 311 PFU/ml per infected host cell, respectively by one-step growth curve analysis. Brochothrix phage ADU4 reduced bacteria immediately after infection, which is shown by optical density (OD) measurement and colony counting (<10 CFU/ml) for 3 days. The degradation of B. thermosphacta in biofilm by Brochothrix phage ADU4 was analyzed and it was found that high titer phage breakdown the existing biofilm and also persistently inhibited biofilm formation. Brochothrix phage ADU4 genome was found to be 127,819 bp, and GC content 41.65%. The genome contains 217 putative open reading frames (ORFs), 4 tRNAs, and additionally, no known virulence and antibiotic resistance genes (AMR) were identified. Brochothrix phage ADU4 showed a high identity (96.09%) to the A9 phage that belongs to the Herelleviridae family. Nevertheless, the assembly module and its around appeared less conserved, and some DNA fragments in Brochothrix phage ADU4 genome were not found in A9 genome and vice versa. A9 contains TnpB, a transposase accessory protein involved in lysogenicity which is absent in Brochothrix phage ADU4. In contrary Brochothrix phage ADU4 had auxiliary metabolic genes (AMG) mostly carried by lytic phages. All these results showed that the Brochothrix phage ADU4 has excellent properties such as strong antibacterial activity, short latent period, high burst size, stability in different conditions, inhibition of biofilms, and absence of virulence and AMR genes. Based on all these features, this newly isolated phage is promising to control B. thermosphacta contamination in meat and meat products, and has the potential to be used alone or in combination with phage cocktails.
Collapse
Affiliation(s)
- Abdulkerim Karaynir
- Recombinant DNA and Recombinant Protein Research Center (REDPROM), Aydın Adnan Menderes University, Turkey
| | - Hanife Salih
- Recombinant DNA and Recombinant Protein Research Center (REDPROM), Aydın Adnan Menderes University, Turkey
| | - Bülent Bozdoğan
- Recombinant DNA and Recombinant Protein Research Center (REDPROM), Aydın Adnan Menderes University, Turkey; Medicine Faculty, Department of Medical Microbiology, Aydın Adnan Menderes University, Turkey
| | - Özgür Güçlü
- Recombinant DNA and Recombinant Protein Research Center (REDPROM), Aydın Adnan Menderes University, Turkey; Sultanhisar Vocational School, Aydın Adnan Menderes University, Köşk- AYDIN, Turkey
| | - Dilek Keskin
- Recombinant DNA and Recombinant Protein Research Center (REDPROM), Aydın Adnan Menderes University, Turkey; Köşk Vocational School, Aydın Adnan Menderes University, Köşk- AYDIN, Turkey.
| |
Collapse
|
6
|
Wu S, Fang Z, Tan J, Li M, Wang C, Guo Q, Xu C, Jiang X, Zhu H. DeePhage: distinguishing virulent and temperate phage-derived sequences in metavirome data with a deep learning approach. Gigascience 2021; 10:giab056. [PMID: 34498685 PMCID: PMC8427542 DOI: 10.1093/gigascience/giab056] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND Prokaryotic viruses referred to as phages can be divided into virulent and temperate phages. Distinguishing virulent and temperate phage-derived sequences in metavirome data is important for elucidating their different roles in interactions with bacterial hosts and regulation of microbial communities. However, there is no experimental or computational approach to effectively classify their sequences in culture-independent metavirome. We present a new computational method, DeePhage, which can directly and rapidly judge each read or contig as a virulent or temperate phage-derived fragment. FINDINGS DeePhage uses a "one-hot" encoding form to represent DNA sequences in detail. Sequence signatures are detected via a convolutional neural network to obtain valuable local features. The accuracy of DeePhage on 5-fold cross-validation reaches as high as 89%, nearly 10% and 30% higher than that of 2 similar tools, PhagePred and PHACTS. On real metavirome, DeePhage correctly predicts the highest proportion of contigs when using BLAST as annotation, without apparent preferences. Besides, DeePhage reduces running time vs PhagePred and PHACTS by 245 and 810 times, respectively, under the same computational configuration. By direct detection of the temperate viral fragments from metagenome and metavirome, we furthermore propose a new strategy to explore phage transformations in the microbial community. The ability to detect such transformations provides us a new insight into the potential treatment for human disease. CONCLUSIONS DeePhage is a novel tool developed to rapidly and efficiently identify 2 kinds of phage fragments especially for metagenomics analysis. DeePhage is freely available via http://cqb.pku.edu.cn/ZhuLab/DeePhage or https://github.com/shufangwu/DeePhage.
Collapse
Affiliation(s)
- Shufang Wu
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
| | - Zhencheng Fang
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
| | - Jie Tan
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
| | - Mo Li
- Peking University-Tsinghua University - National Institute of Biological Sciences (PTN) joint PhD program, School of Life Sciences, Peking University, Beijing 100871, Beijing, China
| | - Chunhui Wang
- Peking University-Tsinghua University - National Institute of Biological Sciences (PTN) joint PhD program, School of Life Sciences, Peking University, Beijing 100871, Beijing, China
| | - Qian Guo
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University,
GA 30332, Atlanta, USA
| | - Congmin Xu
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University,
GA 30332, Atlanta, USA
| | - Xiaoqing Jiang
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
| | - Huaiqiu Zhu
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, College of Engineering, Peking University, Beijing 100871, Beijing, China
- Center for Quantitative Biology, Peking University, Beijing 100871, Beijing, China
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University,
GA 30332, Atlanta, USA
- Institute of Medical Technology, Peking University Health Science Center, Beijing 100191, Beijing, China
| |
Collapse
|
7
|
Nami Y, Imeni N, Panahi B. Application of machine learning in bacteriophage research. BMC Microbiol 2021; 21:193. [PMID: 34174831 PMCID: PMC8235560 DOI: 10.1186/s12866-021-02256-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Accepted: 06/08/2021] [Indexed: 12/20/2022] Open
Abstract
Phages are one of the key components in the structure, dynamics, and interactions of microbial communities in different bins. It has a clear impact on human health and the food industry. Bacteriophage characterization using in vitro approaches are time/cost consuming and laborious tasks. On the other hand, with the advent of new high-throughput sequencing technology, the development of a powerful computational framework to characterize the newly identified bacteriophages is inevitable for future research. Machine learning includes powerful techniques that enable the analysis of complex datasets for knowledge discovery and pattern recognition. In this study, we have conducted a comprehensive review of machine learning methods application using different types of features were applied in various aspects of bacteriophage research including, automated curation, identification, classification, host species recognition, virion protein identification, and life cycle prediction. Moreover, potential limitations and advantages of the developed frameworks were discussed.
Collapse
Affiliation(s)
- Yousef Nami
- Department of Food Biotechnology, Branch for Northwest & West Region, Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran
| | - Nazila Imeni
- Young Researchers and Elite Clube, Marand Branch, Islamic Azad University, Marand, Iran
| | - Bahman Panahi
- Department of Genomics, Branch for Northwest & West Region, Agricultural Biotechnology Research Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran.
| |
Collapse
|
8
|
Tariq MA, Newberry F, Haagmans R, Booth C, Wileman T, Hoyles L, Clokie MRJ, Ebdon J, Carding SR. Genome Characterization of a Novel Wastewater Bacteroides fragilis Bacteriophage (vB_BfrS_23) and its Host GB124. Front Microbiol 2020; 11:583378. [PMID: 33193224 PMCID: PMC7644841 DOI: 10.3389/fmicb.2020.583378] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 10/05/2020] [Indexed: 12/31/2022] Open
Abstract
Bacteroides spp. are part of the human intestinal microbiota but can under some circumstances become clinical pathogens. Phages are a potentially valuable therapeutic treatment option for many pathogens, but phage therapy for pathogenic Bacteroides spp. including Bacteroides fragilis is currently limited to three genome-sequenced phages. Here we describe the isolation from sewage wastewater and genome of a lytic phage, vB_BfrS_23, that infects and kills B. fragilis strain GB124. Transmission electron microscopy identified this phage as a member of the Siphoviridae family. The phage is stable when held at temperatures of 4 and 60°C for 1 h. It has a very narrow host range, only infecting one host from a panel of B. fragilis strains (n = 8). Whole-genome sequence analyses of vB_BfrS_23 determined it is double-stranded DNA phage and is circularly permuted, with a genome of 48,011 bp. The genome encodes 73 putative open reading frames. We also sequenced the host bacterium, B. fragilis GB124 (5.1 Mb), which has two plasmids of 43,923 and 4,138 bp. Although this phage is host specific, its isolation together with the detailed characterization of the host B. fragilis GB124 featured in this study represent a useful starting point from which to facilitate the future development of highly specific therapeutic agents. Furthermore, the phage could be a novel tool in determining water (and water reuse) treatment efficacy, and for identifying human fecal transmission pathways within contaminated environmental waters and foodstuffs.
Collapse
Affiliation(s)
- Mohammad A. Tariq
- Gut Microbes and Health Research Programme, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
| | - Fiona Newberry
- Gut Microbes and Health Research Programme, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| | - Rik Haagmans
- Gut Microbes and Health Research Programme, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| | - Catherine Booth
- Gut Microbes and Health Research Programme, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
| | - Tom Wileman
- Gut Microbes and Health Research Programme, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| | - Lesley Hoyles
- Department of Biosciences, Nottingham Trent University, Nottingham, United Kingdom
| | - Martha R. J. Clokie
- Department of Genetics and Genome Biology, Leicester University, Leicester, United Kingdom
| | - James Ebdon
- Environment and Public Health Research Group, School of Environment and Technology, University of Brighton, Brighton, United Kingdom
| | - Simon R. Carding
- Gut Microbes and Health Research Programme, Quadram Institute Biosciences, Norwich Research Park, Norwich, United Kingdom
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
| |
Collapse
|
9
|
Deaton J, Yu FB, Quake SR. Mini-Metagenomics and Nucleotide Composition Aid the Identification and Host Association of Novel Bacteriophage Sequences. ACTA ACUST UNITED AC 2019; 3:e1900108. [PMID: 32648690 DOI: 10.1002/adbi.201900108] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2019] [Revised: 07/10/2019] [Indexed: 11/07/2022]
Abstract
A broad spectrum of metagenomic and single cell sequencing techniques have become popular for dissecting environmental microbial diversity, leading to the characterization of thousands of novel microbial lineages. In addition to recovering bacterial and archaeal genomes, metagenomic assembly can also produce genomes of viruses that infect microbial cells. Because of their diversity, lack of marker genes, and small genome size, identifying novel bacteriophage sequences from metagenomic data is often challenging, especially when the objective is to establish phage-host relationships. The present work describes a computational approach that uses supervised learning to classify metagenomic contigs as phage or non-phage as well as assigning phage taxonomy based on tetranucleotide frequencies. Furthermore, the method assigns phage-host relationships using co-occurrence statistics derived from a recently developed mini-metagenomic experimental technique. This work evaluates method performance at identifying viral contigs and predicting taxonomic classification using publicly available references. Then, using two mini-metagenomic datasets, over 100 novel phage contigs from hot spring samples of Yellowstone National Park are identified and assigned to putative microbial hosts. Results of this work demonstrate the value of combining viral sequence identification with mini-metagenomic experimental methods to understand the microbial ecosystem.
Collapse
Affiliation(s)
- Jonathan Deaton
- Department of Bioengineering, Stanford University, 443 Via Ortega, Stanford, CA, 94305, USA
| | - Feiqiao Brian Yu
- Department of Bioengineering, Stanford University, 443 Via Ortega, Stanford, CA, 94305, USA.,Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA, 94158, USA
| | - Stephen R Quake
- Department of Bioengineering, Stanford University, 443 Via Ortega, Stanford, CA, 94305, USA.,Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA, 94158, USA
| |
Collapse
|
10
|
Classifying the Unclassified: A Phage Classification Method. Viruses 2019; 11:v11020195. [PMID: 30813498 PMCID: PMC6409715 DOI: 10.3390/v11020195] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 02/06/2019] [Accepted: 02/20/2019] [Indexed: 01/21/2023] Open
Abstract
This work reports the method ClassiPhage to classify phage genomes using sequence derived taxonomic features. ClassiPhage uses a set of phage specific Hidden Markov Models (HMMs) generated from clusters of related proteins. The method was validated on all publicly available genomes of phages that are known to infect Vibrionaceae. The phages belong to the well-described phage families of Myoviridae, Podoviridae, Siphoviridae, and Inoviridae. The achieved classification is consistent with the assignments of the International Committee on Taxonomy of Viruses (ICTV), all tested phages were assigned to the corresponding group of the ICTV-database. In addition, 44 out of 58 genomes of Vibrio phages not yet classified could be assigned to a phage family. The remaining 14 genomes may represent phages of new families or subfamilies. Comparative genomics indicates that the ability of the approach to identify and classify phages is correlated to the conserved genomic organization. ClassiPhage classifies phages exclusively based on genome sequence data and can be applied on distinct phage genomes as well as on prophage regions within host genomes. Possible applications include (a) classifying phages from assembled metagenomes; and (b) the identification and classification of integrated prophages and the splitting of phage families into subfamilies.
Collapse
|
11
|
Shigella Phages Isolated during a Dysentery Outbreak Reveal Uncommon Structures and Broad Species Diversity. J Virol 2018; 92:JVI.02117-17. [PMID: 29437962 DOI: 10.1128/jvi.02117-17] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 01/09/2018] [Indexed: 12/17/2022] Open
Abstract
In 2016, Michigan experienced the largest outbreak of shigellosis, a type of bacillary dysentery caused by Shigella spp., since 1988. Following this outbreak, we isolated 16 novel Shigella-infecting bacteriophages (viruses that infect bacteria) from environmental water sources. Most well-known bacteriophages infect the common laboratory species Escherichia coli and Salmonella enterica, and these phages have built the foundation of molecular and bacteriophage biology. Until now, comparatively few bacteriophages were known to infect Shigella spp., which are close relatives of E. coli We present a comprehensive analysis of these phages' host ranges, genomes, and structures, revealing genome sizes and capsid properties that are shared by very few previously described phages. After sequencing, a majority of the Shigella phages were found to have genomes of an uncommon size, shared by only 2% of all reported phage genomes. To investigate the structural implications of this unusual genome size, we used cryo-electron microscopy to resolve their capsid structures. We determined that these bacteriophage capsids have similarly uncommon geometry. Only two other viruses with this capsid structure have been described. Since most well-known bacteriophages infect Escherichia or Salmonella, our understanding of bacteriophages has been limited to a subset of well-described systems. Continuing to isolate phages using nontraditional strains of bacteria can fill gaps that currently exist in bacteriophage biology. In addition, the prevalence of Shigella phages during a shigellosis outbreak may suggest a potential impact of human health epidemics on local microbial communities.IMPORTANCEShigella spp. bacteria are causative agents of dysentery and affect more than 164 million people worldwide every year. Despite the need to combat antibiotic-resistant Shigella strains, relatively few Shigella-infecting bacteriophages have been described. By specifically looking for Shigella-infecting phages, this work has identified new isolates that (i) may be useful to combat Shigella infections and (ii) fill gaps in our knowledge of bacteriophage biology. The rare qualities of these new isolates emphasize the importance of isolating phages on "nontraditional" laboratory strains of bacteria to more fully understand both the basic biology and diversity of bacteriophages.
Collapse
|
12
|
Hurwitz BL, Ponsero A, Thornton J, U'Ren JM. Phage hunters: Computational strategies for finding phages in large-scale 'omics datasets. Virus Res 2017; 244:110-115. [PMID: 29100906 DOI: 10.1016/j.virusres.2017.10.019] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2017] [Revised: 10/27/2017] [Accepted: 10/30/2017] [Indexed: 01/26/2023]
Abstract
A plethora of tools exist for identifying phage sequences in bacterial genomes, single cell amplified genomes, and host-associated and environmental metagenomes. Yet because the genetics of phages and their hosts are closely intertwined, distinguishing viral from bacterial signal remains an ongoing challenge. Further the size, quantity and fragmentary nature of modern 'omics datasets ushers in a new set of computational challenges. Here, we detail the promises and pitfalls of using currently available gene-centric or k-mer based tools for identifying prophage sequences in genomes and prophage and viral contigs in metagenomes. Each of these methods offers a unique piece of the puzzle to elucidating the intriguing signatures of phage-host coevolution.
Collapse
Affiliation(s)
- Bonnie L Hurwitz
- Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson, AZ 85719, United States; BIO5 Research Institute, University of Arizona, Tucson, AZ 85719, United States.
| | - Alise Ponsero
- Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson, AZ 85719, United States
| | - James Thornton
- Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson, AZ 85719, United States
| | - Jana M U'Ren
- Department of Agricultural and Biosystems Engineering, University of Arizona, Tucson, AZ 85719, United States; BIO5 Research Institute, University of Arizona, Tucson, AZ 85719, United States
| |
Collapse
|
13
|
van Zyl LJ, Nemavhulani S, Cass J, Cowan DA, Trindade M. Three novel bacteriophages isolated from the East African Rift Valley soda lakes. Virol J 2016; 13:204. [PMID: 27912769 PMCID: PMC5135824 DOI: 10.1186/s12985-016-0656-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2016] [Accepted: 11/21/2016] [Indexed: 12/21/2022] Open
Abstract
Background Soda lakes are unique environments in terms of their physical characteristics and the biology they harbour. Although well studied with respect to their microbial composition, their viral compositions have not, and consequently few bacteriophages that infect bacteria from haloalkaline environments have been described. Methods Bacteria were isolated from sediment samples of lakes Magadi and Shala. Three phages were isolated on two different Bacillus species and one Paracoccus species using agar overlays. The growth characteristics of each phage in its host was investigated and the genome sequences determined and analysed by comparison with known phages. Results Phage Shbh1 belongs to the family Myoviridae while Mgbh1 and Shpa belong to the Siphoviridae family. Tetranucleotide usage frequencies and G + C content suggests that Shbh1 and Mgbh1 do not regularly infect, and have therefore not evolved with, the hosts they were isolated on here. Shbh1 was shown capable of infecting two different Bacillus species from the two different lakes demonstrating its potential broad-host range. Comparative analysis of their genome sequence with known phages revealed that, although novel, Shbh1 does share substantial amino acid similarity with previously described Bacillus infecting phages (Grass, phiNIT1 and phiAGATE) and belongs to the Bastille group, while Mgbh1 and Shpa are highly novel. Conclusion The addition of these phages to current databases should help with metagenome/metavirome annotation efforts. We describe a highly novel Paracoccus infecting virus (Shpa) which together with NgoΦ6 and vB_PmaS_IMEP1 is one of only three phages known to infect Paracoccus species but does not show similarity to these phages. Electronic supplementary material The online version of this article (doi:10.1186/s12985-016-0656-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Leonardo Joaquim van Zyl
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville, Cape Town, 7535, South Africa.
| | - Shonisani Nemavhulani
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville, Cape Town, 7535, South Africa
| | - James Cass
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville, Cape Town, 7535, South Africa
| | - Donald Arthur Cowan
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville, Cape Town, 7535, South Africa.,Department of Genetics, University of Pretoria, Pretoria, 0002, South Africa
| | - Marla Trindade
- Institute for Microbial Biotechnology and Metagenomics (IMBM), Department of Biotechnology, University of the Western Cape, Robert Sobukwe Road, Bellville, Cape Town, 7535, South Africa
| |
Collapse
|
14
|
Ahlgren NA, Ren J, Lu YY, Fuhrman JA, Sun F. Alignment-free $d_2^*$ oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences. Nucleic Acids Res 2016; 45:39-53. [PMID: 27899557 PMCID: PMC5224470 DOI: 10.1093/nar/gkw1002] [Citation(s) in RCA: 167] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 10/31/2016] [Indexed: 01/17/2023] Open
Abstract
Viruses and their host genomes often share similar oligonucleotide frequency (ONF) patterns, which can be used to predict the host of a given virus by finding the host with the greatest ONF similarity. We comprehensively compared 11 ONF metrics using several k-mer lengths for predicting host taxonomy from among ∼32 000 prokaryotic genomes for 1427 virus isolate genomes whose true hosts are known. The background-subtracting measure [Formula: see text] at k = 6 gave the highest host prediction accuracy (33%, genus level) with reasonable computational times. Requiring a maximum dissimilarity score for making predictions (thresholding) and taking the consensus of the 30 most similar hosts further improved accuracy. Using a previous dataset of 820 bacteriophage and 2699 bacterial genomes, [Formula: see text] host prediction accuracies with thresholding and consensus methods (genus-level: 64%) exceeded previous Euclidian distance ONF (32%) or homology-based (22-62%) methods. When applied to metagenomically-assembled marine SUP05 viruses and the human gut virus crAssphage, [Formula: see text]-based predictions overlapped (i.e. some same, some different) with the previously inferred hosts of these viruses. The extent of overlap improved when only using host genomes or metagenomic contigs from the same habitat or samples as the query viruses. The [Formula: see text] ONF method will greatly improve the characterization of novel, metagenomic viruses.
Collapse
Affiliation(s)
- Nathan A Ahlgren
- Department of Biological Sciences, University of Southern California, 3616 Trousdale Pkwy Los, Angeles, CA 90089, USA
| | - Jie Ren
- Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA
| | - Yang Young Lu
- Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA
| | - Jed A Fuhrman
- Department of Biological Sciences, University of Southern California, 3616 Trousdale Pkwy Los, Angeles, CA 90089, USA
| | - Fengzhu Sun
- Department of Biological Sciences, University of Southern California, 3616 Trousdale Pkwy Los, Angeles, CA 90089, USA.,Molecular and Computational Biology Program, University of Southern California, 1050 Childs Way, Los Angeles, CA 90089, USA.,Center for Computational Systems Biology, Fudan University, Shanghai 200433, China
| |
Collapse
|
15
|
Karamichalis R, Kari L, Konstantinidis S, Kopecki S, Solis-Reyes S. Additive methods for genomic signatures. BMC Bioinformatics 2016; 17:313. [PMID: 27549194 PMCID: PMC4994249 DOI: 10.1186/s12859-016-1157-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Accepted: 07/19/2016] [Indexed: 01/09/2023] Open
Abstract
Background Studies exploring the potential of Chaos Game Representations (CGR) of genomic sequences to act as “genomic signatures” (to be species- and genome-specific) showed that CGR patterns of nuclear and organellar DNA sequences of the same organism can be very different. While the hypothesis that CGRs of mitochondrial DNA sequences can act as genomic signatures was validated for a snapshot of all sequenced mitochondrial genomes available in the NCBI GenBank sequence database, to our knowledge no such extensive analysis of CGRs of nuclear DNA sequences exists to date. Results We analyzed an extensive dataset, totalling 1.45 gigabase pairs, of nuclear/nucleoid genomic sequences (nDNA) from 42 different organisms, spanning all major kingdoms of life. Our computational experiments indicate that CGR signatures of nDNA of two different origins cannot always be differentiated, especially if they originate from closely-related species such as H. sapiens and P. troglodytes or E. coli and E. fergusonii. To address this issue, we propose the general concept of additive DNA signature of a set (collection) of DNA sequences. One particular instance, the composite DNA signature, combines information from nDNA fragments and organellar (mitochondrial, chloroplast, or plasmid) genomes. We demonstrate that, in this dataset, composite DNA signatures originating from two different organisms can be differentiated in all cases, including those where the use of CGR signatures of nDNA failed or was inconclusive. Another instance, the assembled DNA signature, combines information from many short DNA subfragments (e.g., 100 basepairs) of a given DNA fragment, to produce its signature. We show that an assembled DNA signature has the same distinguishing power as a conventionally computed CGR signature, while using shorter contiguous sequences and potentially less sequence information. Conclusions Our results suggest that, while CGR signatures of nDNA cannot always play the role of genomic signatures, composite and assembled DNA signatures (separately or in combination) could potentially be used instead. Such additive signatures could be used, e.g., with raw unassembled next-generation sequencing (NGS) read data, when high-quality sequencing data is not available, or to complement information obtained by other methods of species identification or classification. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1157-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rallis Karamichalis
- Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada
| | - Lila Kari
- School of Computing Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada. .,Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada.
| | - Stavros Konstantinidis
- Department of Mathematics and Computing Science, Saint Mary's University, Halifax NS, Canada
| | - Steffen Kopecki
- Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada.,Department of Mathematics and Computing Science, Saint Mary's University, Halifax NS, Canada
| | - Stephen Solis-Reyes
- Department of Computer Science, University of Western Ontario, London ON, N6A 5B7, Canada
| |
Collapse
|
16
|
Karamichalis R, Kari L, Konstantinidis S, Kopecki S. An investigation into inter- and intragenomic variations of graphic genomic signatures. BMC Bioinformatics 2015; 16:246. [PMID: 26249837 PMCID: PMC4527362 DOI: 10.1186/s12859-015-0655-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 06/30/2015] [Indexed: 11/30/2022] Open
Abstract
Background Motivated by the general need to identify and classify species based on molecular evidence, genome comparisons have been proposed that are based on measuring mostly Euclidean distances between Chaos Game Representation (CGR) patterns of genomic DNA sequences. Results We provide, on an extensive dataset and using several different distances, confirmation of the hypothesis that CGR patterns are preserved along a genomic DNA sequence, and are different for DNA sequences originating from genomes of different species. This finding lends support to the theory that CGRs of genomic sequences can act as graphic genomic signatures. In particular, we compare the CGR patterns of over five hundred different 150,000 bp genomic sequences spanning one complete chromosome from each of six organisms, representing all kingdoms of life: H. sapiens (Animalia; chromosome 21), S. cerevisiae (Fungi; chromosome 4), A. thaliana (Plantae; chromosome 1), P. falciparum (Protista; chromosome 14), E. coli (Bacteria - full genome), and P. furiosus (Archaea - full genome). To maximize the diversity within each species, we also analyze the interrelationships within a set of over five hundred 150,000 bp genomic sequences sampled from the entire aforementioned genomes. Lastly, we provide some preliminary evidence of this method’s ability to classify genomic DNA sequences at lower taxonomic levels by comparing sequences sampled from the entire genome of H. sapiens (class Mammalia, order Primates) and of M. musculus (class Mammalia, order Rodentia), for a total length of approximately 174 million basepairs analyzed. We compute pairwise distances between CGRs of these genomic sequences using six different distances, and construct Molecular Distance Maps, which visualize all sequences as points in a two-dimensional or three-dimensional space, to simultaneously display their interrelationships. Conclusion Our analysis confirms, for this dataset, that CGR patterns of DNA sequences from the same genome are in general quantitatively similar, while being different for DNA sequences from genomes of different species. Our assessment of the performance of the six distances analyzed uses three different quality measures and suggests that several distances outperform the Euclidean distance, which has so far been almost exclusively used for such studies.
Collapse
Affiliation(s)
- Rallis Karamichalis
- Department of Computer Science, University of Western Ontario, London, ON, Canada.
| | - Lila Kari
- Department of Computer Science, University of Western Ontario, London, ON, Canada.
| | - Stavros Konstantinidis
- Department of Mathematics and Computing Science, Saint Mary's University, Halifax, NS, Canada.
| | - Steffen Kopecki
- Department of Computer Science, University of Western Ontario, London, ON, Canada. .,Department of Mathematics and Computing Science, Saint Mary's University, Halifax, NS, Canada.
| |
Collapse
|
17
|
Kari L, Hill KA, Sayem AS, Karamichalis R, Bryans N, Davis K, Dattani NS. Mapping the space of genomic signatures. PLoS One 2015; 10:e0119815. [PMID: 26000734 PMCID: PMC4441465 DOI: 10.1371/journal.pone.0119815] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2014] [Accepted: 01/16/2015] [Indexed: 01/01/2023] Open
Abstract
We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR), is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM), implicitly compares the occurrences of oligomers of length up to k (herein k = 9) in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS) to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (super)kingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal, the Denisovan, and the chimp), and that the sequence most different from it in this dataset belongs to a cucumber.
Collapse
Affiliation(s)
- Lila Kari
- Department of Computer Science, University of Western Ontario, London, Ontario, Canada
| | - Kathleen A. Hill
- Department of Computer Science, University of Western Ontario, London, Ontario, Canada
- Department of Biology, University of Western Ontario, London, Ontario, Canada
| | - Abu S. Sayem
- Department of Computer Science, University of Western Ontario, London, Ontario, Canada
| | - Rallis Karamichalis
- Department of Computer Science, University of Western Ontario, London, Ontario, Canada
| | - Nathaniel Bryans
- Department of Computer Science, University of Western Ontario, London, Ontario, Canada
| | - Katelyn Davis
- Department of Biology, University of Western Ontario, London, Ontario, Canada
| | - Nikesh S. Dattani
- Physical and Theoretical Chemistry Laboratory, Department of Chemistry, Oxford University, Oxford, United Kingdom
| |
Collapse
|
18
|
Urbieta MS, Donati ER, Chan KG, Shahar S, Sin LL, Goh KM. Thermophiles in the genomic era: Biodiversity, science, and applications. Biotechnol Adv 2015; 33:633-47. [PMID: 25911946 DOI: 10.1016/j.biotechadv.2015.04.007] [Citation(s) in RCA: 75] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 12/18/2014] [Accepted: 04/14/2015] [Indexed: 01/30/2023]
Abstract
Thermophiles and hyperthermophiles are present in various regions of the Earth, including volcanic environments, hot springs, mud pots, fumaroles, geysers, coastal thermal springs, and even deep-sea hydrothermal vents. They are also found in man-made environments, such as heated compost facilities, reactors, and spray dryers. Thermophiles, hyperthermophiles, and their bioproducts facilitate various industrial, agricultural, and medicinal applications and offer potential solutions to environmental damages and the demand for biofuels. Intensified efforts to sequence the entire genome of hyperthermophiles and thermophiles are increasing rapidly, as evidenced by the fact that over 120 complete genome sequences of the hyperthermophiles Aquificae, Thermotogae, Crenarchaeota, and Euryarchaeota are now available. In this review, we summarise the major current applications of thermophiles and thermozymes. In addition, emphasis is placed on recent progress in understanding the biodiversity, genomes, transcriptomes, metagenomes, and single-cell sequencing of thermophiles in the genomic era.
Collapse
Affiliation(s)
- M Sofía Urbieta
- CINDEFI (CCT La Plata-CONICET, UNLP), Facultad de Ciencias Exactas, Universidad Nacional de La Plata, Calle 47 y 115, 1900 La Plata, Argentina
| | - Edgardo R Donati
- CINDEFI (CCT La Plata-CONICET, UNLP), Facultad de Ciencias Exactas, Universidad Nacional de La Plata, Calle 47 y 115, 1900 La Plata, Argentina
| | - Kok-Gan Chan
- Division of Genetics and Molecular Biology, Institute of Biological Sciences, Faculty of Science, University of Malaya, 50603 Kuala Lumpur, Malaysia
| | - Saleha Shahar
- Faculty of Biosciences and Medical Engineering, Universiti Teknologi Malaysia, 81310 Johor Bahru, Malaysia
| | - Lee Li Sin
- Faculty of Biosciences and Medical Engineering, Universiti Teknologi Malaysia, 81310 Johor Bahru, Malaysia
| | - Kian Mau Goh
- Faculty of Biosciences and Medical Engineering, Universiti Teknologi Malaysia, 81310 Johor Bahru, Malaysia.
| |
Collapse
|
19
|
Prabhakaran R, Chithambaram S, Xia X. Escherichia coli and Staphylococcus phages: effect of translation initiation efficiency on differential codon adaptation mediated by virulent and temperate lifestyles. J Gen Virol 2015; 96:1169-1179. [PMID: 25614589 PMCID: PMC4631060 DOI: 10.1099/vir.0.000050] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 01/11/2015] [Indexed: 12/19/2022] Open
Abstract
Rapid biosynthesis is key to the success of bacteria and viruses. Highly expressed genes in bacteria exhibit a strong codon bias corresponding to the differential availability of tRNAs. However, a large clade of lambdoid coliphages exhibits relatively poor codon adaptation to the host translation machinery, in contrast to other coliphages that exhibit strong codon adaptation to the host. Three possible explanations were previously proposed but dismissed: (1) the phage-borne tRNA genes that reduce the dependence of phage translation on host tRNAs, (2) lack of time needed for evolving codon adaptation due to recent host switching, and (3) strong strand asymmetry with biased mutation disrupting codon adaptation. Here, we examined the possibility that phages with relatively poor codon adaptation have poor translation initiation which would weaken the selection on codon adaptation. We measured translation initiation by: (1) the strength and position of the Shine-Dalgarno (SD) sequence, and (2) the stability of the secondary structure of sequences flanking the SD and start codon known to affect accessibility of the SD sequence and start codon. Phage genes with strong codon adaptation had significantly stronger SD sequences than those with poor codon adaptation. The former also had significantly weaker secondary structure in sequences flanking the SD sequence and start codon than the latter. Thus, lambdoid phages do not exhibit strong codon adaptation because they have relatively inefficient translation initiation and would benefit little from increased elongation efficiency. We also provided evidence suggesting that phage lifestyle (virulent versus temperate) affected selection intensity on the efficiency of translation initiation and elongation.
Collapse
Affiliation(s)
- Ramanandan Prabhakaran
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, PO Box 450, Station A, Ottawa, Ontario K1N 6N5, Canada
| | - Shivapriya Chithambaram
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, PO Box 450, Station A, Ottawa, Ontario K1N 6N5, Canada
| | - Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, PO Box 450, Station A, Ottawa, Ontario K1N 6N5, Canada
- Correspondence Xuhua Xia
| |
Collapse
|
20
|
Ogilvie LA, Bowler LD, Caplin J, Dedi C, Diston D, Cheek E, Taylor H, Ebdon JE, Jones BV. Genome signature-based dissection of human gut metagenomes to extract subliminal viral sequences. Nat Commun 2014; 4:2420. [PMID: 24036533 PMCID: PMC3778543 DOI: 10.1038/ncomms3420] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Accepted: 08/08/2013] [Indexed: 12/20/2022] Open
Abstract
Bacterial viruses (bacteriophages) have a key role in shaping the development and functional outputs of host microbiomes. Although metagenomic approaches have greatly expanded our understanding of the prokaryotic virosphere, additional tools are required for the phage-oriented dissection of metagenomic data sets, and host-range affiliation of recovered sequences. Here we demonstrate the application of a genome signature-based approach to interrogate conventional whole-community metagenomes and access subliminal, phylogenetically targeted, phage sequences present within. We describe a portion of the biological dark matter extant in the human gut virome, and bring to light a population of potentially gut-specific Bacteroidales-like phage, poorly represented in existing virus like particle-derived viral metagenomes. These predominantly temperate phage were shown to encode functions of direct relevance to human health in the form of antibiotic resistance genes, and provided evidence for the existence of putative ‘viral-enterotypes’ among this fraction of the human gut virome. Bacteriophages have a significant impact on microbial ecosystems, but additional tools are needed to assess viral communities. Ogilvie et al. present a new strategy to extract viral sequences from metagenomic data sets, and present new insights on their function in the gut ecosystem.
Collapse
Affiliation(s)
- Lesley A Ogilvie
- Centre for Biomedical and Health Science Research, School of Pharmacy and Biomolecular Sciences, University of Brighton, Brighton BN2 4GJ, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Chithambaram S, Prabhakaran R, Xia X. Differential codon adaptation between dsDNA and ssDNA phages in Escherichia coli. Mol Biol Evol 2014; 31:1606-17. [PMID: 24586046 PMCID: PMC4032129 DOI: 10.1093/molbev/msu087] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Because phages use their host translation machinery, their codon usage should evolve toward that of highly expressed host genes. We used two indices to measure codon adaptation of phages to their host, rRSCU (the correlation in relative synonymous codon usage [RSCU] between phages and their host) and Codon Adaptation Index (CAI) computed with highly expressed host genes as the reference set (because phage translation depends on host translation machinery). These indices used for this purpose are appropriate only when hosts exhibit little mutation bias, so only phages parasitizing Escherichia coli were included in the analysis. For double-stranded DNA (dsDNA) phages, both rRSCU and CAI decrease with increasing number of transfer RNA genes encoded by the phage genome. rRSCU is greater for dsDNA phages than for single-stranded DNA (ssDNA) phages, and the low rRSCU values are mainly due to poor concordance in RSCU values for Y-ending codons between ssDNA phages and the E. coli host, consistent with the predicted effect of C→T mutation bias in the ssDNA phages. Strong C→T mutation bias would improve codon adaptation in codon families (e.g., Gly) where U-ending codons are favored over C-ending codons (“U-friendly” codon families) by highly expressed host genes but decrease codon adaptation in other codon families where highly expressed host genes favor C-ending codons against U-ending codons (“U-hostile” codon families). It is remarkable that ssDNA phages with increasing C→T mutation bias also increased the usage of codons in the “U-friendly” codon families, thereby achieving CAI values almost as large as those of dsDNA phages. This represents a new type of codon adaptation.
Collapse
Affiliation(s)
- Shivapriya Chithambaram
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, Ontario, Canada
| | - Ramanandan Prabhakaran
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, Ontario, Canada
| | - Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
22
|
Schoenfeld TW, Murugapiran SK, Dodsworth JA, Floyd S, Lodes M, Mead DA, Hedlund BP. Lateral gene transfer of family A DNA polymerases between thermophilic viruses, aquificae, and apicomplexa. Mol Biol Evol 2013; 30:1653-64. [PMID: 23608703 DOI: 10.1093/molbev/mst078] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Bioinformatics and functional screens identified a group of Family A-type DNA Polymerase (polA) genes encoded by viruses inhabiting circumneutral and alkaline hot springs in Yellowstone National Park and the US Great Basin. The proteins encoded by these viral polA genes (PolAs) shared no significant sequence similarity with any known viral proteins but were remarkably similar to PolAs encoded by two of three families of the bacterial phylum Aquificae and by several apicoplast-targeted PolA-like proteins found in the eukaryotic phylum Apicomplexa, which includes the obligate parasites Plasmodium, Babesia, and Toxoplasma. The viral gene products share signature elements previously associated only with Aquificae and Apicomplexa PolA-like proteins and were similar to proteins encoded by prophage elements of a variety of otherwise unrelated Bacteria, each of which additionally encoded a prototypical bacterial PolA. Unique among known viral DNA polymerases, the viral PolA proteins of this study share with the Apicomplexa proteins large amino-terminal domains with putative helicase/primase elements but low primary sequence similarity. The genomic context and distribution, phylogeny, and biochemistry of these PolA proteins suggest that thermophilic viruses transferred polA genes to the Apicomplexa, likely through secondary endosymbiosis of a virus-infected proto-apicoplast, and to the common ancestor of two of three Aquificae families, where they displaced the orthologous cellular polA gene. On the basis of biochemical activity, gene structure, and sequence similarity, we speculate that the xenologous viral-type polA genes may have functions associated with diversity-generating recombination in both Bacteria and Apicomplexa.
Collapse
|
23
|
Ogilvie LA, Caplin J, Dedi C, Diston D, Cheek E, Bowler L, Taylor H, Ebdon J, Jones BV. Comparative (meta)genomic analysis and ecological profiling of human gut-specific bacteriophage φB124-14. PLoS One 2012; 7:e35053. [PMID: 22558115 PMCID: PMC3338817 DOI: 10.1371/journal.pone.0035053] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2011] [Accepted: 03/08/2012] [Indexed: 12/30/2022] Open
Abstract
Bacteriophage associated with the human gut microbiome are likely to have an important impact on community structure and function, and provide a wealth of biotechnological opportunities. Despite this, knowledge of the ecology and composition of bacteriophage in the gut bacterial community remains poor, with few well characterized gut-associated phage genomes currently available. Here we describe the identification and in-depth (meta)genomic, proteomic, and ecological analysis of a human gut-specific bacteriophage (designated φB124-14). In doing so we illuminate a fraction of the biological dark matter extant in this ecosystem and its surrounding eco-genomic landscape, identifying a novel and uncharted bacteriophage gene-space in this community. φB124-14 infects only a subset of closely related gut-associated Bacteroides fragilis strains, and the circular genome encodes functions previously found to be rare in viral genomes and human gut viral metagenome sequences, including those which potentially confer advantages upon phage and/or host bacteria. Comparative genomic analyses revealed φB124-14 is most closely related to φB40-8, the only other publically available Bacteroides sp. phage genome, whilst comparative metagenomic analysis of both phage failed to identify any homologous sequences in 136 non-human gut metagenomic datasets searched, supporting the human gut-specific nature of this phage. Moreover, a potential geographic variation in the carriage of these and related phage was revealed by analysis of their distribution and prevalence within 151 human gut microbiomes and viromes from Europe, America and Japan. Finally, ecological profiling of φB124-14 and φB40-8, using both gene-centric alignment-driven phylogenetic analyses, as well as alignment-free gene-independent approaches was undertaken. This not only verified the human gut-specific nature of both phage, but also indicated that these phage populate a distinct and unexplored ecological landscape within the human gut microbiome.
Collapse
Affiliation(s)
- Lesley A. Ogilvie
- Centre for Biomedical and Health Science Research, School of Pharmacy and Biomolecular Sciences, University of Brighton, Brighton, United Kingdom
| | - Jonathan Caplin
- School of Environment and Technology, University of Brighton, Brighton, United Kingdom
| | - Cinzia Dedi
- Centre for Biomedical and Health Science Research, School of Pharmacy and Biomolecular Sciences, University of Brighton, Brighton, United Kingdom
| | - David Diston
- School of Environment and Technology, University of Brighton, Brighton, United Kingdom
| | - Elizabeth Cheek
- School of Computing, Engineering and Mathematics, University of Brighton, Brighton, United Kingdom
| | - Lucas Bowler
- Sussex Proteomics Centre, University of Sussex, Brighton, United Kingdom
| | - Huw Taylor
- School of Environment and Technology, University of Brighton, Brighton, United Kingdom
| | - James Ebdon
- School of Environment and Technology, University of Brighton, Brighton, United Kingdom
| | - Brian V. Jones
- Centre for Biomedical and Health Science Research, School of Pharmacy and Biomolecular Sciences, University of Brighton, Brighton, United Kingdom
- * E-mail:
| |
Collapse
|
24
|
High diversity and novel species of Pseudomonas aeruginosa bacteriophages. Appl Environ Microbiol 2012; 78:4510-5. [PMID: 22504803 DOI: 10.1128/aem.00065-12] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The diversity of Pseudomonas aeruginosa bacteriophages was investigated using a collection of 68 phages isolated from Central Mexico. Most of the phages carried double-stranded DNA (dsDNA) genomes and were classified into 12 species. Comparison of the genomes of selected archetypal phages with extant sequences in GenBank resulted in the identification of six novel species. This finding increased the group diversity by ~30%. The great diversity of phage species could be related to the ubiquitous nature of P. aeruginosa.
Collapse
|
25
|
Dutta C, Paul S. Microbial lifestyle and genome signatures. Curr Genomics 2012; 13:153-62. [PMID: 23024607 PMCID: PMC3308326 DOI: 10.2174/138920212799860698] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2011] [Revised: 09/13/2011] [Accepted: 09/28/2011] [Indexed: 12/29/2022] Open
Abstract
Microbes are known for their unique ability to adapt to varying lifestyle and environment, even to the extreme or adverse ones. The genomic architecture of a microbe may bear the signatures not only of its phylogenetic position, but also of the kind of lifestyle to which it is adapted. The present review aims to provide an account of the specific genome signatures observed in microbes acclimatized to distinct lifestyles or ecological niches. Niche-specific signatures identified at different levels of microbial genome organization like base composition, GC-skew, purine-pyrimidine ratio, dinucleotide abundance, codon bias, oligonucleotide composition etc. have been discussed. Among the specific cases highlighted in the review are the phenomena of genome shrinkage in obligatory host-restricted microbes, genome expansion in strictly intra-amoebal pathogens, strand-specific codon usage in intracellular species, acquisition of genome islands in pathogenic or symbiotic organisms, discriminatory genomic traits of marine microbes with distinct trophic strategies, and conspicuous sequence features of certain extremophiles like those adapted to high temperature or high salinity.
Collapse
Affiliation(s)
- Chitra Dutta
- Structural Biology & Bioinformatics Division, CSIR- Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700032, India
| | | |
Collapse
|
26
|
Mokili JL, Rohwer F, Dutilh BE. Metagenomics and future perspectives in virus discovery. Curr Opin Virol 2012; 2:63-77. [PMID: 22440968 PMCID: PMC7102772 DOI: 10.1016/j.coviro.2011.12.004] [Citation(s) in RCA: 401] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Revised: 12/09/2011] [Accepted: 12/12/2011] [Indexed: 01/21/2023]
Abstract
Monitoring the emergence and re-emergence of viral diseases with the goal of containing the spread of viral agents requires both adequate preparedness and quick response. Identifying the causative agent of a new epidemic is one of the most important steps for effective response to disease outbreaks. Traditionally, virus discovery required propagation of the virus in cell culture, a proven technique responsible for the identification of the vast majority of viruses known to date. However, many viruses cannot be easily propagated in cell culture, thus limiting our knowledge of viruses. Viral metagenomic analyses of environmental samples suggest that the field of virology has explored less than 1% of the extant viral diversity. In the last decade, the culture-independent and sequence-independent metagenomic approach has permitted the discovery of many viruses in a wide range of samples. Phylogenetically, some of these viruses are distantly related to previously discovered viruses. In addition, 60-99% of the sequences generated in different viral metagenomic studies are not homologous to known viruses. In this review, we discuss the advances in the area of viral metagenomics during the last decade and their relevance to virus discovery, clinical microbiology and public health. We discuss the potential of metagenomics for characterization of the normal viral population in a healthy community and identification of viruses that could pose a threat to humans through zoonosis. In addition, we propose a new model of the Koch's postulates named the 'Metagenomic Koch's Postulates'. Unlike the original Koch's postulates and the Molecular Koch's postulates as formulated by Falkow, the metagenomic Koch's postulates focus on the identification of metagenomic traits in disease cases. The metagenomic traits that can be traced after healthy individuals have been exposed to the source of the suspected pathogen.
Collapse
Affiliation(s)
- John L Mokili
- Department of Biology, San Diego State University, San Diego, CA 92182, USA.
| | | | | |
Collapse
|
27
|
McNair K, Bailey BA, Edwards RA. PHACTS, a computational approach to classifying the lifestyle of phages. ACTA ACUST UNITED AC 2012; 28:614-8. [PMID: 22238260 PMCID: PMC3289917 DOI: 10.1093/bioinformatics/bts014] [Citation(s) in RCA: 171] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Motivation: Bacteriophages have two distinct lifestyles: virulent and temperate. The virulent lifestyle has many implications for phage therapy, genomics and microbiology. Determining which lifestyle a newly sequenced phage falls into is currently determined using standard culturing techniques. Such laboratory work is not only costly and time consuming, but also cannot be used on phage genomes constructed from environmental sequencing. Therefore, a computational method that utilizes the sequence data of phage genomes is needed. Results: Phage Classification Tool Set (PHACTS) utilizes a novel similarity algorithm and a supervised Random Forest classifier to make a prediction whether the lifestyle of a phage, described by its proteome, is virulent or temperate. The similarity algorithm creates a training set from phages with known lifestyles and along with the lifestyle annotation, trains a Random Forest to classify the lifestyle of a phage. PHACTS predictions are shown to have a 99% precision rate. Availability and implementation: PHACTS was implemented in the PERL programming language and utilizes the FASTA program (Pearson and Lipman, 1988) and the R programming language library ‘Random Forest’ (Liaw and Weiner, 2010). The PHACTS software is open source and is available as downloadable stand-alone version or can be accessed online as a user-friendly web interface. The source code, help files and online version are available at http://www.phantome.org/PHACTS/. Contact:katelyn@rohan.sdsu.edu; redwards@sciences.sdsu.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Katelyn McNair
- Computational Science Research Center, Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182, USA.
| | | | | |
Collapse
|