1
|
Chakraborty J, Roy RP, Chatterjee R, Chaudhuri P. Performance assessment of genomic island prediction tools with an improved version of Design-Island. Comput Biol Chem 2022; 98:107698. [PMID: 35597186 DOI: 10.1016/j.compbiolchem.2022.107698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 04/01/2022] [Accepted: 05/11/2022] [Indexed: 11/03/2022]
Abstract
Genomic Islands (GIs) play an important role in the evolution and adaptation of prokaryotes. The origin and extent of ecological diversity of prokaryotes can be analyzed by comparing GIs across closely or distantly related prokaryotes. Understanding the importance of GI and to study the bacterial evolution, several GI prediction tools have been generated. An unsupervised method, Design-Island, was developed to identify GIs using Monte-Carlo statistical test on randomly selected segments of a chromosome. Here, in the present study Design-Island was modified with the incorporation of majority voting, multiple hypothesis testing correction. The performance of the modified version, Design-Island-II was tested and compared with the existing GI prediction tools. The performance assessment and benchmarking of the GI prediction tools require experimentally validated dataset, which is lacking. So, different datasets, generated or taken from literature were utilized to compare the sensitivity (SN), specificity (SP), precision (PPV) and accuracy (AC) of Design-Island-II. It showed substantial enhancement in term of SN, SP, PPV and AC, and significantly reduced the computation time of the algorithm. The performance of Design-Island-II has also been compared with several GI prediction tools using curated dataset of putative horizontally transferred genes. Design-Island-II showed the highest sensitivity and F1 score, comparable specificity, precision and accuracy in comparison to the other available methods. IslandViewer4 and Islander outperformed all the available methods in terms of AC and PPV respectively. Our study suggested Design-Island-II, IslandViewer4 and GIHunter among the top performing GI prediction tools considering both sensitivity and specificity of the methods.
Collapse
Affiliation(s)
- Joyeeta Chakraborty
- Human Genetics Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India.
| | - Rudra Prasad Roy
- Human Genetics Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India.
| | - Raghunath Chatterjee
- Human Genetics Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India.
| | - Probal Chaudhuri
- Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, 203 B T Road, Kolkata 700 108, India.
| |
Collapse
|
2
|
Gauthier CH, Abad L, Venbakkam AK, Malnak J, Russell D, Hatfull G. OUP accepted manuscript. Nucleic Acids Res 2022; 50:e75. [PMID: 35451479 PMCID: PMC9303363 DOI: 10.1093/nar/gkac273] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 03/11/2022] [Accepted: 04/06/2022] [Indexed: 11/26/2022] Open
Abstract
Advances in genome sequencing have produced hundreds of thousands of bacterial genome sequences, many of which have integrated prophages derived from temperate bacteriophages. These prophages play key roles by influencing bacterial metabolism, pathogenicity, antibiotic resistance, and defense against viral attack. However, they vary considerably even among related bacterial strains, and they are challenging to identify computationally and to extract precisely for comparative genomic analyses. Here, we describe DEPhT, a multimodal tool for prophage discovery and extraction. It has three run modes that facilitate rapid screening of large numbers of bacterial genomes, precise extraction of prophage sequences, and prophage annotation. DEPhT uses genomic architectural features that discriminate between phage and bacterial sequences for efficient prophage discovery, and targeted homology searches for precise prophage extraction. DEPhT is designed for prophage discovery in Mycobacterium genomes but can be adapted broadly to other bacteria. We deploy DEPhT to demonstrate that prophages are prevalent in Mycobacterium strains but are absent not only from the few well-characterized Mycobacterium tuberculosis strains, but also are absent from all ∼30 000 sequenced M. tuberculosis strains.
Collapse
Affiliation(s)
| | | | - Ananya K Venbakkam
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Julia Malnak
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Daniel A Russell
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Graham F Hatfull
- To whom correspondence should be addressed. Tel: +1 412 624 6975;
| |
Collapse
|
3
|
Yu R, Zhang Y, Xu Y, Schwarz S, Li XS, Shang YH, Du XD. Emergence of a tet(M) Variant Conferring Resistance to Tigecycline in Streptococcus suis. Front Vet Sci 2021; 8:709327. [PMID: 34490399 PMCID: PMC8417041 DOI: 10.3389/fvets.2021.709327] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 07/27/2021] [Indexed: 11/19/2022] Open
Abstract
The aim of this study was to gain insight into the resistance determinants conferring resistance to tigecycline in Streptococcus (S.) suis and to investigate the genetic elements involved in their horizontal transfer. A total of 31 tetracycline-resistant S. suis isolates were screened for tigecycline resistance by broth microdilution. S. suis isolate SC128 was subjected to whole genome sequencing with particular reference to resistance determinants involved in tigecycline resistance. Transferability of genomic island (GI) GISsuSC128 was investigated by transformation. The roles of tet(L) or tet(M) in contributing to tigecycline resistance in S. suis were confirmed by transformation using different tet(L)- or tet(M)-carrying constructs. Only S. suis SC128 showed a tigecycline resistance phenotype. A tet(L)-tet(M) and catA8 co-carrying GISsuSC128 was identified in this isolate. After transfer of the novel GI into a susceptible recipient, this recipient showed the same tigecycline resistance phenotype. Further transfer experiments with specific tet(L)- or tet(M)-carrying constructs confirmed that only tet(M), but not tet(L), contributes to resistance to tigecycline. Protein sequence analysis identified a Tet(M) variant, which is responsible for tigecycline resistance in S. suis SC128. It displayed 94.8% amino acid identity with the reference Tet(M) of Enterococcus faecium DO plasmid 1. To the best of our knowledge, this is the first time that a tet(M) variant conferring resistance to tigecycline was identified in S. suis. Its location on a GI will accelerate its transmission among the S. suis population.
Collapse
Affiliation(s)
- Rui Yu
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| | - Yue Zhang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| | - Yindi Xu
- Institute for Animal Husbandry and Veterinary Research, Henan Academy of Agricultural Sciences, Zhengzhou, China
| | - Stefan Schwarz
- Department of Veterinary Medicine, Centre for Infection Medicine, Institute of Microbiology and Epizootics, Freie Universität Berlin, Berlin, Germany
| | - Xin-Sheng Li
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| | - Yan-Hong Shang
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| | - Xiang-Dang Du
- College of Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| |
Collapse
|
4
|
Ibtehaz N, Ahmed I, Ahmed MS, Rahman MS, Azad RK, Bayzid MS. SSG-LUGIA: Single Sequence based Genome Level Unsupervised Genomic Island Prediction Algorithm. Brief Bioinform 2021; 22:6290171. [PMID: 34058749 DOI: 10.1093/bib/bbab116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 03/11/2021] [Accepted: 03/13/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Genomic Islands (GIs) are clusters of genes that are mobilized through horizontal gene transfer. GIs play a pivotal role in bacterial evolution as a mechanism of diversification and adaptation to different niches. Therefore, identification and characterization of GIs in bacterial genomes is important for understanding bacterial evolution. However, quantifying GIs is inherently difficult, and the existing methods suffer from low prediction accuracy and precision-recall trade-off. Moreover, several of them are supervised in nature, and thus, their applications to newly sequenced genomes are riddled with their dependency on the functional annotation of existing genomes. RESULTS We present SSG-LUGIA, a completely automated and unsupervised approach for identifying GIs and horizontally transferred genes. SSG-LUGIA is a novel method based on unsupervised anomaly detection technique, accompanied by further refinement using cues from signal processing literature. SSG-LUGIA leverages the atypical compositional biases of the alien genes to localize GIs in prokaryotic genomes. SSG-LUGIA was assessed on a large benchmark dataset `IslandPick' and on a set of 15 well-studied genomes in the literature and followed by a thorough analysis on the well-understood Salmonella typhi CT18 genome. Furthermore, the efficacy of SSG-LUGIA in identifying horizontally transferred genes was evaluated on two additional bacterial genomes, namely, those of Corynebacterium diphtheria NCTC13129 and Pseudomonas aeruginosa LESB58. SSG-LUGIA was examined on draft genomes and was demonstrated to be efficient as an ensemble method. CONCLUSIONS Our results indicate that SSG-LUGIA achieved superior performance in comparison to frequently used existing methods. Importantly, it yielded a better trade-off between precision and recall than the existing methods. Its nondependency on the functional annotation of genomes makes it suitable for analyzing newly sequenced, yet uncharacterized genomes. Thus, our study is a significant advance in identification of GIs and horizontally transferred genes. SSG-LUGIA is available as an open source software at https://nibtehaz.github.io/SSG-LUGIA/.
Collapse
Affiliation(s)
| | - Ishtiaque Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Md Sabbir Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - M Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Rajeev K Azad
- Department of Biological Sciences and BioDiscovery Institute, University of North Texas, Denton, TX, USA.,Department of Mathematics, University of North Texas, Denton, TX, USA
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| |
Collapse
|
5
|
Xie G, Fair JM. Hidden Markov Model: a shortest unique representative approach to detect the protein toxins, virulence factors and antibiotic resistance genes. BMC Res Notes 2021; 14:122. [PMID: 33785071 PMCID: PMC8011099 DOI: 10.1186/s13104-021-05531-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 03/15/2021] [Indexed: 12/04/2022] Open
Abstract
Objective Currently, next generation sequencing (NGS) is widely used to decode potential novel or variant pathogens both in emergent outbreaks and in routine clinical practice. However, the efficient identification of novel or diverged pathogenomic compositions remains a big challenge. It is especially true for short DNA sequence fragments from NGS, since sequence similarity searching is vulnerable to false negatives or false positives, as is mismatching or matching with unrelated proteins. Therefore, this study aimed to establish a bioinformatics approach that can generate unique motif sequences for profiling searching, resulting in high specificity and sensitivity. Results In this study, we introduced a Shortest Unique Representative Hidden Markov Model (HMM) approach to identify bacterial toxin, virulence factor (VF), and antimicrobial resistance (AR) in short sequence reads. We first construct unique representative domain sequences of toxin genes, VFs, and ARs to avoid potential false positives, and then to use HMM models to accurately identify potential toxin, VF, and AR fragments. The benchmark shows this approach can achieve relatively high specificity and sensitivity if the appropriate cutoff value is applied. Our approach can be used to recognize the protein sequences of known toxins and pathogens, identifies their common characteristics and then searches for similar sequences in other organisms.
Collapse
Affiliation(s)
- Gary Xie
- Biosecurity & Public Health, Los Alamos National Laboratory, Mailstop M888, Los Alamos, NM, 87545, USA.
| | - Jeanne M Fair
- Biosecurity & Public Health, Los Alamos National Laboratory, Mailstop M888, Los Alamos, NM, 87545, USA
| |
Collapse
|
6
|
Sass K, Güllert S, Streit WR, Perner M. A hydrogen-oxidizing bacterium enriched from the open ocean resembling a symbiont. ENVIRONMENTAL MICROBIOLOGY REPORTS 2020; 12:396-405. [PMID: 32338395 DOI: 10.1111/1758-2229.12847] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Revised: 03/31/2020] [Accepted: 04/21/2020] [Indexed: 06/11/2023]
Abstract
A new autotrophic hydrogen-oxidizing Chromatiaceae bacterium, namely bacterium CTD079, was enriched from a water column sample at 1500 m water depth in the southern Pacific Ocean. Based on the phylogeny of 16S rRNA genes, it was closely related to a scaly snail endosymbiont (99.2% DNA sequence identity) whose host so far is only known to colonize hydrothermal vents along the Indian ridge. The average nucleotide identity between the genomes of CTD079 and the snail endosymbiont was 91%. The observed differences likely reflect adaptations to their specific habitats. For example, CTD079 encodes additional enzymes like the formate dehydrogenase increasing the organism's spectrum of energy generation pathways. Other additional physiological features of CTD079 included the increase of viral defence strategies, secretion systems and specific transporters for essential elements. These important genome characteristics suggest an adaptation to life in the open ocean.
Collapse
Affiliation(s)
- Katharina Sass
- Molecular Biology of Microbial Consortia, Universität Hamburg, Hamburg, Germany
- Microbiology and Biotechnology, Universität Hamburg, Hamburg, Germany
| | - Simon Güllert
- Microbiology and Biotechnology, Universität Hamburg, Hamburg, Germany
| | - Wolfgang R Streit
- Microbiology and Biotechnology, Universität Hamburg, Hamburg, Germany
| | - Mirjam Perner
- Molecular Biology of Microbial Consortia, Universität Hamburg, Hamburg, Germany
| |
Collapse
|
7
|
Bertelli C, Tilley KE, Brinkman FSL. Microbial genomic island discovery, visualization and analysis. Brief Bioinform 2020; 20:1685-1698. [PMID: 29868902 PMCID: PMC6917214 DOI: 10.1093/bib/bby042] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 04/30/2018] [Indexed: 12/27/2022] Open
Abstract
Horizontal gene transfer (also called lateral gene transfer) is a major mechanism for microbial genome evolution, enabling rapid adaptation and survival in specific niches. Genomic islands (GIs), commonly defined as clusters of bacterial or archaeal genes of probable horizontal origin, are of particular medical, environmental and/or industrial interest, as they disproportionately encode virulence factors and some antimicrobial resistance genes and may harbor entire metabolic pathways that confer a specific adaptation (solvent resistance, symbiosis properties, etc). As large-scale analyses of microbial genomes increases, such as for genomic epidemiology investigations of infectious disease outbreaks in public health, there is increased appreciation of the need to accurately predict and track GIs. Over the past decade, numerous computational tools have been developed to tackle the challenges inherent in accurate GI prediction. We review here the main types of GI prediction methods and discuss their advantages and limitations for a routine analysis of microbial genomes in this era of rapid whole-genome sequencing. An assessment is provided of 20 GI prediction software methods that use sequence-composition bias to identify the GIs, using a reference GI data set from 104 genomes obtained using an independent comparative genomics approach. Finally, we present guidelines to assist researchers in effectively identifying these key genomic regions.
Collapse
Affiliation(s)
- Claire Bertelli
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Keith E Tilley
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| |
Collapse
|
8
|
De Oliveira DMP, Forde BM, Kidd TJ, Harris PNA, Schembri MA, Beatson SA, Paterson DL, Walker MJ. Antimicrobial Resistance in ESKAPE Pathogens. Clin Microbiol Rev 2020; 23:788-99. [PMID: 32404435 DOI: 10.1111/imb.12124] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2023] Open
Abstract
Antimicrobial-resistant ESKAPE ( Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) pathogens represent a global threat to human health. The acquisition of antimicrobial resistance genes by ESKAPE pathogens has reduced the treatment options for serious infections, increased the burden of disease, and increased death rates due to treatment failure and requires a coordinated global response for antimicrobial resistance surveillance. This looming health threat has restimulated interest in the development of new antimicrobial therapies, has demanded the need for better patient care, and has facilitated heightened governance over stewardship practices.
Collapse
Affiliation(s)
- David M P De Oliveira
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Brian M Forde
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Timothy J Kidd
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Patrick N A Harris
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
- UQ Centre for Clinical Research, The University of Queensland, QLD, Australia
| | - Mark A Schembri
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Scott A Beatson
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - David L Paterson
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
- UQ Centre for Clinical Research, The University of Queensland, QLD, Australia
| | - Mark J Walker
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| |
Collapse
|
9
|
De Oliveira DMP, Forde BM, Kidd TJ, Harris PNA, Schembri MA, Beatson SA, Paterson DL, Walker MJ. Antimicrobial Resistance in ESKAPE Pathogens. Clin Microbiol Rev 2020; 33:e00181-19. [PMID: 32404435 PMCID: PMC7227449 DOI: 10.1128/cmr.00181-19] [Citation(s) in RCA: 908] [Impact Index Per Article: 227.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Antimicrobial-resistant ESKAPE ( Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae, Acinetobacter baumannii, Pseudomonas aeruginosa, and Enterobacter species) pathogens represent a global threat to human health. The acquisition of antimicrobial resistance genes by ESKAPE pathogens has reduced the treatment options for serious infections, increased the burden of disease, and increased death rates due to treatment failure and requires a coordinated global response for antimicrobial resistance surveillance. This looming health threat has restimulated interest in the development of new antimicrobial therapies, has demanded the need for better patient care, and has facilitated heightened governance over stewardship practices.
Collapse
Affiliation(s)
- David M P De Oliveira
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Brian M Forde
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Timothy J Kidd
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Patrick N A Harris
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
- UQ Centre for Clinical Research, The University of Queensland, QLD, Australia
| | - Mark A Schembri
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - Scott A Beatson
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| | - David L Paterson
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
- UQ Centre for Clinical Research, The University of Queensland, QLD, Australia
| | - Mark J Walker
- School of Chemistry and Molecular Biosciences, The University of Queensland, QLD, Australia
- Australian Infectious Diseases Research Centre, The University of Queensland, QLD, Australia
| |
Collapse
|
10
|
Barros-Carvalho GA, Van Sluys MA, Lopes FM. An Efficient Approach to Explore and Discriminate Anomalous Regions in Bacterial Genomes Based on Maximum Entropy. J Comput Biol 2017; 24:1125-1133. [PMID: 28570142 DOI: 10.1089/cmb.2017.0042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Recently, there has been an increase in the number of whole bacterial genomes sequenced, mainly due to the advancing of next-generation sequencing technologies. In face of this, there is a need to provide new analytical alternatives that can follow this advance. Given our current knowledge about the genomic plasticity of bacteria and that those genomic regions can uncover important features about this microorganism, our goal was to develop a fast methodology based on maximum entropy (ME) to guide the researcher to regions that could be prioritized during the analysis. This methodology was compared with other available methods. In addition, ME was applied to eight different bacterial genera. The methodology consists of two main steps: processing the nucleotide sequence and ME calculation. We applied ME to Xanthomonas axonopodis pv. citri 306 (XAC) and Xanthomonas campestris pv. campestris ATCC 33913 (XCC), both of which have their anomalous regions well documented. We then compared our results against those from Alien Hunter, HGT-DB, Islander, IslandPath, and SIGI-HMM. ME was shown to be superior in terms of efficiency and analysis duration. Besides, ME only needs the genome sequence in FASTA format as input. The proposed strategy based on ME is able to help in bacterial genome exploration. This is a simple and fast strategy for individual genomes in comparison with other available methods, without relying on previous annotation and alignments. This methodology can also be a new option in the early stages of analysis of newly sequenced bacterial genomes.
Collapse
Affiliation(s)
- Gesiele Almeida Barros-Carvalho
- 1 Institute of Mathematics and Statistics, University of São Paulo , São Paulo, Brazil .,2 GaTE Lab, Department of Botany, Institute of Bioscience, University of São Paulo , São Paulo, Brazil
| | - Marie-Anne Van Sluys
- 2 GaTE Lab, Department of Botany, Institute of Bioscience, University of São Paulo , São Paulo, Brazil
| | | |
Collapse
|
11
|
Cao MD, Nguyen SH, Ganesamoorthy D, Elliott AG, Cooper MA, Coin LJM. Scaffolding and completing genome assemblies in real-time with nanopore sequencing. Nat Commun 2017; 8:14515. [PMID: 28218240 PMCID: PMC5321748 DOI: 10.1038/ncomms14515] [Citation(s) in RCA: 73] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2016] [Accepted: 01/06/2017] [Indexed: 01/10/2023] Open
Abstract
Third generation sequencing technologies provide the opportunity to improve genome assemblies by generating long reads spanning most repeat sequences. However, current analysis methods require substantial amounts of sequence data and computational resources to overcome the high error rates. Furthermore, they can only perform analysis after sequencing has completed, resulting in either over-sequencing, or in a low quality assembly due to under-sequencing. Here we present npScarf, which can scaffold and complete short read assemblies while the long read sequencing run is in progress. It reports assembly metrics in real-time so the sequencing run can be terminated once an assembly of sufficient quality is obtained. In assembling four bacterial and one eukaryotic genomes, we show that npScarf can construct more complete and accurate assemblies while requiring less sequencing data and computational resources than existing methods. Our approach offers a time- and resource-effective strategy for completing short read assemblies.
Collapse
Affiliation(s)
- Minh Duc Cao
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, Queensland 4072 Australia
| | - Son Hoang Nguyen
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, Queensland 4072 Australia
| | - Devika Ganesamoorthy
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, Queensland 4072 Australia
| | - Alysha G. Elliott
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, Queensland 4072 Australia
| | - Matthew A. Cooper
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, Queensland 4072 Australia
| | - Lachlan J. M. Coin
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, Queensland 4072 Australia
| |
Collapse
|
12
|
Lu B, Leong HW. Computational methods for predicting genomic islands in microbial genomes. Comput Struct Biotechnol J 2016; 14:200-6. [PMID: 27293536 PMCID: PMC4887561 DOI: 10.1016/j.csbj.2016.05.001] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2016] [Revised: 05/01/2016] [Accepted: 05/03/2016] [Indexed: 11/02/2022] Open
Abstract
Clusters of genes acquired by lateral gene transfer in microbial genomes, are broadly referred to as genomic islands (GIs). GIs often carry genes important for genome evolution and adaptation to niches, such as genes involved in pathogenesis and antibiotic resistance. Therefore, GI prediction has gradually become an important part of microbial genome analysis. Despite inherent difficulties in identifying GIs, many computational methods have been developed and show good performance. In this mini-review, we first summarize the general challenges in predicting GIs. Then we group existing GI detection methods by their input, briefly describe representative methods in each group, and discuss their advantages as well as limitations. Finally, we look into the potential improvements for better GI prediction.
Collapse
Affiliation(s)
- Bingxin Lu
- Department of Computer Science, National University of Singapore, 13 Computing Drive, Singapore 117417, Republic of Singapore
| | - Hon Wai Leong
- Department of Computer Science, National University of Singapore, 13 Computing Drive, Singapore 117417, Republic of Singapore
| |
Collapse
|
13
|
Hudson CM, Lau BY, Williams KP. Islander: a database of precisely mapped genomic islands in tRNA and tmRNA genes. Nucleic Acids Res 2014; 43:D48-53. [PMID: 25378302 PMCID: PMC4383910 DOI: 10.1093/nar/gku1072] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Genomic islands are mobile DNAs that are major agents of bacterial and archaeal evolution. Integration into prokaryotic chromosomes usually occurs site-specifically at tRNA or tmRNA gene (together, tDNA) targets, catalyzed by tyrosine integrases. This splits the target gene, yet sequences within the island restore the disrupted gene; the regenerated target and its displaced fragment precisely mark the endpoints of the island. We applied this principle to search for islands in genomic DNA sequences. Our algorithm identifies tDNAs, finds fragments of those tDNAs in the same replicon and removes unlikely candidate islands through a series of filters. A search for islands in 2168 whole prokaryotic genomes produced 3919 candidates. The website Islander (recently moved to http://bioinformatics.sandia.gov/islander/) presents these precisely mapped candidate islands, the gene content and the island sequence. The algorithm further insists that each island encode an integrase, and attachment site sequence identity is carefully noted; therefore, the database also serves in the study of integrase site-specificity and its evolution.
Collapse
Affiliation(s)
- Corey M Hudson
- Sandia National Laboratories, Department of Systems Biology, Livermore, CA 94550, USA
| | - Britney Y Lau
- Sandia National Laboratories, Department of Systems Biology, Livermore, CA 94550, USA
| | - Kelly P Williams
- Sandia National Laboratories, Department of Systems Biology, Livermore, CA 94550, USA
| |
Collapse
|
14
|
The nitrogen-fixation island insertion site is conserved in diazotrophic Pseudomonas stutzeri and Pseudomonas sp. isolated from distal and close geographical regions. PLoS One 2014; 9:e105837. [PMID: 25251496 PMCID: PMC4174501 DOI: 10.1371/journal.pone.0105837] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Accepted: 07/29/2014] [Indexed: 11/19/2022] Open
Abstract
The presence of nitrogen fixers within the genus Pseudomonas has been established and so far most isolated strains are phylogenetically affiliated to Pseudomonas stutzeri. A gene ortholog neighborhood analysis of the nitrogen fixation island (NFI) in four diazotrophic P. stutzeri strains and Pseudomonas azotifigens revealed that all are flanked by genes coding for cobalamin synthase (cobS) and glutathione peroxidise (gshP). The putative NFIs lack all the features characterizing a mobilizable genomic island. Nevertheless, bioinformatic analysis P. stutzeri DSM 4166 NFI demonstrated the presence of short inverted and/or direct repeats within both flanking regions. The other P. stutzeri strains carry only one set of repeats. The genetic diversity of eleven diazotrophic Pseudomonas isolates was also investigated. Multilocus sequence typing grouped nine isolates along with P. stutzeri and two isolates are grouped in a separate clade. A Rep-PCR fingerprinting analysis grouped the eleven isolates into four distinct genotypes. We also provided evidence that the putative NFI in our diazotrophic Pseudomonas isolates is flanked by cobS and gshP genes. Furthermore, we demonstrated that the putative NFI of Pseudomonas sp. Gr65 is flanked by inverted repeats identical to those found in P. stutzeri DSM 4166 and while the other P. stutzeri isolates harbor the repeats located in the intergenic region between cobS and glutaredoxin genes as in the case of P. stutzeri A1501. Taken together these data suggest that all putative NFIs of diazotrophic Pseudomonas isolates are anchored in an intergenic region between cobS and gshP genes and their flanking regions are designated by distinct repeats patterns. Moreover, the presence of almost identical NFIs in diazotrophic Pseudomonas strains isolated from distal geographical locations around the world suggested that this horizontal gene transfer event may have taken place early in the evolution.
Collapse
|
15
|
Hudson CM, Lau BY, Williams KP. Ends of the line for tmRNA-SmpB. Front Microbiol 2014; 5:421. [PMID: 25165464 PMCID: PMC4131195 DOI: 10.3389/fmicb.2014.00421] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2014] [Accepted: 07/24/2014] [Indexed: 11/22/2022] Open
Abstract
Genes for the RNA tmRNA and protein SmpB, partners in the trans-translation process that rescues stalled ribosomes, have previously been found in all bacteria and some organelles. During a major update of The tmRNA Website (relocated to http://bioinformatics.sandia.gov/tmrna), including addition of an SmpB sequence database, we found some bacteria that lack functionally significant regions of SmpB. Three groups with reduced genomes have lost the central loop of SmpB, which is thought to improve alanylation and EF-Tu activation: Carsonella, Hodgkinia, and the hemoplasmas (hemotropic Mycoplasma). Carsonella has also lost the SmpB C-terminal tail, thought to stimulate the decoding center of the ribosome. We validate recent identification of tmRNA homologs in oomycete mitochondria by finding partner genes from oomycete nuclei that target SmpB to the mitochondrion. We have moreover identified through exhaustive search a small number of complete, but often highly derived, bacterial genomes that appear to lack a functional copy of either the tmRNA or SmpB gene (but not both). One Carsonella isolate exhibits complete degradation of the tmRNA gene sequence yet its smpB shows no evidence for relaxed selective constraint, relative to other genes in the genome. After loss of the SmpB central loop in the hemoplasmas, one subclade apparently lost tmRNA. Carsonella also exhibits gene overlap such that tmRNA maturation should produce a non-stop smpB mRNA. At least some of the tmRNA/SmpB-deficient strains appear to further lack the ArfA and ArfB backup systems for ribosome rescue. The most frequent neighbors of smpB are the tmRNA gene, a ratA/rnfH unit, and the gene for RNaseR, a known physical and functional partner of tmRNA-SmpB.
Collapse
Affiliation(s)
- Corey M Hudson
- Sandia National Laboratories, Department of Systems Biology Livermore, CA, USA
| | - Britney Y Lau
- Sandia National Laboratories, Department of Systems Biology Livermore, CA, USA
| | - Kelly P Williams
- Sandia National Laboratories, Department of Systems Biology Livermore, CA, USA
| |
Collapse
|
16
|
Resistance determinants and mobile genetic elements of an NDM-1-encoding Klebsiella pneumoniae strain. PLoS One 2014; 9:e99209. [PMID: 24905728 PMCID: PMC4048246 DOI: 10.1371/journal.pone.0099209] [Citation(s) in RCA: 102] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 05/12/2014] [Indexed: 01/12/2023] Open
Abstract
Multidrug-resistant Enterobacteriaceae are emerging as a serious infectious disease challenge. These strains can accumulate many antibiotic resistance genes though horizontal transfer of genetic elements, those for β-lactamases being of particular concern. Some β-lactamases are active on a broad spectrum of β-lactams including the last-resort carbapenems. The gene for the broad-spectrum and carbapenem-active metallo-β-lactamase NDM-1 is rapidly spreading. We present the complete genome of Klebsiella pneumoniae ATCC BAA-2146, the first U.S. isolate found to encode NDM-1, and describe its repertoire of antibiotic-resistance genes and mutations, including genes for eight β-lactamases and 15 additional antibiotic-resistance enzymes. To elucidate the evolution of this rich repertoire, the mobile elements of the genome were characterized, including four plasmids with varying degrees of conservation and mosaicism and eleven chromosomal genomic islands. One island was identified by a novel phylogenomic approach, that further indicated the cps-lps polysaccharide synthesis locus, where operon translocation and fusion was noted. Unique plasmid segments and mosaic junctions were identified. Plasmid-borne blaCTX-M-15 was transposed recently to the chromosome by ISEcp1. None of the eleven full copies of IS26, the most frequent IS element in the genome, had the expected 8-bp direct repeat of the integration target sequence, suggesting that each copy underwent homologous recombination subsequent to its last transposition event. Comparative analysis likewise indicates IS26 as a frequent recombinational junction between plasmid ancestors, and also indicates a resolvase site. In one novel use of high-throughput sequencing, homologously recombinant subpopulations of the bacterial culture were detected. In a second novel use, circular transposition intermediates were detected for the novel insertion sequence ISKpn21 of the ISNCY family, suggesting that it uses the two-step transposition mechanism of IS3. Robust genome-based phylogeny showed that a unified Klebsiella cluster contains Enterobacter aerogenes and Raoultella, suggesting the latter genus should be abandoned.
Collapse
|
17
|
Gupta A, Kapil R, Dhakan DB, Sharma VK. MP3: a software tool for the prediction of pathogenic proteins in genomic and metagenomic data. PLoS One 2014; 9:e93907. [PMID: 24736651 PMCID: PMC3988012 DOI: 10.1371/journal.pone.0093907] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Accepted: 03/10/2014] [Indexed: 11/24/2022] Open
Abstract
The identification of virulent proteins in any de-novo sequenced genome is useful in estimating its pathogenic ability and understanding the mechanism of pathogenesis. Similarly, the identification of such proteins could be valuable in comparing the metagenome of healthy and diseased individuals and estimating the proportion of pathogenic species. However, the common challenge in both the above tasks is the identification of virulent proteins since a significant proportion of genomic and metagenomic proteins are novel and yet unannotated. The currently available tools which carry out the identification of virulent proteins provide limited accuracy and cannot be used on large datasets. Therefore, we have developed an MP3 standalone tool and web server for the prediction of pathogenic proteins in both genomic and metagenomic datasets. MP3 is developed using an integrated Support Vector Machine (SVM) and Hidden Markov Model (HMM) approach to carry out highly fast, sensitive and accurate prediction of pathogenic proteins. It displayed Sensitivity, Specificity, MCC and accuracy values of 92%, 100%, 0.92 and 96%, respectively, on blind dataset constructed using complete proteins. On the two metagenomic blind datasets (Blind A: 51-100 amino acids and Blind B: 30-50 amino acids), it displayed Sensitivity, Specificity, MCC and accuracy values of 82.39%, 97.86%, 0.80 and 89.32% for Blind A and 71.60%, 94.48%, 0.67 and 81.86% for Blind B, respectively. In addition, the performance of MP3 was validated on selected bacterial genomic and real metagenomic datasets. To our knowledge, MP3 is the only program that specializes in fast and accurate identification of partial pathogenic proteins predicted from short (100-150 bp) metagenomic reads and also performs exceptionally well on complete protein sequences. MP3 is publicly available at http://metagenomics.iiserb.ac.in/mp3/index.php.
Collapse
Affiliation(s)
- Ankit Gupta
- MetaInformatics Laboratory, Metagenomics and Systems Biology Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Madhya Pradesh, India
| | - Rohan Kapil
- MetaInformatics Laboratory, Metagenomics and Systems Biology Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Madhya Pradesh, India
| | - Darshan B. Dhakan
- MetaInformatics Laboratory, Metagenomics and Systems Biology Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Madhya Pradesh, India
| | - Vineet K. Sharma
- MetaInformatics Laboratory, Metagenomics and Systems Biology Group, Department of Biological Sciences, Indian Institute of Science Education and Research Bhopal, Madhya Pradesh, India
| |
Collapse
|
18
|
Identifying pathogenicity islands in bacterial pathogenomics using computational approaches. Pathogens 2014; 3:36-56. [PMID: 25437607 PMCID: PMC4235732 DOI: 10.3390/pathogens3010036] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Revised: 12/30/2013] [Accepted: 01/07/2014] [Indexed: 12/22/2022] Open
Abstract
High-throughput sequencing technologies have made it possible to study bacteria through analyzing their genome sequences. For instance, comparative genome sequence analyses can reveal the phenomenon such as gene loss, gene gain, or gene exchange in a genome. By analyzing pathogenic bacterial genomes, we can discover that pathogenic genomic regions in many pathogenic bacteria are horizontally transferred from other bacteria, and these regions are also known as pathogenicity islands (PAIs). PAIs have some detectable properties, such as having different genomic signatures than the rest of the host genomes, and containing mobility genes so that they can be integrated into the host genome. In this review, we will discuss various pathogenicity island-associated features and current computational approaches for the identification of PAIs. Existing pathogenicity island databases and related computational resources will also be discussed, so that researchers may find it to be useful for the studies of bacterial evolution and pathogenicity mechanisms.
Collapse
|
19
|
Lee CC, Chen YPP, Yao TJ, Ma CY, Lo WC, Lyu PC, Tang CY. GI-POP: A combinational annotation and genomic island prediction pipeline for ongoing microbial genome projects. Gene 2013; 518:114-23. [DOI: 10.1016/j.gene.2012.11.063] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Accepted: 11/27/2012] [Indexed: 10/27/2022]
|
20
|
Stability of a Pseudomonas putida KT2440 bacteriophage-carried genomic island and its impact on rhizosphere fitness. Appl Environ Microbiol 2012; 78:6963-74. [PMID: 22843519 DOI: 10.1128/aem.00901-12] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The stability of seven genomic islands of Pseudomonas putida KT2440 with predicted potential for mobilization was studied in bacterial populations associated with the rhizosphere of corn plants by multiplex PCR. DNA rearrangements were detected for only one of them (GI28), which was lost at high frequency. This genomic island of 39.4 kb, with 53 open reading frames, shows the characteristic organization of genes belonging to tailed phages. We present evidence indicating that it corresponds to the lysogenic state of a functional bacteriophage that we have designated Pspu28. Integrated and rarely excised forms of Pspu28 coexist in KT2440 populations. Pspu28 is self-transmissible, and an excisionase is essential for its removal from the bacterial chromosome. The excised Pspu28 forms a circular element that can integrate into the chromosome at a specific location, att sites containing a 17-bp direct repeat sequence. Excision/insertion of Pspu28 alters the promoter sequence and changes the expression level of PP_1531, which encodes a predicted arsenate reductase. Finally, we show that the presence of Pspu28 in the lysogenic state has a negative effect on bacterial fitness in the rhizosphere under conditions of intraspecific competition, thus explaining why clones having lost this mobile element are recovered from that environment.
Collapse
|
21
|
Toussaint A, Chandler M. Prokaryote genome fluidity: toward a system approach of the mobilome. Methods Mol Biol 2012; 804:57-80. [PMID: 22144148 DOI: 10.1007/978-1-61779-361-5_4] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The importance of horizontal/lateral gene transfer (LGT) in shaping the genomes of prokaryotic organisms has been recognized in recent years as a result of analysis of the increasing number of available genome sequences. LGT is largely due to the transfer and recombination activities of mobile genetic elements (MGEs). Bacterial and archaeal genomes are mosaics of vertically and horizontally transmitted DNA segments. This generates reticulate relationships between members of the prokaryotic world that are better represented by networks than by "classical" phylogenetic trees. In this review we summarize the nature and activities of MGEs, and the problems that presently limit their analysis on a large scale. We propose routes to improve their annotation in the flow of genomic and metagenomic sequences that currently exist and those that become available. We describe network analysis of evolutionary relationships among some MGE categories and sketch out possible developments of this type of approach to get more insight into the role of the mobilome in bacterial adaptation and evolution.
Collapse
Affiliation(s)
- Ariane Toussaint
- Laboratoire de Bioinformatique des Génomes et des Réseaux, Université Libre de Bruxelles, Bruxelles, Belgium.
| | | |
Collapse
|
22
|
Soares SC, Abreu VAC, Ramos RTJ, Cerdeira L, Silva A, Baumbach J, Trost E, Tauch A, Hirata R, Mattos-Guaraldi AL, Miyoshi A, Azevedo V. PIPS: pathogenicity island prediction software. PLoS One 2012; 7:e30848. [PMID: 22355329 PMCID: PMC3280268 DOI: 10.1371/journal.pone.0030848] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2011] [Accepted: 12/22/2011] [Indexed: 01/08/2023] Open
Abstract
The adaptability of pathogenic bacteria to hosts is influenced by the genomic plasticity of the bacteria, which can be increased by such mechanisms as horizontal gene transfer. Pathogenicity islands play a major role in this type of gene transfer because they are large, horizontally acquired regions that harbor clusters of virulence genes that mediate the adhesion, colonization, invasion, immune system evasion, and toxigenic properties of the acceptor organism. Currently, pathogenicity islands are mainly identified in silico based on various characteristic features: (1) deviations in codon usage, G+C content or dinucleotide frequency and (2) insertion sequences and/or tRNA genetic flanking regions together with transposase coding genes. Several computational techniques for identifying pathogenicity islands exist. However, most of these techniques are only directed at the detection of horizontally transferred genes and/or the absence of certain genomic regions of the pathogenic bacterium in closely related non-pathogenic species. Here, we present a novel software suite designed for the prediction of pathogenicity islands (pathogenicity island prediction software, or PIPS). In contrast to other existing tools, our approach is capable of utilizing multiple features for pathogenicity island detection in an integrative manner. We show that PIPS provides better accuracy than other available software packages. As an example, we used PIPS to study the veterinary pathogen Corynebacterium pseudotuberculosis, in which we identified seven putative pathogenicity islands.
Collapse
Affiliation(s)
- Siomar C. Soares
- Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Vinícius A. C. Abreu
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | | | - Louise Cerdeira
- Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
| | - Artur Silva
- Department of Genetics, Federal University of Pará, Belém, Pará, Brazil
| | - Jan Baumbach
- Department of Computer Science, Max-Planck-Institut für Informatik, Saarbrücken, Saarland, Germany
| | - Eva Trost
- Center for Biotechnology, Bielefeld University, Bielefeld, Nordrhein-Westfalen, Germany
| | - Andreas Tauch
- Center for Biotechnology, Bielefeld University, Bielefeld, Nordrhein-Westfalen, Germany
| | - Raphael Hirata
- Microbiology and Immunology Discipline, Medical Sciences Faculty, State University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Ana L. Mattos-Guaraldi
- Microbiology and Immunology Discipline, Medical Sciences Faculty, State University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Anderson Miyoshi
- Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Vasco Azevedo
- Department of General Biology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
- * E-mail:
| |
Collapse
|
23
|
Song L, Pan Y, Chen S, Zhang X. Structural characteristics of genomic islands associated with GMP synthases as integration hotspot among sequenced microbial genomes. Comput Biol Chem 2012; 36:62-70. [PMID: 22306813 DOI: 10.1016/j.compbiolchem.2012.01.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Revised: 12/23/2011] [Accepted: 01/02/2012] [Indexed: 11/18/2022]
Abstract
tRNA, tmRNA and some small RNA genes are recognized as general integration hotspots of genomic islands (GIs). The GMP synthase gene (guaA) has been firstly identified as one insertion hotspot of foreign DNA fragments. Thirty four islands integrated into the guaA genes were identified in the 987 completely sequenced archaeal and bacterial genomes. These alien islands were widely distributed within the host strains belonging to Proteobacteria, Firmicutes and Actinobacteria. The analysis of structural characteristics of these GIs is important for further determination of the island mobility and transference into suitable hosts. The putative functional integrases encoded by guaA-associated islands were mainly composed of phage P4 integrases, and followed by phage PhiLC3 integrases. Interestingly, island-encoding AlpA is close to P4 integrase and is deduced to be the positive transcriptional regulatory factor of P4 integrase while the XRE protein is close to PhiLC3 integrase and may be the negative transcriptional regulatory factor of PhiLC3 integrase. An 8-bp consensus sequence (5'-GAGTGGGA-3') within the direct repeats of these GIs is the cutting site of the P4 integrases encoding by guaA-associated islands, in which the third nucleotide (G) is the key site. The large-scale investigation of the content of GMP synthase gene hotspots may be useful to find important functional islands within members of many key bacterial species and to transfer useful islands into more suitable hosts.
Collapse
Affiliation(s)
- Lei Song
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, PR China
| | | | | | | |
Collapse
|
24
|
Chronology and pattern of integration of tandem genomic islands associated with the tmRNA gene in Escherichia coli and Salmonella enterica. CHINESE SCIENCE BULLETIN-CHINESE 2011. [DOI: 10.1007/s11434-011-4749-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
25
|
Chakraborty A, Ghosh S, Chowdhary G, Maulik U, Chakrabarti S. DBETH: a Database of Bacterial Exotoxins for Human. Nucleic Acids Res 2011; 40:D615-20. [PMID: 22102573 PMCID: PMC3244994 DOI: 10.1093/nar/gkr942] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Pathogenic bacteria produce protein toxins to survive in the hostile environments defined by the host's defense systems and immune response. Recent progresses in high-throughput genome sequencing and structure determination techniques have contributed to a better understanding of mechanisms of action of the bacterial toxins at the cellular and molecular levels leading to pathogenicity. It is fair to assume that with time more and more unknown toxins will emerge not only by the discovery of newer species but also due to the genetic rearrangement of existing bacterial genomes. Hence, it is crucial to organize a systematic compilation and subsequent analyses of the inherent features of known bacterial toxins. We developed a Database for Bacterial ExoToxins (DBETH, http://www.hpppi.iicb.res.in/btox/), which contains sequence, structure, interaction network and analytical results for 229 toxins categorized within 24 mechanistic and activity types from 26 bacterial genuses. The main objective of this database is to provide a comprehensive knowledgebase for human pathogenic bacterial toxins where various important sequence, structure and physico-chemical property based analyses are provided. Further, we have developed a prediction server attached to this database which aims to identify bacterial toxin like sequences either by establishing homology with known toxin sequences/domains or by classifying bacterial toxin specific features using a support vector based machine learning techniques.
Collapse
Affiliation(s)
- Abhijit Chakraborty
- Department of Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, Council for Scientific and Industrial Research, Jadavpur University, Kolkata, WB 700 032, India
| | | | | | | | | |
Collapse
|
26
|
Zhu B, Zhou S, Lou M, Zhu J, Li B, Xie G, Jin G, De Mot R. Characterization and inference of gene gain/loss along burkholderia evolutionary history. Evol Bioinform Online 2011; 7:191-200. [PMID: 22084562 PMCID: PMC3210638 DOI: 10.4137/ebo.s7510] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
A comparative analysis of 60 complete Burkholderia genomes was conducted to obtain insight in the evolutionary history behind the diversity and pathogenicity at species level. A concatenated multiprotein phyletic pattern and a dataset with Burkholderia clusters of orthologous genes (BuCOGs) were constructed. The extent of horizontal gene transfer (HGT) was assessed using a Markov based probabilistic method. A reconstruction of the gene gains and losses history shows that more than half of the Burkholderia genes families are inferred to have experienced HGT at least once during their evolution. Further analysis revealed that the number of gene gain and loss was correlated with the branch length. Genomic islands (GEIs) analysis based on evolutionary history reconstruction not only revealed that most genes in ancient GEIs were gained but also suggested that the fraction of the genome located in GEIs in the small chromosomes is higher than in the large chromosomes in Burkholderia. The mapping of coexpressed genes onto biological pathway schemes revealed that pathogenicity of Burkholderia strains is probably mainly determined by the gained genes in its ancestor. Taken together, our results strongly support that gene gain and loss especially in ancient evolutionary history play an important role in strain divergence, pathogenicity determinants of Burkholderia and GEIs formation.
Collapse
Affiliation(s)
- Bo Zhu
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - Shengli Zhou
- Environmental Monitoring Center of Zhejiang Province, Hangzhou 310015, China
| | - Miaomiao Lou
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - Jun Zhu
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310029, China
| | - Bin Li
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - Guanlin Xie
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
| | - GuLei Jin
- State Key Laboratory of Rice Biology and Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Ministry of Agriculture, Institute of Biotechnology, Zhejiang University, Hangzhou 310029, China
- Institute of Bioinformatics, Zhejiang University, Hangzhou 310029, China
| | - René De Mot
- Centre of Microbial and Plant Genetics, Katholieke Universiteit Leuven, 3001 Heverlee-Leuven 3001, Belgium
| |
Collapse
|
27
|
Bezuidt O, Pierneef R, Mncube K, Lima-Mendez G, Reva ON. Mainstreams of horizontal gene exchange in enterobacteria: consideration of the outbreak of enterohemorrhagic E. coli O104:H4 in Germany in 2011. PLoS One 2011; 6:e25702. [PMID: 22022434 PMCID: PMC3195076 DOI: 10.1371/journal.pone.0025702] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Accepted: 09/08/2011] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Escherichia coli O104:H4 caused a severe outbreak in Europe in 2011. The strain TY-2482 sequenced from this outbreak allowed the discovery of its closest relatives but failed to resolve ways in which it originated and evolved. On account of the previous statement, may we expect similar upcoming outbreaks to occur recurrently or spontaneously in the future? The inability to answer these questions shows limitations of the current comparative and evolutionary genomics methods. PRINCIPAL FINDINGS The study revealed oscillations of gene exchange in enterobacteria, which originated from marine γ-Proteobacteria. These mobile genetic elements have become recombination hotspots and effective 'vehicles' ensuring a wide distribution of successful combinations of fitness and virulence genes among enterobacteria. Two remarkable peculiarities of the strain TY-2482 and its relatives were observed: i) retaining the genetic primitiveness by these strains as they somehow avoided the main fluxes of horizontal gene transfer which effectively penetrated other enetrobacteria; ii) acquisition of antibiotic resistance genes in a plasmid genomic island of β-Proteobacteria origin which ontologically is unrelated to the predominant genomic islands of enterobacteria. CONCLUSIONS Oscillations of horizontal gene exchange activity were reported which result from a counterbalance between the acquired resistance of bacteria towards existing mobile vectors and the generation of new vectors in the environmental microflora. We hypothesized that TY-2482 may originate from a genetically primitive lineage of E. coli that has evolved in confined geographical areas and brought by human migration or cattle trade onto an intersection of several independent streams of horizontal gene exchange. Development of a system for monitoring the new and most active gene exchange events was proposed.
Collapse
Affiliation(s)
- Oliver Bezuidt
- Bioinformatics and Computational Biology Unit, Department of Biochemistry, University of Pretoria, Pretoria, South Africa
| | - Rian Pierneef
- Bioinformatics and Computational Biology Unit, Department of Biochemistry, University of Pretoria, Pretoria, South Africa
| | - Kingdom Mncube
- Bioinformatics and Computational Biology Unit, Department of Biochemistry, University of Pretoria, Pretoria, South Africa
| | - Gipsi Lima-Mendez
- Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe), Université Libre de Bruxelles, Bruxelles, Belgium
| | - Oleg N. Reva
- Bioinformatics and Computational Biology Unit, Department of Biochemistry, University of Pretoria, Pretoria, South Africa
- * E-mail:
| |
Collapse
|
28
|
Integrative analysis of transcriptome and genome indicates two potential genomic islands are associated with pathogenesis of Mycobacterium tuberculosis. Gene 2011; 489:21-9. [PMID: 21924330 DOI: 10.1016/j.gene.2011.08.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Revised: 07/20/2011] [Accepted: 08/26/2011] [Indexed: 11/20/2022]
Abstract
Mycobacterium tuberculosis (M.tb) is a successful human pathogen and widely prevalent throughout the world. Genomic islands (GIs) are thought to be related to pathogenicity. In this study, we predicted two potential genomic islands in M.tb genome, respectively named as GI-1 and GI-2. It is indicated that the genes belong to PE_PGRS family in GI-1 and genes involved in sulfolipid-1 (SL-1) synthesis in GI-2 are strongly associated with M.tb pathogenesis. Sequence analysis revealed that the five PGRS genes are more polymorphic than other PGRS members in full virulence M.tb complex strains at significance level 0.01 but not in attenuated strains. Expression analysis of microarrays collected from literatures displayed that GI-1 genes, especially Rv3508 might be correlated with the response to the inhibition of aerobic respiration. Microarray analysis also showed that SL-1 cluster genes are drastically down-expressed in attenuated strains relative to full virulence strains. We speculated that the effect of SL-1 on M.tb pathogenicity could be associated with long-term survival and persistence establishment during infection. Additionally, the gene Rv3508 in GI-1 was under positive selection. Rv3508 may involve the response of M.tb to the inhibition of aerobic respiration by low oxygen or drug PA-824, and it may be a common feature of genes in GI-1. These findings may provide some novel insights into M.tb physiology and pathogenesis.
Collapse
|
29
|
Roos TE, van Passel MWJ. A quantitative account of genomic island acquisitions in prokaryotes. BMC Genomics 2011; 12:427. [PMID: 21864345 PMCID: PMC3176501 DOI: 10.1186/1471-2164-12-427] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2011] [Accepted: 08/24/2011] [Indexed: 12/15/2022] Open
Abstract
Background Microbial genomes do not merely evolve through the slow accumulation of mutations, but also, and often more dramatically, by taking up new DNA in a process called horizontal gene transfer. These innovation leaps in the acquisition of new traits can take place via the introgression of single genes, but also through the acquisition of large gene clusters, which are termed Genomic Islands. Since only a small proportion of all the DNA diversity has been sequenced, it can be hard to find the appropriate donors for acquired genes via sequence alignments from databases. In contrast, relative oligonucleotide frequencies represent a remarkably stable genomic signature in prokaryotes, which facilitates compositional comparisons as an alignment-free alternative for phylogenetic relatedness. In this project, we test whether Genomic Islands identified in individual bacterial genomes have a similar genomic signature, in terms of relative dinucleotide frequencies, and can therefore be expected to originate from a common donor species. Results When multiple Genomic Islands are present within a single genome, we find that up to 28% of these are compositionally very similar to each other, indicative of frequent recurring acquisitions from the same donor to the same acceptor. Conclusions This represents the first quantitative assessment of common directional transfer events in prokaryotic evolutionary history. We suggest that many of the resident Genomic Islands per prokaryotic genome originated from the same source, which may have implications with respect to their regulatory interactions, and for the elucidation of the common origins of these acquired gene clusters.
Collapse
Affiliation(s)
- Tom E Roos
- Genomics Coordination Center, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | | |
Collapse
|
30
|
Song L, Zhang X. Accurate localization and excision of genomic islands in four strains of Pseudomonas aeruginosa and Pseudomonas fluorescens. ACTA ACUST UNITED AC 2011. [DOI: 10.1007/s11434-011-4410-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
31
|
Li M, Shen X, Yan J, Han H, Zheng B, Liu D, Cheng H, Zhao Y, Rao X, Wang C, Tang J, Hu F, Gao GF. GI-type T4SS-mediated horizontal transfer of the 89K pathogenicity island in epidemic Streptococcus suis serotype 2. Mol Microbiol 2011; 79:1670-83. [PMID: 21244532 PMCID: PMC3132442 DOI: 10.1111/j.1365-2958.2011.07553.x] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Pathogenicity islands (PAIs), a distinct type of genomic island (GI), play important roles in the rapid adaptation and increased virulence of pathogens. 89K is a newly identified PAI in epidemic Streptococcus suis isolates that are related to the two recent large-scale outbreaks of human infection in China. However, its mechanism of evolution and contribution to the epidemic spread of S. suis 2 remain unknown. In this study, the potential for mobilization of 89K was evaluated, and its putative transfer mechanism was investigated. We report that 89K can spontaneously excise to form an extrachromosomal circular product. The precise excision is mediated by an 89K-borne integrase through site-specific recombination, with help from an excisionase. The 89K excision intermediate acts as a substrate for lateral transfer to non-89K S. suis 2 recipients, where it reintegrates site-specifically into the target site. The conjugal transfer of 89K occurred via a GI type IV secretion system (T4SS) encoded in 89K, at a frequency of 10(-6) transconjugants per donor. This is the first demonstration of horizontal transfer of a Gram-positive PAI mediated by a GI-type T4SS. We propose that these genetic events are important in the emergence, pathogenesis and persistence of epidemic S. suis 2 strains.
Collapse
Affiliation(s)
- Ming Li
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
Because the properties of horizontally-transferred genes will reflect the mutational proclivities of their donor genomes, they often show atypical compositional properties relative to native genes. Parametric methods use these discrepancies to identify bacterial genes recently acquired by horizontal transfer. However, compositional patterns of native genes vary stochastically, leaving no clear boundary between typical and atypical genes. As a result, while strongly atypical genes are readily identified as alien, genes of ambiguous character are poorly classified when a single threshold separates typical and atypical genes. This limitation affects all parametric methods that examine genes independently, and escaping it requires the use of additional genomic information. We propose that the performance of all parametric methods can be improved by using a multiple-threshold approach. First, strongly atypical alien genes and strongly typical native genes would be identified using conservative thresholds. Genes with ambiguous compositional features would then be classified by examining gene context, including the class (native or alien) of flanking genes. By including additional genomic information in a multiple-threshold framework, we observed a remarkable improvement in the performance of several popular, but algorithmically distinct, methods for alien gene detection.
Collapse
Affiliation(s)
- Rajeev K Azad
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | | |
Collapse
|
33
|
Shrivastava S, Reddy CVSK, Mande SS. INDeGenIUS, a new method for high-throughput identification of specialized functional islands in completely sequenced organisms. J Biosci 2011; 35:351-64. [PMID: 20826944 DOI: 10.1007/s12038-010-0040-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Genomic islands (GIs) are regions in the genome which are believed to have been acquired via horizontal gene transfer events and are thus likely to be compositionally distinct from the rest of the genome. Majority of the genes located in a GI encode a particular function. Depending on the genes they encode, GIs can be classified into various categories, such as 'metabolic islands', 'symbiotic islands', 'resistance islands', 'pathogenicity islands', etc. The computational process for GI detection is known and many algorithms for the same are available. We present a new method termed as Improved N-mer based Detection of Genomic Islands Using Sequence-clustering (INDeGenIUS) for the identification of GIs. This method was applied to 400 completely sequenced species belonging to proteobacteria. Based on the genes encoded in the identified GIs, the GIs were grouped into 6 categories: metabolic islands, symbiotic islands, resistance islands, secretion islands, pathogenicity islands and motility islands. Several new islands of interest which had previously been missed out by earlier algorithms were picked up as GIs by INDeGenIUS. The present algorithm has potential application in the identification of functionally relevant GIs in the large number of genomes that are being sequenced. Investigation of the predicted GIs in pathogens may lead to identification of potential drug/vaccine candidates.
Collapse
Affiliation(s)
- Sakshi Shrivastava
- Bio-Sciences Division, Innovation Labs, Tata Consultancy Services, 1 Software Units Layout, Hyderabad 500 081, India
| | | | | |
Collapse
|
34
|
Chain PS, Xie G, Starkenburg SR, Scholz MB, Beckloff N, Lo CC, Davenport KW, Reitenga KG, Daligault HE, Detter JC, Freitas TA, Gleasner CD, Green LD, Han CS, McMurry KK, Meincke LJ, Shen X, Zeytun A. Genomics for Key Players in the N Cycle. Methods Enzymol 2011; 496:289-318. [DOI: 10.1016/b978-0-12-386489-5.00012-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
35
|
|
36
|
Touzain F, Denamur E, Médigue C, Barbe V, El Karoui M, Petit MA. Small variable segments constitute a major type of diversity of bacterial genomes at the species level. Genome Biol 2010; 11:R45. [PMID: 20433696 PMCID: PMC2884548 DOI: 10.1186/gb-2010-11-4-r45] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2009] [Revised: 03/15/2010] [Accepted: 04/30/2010] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Analysis of large scale diversity in bacterial genomes has mainly focused on elements such as pathogenicity islands, or more generally, genomic islands. These comprise numerous genes and confer important phenotypes, which are present or absent depending on strains. We report that despite this widely accepted notion, most diversity at the species level is composed of much smaller DNA segments, 20 to 500 bp in size, which we call microdiversity. RESULTS We performed a systematic analysis of the variable segments detected by multiple whole genome alignments at the DNA level on three species for which the greatest number of genomes have been sequenced: Escherichia coli, Staphylococcus aureus, and Streptococcus pyogenes. Among the numerous sites of variability, 62 to 73% were loci of microdiversity, many of which were located within genes. They contribute to phenotypic variations, as 3 to 6% of all genes harbor microdiversity, and 1 to 9% of total genes are located downstream from a microdiversity locus. Microdiversity loci are particularly abundant in genes encoding membrane proteins. In-depth analysis of the E. coli alignments shows that most of the diversity does not correspond to known mobile or repeated elements, and it is likely that they were generated by illegitimate recombination. An intriguing class of microdiversity includes small blocks of highly diverged sequences, whose origin is discussed. CONCLUSIONS This analysis uncovers the importance of this small-sized genome diversity, which we expect to be present in a wide range of bacteria, and possibly also in many eukaryotic genomes.
Collapse
Affiliation(s)
- Fabrice Touzain
- INRA, UMR1319, Micalis, Bat 222, Jouy en Josas, 78350, France
| | | | | | | | | | | |
Collapse
|
37
|
The pheV phenylalanine tRNA gene Klebsiella pneumoniae clinical isolates is an integration hotspot for possible niche-adaptation genomic islands. Curr Microbiol 2010; 60:210-6. [PMID: 19921332 DOI: 10.1007/s00284-009-9526-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2009] [Accepted: 10/14/2009] [Indexed: 10/20/2022]
Abstract
Horizontally acquired genomic islands may allow bacteria to conquer and colonize previously uncharted niches. Four Klebsiella pneumoniae tRNA gene insertion hotspots (arg6, asn34, met56, and pheV) in 101 clinical isolates derived from blood, sputum, wound, bile or urine specimens were screened by long-range PCR for the presence or absence of integrated islands. The pheV phenylalanine tRNA gene was the most frequently occupied site and harbored at least three entirely distinct types of islands: (1) KpGI-1, a 3.7 kb island coding for four proteins, three of which showed high similarity to two hypothetical proteins and a Gcn5-related N-acetyltransferase in Salmonella enterica, (2) KpGI-2, a 6.4 kb island coding for five proteins including a truncated phage-like integrase, two helicase-related proteins, and a homolog of the functionally elusive Fic protein, and (3) KpGI-3, a 12.6 kb island which carried seven fimbriae-related genes, first identified in MGH78578. Consistent with the niche-adaptation hypothesis, KpGI-1-like islands which coded for the putative acetyltransferase were significantly over-represented in sputum isolates as compared to urine (P < 0.001), blood (P < 0.05) or bile (P < 0.05) derived isolates. Despite the unique nature of KpGI-2, likely homologs of orf5_KpGI-2 that coded for Fic were also found at undefined locations in six other clinical isolates, though none possessed the other KpGI-2 genes. We propose that the pheV-associated islands described in this study may contribute to fine tuning and adaptation of K. pneumoniae strains toward preferred infection and/or colonization pathways.
Collapse
|
38
|
Mann S, Li J, Chen YPP. Insights into bacterial genome composition through variable target GC content profiling. J Comput Biol 2010; 17:79-96. [PMID: 20078399 DOI: 10.1089/cmb.2009.0058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
This study presents a new computational method for guanine (G) and cytosine (C), or GC, content profiling based on the idea of multiple resolution sampling (MRS). The benefit of our new approach over existing techniques follows from its ability to locate significant regions without prior knowledge of the sequence, nor the features being sought. The use of MRS has provided novel insights into bacterial genome composition. Key findings include those that are related to the core composition of bacterial genomes, to the identification of large genomic islands (in Enterobacterial genomes), and to the identification of surface protein determinants in human pathogenic organisms (e.g., Staphylococcus genomes). We observed that bacterial surface binding proteins maintain abnormal GC content, potentially pointing to a viral origin. This study has demonstrated that GC content holds a high informational worth and hints at many underlying evolutionary processes. For online Supplementary Material, see www.liebertonline.com .
Collapse
Affiliation(s)
- Scott Mann
- Faculty of Science and Technology, Deakin University, Melbourne, Victoria, Australia
| | | | | |
Collapse
|
39
|
Mann S, Chen YPP. Bacterial genomic G+C composition-eliciting environmental adaptation. Genomics 2010; 95:7-15. [DOI: 10.1016/j.ygeno.2009.09.002] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2009] [Revised: 08/18/2009] [Accepted: 09/01/2009] [Indexed: 01/12/2023]
|
40
|
Abstract
This chapter gives an overview of the most commonly used biological databases of nucleic acid sequences and their structures. We cover general sequence databases, databases for specific DNA features, noncoding RNA sequences, and RNA secondary and tertiary structures.
Collapse
|
41
|
Innovation for ascertaining genomic islands in PAO1 and PA14 of Pseudomonas aeruginosa. ACTA ACUST UNITED AC 2009. [DOI: 10.1007/s11434-009-0598-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
42
|
Gao J, Chen LL. Theoretical methods for identifying important functional genes in bacterial genomes. Res Microbiol 2009; 161:1-8. [PMID: 19900539 DOI: 10.1016/j.resmic.2009.10.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Revised: 10/05/2009] [Accepted: 10/21/2009] [Indexed: 12/30/2022]
Abstract
Some functional genes, such as essential genes, highly expressed genes and horizontally transferred genes, play important roles in the survival and pathogenicity of bacteria. This review attempts to summarize current computational methods in identifying the above functional genes from bacterial genomes, which is of significant importance in exploring the bacterial genomes.
Collapse
Affiliation(s)
- Junxiang Gao
- School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, PR China
| | | |
Collapse
|
43
|
Ramsay JP, Sullivan JT, Jambari N, Ortori CA, Heeb S, Williams P, Barrett DA, Lamont IL, Ronson CW. A LuxRI-family regulatory system controls excision and transfer of the Mesorhizobium loti strain R7A symbiosis island by activating expression of two conserved hypothetical genes. Mol Microbiol 2009; 73:1141-55. [PMID: 19682258 DOI: 10.1111/j.1365-2958.2009.06843.x] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
The symbiosis island ICEMlSym(R7A) of Mesorhizobium loti R7A is an integrative and conjugative element (ICE) that carries genes required for a nitrogen-fixing symbiosis with Lotus species. ICEMlSym(R7A) encodes homologues (TraR, TraI1 and TraI2) of proteins that regulate plasmid transfer by quorum sensing in rhizobia and agrobacteria. Introduction of traR cloned on a plasmid induced excision of ICEMlSym(R7A) in all cells, a 1000-fold increase in the production of 3-oxo-C6-homoserine lactone (3-oxo-C6-HSL) and a 40-fold increase in conjugative transfer. These effects were dependent on traI1 but not traI2. Induction of expression from the traI1 and traI2 promoters required the presence of plasmid-borne traR and either traI1 or 100 pM 3-oxo-C6-HSL, suggesting that traR expression or TraR activity is repressed in wild-type cells by a mechanism that can be overcome by additional copies of traR. The traI2 gene formed an operon with hypothetical genes msi172 and msi171 that were essential for ICEMlSym(R7A) excision and transfer. Our data suggest that derepressed TraR in conjunction with TraI1-synthesized 3-oxo-C6-HSL regulates excision and transfer of ICEMlSym(R7A) through expression of msi172 and msi171. Homologues of msi172 and msi171 were present on putative ICEs in several alpha-proteobacteria, indicating a conserved role in ICE excision and transfer.
Collapse
Affiliation(s)
- Joshua P Ramsay
- Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand
| | | | | | | | | | | | | | | | | |
Collapse
|
44
|
Rocco F, De Gregorio E, Colonna B, Di Nocera PP. Stenotrophomonas maltophilia genomes: a start-up comparison. Int J Med Microbiol 2009; 299:535-46. [PMID: 19574092 DOI: 10.1016/j.ijmm.2009.05.004] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2009] [Revised: 05/06/2009] [Accepted: 05/21/2009] [Indexed: 10/20/2022] Open
Abstract
The whole DNA sequences of 2 Stenotrophomonas maltophilia strains isolated from the blood of a cancer patient (K279a) and the poplar Populus trichocarpa (R551-3) have been compared. The 2 chromosomes exhibit extensive synteny, but each is punctuated by about 40 genomic islands (GEIs), which vary in size from 3 to 70kb, and may encode up to about 50 proteins. A large set of smaller DNA sequences, encoding strain-specific 'solo' orfs, contributes to genetic heterogeneity in a significant manner. S. maltophilia GEIs potentially encode several proteins mediating interactions with the environment such as transmembrane proteins, haemagglutinins, components of type I and IV secretion systems, and efflux proteins having a role in metal and/or drug resistance. The presence of specific GEIs in the S. maltophilia population was monitored by PCR and slot-blot analyses. Data suggest that some islands are present at sites different from those identified in K279a and that alternative islands may be integrated at mapped sites.
Collapse
Affiliation(s)
- Francesco Rocco
- Dipartimento di Biologia e Patologia Cellulare e Molecolare, Università Federico II, 80131 Napoli, Italy
| | | | | | | |
Collapse
|
45
|
Cortez D, Forterre P, Gribaldo S. A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes. Genome Biol 2009; 10:R65. [PMID: 19531232 PMCID: PMC2718499 DOI: 10.1186/gb-2009-10-6-r65] [Citation(s) in RCA: 95] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2009] [Revised: 06/04/2009] [Accepted: 06/16/2009] [Indexed: 11/10/2022] Open
Abstract
A large-scale survey of potential recently acquired integrative elements in 119 archaeal and bacterial genomes reveals that many recently acquired genes have originated from integrative elements Background Archaeal and bacterial genomes contain a number of genes of foreign origin that arose from recent horizontal gene transfer, but the role of integrative elements (IEs), such as viruses, plasmids, and transposable elements, in this process has not been extensively quantified. Moreover, it is not known whether IEs play an important role in the origin of ORFans (open reading frames without matches in current sequence databases), whose proportion remains stable despite the growing number of complete sequenced genomes. Results We have performed a large-scale survey of potential recently acquired IEs in 119 archaeal and bacterial genomes. We developed an accurate in silico Markov model-based strategy to identify clusters of genes that show atypical sequence composition (clusters of atypical genes or CAGs) and are thus likely to be recently integrated foreign elements, including IEs. Our method identified a high number of new CAGs. Probabilistic analysis of gene content indicates that 56% of these new CAGs are likely IEs, whereas only 7% likely originated via horizontal gene transfer from distant cellular sources. Thirty-four percent of CAGs remain unassigned, what may reflect a still poor sampling of IEs associated with bacterial and archaeal diversity. Moreover, our study contributes to the issue of the origin of ORFans, because 39% of these are found inside CAGs, many of which likely represent recently acquired IEs. Conclusions Our results strongly indicate that archaeal and bacterial genomes contain an impressive proportion of recently acquired foreign genes (including ORFans) coming from a still largely unexplored reservoir of IEs.
Collapse
Affiliation(s)
- Diego Cortez
- Institut Pasteur, Département de Microbiologie, Unité de Biologie Moléculaire du Gène chez les Extrêmophiles, Paris, France.
| | | | | |
Collapse
|
46
|
Analysis of ten Brucella genomes reveals evidence for horizontal gene transfer despite a preferred intracellular lifestyle. J Bacteriol 2009; 191:3569-79. [PMID: 19346311 DOI: 10.1128/jb.01767-08] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The facultative intracellular bacterial pathogen Brucella infects a wide range of warm-blooded land and marine vertebrates and causes brucellosis. Currently, there are nine recognized Brucella species based on host preferences and phenotypic differences. The availability of 10 different genomes consisting of two chromosomes and representing six of the species allowed for a detailed comparison among themselves and relatives in the order Rhizobiales. Phylogenomic analysis of ortholog families shows limited divergence but distinct radiations, producing four clades as follows: Brucella abortus-Brucella melitensis, Brucella suis-Brucella canis, Brucella ovis, and Brucella ceti. In addition, Brucella phylogeny does not appear to reflect the phylogeny of Brucella species' preferred hosts. About 4.6% of protein-coding genes seem to be pseudogenes, which is a relatively large fraction. Only B. suis 1330 appears to have an intact beta-ketoadipate pathway, responsible for utilization of plant-derived compounds. In contrast, this pathway in the other species is highly pseudogenized and consistent with the "domino theory" of gene death. There are distinct shared anomalous regions (SARs) found in both chromosomes as the result of horizontal gene transfer unique to Brucella and not shared with its closest relative Ochrobactrum, a soil bacterium, suggesting their acquisition occurred in spite of a predominantly intracellular lifestyle. In particular, SAR 2-5 appears to have been acquired by Brucella after it became intracellular. The SARs contain many genes, including those involved in O-polysaccharide synthesis and type IV secretion, which if mutated or absent significantly affect the ability of Brucella to survive intracellularly in the infected host.
Collapse
|
47
|
Pavlović-Lazetić GM, Mitić NS, Beljanski MV. n-Gram characterization of genomic islands in bacterial genomes. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2009; 93:241-56. [PMID: 19101056 PMCID: PMC7185697 DOI: 10.1016/j.cmpb.2008.10.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2008] [Revised: 09/10/2008] [Accepted: 10/21/2008] [Indexed: 05/27/2023]
Abstract
The paper presents a novel, n-gram-based method for analysis of bacterial genome segments known as genomic islands (GIs). Identification of GIs in bacterial genomes is an important task since many of them represent inserts that may contribute to bacterial evolution and pathogenesis. In order to characterize and distinguish GIs from rest of the genome, binary classification of islands based on n-gram frequency distribution have been performed. It consists of testing the agreement of islands n-gram frequency distributions with the complete genome and backbone sequence. In addition, a statistic based on the maximal order Markov model is used to identify significantly overrepresented and underrepresented n-grams in islands. The results may be used as a basis for Zipf-like analysis suggesting that some of the n-grams are overrepresented in a subset of islands and underrepresented in the backbone, or vice versa, thus complementing the binary classification. The method is applied to strain-specific regions in the Escherichia coli O157:H7 EDL933 genome (O-islands), resulting in two groups of O-islands with different n-gram characteristics. It refines a characterization based on other compositional features such as G+C content and codon usage, and may help in identification of GIs, and also in research and development of adequate drugs targeting virulence genes in them.
Collapse
|
48
|
Langille MGI, Brinkman FSL. IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics 2009; 25:664-5. [PMID: 19151094 PMCID: PMC2647836 DOI: 10.1093/bioinformatics/btp030] [Citation(s) in RCA: 334] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary: Genomic islands (clusters of genes of probable horizontal origin; GIs) play a critical role in medically important adaptations of bacteria. Recently, several computational methods have been developed to predict GIs that utilize either sequence composition bias or comparative genomics approaches. IslandViewer is a web accessible application that provides the first user-friendly interface for obtaining precomputed GI predictions, or predictions from user-inputted sequence, using the most accurate methods for genomic island prediction: IslandPick, IslandPath-DIMOB and SIGI-HMM. The graphical interface allows easy viewing and downloading of island data in multiple formats, at both the chromosome and gene level, for method-specific, or overlapping, GI predictions. Availability: The IslandViewer web service is available at http://www.pathogenomics.sfu.ca/islandviewer and the source code is freely available under the GNU GPL license. Contact:brinkman@sfu.ca
Collapse
Affiliation(s)
- Morgan G I Langille
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | | |
Collapse
|
49
|
Mitić NS, Pavlović-Lažetić GM, Beljanski MV. Could n-gram analysis contribute to genomic island determination? J Biomed Inform 2008; 41:936-43. [DOI: 10.1016/j.jbi.2008.03.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2007] [Revised: 03/13/2008] [Accepted: 03/13/2008] [Indexed: 11/28/2022]
|
50
|
Uchiyama I. Multiple genome alignment for identifying the core structure among moderately related microbial genomes. BMC Genomics 2008; 9:515. [PMID: 18976470 PMCID: PMC2615449 DOI: 10.1186/1471-2164-9-515] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2008] [Accepted: 10/31/2008] [Indexed: 12/04/2022] Open
Abstract
Background Identifying the set of intrinsically conserved genes, or the genomic core, among related genomes is crucial for understanding prokaryotic genomes where horizontal gene transfers are common. Although core genome identification appears to be obvious among very closely related genomes, it becomes more difficult when more distantly related genomes are compared. Here, we consider the core structure as a set of sufficiently long segments in which gene orders are conserved so that they are likely to have been inherited mainly through vertical transfer, and developed a method for identifying the core structure by finding the order of pre-identified orthologous groups (OGs) that maximally retains the conserved gene orders. Results The method was applied to genome comparisons of two well-characterized families, Bacillaceae and Enterobacteriaceae, and identified their core structures comprising 1438 and 2125 OGs, respectively. The core sets contained most of the essential genes and their related genes, which were primarily included in the intersection of the two core sets comprising around 700 OGs. The definition of the genomic core based on gene order conservation was demonstrated to be more robust than the simpler approach based only on gene conservation. We also investigated the core structures in terms of G+C content homogeneity and phylogenetic congruence, and found that the core genes primarily exhibited the expected characteristic, i.e., being indigenous and sharing the same history, more than the non-core genes. Conclusion The results demonstrate that our strategy of genome alignment based on gene order conservation can provide an effective approach to identify the genomic core among moderately related microbial genomes.
Collapse
Affiliation(s)
- Ikuo Uchiyama
- Department of Theoretical Biology, National Institute for Basic Biology, National Institutes of Natural Sciences, Okazaki, Aichi, Japan.
| |
Collapse
|