1
|
Ibtehaz N, Ahmed I, Ahmed MS, Rahman MS, Azad RK, Bayzid MS. SSG-LUGIA: Single Sequence based Genome Level Unsupervised Genomic Island Prediction Algorithm. Brief Bioinform 2021; 22:6290171. [PMID: 34058749 DOI: 10.1093/bib/bbab116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 03/11/2021] [Accepted: 03/13/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Genomic Islands (GIs) are clusters of genes that are mobilized through horizontal gene transfer. GIs play a pivotal role in bacterial evolution as a mechanism of diversification and adaptation to different niches. Therefore, identification and characterization of GIs in bacterial genomes is important for understanding bacterial evolution. However, quantifying GIs is inherently difficult, and the existing methods suffer from low prediction accuracy and precision-recall trade-off. Moreover, several of them are supervised in nature, and thus, their applications to newly sequenced genomes are riddled with their dependency on the functional annotation of existing genomes. RESULTS We present SSG-LUGIA, a completely automated and unsupervised approach for identifying GIs and horizontally transferred genes. SSG-LUGIA is a novel method based on unsupervised anomaly detection technique, accompanied by further refinement using cues from signal processing literature. SSG-LUGIA leverages the atypical compositional biases of the alien genes to localize GIs in prokaryotic genomes. SSG-LUGIA was assessed on a large benchmark dataset `IslandPick' and on a set of 15 well-studied genomes in the literature and followed by a thorough analysis on the well-understood Salmonella typhi CT18 genome. Furthermore, the efficacy of SSG-LUGIA in identifying horizontally transferred genes was evaluated on two additional bacterial genomes, namely, those of Corynebacterium diphtheria NCTC13129 and Pseudomonas aeruginosa LESB58. SSG-LUGIA was examined on draft genomes and was demonstrated to be efficient as an ensemble method. CONCLUSIONS Our results indicate that SSG-LUGIA achieved superior performance in comparison to frequently used existing methods. Importantly, it yielded a better trade-off between precision and recall than the existing methods. Its nondependency on the functional annotation of genomes makes it suitable for analyzing newly sequenced, yet uncharacterized genomes. Thus, our study is a significant advance in identification of GIs and horizontally transferred genes. SSG-LUGIA is available as an open source software at https://nibtehaz.github.io/SSG-LUGIA/.
Collapse
Affiliation(s)
| | - Ishtiaque Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Md Sabbir Ahmed
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - M Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| | - Rajeev K Azad
- Department of Biological Sciences and BioDiscovery Institute, University of North Texas, Denton, TX, USA.,Department of Mathematics, University of North Texas, Denton, TX, USA
| | - Md Shamsuzzoha Bayzid
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh
| |
Collapse
|
2
|
Saak CC, Dinh CB, Dutton RJ. Experimental approaches to tracking mobile genetic elements in microbial communities. FEMS Microbiol Rev 2020; 44:606-630. [PMID: 32672812 PMCID: PMC7476777 DOI: 10.1093/femsre/fuaa025] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2020] [Accepted: 06/29/2020] [Indexed: 12/19/2022] Open
Abstract
Horizontal gene transfer is an important mechanism of microbial evolution and is often driven by the movement of mobile genetic elements between cells. Due to the fact that microbes live within communities, various mechanisms of horizontal gene transfer and types of mobile elements can co-occur. However, the ways in which horizontal gene transfer impacts and is impacted by communities containing diverse mobile elements has been challenging to address. Thus, the field would benefit from incorporating community-level information and novel approaches alongside existing methods. Emerging technologies for tracking mobile elements and assigning them to host organisms provide promise for understanding the web of potential DNA transfers in diverse microbial communities more comprehensively. Compared to existing experimental approaches, chromosome conformation capture and methylome analyses have the potential to simultaneously study various types of mobile elements and their associated hosts. We also briefly discuss how fermented food microbiomes, given their experimental tractability and moderate species complexity, make ideal models to which to apply the techniques discussed herein and how they can be used to address outstanding questions in the field of horizontal gene transfer in microbial communities.
Collapse
Affiliation(s)
- Christina C Saak
- Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Cong B Dinh
- Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Rachel J Dutton
- Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| |
Collapse
|
3
|
Bertelli C, Tilley KE, Brinkman FSL. Microbial genomic island discovery, visualization and analysis. Brief Bioinform 2020; 20:1685-1698. [PMID: 29868902 PMCID: PMC6917214 DOI: 10.1093/bib/bby042] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 04/30/2018] [Indexed: 12/27/2022] Open
Abstract
Horizontal gene transfer (also called lateral gene transfer) is a major mechanism for microbial genome evolution, enabling rapid adaptation and survival in specific niches. Genomic islands (GIs), commonly defined as clusters of bacterial or archaeal genes of probable horizontal origin, are of particular medical, environmental and/or industrial interest, as they disproportionately encode virulence factors and some antimicrobial resistance genes and may harbor entire metabolic pathways that confer a specific adaptation (solvent resistance, symbiosis properties, etc). As large-scale analyses of microbial genomes increases, such as for genomic epidemiology investigations of infectious disease outbreaks in public health, there is increased appreciation of the need to accurately predict and track GIs. Over the past decade, numerous computational tools have been developed to tackle the challenges inherent in accurate GI prediction. We review here the main types of GI prediction methods and discuss their advantages and limitations for a routine analysis of microbial genomes in this era of rapid whole-genome sequencing. An assessment is provided of 20 GI prediction software methods that use sequence-composition bias to identify the GIs, using a reference GI data set from 104 genomes obtained using an independent comparative genomics approach. Finally, we present guidelines to assist researchers in effectively identifying these key genomic regions.
Collapse
Affiliation(s)
- Claire Bertelli
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Keith E Tilley
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada
| |
Collapse
|
4
|
Lu B, Leong HW. GI-Cluster: Detecting genomic islands via consensus clustering on multiple features. J Bioinform Comput Biol 2018; 16:1840010. [DOI: 10.1142/s0219720018400103] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The accurate detection of genomic islands (GIs) in microbial genomes is important for both evolutionary study and medical research, because GIs may promote genome evolution and contain genes involved in pathogenesis. Various computational methods have been developed to predict GIs over the years. However, most of them cannot make full use of GI-associated features to achieve desirable performance. Additionally, many methods cannot be directly applied to newly sequenced genomes. We develop a new method called GI-Cluster, which provides an effective way to integrate multiple GI-related features via consensus clustering. GI-Cluster does not require training datasets or existing genome annotations, but it can still achieve comparable or better performance than supervised learning methods in comprehensive evaluations. Moreover, GI-Cluster is widely applicable, either to complete and incomplete genomes or to initial GI predictions from other programs. GI-Cluster also provides plots to visualize the distribution of predicted GIs and related features. GI-Cluster is available at https://github.com/icelu/GI_Cluster.
Collapse
Affiliation(s)
- Bingxin Lu
- Department of Computer Science, National University of Singapore, 13 Computing Drive, Singapore 117417, Republic of Singapore
| | - Hon Wai Leong
- Department of Computer Science, National University of Singapore, 13 Computing Drive, Singapore 117417, Republic of Singapore
| |
Collapse
|
5
|
Bush EC, Clark AE, DeRanek CA, Eng A, Forman J, Heath K, Lee AB, Stoebel DM, Wang Z, Wilber M, Wu H. xenoGI: reconstructing the history of genomic island insertions in clades of closely related bacteria. BMC Bioinformatics 2018; 19:32. [PMID: 29402213 PMCID: PMC5799925 DOI: 10.1186/s12859-018-2038-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 01/23/2018] [Indexed: 12/13/2022] Open
Abstract
Background Genomic islands play an important role in microbial genome evolution, providing a mechanism for strains to adapt to new ecological conditions. A variety of computational methods, both genome-composition based and comparative, have been developed to identify them. Some of these methods are explicitly designed to work in single strains, while others make use of multiple strains. In general, existing methods do not identify islands in the context of the phylogeny in which they evolved. Even multiple strain approaches are best suited to identifying genomic islands that are present in one strain but absent in others. They do not automatically recognize islands which are shared between some strains in the clade or determine the branch on which these islands inserted within the phylogenetic tree. Results We have developed a software package, xenoGI, that identifies genomic islands and maps their origin within a clade of closely related bacteria, determining which branch they inserted on. It takes as input a set of sequenced genomes and a tree specifying their phylogenetic relationships. Making heavy use of synteny information, the package builds gene families in a species-tree-aware way, and then attempts to combine into islands those families whose members are adjacent and whose most recent common ancestor is shared. The package provides a variety of text-based analysis functions, as well as the ability to export genomic islands into formats suitable for viewing in a genome browser. We demonstrate the capabilities of the package with several examples from enteric bacteria, including an examination of the evolution of the acid fitness island in the genus Escherichia. In addition we use output from simulations and a set of known genomic islands from the literature to show that xenoGI can accurately identify genomic islands and place them on a phylogenetic tree. Conclusions xenoGI is an effective tool for studying the history of genomic island insertions in a clade of microbes. It identifies genomic islands, and determines which branch they inserted on within the phylogenetic tree for the clade. Such information is valuable because it helps us understand the adaptive path that has produced living species. Electronic supplementary material The online version of this article (10.1186/s12859-018-2038-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Eliot C Bush
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA.
| | - Anne E Clark
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA.,Current address: Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, 98195-5065, WA, USA
| | - Carissa A DeRanek
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA
| | - Alexander Eng
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA.,Current address: Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, 98195-5065, WA, USA
| | - Juliet Forman
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA
| | - Kevin Heath
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA.,Current address: Department of Biology and Biotechnology, Worcester Polytechnic Institute, 100 Institute Rd., Worcester, 01609, MA, USA
| | - Alexander B Lee
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA.,Current address: Quantitative Biosciences Program, Georgia Institute of Technology, 837 State Street, Atlanta, 30332-0430, GA, USA
| | - Daniel M Stoebel
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA
| | - Zunyan Wang
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA
| | - Matthew Wilber
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA
| | - Helen Wu
- Department of Biology, Harvey Mudd College, 301 Platt Blvd., Claremont, 91711, CA, USA
| |
Collapse
|
6
|
Oliveira Alvarenga D, Moreira LM, Chandler M, Varani AM. A Practical Guide for Comparative Genomics of Mobile Genetic Elements in Prokaryotic Genomes. Methods Mol Biol 2018; 1704:213-242. [PMID: 29277867 DOI: 10.1007/978-1-4939-7463-4_7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Mobile genetic elements (MGEs) are an important feature of prokaryote genomes but are seldom well annotated and, consequently, are often underestimated. MGEs include transposons (Tn), insertion sequences (ISs), prophages, genomic islands (GEIs), integrons, and integrative and conjugative elements (ICEs). They are intimately involved in genome evolution and promote phenomena such as genomic expansion and rearrangement, emergence of virulence and pathogenicity, and symbiosis. In spite of the annotation bottleneck, there are so far at least 75 different programs and databases dedicated to prokaryotic MGE analysis and annotation, and this number is rapidly growing. Here, we present a practical guide to explore, compare, and visualize prokaryote MGEs using a combination of available software and databases tailored to small scale genome analyses. This protocol can be coupled with expert MGE annotation and exploited for evolutionary and comparative genomic analyses.
Collapse
Affiliation(s)
- Danillo Oliveira Alvarenga
- Departamento de Tecnologia, Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista "Júlio de Mesquita Filho"-UNESP, Jaboticabal, SP, Brazil
| | - Leandro M Moreira
- Departamento de Ciências Biológicas-Núcleo de Pesquisas em Ciências Biológicas-NUPEB, Universidade Federal de Ouro Preto, Ouro Preto, Minas Gerais, Brazil
| | - Mick Chandler
- Laboratoire de Microbiologie et Génétique Moléculaires, CNRS 118, Route de Narbonne, 31062, Toulouse Cedex, France
| | - Alessandro M Varani
- Departamento de Tecnologia, Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista "Júlio de Mesquita Filho"-UNESP, Jaboticabal, SP, Brazil.
| |
Collapse
|
7
|
Lu B, Leong HW. Computational methods for predicting genomic islands in microbial genomes. Comput Struct Biotechnol J 2016; 14:200-6. [PMID: 27293536 PMCID: PMC4887561 DOI: 10.1016/j.csbj.2016.05.001] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2016] [Revised: 05/01/2016] [Accepted: 05/03/2016] [Indexed: 11/02/2022] Open
Abstract
Clusters of genes acquired by lateral gene transfer in microbial genomes, are broadly referred to as genomic islands (GIs). GIs often carry genes important for genome evolution and adaptation to niches, such as genes involved in pathogenesis and antibiotic resistance. Therefore, GI prediction has gradually become an important part of microbial genome analysis. Despite inherent difficulties in identifying GIs, many computational methods have been developed and show good performance. In this mini-review, we first summarize the general challenges in predicting GIs. Then we group existing GI detection methods by their input, briefly describe representative methods in each group, and discuss their advantages as well as limitations. Finally, we look into the potential improvements for better GI prediction.
Collapse
Affiliation(s)
- Bingxin Lu
- Department of Computer Science, National University of Singapore, 13 Computing Drive, Singapore 117417, Republic of Singapore
| | - Hon Wai Leong
- Department of Computer Science, National University of Singapore, 13 Computing Drive, Singapore 117417, Republic of Singapore
| |
Collapse
|
8
|
Dhillon BK, Laird MR, Shay JA, Winsor GL, Lo R, Nizam F, Pereira SK, Waglechner N, McArthur AG, Langille MGI, Brinkman FSL. IslandViewer 3: more flexible, interactive genomic island discovery, visualization and analysis. Nucleic Acids Res 2015; 43:W104-8. [PMID: 25916842 PMCID: PMC4489224 DOI: 10.1093/nar/gkv401] [Citation(s) in RCA: 239] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 04/15/2015] [Indexed: 01/12/2023] Open
Abstract
IslandViewer (http://pathogenomics.sfu.ca/islandviewer) is a widely used web-based resource for the prediction and analysis of genomic islands (GIs) in bacterial and archaeal genomes. GIs are clusters of genes of probable horizontal origin, and are of high interest since they disproportionately encode genes involved in medically and environmentally important adaptations, including antimicrobial resistance and virulence. We now report a major new release of IslandViewer, since the last release in 2013. IslandViewer 3 incorporates a completely new genome visualization tool, IslandPlot, enabling for the first time interactive genome analysis and gene search capabilities using synchronized circular, horizontal and vertical genome views. In addition, more curated virulence factors and antimicrobial resistance genes have been incorporated, and homologs of these genes identified in closely related genomes using strict filters. Pathogen-associated genes have been re-calculated for all pre-computed complete genomes. For user-uploaded genomes to be analysed, IslandViewer 3 can also now handle incomplete genomes, with an improved queuing system on compute nodes to handle user demand. Overall, IslandViewer 3 represents a significant new version of this GI analysis software, with features that may make it more broadly useful for general microbial genome analysis and visualization.
Collapse
Affiliation(s)
- Bhavjinder K Dhillon
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Matthew R Laird
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Julie A Shay
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Geoffrey L Winsor
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Raymond Lo
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Fazmin Nizam
- M.G. DeGroote Institute for Infectious Disease Research, Department of Biochemistry and Biomedical Sciences, DeGroote School of Medicine, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Sheldon K Pereira
- M.G. DeGroote Institute for Infectious Disease Research, Department of Biochemistry and Biomedical Sciences, DeGroote School of Medicine, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Nicholas Waglechner
- M.G. DeGroote Institute for Infectious Disease Research, Department of Biochemistry and Biomedical Sciences, DeGroote School of Medicine, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Andrew G McArthur
- M.G. DeGroote Institute for Infectious Disease Research, Department of Biochemistry and Biomedical Sciences, DeGroote School of Medicine, McMaster University, Hamilton, ON L8S 4K1, Canada
| | - Morgan G I Langille
- Department of Pharmacology, Dalhousie University, Halifax, NS B3H 4R2, Canada
| | - Fiona S L Brinkman
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| |
Collapse
|
9
|
Identifying pathogenicity islands in bacterial pathogenomics using computational approaches. Pathogens 2014; 3:36-56. [PMID: 25437607 PMCID: PMC4235732 DOI: 10.3390/pathogens3010036] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2013] [Revised: 12/30/2013] [Accepted: 01/07/2014] [Indexed: 12/22/2022] Open
Abstract
High-throughput sequencing technologies have made it possible to study bacteria through analyzing their genome sequences. For instance, comparative genome sequence analyses can reveal the phenomenon such as gene loss, gene gain, or gene exchange in a genome. By analyzing pathogenic bacterial genomes, we can discover that pathogenic genomic regions in many pathogenic bacteria are horizontally transferred from other bacteria, and these regions are also known as pathogenicity islands (PAIs). PAIs have some detectable properties, such as having different genomic signatures than the rest of the host genomes, and containing mobility genes so that they can be integrated into the host genome. In this review, we will discuss various pathogenicity island-associated features and current computational approaches for the identification of PAIs. Existing pathogenicity island databases and related computational resources will also be discussed, so that researchers may find it to be useful for the studies of bacterial evolution and pathogenicity mechanisms.
Collapse
|
10
|
Dhillon BK, Chiu TA, Laird MR, Langille MGI, Brinkman FSL. IslandViewer update: Improved genomic island discovery and visualization. Nucleic Acids Res 2013; 41:W129-32. [PMID: 23677610 PMCID: PMC3692065 DOI: 10.1093/nar/gkt394] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
IslandViewer (http://pathogenomics.sfu.ca/islandviewer) is a web-accessible application for the computational prediction and analysis of genomic islands (GIs) in bacterial and archaeal genomes. GIs are clusters of genes of probable horizontal origin and are of high interest because they disproportionately encode virulence factors and other adaptations of medical, environmental and industrial interest. Many computational tools exist for the prediction of GIs, but three of the most accurate methods are available in integrated form via IslandViewer: IslandPath-DIMOB, SIGI-HMM and IslandPick. IslandViewer GI predictions are precomputed for all complete microbial genomes from National Center for Biotechnology Information, with an option to upload other genomes and/or perform customized analyses using different settings. Here, we report recent changes to the IslandViewer framework that have vastly improved its efficiency in handling an increasing number of users, plus better facilitate custom genome analyses. Users may also now overlay additional annotations such as virulence factors, antibiotic resistance genes and pathogen-associated genes on top of current GI predictions. Comparisons of GIs between user-selected genomes are now facilitated through a highly requested side-by-side viewer. IslandViewer improvements aim to provide a more flexible interface, coupled with additional highly relevant annotation information, to aid analysis of GIs in diverse microbial species.
Collapse
Affiliation(s)
- Bhavjinder K Dhillon
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia V5A 1S6, Canada
| | | | | | | | | |
Collapse
|