1
|
Yu J, Jia Y, Yu Q, Lin L, Li C, Chen B, Zhong P, Lin X, Li H, Sun Y, Zhong X, He Y, Huang X, Lin S, Pan Y. Deciphering complex antibiotic resistance patterns in Helicobacter pylori through whole genome sequencing and machine learning. Front Cell Infect Microbiol 2024; 13:1306368. [PMID: 38379956 PMCID: PMC10878306 DOI: 10.3389/fcimb.2023.1306368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 12/06/2023] [Indexed: 02/22/2024] Open
Abstract
Introduction Helicobacter pylori (H.pylori, Hp) affects billions of people worldwide. However, the emerging resistance of Hp to antibiotics challenges the effectiveness of current treatments. Investigating the genotype-phenotype connection for Hp using next-generation sequencing could enhance our understanding of this resistance. Methods In this study, we analyzed 52 Hp strains collected from various hospitals. The susceptibility of these strains to five antibiotics was assessed using the agar dilution assay. Whole-genome sequencing was then performed to screen the antimicrobial resistance (AMR) genotypes of these Hp strains. To model the relationship between drug resistance and genotype, we employed univariate statistical tests, unsupervised machine learning, and supervised machine learning techniques, including the development of support vector machine models. Results Our models for predicting Amoxicillin resistance demonstrated 66% sensitivity and 100% specificity, while those for Clarithromycin resistance showed 100% sensitivity and 100% specificity. These results outperformed the known resistance sites for Amoxicillin (A1834G) and Clarithromycin (A2147), which had sensitivities of 22.2% and 87%, and specificities of 100% and 96%, respectively. Discussion Our study demonstrates that predictive modeling using supervised learning algorithms with feature selection can yield diagnostic models with higher predictive power compared to models relying on single single-nucleotide polymorphism (SNP) sites. This approach significantly contributes to enhancing the precision and effectiveness of antibiotic treatment strategies for Hp infections. The application of whole-genome sequencing for Hp presents a promising pathway for advancing personalized medicine in this context.
Collapse
Affiliation(s)
- Jianwei Yu
- Department of Gastroenterology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Yan Jia
- Department of Gastroenterology, the 7Medical Center of PLA General Hospital, Beijing, China
| | - Qichao Yu
- Center for Systems Biology, Intelliphecy, Main Building, Beishan Industrial Zone, Shenzhen, Guangdong, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Lan Lin
- Department of Gastroenterology, Xiamen Humanity Hospital, Xiamen, Fujian, China
| | - Chao Li
- Department of Gastroenterology, Peking University Aerospace School of Clinical Medicine, Beijing, China
| | - Bowang Chen
- Center for Systems Biology, Intelliphecy, Main Building, Beishan Industrial Zone, Shenzhen, Guangdong, China
- Department of Data Science, Intelliphecy, Nanjing, Jiangsu, China
| | - Pingyu Zhong
- Department of Gastroenterology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Xueqing Lin
- Department of Gastroenterology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Huilan Li
- Department of Nephrology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Yinping Sun
- Department of Gastroenterology, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Xuejing Zhong
- Department of Science and Education, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Yuqi He
- Department of Gastroenterology, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, China
| | - Xiaoyun Huang
- Center for Systems Biology, Intelliphecy, Main Building, Beishan Industrial Zone, Shenzhen, Guangdong, China
| | - Shuangming Lin
- Department of Gastrointestinal Surgery, Longyan First Affiliated Hospital of Fujian Medical University, Longyan, Fujian, China
| | - Yuanming Pan
- Cancer Research Center, Beijing Chest Hospital, Capital Medical University, Beijing Tuberculosis and Thoracic Tumor Research Institute, Beijing, China
| |
Collapse
|
2
|
Dutta A, McDonald BA, Croll D. Combined reference-free and multi-reference based GWAS uncover cryptic variation underlying rapid adaptation in a fungal plant pathogen. PLoS Pathog 2023; 19:e1011801. [PMID: 37972199 PMCID: PMC10688896 DOI: 10.1371/journal.ppat.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 11/30/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
Microbial pathogens often harbor substantial functional diversity driven by structural genetic variation. Rapid adaptation from such standing variation threatens global food security and human health. Genome-wide association studies (GWAS) provide a powerful approach to identify genetic variants underlying recent pathogen adaptation. However, the reliance on single reference genomes and single nucleotide polymorphisms (SNPs) obscures the true extent of adaptive genetic variation. Here, we show quantitatively how a combination of multiple reference genomes and reference-free approaches captures substantially more relevant genetic variation compared to single reference mapping. We performed reference-genome based association mapping across 19 reference-quality genomes covering the diversity of the species. We contrasted the results with a reference-free (i.e., k-mer) approach using raw whole-genome sequencing data in a panel of 145 strains collected across the global distribution range of the fungal wheat pathogen Zymoseptoria tritici. We mapped the genetic architecture of 49 life history traits including virulence, reproduction and growth in multiple stressful environments. The inclusion of additional reference genome SNP datasets provides a nearly linear increase in additional loci mapped through GWAS. Variants detected through the k-mer approach explained a higher proportion of phenotypic variation than a reference genome-based approach and revealed functionally confirmed loci that classic GWAS approaches failed to map. The power of GWAS in microbial pathogens can be significantly enhanced by comprehensively capturing structural genetic variation. Our approach is generalizable to a large number of species and will uncover novel mechanisms driving rapid adaptation of pathogens.
Collapse
Affiliation(s)
- Anik Dutta
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Bruce A. McDonald
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Daniel Croll
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| |
Collapse
|
3
|
Graña-Miraglia L, Morales-Lizcano N, Wang PW, Hwang DM, Yau YCW, Waters VJ, Guttman DS. Predictive modeling of antibiotic eradication therapy success for new-onset Pseudomonas aeruginosa pulmonary infections in children with cystic fibrosis. PLoS Comput Biol 2023; 19:e1011424. [PMID: 37672526 PMCID: PMC10506723 DOI: 10.1371/journal.pcbi.1011424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 09/18/2023] [Accepted: 08/09/2023] [Indexed: 09/08/2023] Open
Abstract
Chronic Pseudomonas aeruginosa (Pa) lung infections are the leading cause of mortality among cystic fibrosis (CF) patients; therefore, the eradication of new-onset Pa lung infections is an important therapeutic goal that can have long-term health benefits. The use of early antibiotic eradication therapy (AET) has been shown to clear the majority of new-onset Pa infections, and it is hoped that identifying the underlying basis for AET failure will further improve treatment outcomes. Here we generated machine learning models to predict AET outcomes based on pathogen genomic data. We used a nested cross validation design, population structure control, and recursive feature selection to improve model performance and showed that incorporating population structure control was crucial for improving model interpretation and generalizability. Our best model, controlling for population structure and using only 30 recursively selected features, had an area under the curve of 0.87 for a holdout test dataset. The top-ranked features were generally associated with motility, adhesion, and biofilm formation.
Collapse
Affiliation(s)
- Lucía Graña-Miraglia
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Nadia Morales-Lizcano
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Pauline W. Wang
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario, Canada
| | - David M. Hwang
- Department of Laboratory Medicine and Pathobiology, Toronto, Ontario, Canada
- Laboratory Medicine and Molecular Diagnostics, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Yvonne C. W. Yau
- Department of Laboratory Medicine and Pathobiology, Toronto, Ontario, Canada
- Department of Paediatric Laboratory Medicine, Division of Microbiology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Valerie J. Waters
- Department of Pediatrics, Division of Infectious Diseases, The Hospital for Sick Children, Toronto, Ontario, Canada
- Translational Medicine, Research Institute, Hospital for Sick Children, Toronto, Ontario, Canada
| | - David S. Guttman
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
4
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
5
|
Iquebal MA, Jagannadham J, Jaiswal S, Prabha R, Rai A, Kumar D. Potential Use of Microbial Community Genomes in Various Dimensions of Agriculture Productivity and Its Management: A Review. Front Microbiol 2022; 13:708335. [PMID: 35655999 PMCID: PMC9152772 DOI: 10.3389/fmicb.2022.708335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 03/17/2022] [Indexed: 12/12/2022] Open
Abstract
Agricultural productivity is highly influenced by its associated microbial community. With advancements in omics technology, metagenomics is known to play a vital role in microbial world studies by unlocking the uncultured microbial populations present in the environment. Metagenomics is a diagnostic tool to target unique signature loci of plant and animal pathogens as well as beneficial microorganisms from samples. Here, we reviewed various aspects of metagenomics from experimental methods to techniques used for sequencing, as well as diversified computational resources, including databases and software tools. Exhaustive focus and study are conducted on the application of metagenomics in agriculture, deciphering various areas, including pathogen and plant disease identification, disease resistance breeding, plant pest control, weed management, abiotic stress management, post-harvest management, discoveries in agriculture, source of novel molecules/compounds, biosurfactants and natural product, identification of biosynthetic molecules, use in genetically modified crops, and antibiotic-resistant genes. Metagenomics-wide association studies study in agriculture on crop productivity rates, intercropping analysis, and agronomic field is analyzed. This article is the first of its comprehensive study and prospects from an agriculture perspective, focusing on a wider range of applications of metagenomics and its association studies.
Collapse
Affiliation(s)
- Mir Asif Iquebal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Jaisri Jagannadham
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sarika Jaiswal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Ratna Prabha
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dinesh Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
- School of Interdisciplinary and Applied Sciences, Central University of Haryana, Mahendergarh, Haryana, India
| |
Collapse
|
6
|
Allué-Guardia A, Koenig SSK, Martinez RA, Rodriguez AL, Bosilevac JM, Feng† P, Eppinger M. Pathogenomes and variations in Shiga toxin production among geographically distinct clones of Escherichia coli O113:H21. Microb Genom 2022; 8. [PMID: 35394418 PMCID: PMC9453080 DOI: 10.1099/mgen.0.000796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Infections with globally disseminated Shiga toxin-producing Escherichia coli (STEC) of the O113:H21 serotype can progress to severe clinical complications, such as hemolytic uremic syndrome (HUS). Two phylogeographically distinct clonal complexes have been established by multi locus sequence typing (MLST). Infections with ST-820 isolates circulating exclusively in Australia have caused severe human disease, such as HUS. Conversely, ST-223 isolates prevalent in the US and outside Australia seem to rarely cause severe human disease but are frequent contaminants. Following a genomic epidemiology approach, we wanted to gain insights into the underlying cause for this disparity. We examined the plasticity in the genome make-up and Shiga toxin production in a collection of 20 ST-820 and ST-223 strains isolated from produce, the bovine reservoir, and clinical cases. STEC are notorious for assembly into fragmented draft sequences when using short-read sequencing technologies due to the extensive and partly homologous phage complement. The application of long-read technology (LRT) sequencing yielded closed reference chromosomes and plasmids for two representative ST-820 and ST-223 strains. The established high-resolution framework, based on whole genome alignments, single nucleotide polymorphism (SNP)-typing and MLST, includes the chromosomes and plasmids of other publicly available O113:H21 sequences and allowed us to refine the phylogeographical boundaries of ST-820 and ST-223 complex isolates and to further identify a historic non-shigatoxigenic strain from Mexico as a quasi-intermediate. Plasmid comparison revealed strong correlations between the strains' featured pO113 plasmid genotypes and chromosomally inferred ST, which suggests coevolution of the chromosome and virulence plasmids. Our pathogenicity assessment revealed statistically significant differences in the Stx2a-production capabilities of ST-820 as compared to ST-223 strains under RecA-induced Stx phage mobilization, a condition that mimics Stx-phage induction. These observations suggest that ST-820 strains may confer an increased pathogenic potential in line with the strain-associated epidemiological metadata. Still, some of the tested ST-223 cultures sourced from contaminated produce or the bovine reservoir also produced Stx at levels comparable to those of ST-820 isolates, which calls for awareness and for continued surveillance of this lineage.
Collapse
Affiliation(s)
- Anna Allué-Guardia
- Department of Molecular Microbiology and Immunology, University of Texas at San Antonio, San Antonio, TX, USA
- South Texas Center for Emerging Infectious Diseases (STCEID), San Antonio, TX, USA
| | - Sara S. K. Koenig
- Department of Molecular Microbiology and Immunology, University of Texas at San Antonio, San Antonio, TX, USA
- South Texas Center for Emerging Infectious Diseases (STCEID), San Antonio, TX, USA
| | - Ricardo A. Martinez
- Department of Molecular Microbiology and Immunology, University of Texas at San Antonio, San Antonio, TX, USA
- South Texas Center for Emerging Infectious Diseases (STCEID), San Antonio, TX, USA
| | - Armando L. Rodriguez
- University of Texas at San Antonio, Research Computing Support Group, San Antonio, TX, USA
| | - Joseph M. Bosilevac
- U.S. Department of Agriculture (USDA), Agricultural Research Service (ARS), Roman L. Hruska U.S. Meat Animal Research Center, Clay Center, NE, USA
| | - Peter Feng†
- U.S. Food and Drug Administration (FDA), College Park, MD, USA
| | - Mark Eppinger
- Department of Molecular Microbiology and Immunology, University of Texas at San Antonio, San Antonio, TX, USA
- South Texas Center for Emerging Infectious Diseases (STCEID), San Antonio, TX, USA
- *Correspondence: Mark Eppinger,
| |
Collapse
|
7
|
Allen JP, Snitkin E, Pincus NB, Hauser AR. Forest and Trees: Exploring Bacterial Virulence with Genome-wide Association Studies and Machine Learning. Trends Microbiol 2021; 29:621-633. [PMID: 33455849 PMCID: PMC8187264 DOI: 10.1016/j.tim.2020.12.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 12/07/2020] [Accepted: 12/08/2020] [Indexed: 12/15/2022]
Abstract
The advent of inexpensive and rapid sequencing technologies has allowed bacterial whole-genome sequences to be generated at an unprecedented pace. This wealth of information has revealed an unanticipated degree of strain-to-strain genetic diversity within many bacterial species. Awareness of this genetic heterogeneity has corresponded with a greater appreciation of intraspecies variation in virulence. A number of comparative genomic strategies have been developed to link these genotypic and pathogenic differences with the aim of discovering novel virulence factors. Here, we review recent advances in comparative genomic approaches to identify bacterial virulence determinants, with a focus on genome-wide association studies and machine learning.
Collapse
Affiliation(s)
- Jonathan P Allen
- Department of Microbiology and Immunology, Loyola University Chicago Stritch School of Medicine, Maywood, IL 60153, USA.
| | - Evan Snitkin
- Department of Microbiology and Immunology, Department of Internal Medicine/Division of Infectious Diseases, University of Michigan, Ann Arbor, MI 48109, USA
| | - Nathan B Pincus
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Alan R Hauser
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA; Department of Medicine/Division of Infectious Diseases, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| |
Collapse
|
8
|
Abstract
Alphaherpesviruses, as large double-stranded DNA viruses, were long considered to be genetically stable and to exist in a homogeneous state. Recently, the proliferation of high-throughput sequencing (HTS) and bioinformatics analysis has expanded our understanding of herpesvirus genomes and the variations found therein. Recent data indicate that herpesviruses exist as diverse populations, both in culture and in vivo, in a manner reminiscent of RNA viruses. In this review, we discuss the past, present, and potential future of alphaherpesvirus genomics, including the technical challenges that face the field. We also review how recent data has enabled genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures, including those introduced by cell culture. While we focus on the human alphaherpesviruses, we draw key insights from related veterinary species and from the beta- and gamma-subfamilies of herpesviruses. Promising technologies and potential future directions for herpesvirus genomics are highlighted as well, including the potential to link viral genetic differences to phenotypic and disease outcomes.
Collapse
Affiliation(s)
- Chad V. Kuny
- Departments of Biology, and Biochemistry and Molecular Biology, Center for Infectious Disease Dynamics, and the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, 16802, USA
| | - Moriah L. Szpara
- Departments of Biology, and Biochemistry and Molecular Biology, Center for Infectious Disease Dynamics, and the Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, Pennsylvania, 16802, USA
| |
Collapse
|
9
|
Temporal encoding of bacterial identity and traits in growth dynamics. Proc Natl Acad Sci U S A 2020; 117:20202-20210. [PMID: 32747578 DOI: 10.1073/pnas.2008807117] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In biology, it is often critical to determine the identity of an organism and phenotypic traits of interest. Whole-genome sequencing can be useful for this but has limited power for trait prediction. However, we can take advantage of the inherent information content of phenotypes to bypass these limitations. We demonstrate, in clinical and environmental bacterial isolates, that growth dynamics in standardized conditions can differentiate between genotypes, even among strains from the same species. We find that for pairs of isolates, there is little correlation between genetic distance, according to phylogenetic analysis, and phenotypic distance, as determined by growth dynamics. This absence of correlation underscores the challenge in using genomics to infer phenotypes and vice versa. Bypassing this complexity, we show that growth dynamics alone can robustly predict antibiotic responses. These findings are a foundation for a method to identify traits not easily traced to a genetic mechanism.
Collapse
|
10
|
San JE, Baichoo S, Kanzi A, Moosa Y, Lessells R, Fonseca V, Mogaka J, Power R, de Oliveira T. Current Affairs of Microbial Genome-Wide Association Studies: Approaches, Bottlenecks and Analytical Pitfalls. Front Microbiol 2020; 10:3119. [PMID: 32082269 PMCID: PMC7002396 DOI: 10.3389/fmicb.2019.03119] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 12/24/2019] [Indexed: 12/12/2022] Open
Abstract
Microbial genome-wide association studies (mGWAS) are a new and exciting research field that is adapting human GWAS methods to understand how variations in microbial genomes affect host or pathogen phenotypes, such as drug resistance, virulence, host specificity and prognosis. Several computational tools and methods have been developed or adapted from human GWAS to facilitate the discovery of novel mutations and structural variations that are associated with the phenotypes of interest. However, no comprehensive, end-to-end, user-friendly tool is currently available. The development of a broadly applicable pipeline presents a real opportunity among computational biologists. Here, (i) we review the prominent and promising tools, (ii) discuss analytical pitfalls and bottlenecks in mGWAS, (iii) provide insights into the selection of appropriate tools, (iv) highlight the gaps that still need to be filled and how users and developers can work together to overcome these bottlenecks. Use of mGWAS research can inform drug repositioning decisions as well as accelerate the discovery and development of more effective vaccines and antimicrobials for pressing infectious diseases of global health significance, such as HIV, TB, influenza, and malaria.
Collapse
Affiliation(s)
- James Emmanuel San
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Shakuntala Baichoo
- Department of Digital Technologies, FoICDT, University of Mauritius, Réduit, Mauritius
| | - Aquillah Kanzi
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Yumna Moosa
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Richard Lessells
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Vagner Fonseca
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
- Laboratório de Genética Celular e Molecular, ICB, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - John Mogaka
- Discipline of Public Health, University of Kwazulu-Natal, Durban, South Africa
| | - Robert Power
- St Edmund Hall, Oxford University, Oxford, United Kingdom
| | - Tulio de Oliveira
- Kwazulu-Natal Research and Innovation Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
- Department of Global Health, University of Washington, Seattle, WA, United States
| |
Collapse
|
11
|
Escalas A, Hale L, Voordeckers JW, Yang Y, Firestone MK, Alvarez‐Cohen L, Zhou J. Microbial functional diversity: From concepts to applications. Ecol Evol 2019; 9:12000-12016. [PMID: 31695904 PMCID: PMC6822047 DOI: 10.1002/ece3.5670] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 08/27/2019] [Accepted: 08/28/2019] [Indexed: 12/21/2022] Open
Abstract
Functional diversity is increasingly recognized by microbial ecologists as the essential link between biodiversity patterns and ecosystem functioning, determining the trophic relationships and interactions between microorganisms, their participation in biogeochemical cycles, and their responses to environmental changes. Consequently, its definition and quantification have practical and theoretical implications. In this opinion paper, we present a synthesis on the concept of microbial functional diversity from its definition to its application. Initially, we revisit to the original definition of functional diversity, highlighting two fundamental aspects, the ecological unit under study and the functional traits used to characterize it. Then, we discuss how the particularities of the microbial world disallow the direct application of the concepts and tools developed for macroorganisms. Next, we provide a synthesis of the literature on the types of ecological units and functional traits available in microbial functional ecology. We also provide a list of more than 400 traits covering a wide array of environmentally relevant functions. Lastly, we provide examples of the use of functional diversity in microbial systems based on the different units and traits discussed herein. It is our hope that this paper will stimulate discussions and help the growing field of microbial functional ecology to realize a potential that thus far has only been attained in macrobial ecology.
Collapse
Affiliation(s)
- Arthur Escalas
- MARBECCNRSIfremerIRDUniversity of MontpellierMontpellier Cedex 5France
- Institute for Environmental Genomics and Department of Microbiology and Plant BiologyUniversity of OklahomaNormanOKUSA
| | - Lauren Hale
- Water Management Research UnitSJVASCUSDA‐ARSParlierCAUSA
| | | | - Yunfeng Yang
- State Key Joint Laboratory of Environment Simulation and Pollution ControlSchool of EnvironmentTsinghua UniversityBeijingChina
| | - Mary K. Firestone
- Department of Environmental Science, Policy, and ManagementUniversity of CaliforniaBerkeleyCAUSA
| | - Lisa Alvarez‐Cohen
- Department of Civil and Environmental EngineeringUniversity of CaliforniaBerkeleyCAUSA
| | - Jizhong Zhou
- Institute for Environmental Genomics and Department of Microbiology and Plant BiologyUniversity of OklahomaNormanOKUSA
- State Key Joint Laboratory of Environment Simulation and Pollution ControlSchool of EnvironmentTsinghua UniversityBeijingChina
- Earth and Environmental SciencesLawrence Berkeley National LaboratoryBerkeleyCAUSA
| |
Collapse
|
12
|
Vilne B, Meistere I, Grantiņa-Ieviņa L, Ķibilds J. Machine Learning Approaches for Epidemiological Investigations of Food-Borne Disease Outbreaks. Front Microbiol 2019; 10:1722. [PMID: 31447800 PMCID: PMC6691741 DOI: 10.3389/fmicb.2019.01722] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2019] [Accepted: 07/12/2019] [Indexed: 12/14/2022] Open
Abstract
Foodborne diseases (FBDs) are infections of the gastrointestinal tract caused by foodborne pathogens (FBPs) such as bacteria [Salmonella, Listeria monocytogenes and Shiga toxin-producing E. coli (STEC)] and several viruses, but also parasites and some fungi. Artificial intelligence (AI) and its sub-discipline machine learning (ML) are re-emerging and gaining an ever increasing popularity in the scientific community and industry, and could lead to actionable knowledge in diverse ranges of sectors including epidemiological investigations of FBD outbreaks and antimicrobial resistance (AMR). As genotyping using whole-genome sequencing (WGS) is becoming more accessible and affordable, it is increasingly used as a routine tool for the detection of pathogens, and has the potential to differentiate between outbreak strains that are closely related, identify virulence/resistance genes and provide improved understanding of transmission events within hours to days. In most cases, the computational pipeline of WGS data analysis can be divided into four (though, not necessarily consecutive) major steps: de novo genome assembly, genome characterization, comparative genomics, and inference of phylogeny or phylogenomics. In each step, ML could be used to increase the speed and potentially the accuracy (provided increasing amounts of high-quality input data) of identification of the source of ongoing outbreaks, leading to more efficient treatment and prevention of additional cases. In this review, we explore whether ML or any other form of AI algorithms have already been proposed for the respective tasks and compare those with mechanistic model-based approaches.
Collapse
Affiliation(s)
- Baiba Vilne
- Institute of Food Safety, Animal Health and Environment—“BIOR”, Riga, Latvia
- SIA net-OMICS, Riga, Latvia
| | - Irēna Meistere
- Institute of Food Safety, Animal Health and Environment—“BIOR”, Riga, Latvia
| | | | - Juris Ķibilds
- Institute of Food Safety, Animal Health and Environment—“BIOR”, Riga, Latvia
| |
Collapse
|
13
|
Uncovering carbohydrate metabolism through a genotype-phenotype association study of 56 lactic acid bacteria genomes. Appl Microbiol Biotechnol 2019; 103:3135-3152. [PMID: 30830251 PMCID: PMC6447522 DOI: 10.1007/s00253-019-09701-6] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Revised: 02/14/2019] [Accepted: 02/14/2019] [Indexed: 11/09/2022]
Abstract
Owing to their unique potential to ferment carbohydrates, both homo- and heterofermentative lactic acid bacteria (LAB) are widely used in the food industry. Deciphering the genetic basis that determine the LAB fermentation type, and hence carbohydrate utilization, is paramount to optimize LAB industrial processes. Deep sequencing of 24 LAB species and comparison with 32 publicly available genome sequences provided a comparative data set including five major LAB genera for further analysis. Phylogenomic reconstruction confirmed Leuconostoc and Pediococcus species as independently emerging from the Lactobacillus genus, within one of the three phylogenetic clades identified. These clades partially grouped LABs according to their fermentation types, suggesting that some metabolic capabilities were independently acquired during LAB evolution. In order to apply a genome-wide association study (GWAS) at the multigene family level, utilization of 49 carbohydrates was also profiled for these 56 LAB species. GWAS results indicated that obligately heterofermentative species lack 1-phosphofructokinase, required for d-mannose degradation in the homofermentative pathway. Heterofermentative species were found to often contain the araBAD operon, involved in l-arabinose degradation, which is important for heterofermentation. Taken together, our results provide helpful insights into the genetic determinants of LAB carbohydrate metabolism, and opens for further experimental research, aiming at validating the role of these candidate genes for industrial applications.
Collapse
|
14
|
Wels M, Siezen R, van Hijum S, Kelly WJ, Bachmann H. Comparative Genome Analysis of Lactococcus lactis Indicates Niche Adaptation and Resolves Genotype/Phenotype Disparity. Front Microbiol 2019; 10:4. [PMID: 30766512 PMCID: PMC6365430 DOI: 10.3389/fmicb.2019.00004] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 01/07/2019] [Indexed: 01/21/2023] Open
Abstract
Lactococcus lactis is one of the most important micro-organisms in the dairy industry for the fermentation of cheese and buttermilk. Besides the conversion of lactose to lactate it is responsible for product properties such as flavor and texture, which are determined by volatile metabolites, proteolytic activity and exopolysaccharide production. While the species Lactococcus lactis consists of the two subspecies lactis and cremoris their taxonomic position is confused by a group of strains that, despite of a cremoris genotype, display a lactis phenotype. Here we compared and analyzed the (draft) genomes of 43 L. lactis strains, of which 19 are of dairy and 24 are of non-dairy origin. Machine-learning algorithms facilitated the identification of orthologous groups of protein sequences (OGs) that are predictors for either the taxonomic position or the source of isolation. This allowed the unambiguous categorization of the genotype/phenotype disparity of ssp. lactis and ssp. cremoris strains. A detailed analysis of phenotypic properties including plasmid-encoded genes indicates evolutionary changes during niche adaptations. The results are consistent with the hypothesis that dairy isolates evolved from plant isolates. The analysis further suggests that genomes of cremoris phenotype strains are so eroded that they are restricted to a dairy environment. Overall the genome comparison of a diverse set of strains allowed the identification of niche and subspecies specific genes. This explains evolutionary relationships and will aid the identification and selection of industrial starter cultures.
Collapse
Affiliation(s)
- Michiel Wels
- NIZO Food Research B.V., Ede, Netherlands.,TI Food and Nutrition, Wageningen, Netherlands
| | - Roland Siezen
- TI Food and Nutrition, Wageningen, Netherlands.,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands.,Microbial Bioinformatics, Ede, Netherlands
| | - Sacha van Hijum
- NIZO Food Research B.V., Ede, Netherlands.,TI Food and Nutrition, Wageningen, Netherlands.,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Nijmegen, Netherlands
| | | | - Herwig Bachmann
- NIZO Food Research B.V., Ede, Netherlands.,TI Food and Nutrition, Wageningen, Netherlands.,Systems Bioinformatics, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
15
|
Asgari E, Garakani K, McHardy AC, Mofrad MRK. MicroPheno: predicting environments and host phenotypes from 16S rRNA gene sequencing using a k-mer based representation of shallow sub-samples. Bioinformatics 2018; 34:i32-i42. [PMID: 29950008 PMCID: PMC6022683 DOI: 10.1093/bioinformatics/bty296] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Motivation Microbial communities play important roles in the function and maintenance of various biosystems, ranging from the human body to the environment. A major challenge in microbiome research is the classification of microbial communities of different environments or host phenotypes. The most common and cost-effective approach for such studies to date is 16S rRNA gene sequencing. Recent falls in sequencing costs have increased the demand for simple, efficient and accurate methods for rapid detection or diagnosis with proved applications in medicine, agriculture and forensic science. We describe a reference- and alignment-free approach for predicting environments and host phenotypes from 16S rRNA gene sequencing based on k-mer representations that benefits from a bootstrapping framework for investigating the sufficiency of shallow sub-samples. Deep learning methods as well as classical approaches were explored for predicting environments and host phenotypes. Results A k-mer distribution of shallow sub-samples outperformed Operational Taxonomic Unit (OTU) features in the tasks of body-site identification and Crohn's disease prediction. Aside from being more accurate, using k-mer features in shallow sub-samples allows (i) skipping computationally costly sequence alignments required in OTU-picking and (ii) provided a proof of concept for the sufficiency of shallow and short-length 16S rRNA sequencing for phenotype prediction. In addition, k-mer features predicted representative 16S rRNA gene sequences of 18 ecological environments, and 5 organismal environments with high macro-F1 scores of 0.88 and 0.87. For large datasets, deep learning outperformed classical methods such as Random Forest and Support Vector Machine. Availability and implementation The software and datasets are available at https://llp.berkeley.edu/micropheno. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ehsaneddin Asgari
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA, USA
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
| | - Kiavash Garakani
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA, USA
| | - Alice C McHardy
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
| | - Mohammad R K Mofrad
- Molecular Cell Biomechanics Laboratory, Departments of Bioengineering and Mechanical Engineering, University of California, Berkeley, CA, USA
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Lab, Berkeley, CA, USA
| |
Collapse
|
16
|
Vale FF, Lehours P. Relating Phage Genomes to Helicobacter pylori Population Structure: General Steps Using Whole-Genome Sequencing Data. Int J Mol Sci 2018; 19:ijms19071831. [PMID: 29933614 PMCID: PMC6073503 DOI: 10.3390/ijms19071831] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 05/30/2018] [Accepted: 06/15/2018] [Indexed: 12/19/2022] Open
Abstract
The review uses the Helicobacter pylori, the gastric bacterium that colonizes the human stomach, to address how to obtain information from bacterial genomes about prophage biology. In a time of continuous growing number of genomes available, this review provides tools to explore genomes for prophage presence, or other mobile genetic elements and virulence factors. The review starts by covering the genetic diversity of H. pylori and then moves to the biologic basis and the bioinformatics approaches used for studding the H. pylori phage biology from their genomes and how this is related with the bacterial population structure. Aspects concerning H. pylori prophage biology, evolution and phylogeography are discussed.
Collapse
Affiliation(s)
- Filipa F Vale
- Host-Pathogen Interactions Unit, Research Institute for Medicines (iMed-ULisboa), Faculdade de Farmácia, Universidade de Lisboa, 1649-003 Lisboa, Portugal.
| | - Philippe Lehours
- Laboratoire de Bacteriologie, Centre National de Référence des Campylobacters et Hélicobacters, Place Amélie Raba Léon, 33076 Bordeaux, France.
- INSERM U1053-UMR Bordeaux Research in Translational Oncology, BaRITOn, 33000 Bordeaux, France.
| |
Collapse
|
17
|
Wheeler NE, Gardner PP, Barquist L. Machine learning identifies signatures of host adaptation in the bacterial pathogen Salmonella enterica. PLoS Genet 2018; 14:e1007333. [PMID: 29738521 PMCID: PMC5940178 DOI: 10.1371/journal.pgen.1007333] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 03/24/2018] [Indexed: 11/18/2022] Open
Abstract
Emerging pathogens are a major threat to public health, however understanding how pathogens adapt to new niches remains a challenge. New methods are urgently required to provide functional insights into pathogens from the massive genomic data sets now being generated from routine pathogen surveillance for epidemiological purposes. Here, we measure the burden of atypical mutations in protein coding genes across independently evolved Salmonella enterica lineages, and use these as input to train a random forest classifier to identify strains associated with extraintestinal disease. Members of the species fall along a continuum, from pathovars which cause gastrointestinal infection and low mortality, associated with a broad host-range, to those that cause invasive infection and high mortality, associated with a narrowed host range. Our random forest classifier learned to perfectly discriminate long-established gastrointestinal and invasive serovars of Salmonella. Additionally, it was able to discriminate recently emerged Salmonella Enteritidis and Typhimurium lineages associated with invasive disease in immunocompromised populations in sub-Saharan Africa, and within-host adaptation to invasive infection. We dissect the architecture of the model to identify the genes that were most informative of phenotype, revealing a common theme of degradation of metabolic pathways in extraintestinal lineages. This approach accurately identifies patterns of gene degradation and diversifying selection specific to invasive serovars that have been captured by more labour-intensive investigations, but can be readily scaled to larger analyses.
Collapse
Affiliation(s)
- Nicole E. Wheeler
- Wellcome Sanger Institute, Hinxton, United Kingdom
- Biomolecular Interaction Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
- * E-mail: (NEW); (LB)
| | - Paul P. Gardner
- Biomolecular Interaction Centre, School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Lars Barquist
- Institute for Molecular Infection Biology, University of Wuerzburg, Wuerzburg, Germany
- Helmholtz Institute for RNA-based Infection Research, Wuerzburg, Germany
- * E-mail: (NEW); (LB)
| |
Collapse
|
18
|
Abstract
Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.
Collapse
|
19
|
Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution. J Virol 2017; 92:JVI.00908-17. [PMID: 29046445 PMCID: PMC5730764 DOI: 10.1128/jvi.00908-17] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well.
Collapse
|
20
|
Baltrus DA, McCann HC, Guttman DS. Evolution, genomics and epidemiology of Pseudomonas syringae: Challenges in Bacterial Molecular Plant Pathology. MOLECULAR PLANT PATHOLOGY 2017; 18:152-168. [PMID: 27798954 PMCID: PMC6638251 DOI: 10.1111/mpp.12506] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Revised: 10/25/2016] [Accepted: 10/26/2016] [Indexed: 05/12/2023]
Abstract
A remarkable shift in our understanding of plant-pathogenic bacteria is underway. Until recently, nearly all research on phytopathogenic bacteria was focused on a small number of model strains, which provided a deep, but narrow, perspective on plant-microbe interactions. Advances in genome sequencing technologies have changed this by enabling the incorporation of much greater diversity into comparative and functional research. We are now moving beyond a typological understanding of a select collection of strains to a more generalized appreciation of the breadth and scope of plant-microbe interactions. The study of natural populations and evolution has particularly benefited from the expansion of genomic data. We are beginning to have a much deeper understanding of the natural genetic diversity, niche breadth, ecological constraints and defining characteristics of phytopathogenic species. Given this expanding genomic and ecological knowledge, we believe the time is ripe to evaluate what we know about the evolutionary dynamics of plant pathogens.
Collapse
Affiliation(s)
| | - Honour C. McCann
- New Zealand Institute for Advanced StudyMassey UniversityAuckland 0632New Zealand
| | - David S. Guttman
- Department of Cell and Systems BiologyUniversity of TorontoTorontoON M5S 3B2Canada
- Centre for the Analysis of Genome Evolution and FunctionUniversity of TorontoTorontoON M5S 3B2Canada
| |
Collapse
|
21
|
Brbić M, Piškorec M, Vidulin V, Kriško A, Šmuc T, Supek F. The landscape of microbial phenotypic traits and associated genes. Nucleic Acids Res 2016; 44:10074-10090. [PMID: 27915291 PMCID: PMC5137458 DOI: 10.1093/nar/gkw964] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Revised: 09/21/2016] [Accepted: 10/11/2016] [Indexed: 12/31/2022] Open
Abstract
Bacteria and Archaea display a variety of phenotypic traits and can adapt to diverse ecological niches. However, systematic annotation of prokaryotic phenotypes is lacking. We have therefore developed ProTraits, a resource containing ∼545 000 novel phenotype inferences, spanning 424 traits assigned to 3046 bacterial and archaeal species. These annotations were assigned by a computational pipeline that associates microbes with phenotypes by text-mining the scientific literature and the broader World Wide Web, while also being able to define novel concepts from unstructured text. Moreover, the ProTraits pipeline assigns phenotypes by drawing extensively on comparative genomics, capturing patterns in gene repertoires, codon usage biases, proteome composition and co-occurrence in metagenomes. Notably, we find that gene synteny is highly predictive of many phenotypes, and highlight examples of gene neighborhoods associated with spore-forming ability. A global analysis of trait interrelatedness outlined clusters in the microbial phenotype network, suggesting common genetic underpinnings. Our extended set of phenotype annotations allows detection of 57 088 high confidence gene-trait links, which recover many known associations involving sporulation, flagella, catalase activity, aerobicity, photosynthesis and other traits. Over 99% of the commonly occurring gene families are involved in genetic interactions conditional on at least one phenotype, suggesting that epistasis has a major role in shaping microbial gene content.
Collapse
Affiliation(s)
- Maria Brbić
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Matija Piškorec
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Vedrana Vidulin
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Anita Kriško
- Mediterranean Institute of Life Sciences, 21000 Split, Croatia
| | - Tomislav Šmuc
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia
| | - Fran Supek
- Division of Electronics, Ruder Boskovic Institute, 10000 Zagreb, Croatia .,EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain.,Universitat Pompeu Fabra (UPF), 08002 Barcelona, Spain
| |
Collapse
|
22
|
Abstract
The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects.
Collapse
|
23
|
Koehorst JJ, Saccenti E, Schaap PJ, Martins Dos Santos VAP, Suarez-Diez M. Protein domain architectures provide a fast, efficient and scalable alternative to sequence-based methods for comparative functional genomics. F1000Res 2016; 5:1987. [PMID: 27703668 PMCID: PMC5031134 DOI: 10.12688/f1000research.9416.3] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/26/2017] [Indexed: 11/20/2022] Open
Abstract
A functional comparative genome analysis is essential to understand the mechanisms underlying bacterial evolution and adaptation. Detection of functional orthologs using standard global sequence similarity methods faces several problems; the need for defining arbitrary acceptance thresholds for similarity and alignment length, lateral gene acquisition and the high computational cost for finding bi-directional best matches at a large scale. We investigated the use of protein domain architectures for large scale functional comparative analysis as an alternative method. The performance of both approaches was assessed through functional comparison of 446 bacterial genomes sampled at different taxonomic levels. We show that protein domain architectures provide a fast and efficient alternative to methods based on sequence similarity to identify groups of functionally equivalent proteins within and across taxonomic boundaries, and it is suitable for large scale comparative analysis. Running both methods in parallel pinpoints potential functional adaptations that may add to bacterial fitness.
Collapse
Affiliation(s)
- Jasper J Koehorst
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Edoardo Saccenti
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Peter J Schaap
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Vitor A P Martins Dos Santos
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands.,LifeGlimmer GmBH, Berlin, Germany
| | - Maria Suarez-Diez
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| |
Collapse
|
24
|
Martino ME, Bayjanov JR, Caffrey BE, Wels M, Joncour P, Hughes S, Gillet B, Kleerebezem M, van Hijum SA, Leulier F. Nomadic lifestyle of Lactobacillus plantarum
revealed by comparative genomics of 54 strains isolated from different habitats. Environ Microbiol 2016; 18:4974-4989. [DOI: 10.1111/1462-2920.13455] [Citation(s) in RCA: 129] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Accepted: 07/13/2016] [Indexed: 01/24/2023]
Affiliation(s)
- Maria Elena Martino
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Ecole Normale Supérieure de Lyon, CNRS UMR 5242; Université Claude Bernard Lyon 1, Lyon France
| | - Jumamurat R. Bayjanov
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences; Radboud UMC, P.O. Box 9101 6500 HB Nijmegen The Netherlands
| | - Brian E. Caffrey
- Max Planck Institute for Molecular Genetics; Ihnestrasse 63-73 Berlin 14195 Germany
| | - Michiel Wels
- NIZO food research; P.O. Box 20, 6710 BA Ede The Netherlands
| | - Pauline Joncour
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Ecole Normale Supérieure de Lyon, CNRS UMR 5242; Université Claude Bernard Lyon 1, Lyon France
| | - Sandrine Hughes
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Ecole Normale Supérieure de Lyon, CNRS UMR 5242; Université Claude Bernard Lyon 1, Lyon France
| | - Benjamin Gillet
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Ecole Normale Supérieure de Lyon, CNRS UMR 5242; Université Claude Bernard Lyon 1, Lyon France
| | - Michiel Kleerebezem
- Host Microbe Interactomics Group, Wageningen University; De Elst 1 6708WD Wageningen The Netherlands
| | - Sacha A.F.T. van Hijum
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences; Radboud UMC, P.O. Box 9101 6500 HB Nijmegen The Netherlands
- NIZO food research; P.O. Box 20, 6710 BA Ede The Netherlands
| | - François Leulier
- Institut de Génomique Fonctionnelle de Lyon (IGFL), Ecole Normale Supérieure de Lyon, CNRS UMR 5242; Université Claude Bernard Lyon 1, Lyon France
| |
Collapse
|
25
|
Lobb B, Doxey AC. Novel function discovery through sequence and structural data mining. Curr Opin Struct Biol 2016; 38:53-61. [DOI: 10.1016/j.sbi.2016.05.017] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2016] [Revised: 05/17/2016] [Accepted: 05/24/2016] [Indexed: 01/30/2023]
|
26
|
Blank CE, Cui H, Moore LR, Walls RL. MicrO: an ontology of phenotypic and metabolic characters, assays, and culture media found in prokaryotic taxonomic descriptions. J Biomed Semantics 2016; 7:18. [PMID: 27076900 PMCID: PMC4830071 DOI: 10.1186/s13326-016-0060-6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 04/02/2016] [Indexed: 12/03/2022] Open
Abstract
Background MicrO is an ontology of microbiological terms, including prokaryotic qualities and processes, material entities (such as cell components), chemical entities (such as microbiological culture media and medium ingredients), and assays. The ontology was built to support the ongoing development of a natural language processing algorithm, MicroPIE (or, Microbial Phenomics Information Extractor). During the MicroPIE design process, we realized there was a need for a prokaryotic ontology which would capture the evolutionary diversity of phenotypes and metabolic processes across the tree of life, capture the diversity of synonyms and information contained in the taxonomic literature, and relate microbiological entities and processes to terms in a large number of other ontologies, most particularly the Gene Ontology (GO), the Phenotypic Quality Ontology (PATO), and the Chemical Entities of Biological Interest (ChEBI). We thus constructed MicrO to be rich in logical axioms and synonyms gathered from the taxonomic literature. Results MicrO currently has ~14550 classes (~2550 of which are new, the remainder being microbiologically-relevant classes imported from other ontologies), connected by ~24,130 logical axioms (5,446 of which are new), and is available at (http://purl.obolibrary.org/obo/MicrO.owl) and on the project website at https://github.com/carrineblank/MicrO. MicrO has been integrated into the OBO Foundry Library (http://www.obofoundry.org/ontology/micro.html), so that other ontologies can borrow and re-use classes. Term requests and user feedback can be made using MicrO’s Issue Tracker in GitHub. We designed MicrO such that it can support the ongoing and future development of algorithms that can leverage the controlled vocabulary and logical inference power provided by the ontology. Conclusions By connecting microbial classes with large numbers of chemical entities, material entities, biological processes, molecular functions, and qualities using a dense array of logical axioms, we intend MicrO to be a powerful new tool to increase the computing power of bioinformatics tools such as the automated text mining of prokaryotic taxonomic descriptions using natural language processing. We also intend MicrO to support the development of new bioinformatics tools that aim to develop new connections between microbial phenotypes and genotypes (i.e., the gene content in genomes). Future ontology development will include incorporation of pathogenic phenotypes and prokaryotic habitats. Electronic supplementary material The online version of this article (doi:10.1186/s13326-016-0060-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Carrine E Blank
- Department of Geosciences, University of Montana, Missoula, MT 59812 USA
| | - Hong Cui
- School of Information, University of Arizona, Tucson, AZ 85719 USA
| | - Lisa R Moore
- Department of Biological Sciences, University of Southern Maine, Portland, ME 04104 USA
| | | |
Collapse
|
27
|
From cultured to uncultured genome sequences: metagenomics and modeling microbial ecosystems. Cell Mol Life Sci 2015; 72:4287-308. [PMID: 26254872 PMCID: PMC4611022 DOI: 10.1007/s00018-015-2004-1] [Citation(s) in RCA: 79] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2015] [Revised: 07/23/2015] [Accepted: 07/28/2015] [Indexed: 12/30/2022]
Abstract
Microorganisms and the viruses that infect them are the most numerous biological entities on Earth and enclose its greatest biodiversity and genetic reservoir. With strength in their numbers, these microscopic organisms are major players in the cycles of energy and matter that sustain all life. Scientists have only scratched the surface of this vast microbial world through culture-dependent methods. Recent developments in generating metagenomes, large random samples of nucleic acid sequences isolated directly from the environment, are providing comprehensive portraits of the composition, structure, and functioning of microbial communities. Moreover, advances in metagenomic analysis have created the possibility of obtaining complete or nearly complete genome sequences from uncultured microorganisms, providing important means to study their biology, ecology, and evolution. Here we review some of the recent developments in the field of metagenomics, focusing on the discovery of genetic novelty and on methods for obtaining uncultured genome sequences, including through the recycling of previously published datasets. Moreover we discuss how metagenomics has become a core scientific tool to characterize eco-evolutionary patterns of microbial ecosystems, thus allowing us to simultaneously discover new microbes and study their natural communities. We conclude by discussing general guidelines and challenges for modeling the interactions between uncultured microorganisms and viruses based on the information contained in their genome sequences. These models will significantly advance our understanding of the functioning of microbial ecosystems and the roles of microbes in the environment.
Collapse
|
28
|
Alkema W, Boekhorst J, Wels M, van Hijum SAFT. Microbial bioinformatics for food safety and production. Brief Bioinform 2015; 17:283-92. [PMID: 26082168 PMCID: PMC4793891 DOI: 10.1093/bib/bbv034] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Indexed: 12/14/2022] Open
Abstract
In the production of fermented foods, microbes play an important role. Optimization of fermentation processes or starter culture production traditionally was a trial-and-error approach inspired by expert knowledge of the fermentation process. Current developments in high-throughput 'omics' technologies allow developing more rational approaches to improve fermentation processes both from the food functionality as well as from the food safety perspective. Here, the authors thematically review typical bioinformatics techniques and approaches to improve various aspects of the microbial production of fermented food products and food safety.
Collapse
|
29
|
Scaria J, Suzuki H, Ptak CP, Chen JW, Zhu Y, Guo XK, Chang YF. Comparative genomic and phenomic analysis of Clostridium difficile and Clostridium sordellii, two related pathogens with differing host tissue preference. BMC Genomics 2015; 16:448. [PMID: 26059449 PMCID: PMC4462011 DOI: 10.1186/s12864-015-1663-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Accepted: 05/29/2015] [Indexed: 01/05/2023] Open
Abstract
Background Clostridium difficile and C. sordellii are two anaerobic, spore forming, gram positive pathogens with a broad host range and the ability to cause lethal infections. Despite strong similarities between the two Clostridial strains, differences in their host tissue preference place C. difficile infections in the gastrointestinal tract and C. sordellii infections in soft tissues. Results In this study, to improve our understanding of C. sordellii and C. difficile virulence and pathogenesis, we have performed a comparative genomic and phenomic analysis of the two. The global phenomes of C. difficile and C. sordellii were compared using Biolog Phenotype microarrays. When compared to C. difficile, C. sordellii was found to better utilize more complex sources of carbon and nitrogen, including peptides. Phenotype microarray comparison also revealed that C. sordellii was better able to grow in acidic pH conditions. Using next generation sequencing technology, we determined the draft genome of C. sordellii strain 8483 and performed comparative genome analysis with C. difficile and other Clostridial genomes. Comparative genome analysis revealed the presence of several enzymes, including the urease gene cluster, specific to the C. sordellii genome that confer the ability of expanded peptide utilization and survival in acidic pH. Conclusions The identified phenotypes of C. sordellii might be important in causing wound and vaginal infections respectively. Proteins involved in the metabolic differences between C. sordellii and C. difficile should be targets for further studies aimed at understanding C. difficile and C. sordellii infection site specificity and pathogenesis. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1663-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Joy Scaria
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA. .,Department of Veterinary and Biomedical Sciences, South Dakota State University, Brookings, SD, 57007, USA.
| | - Haruo Suzuki
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA. .,Graduate School of Science and Engineering, Yamaguchi University, Yamaguchi, Japan.
| | - Christopher P Ptak
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA.
| | - Jenn-Wei Chen
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA.
| | - Yongzhang Zhu
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA. .,Department of Medical Microbiology and Parasitology, Institutes of Medical Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Xiao-Kui Guo
- Department of Medical Microbiology and Parasitology, Institutes of Medical Sciences, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Yung-Fu Chang
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA.
| |
Collapse
|
30
|
Villaveces JM, Koti P, Habermann BH. Tools for visualization and analysis of molecular networks, pathways, and -omics data. Adv Appl Bioinform Chem 2015; 8:11-22. [PMID: 26082651 PMCID: PMC4461095 DOI: 10.2147/aabc.s63534] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Biological pathways have become the standard way to represent the coordinated reactions and actions of a series of molecules in a cell. A series of interconnected pathways is referred to as a biological network, which denotes a more holistic view on the entanglement of cellular reactions. Biological pathways and networks are not only an appropriate approach to visualize molecular reactions. They have also become one leading method in -omics data analysis and visualization. Here, we review a set of pathway and network visualization and analysis methods and take a look at potential future developments in the field.
Collapse
Affiliation(s)
- Jose M Villaveces
- Max Planck Institute of Biochemistry, Research Group Computational Biology, Martinsried, Germany
| | - Prasanna Koti
- Max Planck Institute of Biochemistry, Research Group Computational Biology, Martinsried, Germany
| | - Bianca H Habermann
- Max Planck Institute of Biochemistry, Research Group Computational Biology, Martinsried, Germany
| |
Collapse
|
31
|
Abstract
Fungi contribute extensively to a wide range of ecosystem processes, including decomposition of organic carbon, deposition of recalcitrant carbon, and transformations of nitrogen and phosphorus. In this review, we discuss the current knowledge about physiological and morphological traits of fungi that directly influence these processes, and we describe the functional genes that encode these traits. In addition, we synthesize information from 157 whole fungal genomes in order to determine relationships among selected functional genes within fungal taxa. Ecosystem-related traits varied most at relatively coarse taxonomic levels. For example, we found that the maximum amount of variance for traits associated with carbon mineralization, nitrogen and phosphorus cycling, and stress tolerance could be explained at the levels of order to phylum. Moreover, suites of traits tended to co-occur within taxa. Specifically, the genetic capacities for traits that improve stress tolerance-β-glucan synthesis, trehalose production, and cold-induced RNA helicases-were positively related to one another, and they were more evident in yeasts. Traits that regulate the decomposition of complex organic matter-lignin peroxidases, cellobiohydrolases, and crystalline cellulases-were also positively related, but they were more strongly associated with free-living filamentous fungi. Altogether, these relationships provide evidence for two functional groups: stress tolerators, which may contribute to soil carbon accumulation via the production of recalcitrant compounds; and decomposers, which may reduce soil carbon stocks. It is possible that ecosystem functions, such as soil carbon storage, may be mediated by shifts in the fungal community between stress tolerators and decomposers in response to environmental changes, such as drought and warming.
Collapse
Affiliation(s)
- Kathleen K Treseder
- Department of Ecology and Evolutionary Biology, University of California, Irvine, California, USA
| | - Jay T Lennon
- Department of Biology, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
32
|
Amaral GRS, Campeão ME, Swings J, Thompson FL, Thompson CC. Finding diagnostic phenotypic features of Photobacterium in the genome sequences. Antonie van Leeuwenhoek 2015; 107:1351-8. [PMID: 25724129 DOI: 10.1007/s10482-015-0414-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Accepted: 02/23/2015] [Indexed: 01/14/2023]
Abstract
Photobacterium species are ubiquitous in the aquatic environment and can be found in association with animal hosts including pathogenic and mutualistic associations. The traditional phenotypic characterization of Photobacterium is expensive, time-consuming and restricted to a limited number of features. An alternative is to infer phenotypic information directly from whole genome sequences. The present study evaluates the usefulness of whole genome sequences as a source of phenotypic information and compares diagnostic phenotypes of the Photobacterium species from the literature with the predicted phenotypes obtained from whole genome sequences. All genes coding for the specific proteins involved in metabolic pathways responsible for positive phenotypes of the seventeen diagnostic features were found in the majority of the Photobacterium genomes. In the Photobacterium species that were negative for a given phenotype, at least one or several genes involved in the respective biochemical pathways were absent.
Collapse
Affiliation(s)
- Gilda Rose S Amaral
- Laboratory for Microbiology, Institute of Biology, Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil
| | | | | | | | | |
Collapse
|
33
|
Microbial taxonomy in the post-genomic era: rebuilding from scratch? Arch Microbiol 2014; 197:359-70. [PMID: 25533848 DOI: 10.1007/s00203-014-1071-2] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2014] [Revised: 12/04/2014] [Accepted: 12/05/2014] [Indexed: 12/20/2022]
Abstract
Microbial taxonomy should provide adequate descriptions of bacterial, archaeal, and eukaryotic microbial diversity in ecological, clinical, and industrial environments. Its cornerstone, the prokaryote species has been re-evaluated twice. It is time to revisit polyphasic taxonomy, its principles, and its practice, including its underlying pragmatic species concept. Ultimately, we will be able to realize an old dream of our predecessor taxonomists and build a genomic-based microbial taxonomy, using standardized and automated curation of high-quality complete genome sequences as the new gold standard.
Collapse
|
34
|
Dunne WM, van Belkum A. More Timely Antimicrobial Susceptibility Testing as a Tool in Combatting Antimicrobial Resistance in Clinically Relevant Microorganisms: Is There More than One Way to Skin a Cat? ACTA ACUST UNITED AC 2014. [DOI: 10.1016/j.clinmicnews.2014.09.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
35
|
Comparative genomics of 274 Vibrio cholerae genomes reveals mobile functions structuring three niche dimensions. BMC Genomics 2014; 15:654. [PMID: 25096633 PMCID: PMC4141962 DOI: 10.1186/1471-2164-15-654] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Accepted: 07/30/2014] [Indexed: 01/23/2023] Open
Abstract
Background Vibrio cholerae is a globally dispersed pathogen that has evolved with humans for centuries, but also includes non-pathogenic environmental strains. Here, we identify the genomic variability underlying this remarkable persistence across the three major niche dimensions space, time, and habitat. Results Taking an innovative approach of genome-wide association applicable to microbial genomes (GWAS-M), we classify 274 complete V. cholerae genomes by niche, including 39 newly sequenced for this study with the Ion Torrent DNA-sequencing platform. Niche metadata were collected for each strain and analyzed together with comprehensive annotations of genetic and genomic attributes, including point mutations (single-nucleotide polymorphisms, SNPs), protein families, functions and prophages. Conclusions Our analysis revealed that genomic variations, in particular mobile functions including phages, prophages, transposable elements, and plasmids underlie the metadata structuring in each of the three niche dimensions. This underscores the role of phages and mobile elements as the most rapidly evolving elements in bacterial genomes, creating local endemicity (space), leading to temporal divergence (time), and allowing the invasion of new habitats. Together, we take a data-driven approach for comparative functional genomics that exploits high-volume genome sequencing and annotation, in conjunction with novel statistical and machine learning analyses to identify connections between genotype and phenotype on a genome-wide scale. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-654) contains supplementary material, which is available to authorized users.
Collapse
|
36
|
Lazarevic V, Francois P. Functional genomics of microbial pathogens. Brief Funct Genomics 2013. [DOI: 10.1093/bfgp/elt038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|