1
|
Tsoumtsa Meda L, Lagarde J, Guillier L, Roussel S, Douarre PE. Using GWAS and Machine Learning to Identify and Predict Genetic Variants Associated with Foodborne Bacteria Phenotypic Traits. Methods Mol Biol 2025; 2852:223-253. [PMID: 39235748 DOI: 10.1007/978-1-0716-4100-2_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
One of the main challenges in food microbiology is to prevent the risk of outbreaks by avoiding the distribution of food contaminated by bacteria. This requires constant monitoring of the circulating strains throughout the food production chain. Bacterial genomes contain signatures of natural evolution and adaptive markers that can be exploited to better understand the behavior of pathogen in the food industry. The monitoring of foodborne strains can therefore be facilitated by the use of these genomic markers capable of rapidly providing essential information on isolated strains, such as the source of contamination, risk of illness, potential for biofilm formation, and tolerance or resistance to biocides. The increasing availability of large genome datasets is enhancing the understanding of the genetic basis of complex traits such as host adaptation, virulence, and persistence. Genome-wide association studies have shown very promising results in the discovery of genomic markers that can be integrated into rapid detection tools. In addition, machine learning has successfully predicted phenotypes and classified important traits. Genome-wide association and machine learning tools have therefore the potential to support decision-making circuits intending at reducing the burden of foodborne diseases. The aim of this chapter review is to provide knowledge on the use of these two methods in food microbiology and to recommend their use in the field.
Collapse
Affiliation(s)
- Landry Tsoumtsa Meda
- ACTALIA, La Roche-sur-Foron, France
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France
| | - Jean Lagarde
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France
- INRAE, Unit of Process Optimisation in Food, Agriculture and the Environment (UR OPAALE), Rennes, France
| | | | - Sophie Roussel
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France
| | - Pierre-Emmanuel Douarre
- ANSES, Salmonella and Listeria Unit (USEL), University of Paris-Est, Maisons-Alfort Laboratory for Food Safety, Maisons-Alfort, France.
| |
Collapse
|
2
|
Do DT, Yang MR, Vo TNS, Le NQK, Wu YW. Unitig-centered pan-genome machine learning approach for predicting antibiotic resistance and discovering novel resistance genes in bacterial strains. Comput Struct Biotechnol J 2024; 23:1864-1876. [PMID: 38707536 PMCID: PMC11067008 DOI: 10.1016/j.csbj.2024.04.035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Revised: 04/13/2024] [Accepted: 04/13/2024] [Indexed: 05/07/2024] Open
Abstract
In current genomic research, the widely used methods for predicting antimicrobial resistance (AMR) often rely on prior knowledge of known AMR genes or reference genomes. However, these methods have limitations, potentially resulting in imprecise predictions owing to incomplete coverage of AMR mechanisms and genetic variations. To overcome these limitations, we propose a pan-genome-based machine learning approach to advance our understanding of AMR gene repertoires and uncover possible feature sets for precise AMR classification. By building compacted de Brujin graphs (cDBGs) from thousands of genomes and collecting the presence/absence patterns of unique sequences (unitigs) for Pseudomonas aeruginosa, we determined that using machine learning models on unitig-centered pan-genomes showed significant promise for accurately predicting the antibiotic resistance or susceptibility of microbial strains. Applying a feature-selection-based machine learning algorithm led to satisfactory predictive performance for the training dataset (with an area under the receiver operating characteristic curve (AUC) of > 0.929) and an independent validation dataset (AUC, approximately 0.77). Furthermore, the selected unitigs revealed previously unidentified resistance genes, allowing for the expansion of the resistance gene repertoire to those that have not previously been described in the literature on antibiotic resistance. These results demonstrate that our proposed unitig-based pan-genome feature set was effective in constructing machine learning predictors that could accurately identify AMR pathogens. Gene sets extracted using this approach may offer valuable insights into expanding known AMR genes and forming new hypotheses to uncover the underlying mechanisms of bacterial AMR.
Collapse
Affiliation(s)
- Duyen Thi Do
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Ming-Ren Yang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Tran Nam Son Vo
- Department of Business Administration, College of Management, Lunghwa University of Science and Technology, Taoyuan City, Taiwan
| | - Nguyen Quoc Khanh Le
- Professional Master Program in Artificial Intelligence in Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Yu-Wei Wu
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
- TMU Research Center for Digestive Medicine, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
3
|
Derelle R, von Wachsmann J, Mäklin T, Hellewell J, Russell T, Lalvani A, Chindelevitch L, Croucher NJ, Harris SR, Lees JA. Seamless, rapid, and accurate analyses of outbreak genomic data using split k-mer analysis. Genome Res 2024; 34:1661-1673. [PMID: 39406504 PMCID: PMC11529842 DOI: 10.1101/gr.279449.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Accepted: 09/16/2024] [Indexed: 11/01/2024]
Abstract
Sequence variation observed in populations of pathogens can be used for important public health and evolutionary genomic analyses, especially outbreak analysis and transmission reconstruction. Identifying this variation is typically achieved by aligning sequence reads to a reference genome, but this approach is susceptible to reference biases and requires careful filtering of called genotypes. There is a need for tools that can process this growing volume of bacterial genome data, providing rapid results, but that remain simple so they can be used without highly trained bioinformaticians, expensive data analysis, and long-term storage and processing of large files. Here we describe split k-mer analysis (SKA2), a method that supports both reference-free and reference-based mapping to quickly and accurately genotype populations of bacteria using sequencing reads or genome assemblies. SKA2 is highly accurate for closely related samples, and in outbreak simulations, we show superior variant recall compared with reference-based methods, with no false positives. SKA2 can also accurately map variants to a reference and be used with recombination detection methods to rapidly reconstruct vertical evolutionary history. SKA2 is many times faster than comparable methods and can be used to add new genomes to an existing call set, allowing sequential use without the need to reanalyze entire collections. With an inherent absence of reference bias, high accuracy, and a robust implementation, SKA2 has the potential to become the tool of choice for genotyping bacteria. SKA2 is implemented in Rust and is freely available as open-source software.
Collapse
Affiliation(s)
- Romain Derelle
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W21PG, United Kingdom
| | - Johanna von Wachsmann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Tommi Mäklin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
- Department of Mathematics and Statistics, University of Helsinki, Helsinki 00014, Finland
| | - Joel Hellewell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
| | - Timothy Russell
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene & Tropical Medicine, London WC1E 7HT, United Kingdom
| | - Ajit Lalvani
- NIHR Health Protection Research Unit in Respiratory Infections, National Heart and Lung Institute, Imperial College London, London W21PG, United Kingdom
| | - Leonid Chindelevitch
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, United Kingdom
| | - Nicholas J Croucher
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London W12 0BZ, United Kingdom
| | - Simon R Harris
- Bill and Melinda Gates Foundation, Westminster, London SW1E 6AJ, United Kingdom
| | - John A Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom;
| |
Collapse
|
4
|
Chen J, Du W, Li Y, Zhou H, Ouyang D, Yao Z, Fu J, Ye X. Genome-based model for differentiating between infection and carriage Staphylococcus aureus. Microbiol Spectr 2024; 12:e0049324. [PMID: 39248515 PMCID: PMC11448440 DOI: 10.1128/spectrum.00493-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 08/02/2024] [Indexed: 09/10/2024] Open
Abstract
Staphylococcus aureus (S. aureus) is a clinically significant opportunistic pathogen, which can colonize multiple body sites in healthy individuals and cause various life-threatening diseases in both children and adults worldwide. The genetic backgrounds of S. aureus that cause infection versus asymptomatic carriage vary widely, but the potential genetic elements (k-mers) associated with S. aureus infection remain unknown, which leads to difficulties in differentiating infection isolates from harmless colonizers. Here, we address the disease-associated k-mers by using a comprehensive genome-wide association study (GWAS) to compare the genetic variation of S. aureus isolates from clinical infection sites (272 isolates) with nasal carriage (240 isolates). This study uncovers consensus evidence that certain k-mers are overrepresented in infection isolates compared with carriage isolates, indicating the presence of specific genetic elements associated with S. aureus infection. Moreover, the random forest (RF) model achieved a classification accuracy of 77% for predicting disease status (infection vs carriage), with 68% accuracy for a single highest-ranked k-mer, providing a simple target for identifying high-risk genotypes. Our findings suggest that the disease-causing S. aureus is a pathogenic subpopulation harboring unique genomic variation that promotes invasion and infection, providing novel targets for clinical interventions. IMPORTANCE Defining the disease-causing isolates is the first step toward disease control. However, the disease-associated genetic elements of Staphylococcus aureus remain unknown, which leads to difficulties in differentiating infection isolates from harmless carriage isolates. Our comprehensive genome-wide association study (GWAS) found consensus evidence that certain genetic elements are overrepresented among infection isolates than carriage isolates, suggesting that the enrichment of disease-associated elements may promote infection. Notably, a single k-mer predictor achieved a high classification accuracy, which forms the basis for early diagnostics and interventions.
Collapse
Affiliation(s)
- Jianyu Chen
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Wenyin Du
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Yuehe Li
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Huiliu Zhou
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Dejia Ouyang
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Zhenjiang Yao
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Jinjian Fu
- Department of Laboratory Science, Maoming Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
- Guangzhou Women and Children's Medical Center Liuzhou Hospital, Liuzhou, China
| | - Xiaohua Ye
- Laboratory of Molecular Epidemiology, School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| |
Collapse
|
5
|
Pham NP, Gingras H, Godin C, Feng J, Groppi A, Nikolski M, Leprohon P, Ouellette M. Holistic understanding of trimethoprim resistance in Streptococcus pneumoniae using an integrative approach of genome-wide association study, resistance reconstruction, and machine learning. mBio 2024; 15:e0136024. [PMID: 39120145 PMCID: PMC11389379 DOI: 10.1128/mbio.01360-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Accepted: 07/08/2024] [Indexed: 08/10/2024] Open
Abstract
Antimicrobial resistance (AMR) is a public health threat worldwide. Next-generation sequencing (NGS) has opened unprecedented opportunities to accelerate AMR mechanism discovery and diagnostics. Here, we present an integrative approach to investigate trimethoprim (TMP) resistance in the key pathogen Streptococcus pneumoniae. We explored a collection of 662 S. pneumoniae genomes by conducting a genome-wide association study (GWAS), followed by functional validation using resistance reconstruction experiments, combined with machine learning (ML) approaches to predict TMP minimum inhibitory concentration (MIC). Our study showed that multiple additive mutations in the folA and sulA loci are responsible for TMP non-susceptibility in S. pneumoniae and can be used as key features to build ML models for digital MIC prediction, reaching an average accuracy within ±1 twofold dilution factor of 86.3%. Our roadmap of in silico analysis-wet-lab validation-diagnostic tool building could be adapted to explore AMR in other combinations of bacteria-antibiotic. IMPORTANCE In the age of next-generation sequencing (NGS), while data-driven methods such as genome-wide association study (GWAS) and machine learning (ML) excel at finding patterns, functional validation can be challenging due to the high numbers of candidate variants. We designed an integrative approach combining a GWAS on S. pneumoniae clinical isolates, followed by whole-genome transformation coupled with NGS to functionally characterize a large set of GWAS candidates. Our study validated several phenotypic folA mutations beyond the standard Ile100Leu mutation, and showed that the overexpression of the sulA locus produces trimethoprim (TMP) resistance in Streptococcus pneumoniae. These validated loci, when used to build ML models, were found to be the best inputs for predicting TMP minimal inhibitory concentrations. Integrative approaches can bridge the genotype-phenotype gap by biological insights that can be incorporated in ML models for accurate prediction of drug susceptibility.
Collapse
Affiliation(s)
- Nguyen-Phuong Pham
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Hélène Gingras
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Chantal Godin
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Jie Feng
- State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Alexis Groppi
- Bordeaux Bioinformatics Center and CNRS, Institut de Biochimie et Génétique Cellulaires (IBGC) UMR 5095, Université de Bordeaux, Bordeaux, France
| | - Macha Nikolski
- Bordeaux Bioinformatics Center and CNRS, Institut de Biochimie et Génétique Cellulaires (IBGC) UMR 5095, Université de Bordeaux, Bordeaux, France
| | - Philippe Leprohon
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| | - Marc Ouellette
- Centre de Recherche en Infectiologie du Centre de Recherche du CHU de Québec and Département de Microbiologie, Infectiologie et Immunologie, Faculté de Médecine, Université Laval, Québec City, Québec, Canada
| |
Collapse
|
6
|
Tam YL, Cameron S, Preston A, Cowley L. GWarrange: a pre- and post- genome-wide association studies pipeline for detecting phenotype-associated genome rearrangement events. Microb Genom 2024; 10:001268. [PMID: 38980151 PMCID: PMC11316554 DOI: 10.1099/mgen.0.001268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Accepted: 06/17/2024] [Indexed: 07/10/2024] Open
Abstract
The use of k-mers to capture genetic variation in bacterial genome-wide association studies (bGWAS) has demonstrated its effectiveness in overcoming the plasticity of bacterial genomes by providing a comprehensive array of genetic variants in a genome set that is not confined to a single reference genome. However, little attempt has been made to interpret k-mers in the context of genome rearrangements, partly due to challenges in the exhaustive and high-throughput identification of genome structure and individual rearrangement events. Here, we present GWarrange, a pre- and post-bGWAS processing methodology that leverages the unique properties of k-mers to facilitate bGWAS for genome rearrangements. Repeat sequences are common instigators of genome rearrangements through intragenomic homologous recombination, and they are commonly found at rearrangement boundaries. Using whole-genome sequences, repeat sequences are replaced by short placeholder sequences, allowing the regions flanking repeats to be incorporated into relatively short k-mers. Then, locations of flanking regions in significant k-mers are mapped back to complete genome sequences to visualise genome rearrangements. Four case studies based on two bacterial species (Bordetella pertussis and Enterococcus faecium) and a simulated genome set are presented to demonstrate the ability to identify phenotype-associated rearrangements. GWarrange is available at https://github.com/DorothyTamYiLing/GWarrange.
Collapse
Affiliation(s)
- Yi Ling Tam
- The Milner Centre for Evolution and Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Sarah Cameron
- The Milner Centre for Evolution and Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Andrew Preston
- The Milner Centre for Evolution and Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Lauren Cowley
- The Milner Centre for Evolution and Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| |
Collapse
|
7
|
Baker M, Zhang X, Maciel-Guerra A, Babaarslan K, Dong Y, Wang W, Hu Y, Renney D, Liu L, Li H, Hossain M, Heeb S, Tong Z, Pearcy N, Zhang M, Geng Y, Zhao L, Hao Z, Senin N, Chen J, Peng Z, Li F, Dottorini T. Convergence of resistance and evolutionary responses in Escherichia coli and Salmonella enterica co-inhabiting chicken farms in China. Nat Commun 2024; 15:206. [PMID: 38182559 PMCID: PMC10770378 DOI: 10.1038/s41467-023-44272-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 12/06/2023] [Indexed: 01/07/2024] Open
Abstract
Sharing of genetic elements among different pathogens and commensals inhabiting same hosts and environments has significant implications for antimicrobial resistance (AMR), especially in settings with high antimicrobial exposure. We analysed 661 Escherichia coli and Salmonella enterica isolates collected within and across hosts and environments, in 10 Chinese chicken farms over 2.5 years using data-mining methods. Most isolates within same hosts possessed the same clinically relevant AMR-carrying mobile genetic elements (plasmids: 70.6%, transposons: 78%), which also showed recent common evolution. Supervised machine learning classifiers revealed known and novel AMR-associated mutations and genes underlying resistance to 28 antimicrobials, primarily associated with resistance in E. coli and susceptibility in S. enterica. Many were essential and affected same metabolic processes in both species, albeit with varying degrees of phylogenetic penetration. Multi-modal strategies are crucial to investigate the interplay of mobilome, resistance and metabolism in cohabiting bacteria, especially in ecological settings where community-driven resistance selection occurs.
Collapse
Affiliation(s)
- Michelle Baker
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Loughborough, Leicestershire, LE12 5RD, UK
| | - Xibin Zhang
- Shandong New Hope Liuhe Group Co. Ltd. and Qingdao Key Laboratory of Animal Feed Safety, Qingdao, Shandong, 266000, P.R. China
| | - Alexandre Maciel-Guerra
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Loughborough, Leicestershire, LE12 5RD, UK
| | - Kubra Babaarslan
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Loughborough, Leicestershire, LE12 5RD, UK
| | - Yinping Dong
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, P. R. China
| | - Wei Wang
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, P. R. China
| | - Yujie Hu
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, P. R. China
| | - David Renney
- Nimrod Veterinary Products Limited, 2, Wychwood Court, Cotswold Business Village, Moreton-in-Marsh, GL56 0JQ, London, UK
| | - Longhai Liu
- Shandong Kaijia Food Co. Ltd, Weifang, P. R. China
| | - Hui Li
- Luoyang Center for Disease Control and Prevention, No. 9, Zhenghe Road, Luolong District, Luoyang City, Henan Province, Luolong, 471000, P. R. China
| | - Maqsud Hossain
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Loughborough, Leicestershire, LE12 5RD, UK
| | - Stephan Heeb
- School of Life Sciences, University of Nottingham, East Drive, Nottingham, Nottinghamshire, NG7 2RD, UK
| | - Zhiqin Tong
- Luoyang Center for Disease Control and Prevention, No. 9, Zhenghe Road, Luolong District, Luoyang City, Henan Province, Luolong, 471000, P. R. China
| | - Nicole Pearcy
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Loughborough, Leicestershire, LE12 5RD, UK
- School of Life Sciences, University of Nottingham, East Drive, Nottingham, Nottinghamshire, NG7 2RD, UK
| | - Meimei Zhang
- Liaoning Provincial Center for Disease Control and Prevention, No. 168, Jinfeng Street, Hunnan District, Shenyang City, Liaoning Province, 110072, P. R. China
| | - Yingzhi Geng
- Liaoning Provincial Center for Disease Control and Prevention, No. 168, Jinfeng Street, Hunnan District, Shenyang City, Liaoning Province, 110072, P. R. China
| | - Li Zhao
- Agricultural Biopharmaceutical Laboratory, College of Chemistry and Pharmaceutical Sciences, Qingdao Agricultural University, No. 700 Changcheng Road, Chengyang District, Qingdao City, Shandong Province, 266109, P. R. China
| | - Zhihui Hao
- Chinese Veterinary Medicine Innovation Center, College of Veterinary Medicine, China Agricultural University, Haidian District, Beijing City, 100193, P. R. China
| | - Nicola Senin
- Department of Engineering, University of Perugia, Perugia, I06125, Italy
| | - Junshi Chen
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, P. R. China
| | - Zixin Peng
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, P. R. China.
| | - Fengqin Li
- NHC Key Laboratory of Food Safety Risk Assessment, China National Center for Food Safety Risk Assessment, Beijing, 100021, P. R. China.
| | - Tania Dottorini
- School of Veterinary Medicine and Science, University of Nottingham, College Road, Sutton Bonington, Loughborough, Leicestershire, LE12 5RD, UK.
- Centre for Smart Food Research, Nottingham Ningbo China Beacons of Excellence Research and Innovation Institute, University of Nottingham Ningbo China, Ningbo, 315100, P. R. China.
| |
Collapse
|
8
|
Yang S, Chen J, Fu J, Huang J, Li T, Yao Z, Ye X. Disease-Associated Streptococcus pneumoniae Genetic Variation. Emerg Infect Dis 2024; 30:39-49. [PMID: 38146979 PMCID: PMC10756394 DOI: 10.3201/eid3001.221927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023] Open
Abstract
Streptococcus pneumoniae is an opportunistic pathogen that causes substantial illness and death among children worldwide. The genetic backgrounds of pneumococci that cause infection versus asymptomatic carriage vary substantially. To determine the evolutionary mechanisms of opportunistic pathogenicity, we conducted a genomic surveillance study in China. We collected 783 S. pneumoniae isolates from infected and asymptomatic children. By using a 2-stage genomewide association study process, we compared genomic differences between infection and carriage isolates to address genomic variation associated with pathogenicity. We identified 8 consensus k-mers associated with adherence, antimicrobial resistance, and immune modulation, which were unevenly distributed in the infection isolates. Classification accuracy of the best k-mer predictor for S. pneumoniae infection was good, giving a simple target for predicting pathogenic isolates. Our findings suggest that S. pneumoniae pathogenicity is complex and multifactorial, and we provide genetic evidence for precise targeted interventions.
Collapse
|
9
|
Yurtseven A, Buyanova S, Agrawal AA, Bochkareva OO, Kalinina OV. Machine learning and phylogenetic analysis allow for predicting antibiotic resistance in M. tuberculosis. BMC Microbiol 2023; 23:404. [PMID: 38124060 PMCID: PMC10731705 DOI: 10.1186/s12866-023-03147-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/07/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Antimicrobial resistance (AMR) poses a significant global health threat, and an accurate prediction of bacterial resistance patterns is critical for effective treatment and control strategies. In recent years, machine learning (ML) approaches have emerged as powerful tools for analyzing large-scale bacterial AMR data. However, ML methods often ignore evolutionary relationships among bacterial strains, which can greatly impact performance of the ML methods, especially if resistance-associated features are attempted to be detected. Genome-wide association studies (GWAS) methods like linear mixed models accounts for the evolutionary relationships in bacteria, but they uncover only highly significant variants which have already been reported in literature. RESULTS In this work, we introduce a novel phylogeny-related parallelism score (PRPS), which measures whether a certain feature is correlated with the population structure of a set of samples. We demonstrate that PRPS can be used, in combination with SVM- and random forest-based models, to reduce the number of features in the analysis, while simultaneously increasing models' performance. We applied our pipeline to publicly available AMR data from PATRIC database for Mycobacterium tuberculosis against six common antibiotics. CONCLUSIONS Using our pipeline, we re-discovered known resistance-associated mutations as well as new candidate mutations which can be related to resistance and not previously reported in the literature. We demonstrated that taking into account phylogenetic relationships not only improves the model performance, but also yields more biologically relevant predicted most contributing resistance markers.
Collapse
Affiliation(s)
- Alper Yurtseven
- Department of Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Campus E8.1, Saarbrücken, 66123, Saarland, Germany.
- Graduate School of Computer Science, Saarland University, Saarbrücken, 66123, Saarland, Germany.
| | - Sofia Buyanova
- Institute of Science and Technology Austria (ISTA), Am Campus 1, Klosterneuburg, 3400, Austria
| | - Amay Ajaykumar Agrawal
- Department of Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Campus E8.1, Saarbrücken, 66123, Saarland, Germany
- Graduate School of Computer Science, Saarland University, Saarbrücken, 66123, Saarland, Germany
| | - Olga O Bochkareva
- Institute of Science and Technology Austria (ISTA), Am Campus 1, Klosterneuburg, 3400, Austria
- Centre for Microbiology and Environmental Systems Science, Division of Computational System Biology, University of Vienna, Djerassiplatz 1 A, Wien, 1030, Austria
| | - Olga V Kalinina
- Department of Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Campus E8.1, Saarbrücken, 66123, Saarland, Germany
- Graduate School of Computer Science, Saarland University, Saarbrücken, 66123, Saarland, Germany
- Faculty of Medicine, Saarland University, Homburg, 66421, Saarland, Germany
| |
Collapse
|
10
|
Dutta A, McDonald BA, Croll D. Combined reference-free and multi-reference based GWAS uncover cryptic variation underlying rapid adaptation in a fungal plant pathogen. PLoS Pathog 2023; 19:e1011801. [PMID: 37972199 PMCID: PMC10688896 DOI: 10.1371/journal.ppat.1011801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 11/30/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
Microbial pathogens often harbor substantial functional diversity driven by structural genetic variation. Rapid adaptation from such standing variation threatens global food security and human health. Genome-wide association studies (GWAS) provide a powerful approach to identify genetic variants underlying recent pathogen adaptation. However, the reliance on single reference genomes and single nucleotide polymorphisms (SNPs) obscures the true extent of adaptive genetic variation. Here, we show quantitatively how a combination of multiple reference genomes and reference-free approaches captures substantially more relevant genetic variation compared to single reference mapping. We performed reference-genome based association mapping across 19 reference-quality genomes covering the diversity of the species. We contrasted the results with a reference-free (i.e., k-mer) approach using raw whole-genome sequencing data in a panel of 145 strains collected across the global distribution range of the fungal wheat pathogen Zymoseptoria tritici. We mapped the genetic architecture of 49 life history traits including virulence, reproduction and growth in multiple stressful environments. The inclusion of additional reference genome SNP datasets provides a nearly linear increase in additional loci mapped through GWAS. Variants detected through the k-mer approach explained a higher proportion of phenotypic variation than a reference genome-based approach and revealed functionally confirmed loci that classic GWAS approaches failed to map. The power of GWAS in microbial pathogens can be significantly enhanced by comprehensively capturing structural genetic variation. Our approach is generalizable to a large number of species and will uncover novel mechanisms driving rapid adaptation of pathogens.
Collapse
Affiliation(s)
- Anik Dutta
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Bruce A. McDonald
- Plant Pathology, Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
| | - Daniel Croll
- Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| |
Collapse
|
11
|
Trinh P, Clausen DS, Willis AD. happi: a hierarchical approach to pangenomics inference. Genome Biol 2023; 24:214. [PMID: 37773075 PMCID: PMC10540326 DOI: 10.1186/s13059-023-03040-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 08/16/2023] [Indexed: 09/30/2023] Open
Abstract
Recovering metagenome-assembled genomes (MAGs) from shotgun sequencing data is an increasingly common task in microbiome studies, as MAGs provide deeper insight into the functional potential of both culturable and non-culturable microorganisms. However, metagenome-assembled genomes vary in quality and may contain omissions and contamination. These errors present challenges for detecting genes and comparing gene enrichment across sample types. To address this, we propose happi, an approach to testing hypotheses about gene enrichment that accounts for genome quality. We illustrate the advantages of happi over existing approaches using published Saccharibacteria MAGs, Streptococcus thermophilus MAGs, and via simulation.
Collapse
Affiliation(s)
- Pauline Trinh
- Department of Environmental & Occupational Health Sciences, University of Washington, Seattle, WA, USA
| | - David S Clausen
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Amy D Willis
- Department of Biostatistics, University of Washington, Seattle, WA, USA.
| |
Collapse
|
12
|
Lemieux JE, Huang W, Hill N, Cerar T, Freimark L, Hernandez S, Luban M, Maraspin V, Bogovič P, Ogrinc K, Ruzič-Sabljič E, Lapierre P, Lasek-Nesselquist E, Singh N, Iyer R, Liveris D, Reed KD, Leong JM, Branda JA, Steere AC, Wormser GP, Strle F, Sabeti PC, Schwartz I, Strle K. Whole genome sequencing of human Borrelia burgdorferi isolates reveals linked blocks of accessory genome elements located on plasmids and associated with human dissemination. PLoS Pathog 2023; 19:e1011243. [PMID: 37651316 PMCID: PMC10470944 DOI: 10.1371/journal.ppat.1011243] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 06/13/2023] [Indexed: 09/02/2023] Open
Abstract
Lyme disease is the most common vector-borne disease in North America and Europe. The clinical manifestations of Lyme disease vary based on the genospecies of the infecting Borrelia burgdorferi spirochete, but the microbial genetic elements underlying these associations are not known. Here, we report the whole genome sequence (WGS) and analysis of 299 B. burgdorferi (Bb) isolates derived from patients in the Eastern and Midwestern US and Central Europe. We develop a WGS-based classification of Bb isolates, confirm and extend the findings of previous single- and multi-locus typing systems, define the plasmid profiles of human-infectious Bb isolates, annotate the core and strain-variable surface lipoproteome, and identify loci associated with disseminated infection. A core genome consisting of ~900 open reading frames and a core set of plasmids consisting of lp17, lp25, lp36, lp28-3, lp28-4, lp54, and cp26 are found in nearly all isolates. Strain-variable (accessory) plasmids and genes correlate strongly with phylogeny. Using genetic association study methods, we identify an accessory genome signature associated with dissemination in humans and define the individual plasmids and genes that make up this signature. Strains within the RST1/WGS A subgroup, particularly a subset marked by the OspC type A genotype, have increased rates of dissemination in humans. OspC type A strains possess a unique set of strongly linked genetic elements including the presence of lp56 and lp28-1 plasmids and a cluster of genes that may contribute to their enhanced virulence compared to other genotypes. These features of OspC type A strains reflect a broader paradigm across Bb isolates, in which near-clonal genotypes are defined by strain-specific clusters of linked genetic elements, particularly those encoding surface-exposed lipoproteins. These clusters of genes are maintained by strain-specific patterns of plasmid occupancy and are associated with the probability of invasive infection.
Collapse
Affiliation(s)
- Jacob E. Lemieux
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Weihua Huang
- New York Medical College, Valhalla, New York, United States of America
- East Carolina University, Greenville, North Carolina, United States of America
| | - Nathan Hill
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Tjasa Cerar
- University of Ljubljana, Ljubljana, Slovenia
| | - Lisa Freimark
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sergio Hernandez
- Wadsworth Center, New York State Department of Health, Albany, New York, United States of America
| | - Matteo Luban
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Vera Maraspin
- University Medical Center Ljubljana, Ljubljana, Slovenia
| | - Petra Bogovič
- University Medical Center Ljubljana, Ljubljana, Slovenia
| | | | | | - Pascal Lapierre
- Wadsworth Center, New York State Department of Health, Albany, New York, United States of America
| | - Erica Lasek-Nesselquist
- Wadsworth Center, New York State Department of Health, Albany, New York, United States of America
| | - Navjot Singh
- Wadsworth Center, New York State Department of Health, Albany, New York, United States of America
| | - Radha Iyer
- New York Medical College, Valhalla, New York, United States of America
| | - Dionysios Liveris
- New York Medical College, Valhalla, New York, United States of America
| | - Kurt D. Reed
- University of Wisconsin, Madison, Wisconsin, United States of America
| | - John M. Leong
- Tufts University School of Medicine, Boston, Massachusetts, United States of America
| | - John A. Branda
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Allen C. Steere
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Gary P. Wormser
- New York Medical College, Valhalla, New York, United States of America
| | - Franc Strle
- University Medical Center Ljubljana, Ljubljana, Slovenia
| | - Pardis C. Sabeti
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- Harvard University, Cambridge, Massachusetts, United States of America
- Harvard T.H.Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Ira Schwartz
- New York Medical College, Valhalla, New York, United States of America
| | - Klemen Strle
- Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- Wadsworth Center, New York State Department of Health, Albany, New York, United States of America
- Tufts University School of Medicine, Boston, Massachusetts, United States of America
| |
Collapse
|
13
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
14
|
Mehta RS, Petit RA, Read TD, Weissman DB. Detecting patterns of accessory genome coevolution in Staphylococcus aureus using data from thousands of genomes. BMC Bioinformatics 2023; 24:243. [PMID: 37296404 PMCID: PMC10251594 DOI: 10.1186/s12859-023-05363-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 05/26/2023] [Indexed: 06/12/2023] Open
Abstract
Bacterial genomes exhibit widespread horizontal gene transfer, resulting in highly variable genome content that complicates the inference of genetic interactions. In this study, we develop a method for detecting coevolving genes from large datasets of bacterial genomes based on pairwise comparisons of closely related individuals, analogous to a pedigree study in eukaryotic populations. We apply our method to pairs of genes from the Staphylococcus aureus accessory genome of over 75,000 annotated gene families using a database of over 40,000 whole genomes. We find many pairs of genes that appear to be gained or lost in a coordinated manner, as well as pairs where the gain of one gene is associated with the loss of the other. These pairs form networks of rapidly coevolving genes, primarily consisting of genes involved in virulence, mechanisms of horizontal gene transfer, and antibiotic resistance, particularly the SCCmec complex. While we focus on gene gain and loss, our method can also detect genes that tend to acquire substitutions in tandem, or genotype-phenotype or phenotype-phenotype coevolution. Finally, we present the R package DeCoTUR that allows for the computation of our method.
Collapse
Affiliation(s)
- Rohan S Mehta
- Department of Physics, Emory University, Atlanta, GA, USA.
| | - Robert A Petit
- Division of Infectious Diseases, Department of Medicine, School of Medicine, Emory University, Atlanta, GA, USA
- Wyoming Public Health Laboratory, Cheyenne, WY, USA
| | - Timothy D Read
- Division of Infectious Diseases, Department of Medicine, School of Medicine, Emory University, Atlanta, GA, USA
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA, USA
| | | |
Collapse
|
15
|
Artificial Intelligence for Antimicrobial Resistance Prediction: Challenges and Opportunities towards Practical Implementation. Antibiotics (Basel) 2023; 12:antibiotics12030523. [PMID: 36978390 PMCID: PMC10044311 DOI: 10.3390/antibiotics12030523] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Revised: 03/01/2023] [Accepted: 03/03/2023] [Indexed: 03/08/2023] Open
Abstract
Antimicrobial resistance (AMR) is emerging as a potential threat to many lives worldwide. It is very important to understand and apply effective strategies to counter the impact of AMR and its mutation from a medical treatment point of view. The intersection of artificial intelligence (AI), especially deep learning/machine learning, has led to a new direction in antimicrobial identification. Furthermore, presently, the availability of huge amounts of data from multiple sources has made it more effective to use these artificial intelligence techniques to identify interesting insights into AMR genes such as new genes, mutations, drug identification, conditions favorable to spread, and so on. Therefore, this paper presents a review of state-of-the-art challenges and opportunities. These include interesting input features posing challenges in use, state-of-the-art deep-learning/machine-learning models for robustness and high accuracy, challenges, and prospects to apply these techniques for practical purposes. The paper concludes with the encouragement to apply AI to the AMR sector with the intention of practical diagnosis and treatment, since presently most studies are at early stages with minimal application in the practice of diagnosis and treatment of disease.
Collapse
|
16
|
Newberry EA, Minsavage GV, Holland A, Jones JB, Potnis N. Genome-Wide Association to Study the Host-Specificity Determinants of Xanthomonas perforans. PHYTOPATHOLOGY 2023; 113:400-412. [PMID: 36318253 DOI: 10.1094/phyto-08-22-0294-r] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Xanthomonas perforans and X. euvesicatoria are the causal agents of bacterial spot disease of tomato and pepper, endemic to the Southeastern United States. Although very closely related, the two bacterial species differ in host specificity, where X. perforans is the dominant pathogen of tomato and X. euvesicatoria that of pepper. This is in part due to the activity of avirulence proteins that are secreted by X. perforans strains and elicit effector-triggered immunity in pepper leaves, thereby restricting pathogen growth. In recent years, the emergence of several pepper-pathogenic X. perforans lineages has revealed variability within the bacterial species to multiply and cause disease in pepper, even in the absence of avirulence gene activity. Here, we investigated the basal evolutionary processes underlying the host range of this species using multiple genome-wide association analyses. Surprisingly, we identified two novel gene candidates that were significantly associated with pepper-pathogenic X. perforans and X. euvesicatoria. Both candidates were predicted to be involved in the transport/acquisition of nutrients common to the plant cell wall or apoplast and included a TonB-dependent receptor, which was disrupted through independent mutations within the X. perforans lineage. The other included a symporter of protons/glutamate, gltP, enriched with pepper-associated mutations near the promoter and start codon of the gene. Functional analysis of these candidates revealed that only the TonB-dependent receptor had a minor effect on the symptom development and growth of X. perforans in pepper leaves, indicating that pathogenicity to this host might have evolved independently within the bacterial species and is likely a complex, multigenic trait.
Collapse
Affiliation(s)
- Eric A Newberry
- Department of Entomology and Plant Pathology, Auburn University, AL 36849
| | | | - Auston Holland
- Department of Entomology and Plant Pathology, Auburn University, AL 36849
| | - Jeffrey B Jones
- Department of Plant Pathology, University of Florida, FL 32611
| | - Neha Potnis
- Department of Entomology and Plant Pathology, Auburn University, AL 36849
| |
Collapse
|
17
|
Migration Rates on Swim Plates Vary between Escherichia coli Soil Isolates: Differences Are Associated with Variants in Metabolic Genes. Appl Environ Microbiol 2023; 89:e0172722. [PMID: 36695629 PMCID: PMC9972950 DOI: 10.1128/aem.01727-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
This study investigates migration phenotypes of 265 Escherichia coli soil isolates from the Buffalo River basin in Minnesota, USA. Migration rates on semisolid tryptone swim plates ranged from nonmotile to 190% of the migration rate of a highly motile E. coli K-12 strain. The nonmotile isolate, LGE0550, had mutations in flagellar and chemotaxis genes, including two IS3 elements in the flagellin-encoding gene fliC. A genome-wide association study (GWAS), associating the migration rates with genetic variants in specific genes, yielded two metabolic variants (rygD-serA and metR-metE) with previous implications in chemotaxis. As a novel way of confirming GWAS results, we used minimal medium swim plates to confirm the associations. Other variants in metabolic genes and genes that are associated with biofilm were positively or negatively associated with migration rates. A determination of growth phenotypes on Biolog EcoPlates yielded differential growth for the 10 tested isolates on d-malic acid, putrescine, and d-xylose, all of which are important in the soil environment. IMPORTANCE E. coli is a Gram-negative, facultative anaerobic bacterium whose life cycle includes extra host environments in addition to human, animal, and plant hosts. The bacterium has the genomic capability of being motile. In this context, the significance of this study is severalfold: (i) the great diversity of migration phenotypes that we observed within our isolate collection supports previous (G. NandaKafle, A. A. Christie, S. Vilain, and V. S. Brözel, Front Microbiol 9:762, 2018, https://doi.org/10.3389/fmicb.2018.00762; Y. Somorin, F. Abram, F. Brennan, and C. O'Byrne, Appl Environ Microbiol 82:4628-4640, 2016, https://doi.org/10.1128/AEM.01175-16) ideas of soil promoting phenotypic heterogeneity, (ii) such heterogeneity may facilitate bacterial growth in the many different soil niches, and (iii) such heterogeneity may enable the bacteria to interact with human, animal, and plant hosts.
Collapse
|
18
|
Lemieux JE, Huang W, Hill N, Cerar T, Freimark L, Hernandez S, Luban M, Maraspin V, Bogovic P, Ogrinc K, Ruzic-Sabljic E, Lapierre P, Lasek-Nesselquist E, Singh N, Iyer R, Liveris D, Reed KD, Leong JM, Branda JA, Steere AC, Wormser GP, Strle F, Sabeti PC, Schwartz I, Strle K. Whole genome sequencing of Borrelia burgdorferi isolates reveals linked clusters of plasmid-borne accessory genome elements associated with virulence. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.26.530159. [PMID: 36909473 PMCID: PMC10002713 DOI: 10.1101/2023.02.26.530159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
Abstract
Lyme disease is the most common vector-borne disease in North America and Europe. The clinical manifestations of Lyme disease vary based on the genospecies of the infecting Borrelia burgdorferi spirochete, but the microbial genetic elements underlying these associations are not known. Here, we report the whole genome sequence (WGS) and analysis of 299 patient-derived B. burgdorferi sensu stricto ( Bbss ) isolates from patients in the Eastern and Midwestern US and Central Europe. We develop a WGS-based classification of Bbss isolates, confirm and extend the findings of previous single- and multi-locus typing systems, define the plasmid profiles of human-infectious Bbss isolates, annotate the core and strain-variable surface lipoproteome, and identify loci associated with disseminated infection. A core genome consisting of ∼800 open reading frames and a core set of plasmids consisting of lp17, lp25, lp36, lp28-3, lp28-4, lp54, and cp26 are found in nearly all isolates. Strain-variable (accessory) plasmids and genes correlate strongly with phylogeny. Using genetic association study methods, we identify an accessory genome signature associated with dissemination and define the individual plasmids and genes that make up this signature. Strains within the RST1/WGS A subgroup, particularly a subset marked by the OspC type A genotype, are associated with increased rates of dissemination. OspC type A strains possess a unique constellation of strongly linked genetic changes including the presence of lp56 and lp28-1 plasmids and a cluster of genes that may contribute to their enhanced virulence compared to other genotypes. The patterns of OspC type A strains typify a broader paradigm across Bbss isolates, in which genetic structure is defined by correlated groups of strain-variable genes located predominantly on plasmids, particularly for expression of surface-exposed lipoproteins. These clusters of genes are inherited in blocks through strain-specific patterns of plasmid occupancy and are associated with the probability of invasive infection.
Collapse
Affiliation(s)
- Jacob E Lemieux
- Massachusetts General Hospital, Harvard Medical School
- Broad Institute of MIT and Harvard
| | - Weihua Huang
- New York Medical College
- East Carolina University
| | - Nathan Hill
- Massachusetts General Hospital, Harvard Medical School
- Broad Institute of MIT and Harvard
| | | | | | | | - Matteo Luban
- Massachusetts General Hospital, Harvard Medical School
- Broad Institute of MIT and Harvard
| | | | | | | | | | | | | | | | | | | | | | - John M Leong
- Tufts University, Department of Molecular Biology and Microbiology
| | - John A Branda
- Massachusetts General Hospital, Harvard Medical School
| | | | | | | | - Pardis C Sabeti
- Massachusetts General Hospital, Harvard Medical School
- Broad Institute of MIT and Harvard
- Harvard University
- Harvard T.H.Chan School of Public Health
| | | | - Klemen Strle
- Massachusetts General Hospital, Harvard Medical School
- Wadsworth Center
| |
Collapse
|
19
|
Genomic Characterization of Skin and Soft Tissue Streptococcus pyogenes Isolates from a Low-Income and a High-Income Setting. mSphere 2023; 8:e0046922. [PMID: 36507654 PMCID: PMC9942559 DOI: 10.1128/msphere.00469-22] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Streptococcus pyogenes is a leading cause of human morbidity and mortality, especially in resource-limited settings. The development of a vaccine against S. pyogenes is a global health priority to reduce the burden of postinfection rheumatic heart disease. To support this, molecular characterization of circulating S. pyogenes isolates is needed. We performed whole-genome analyses of S. pyogenes isolates from skin and soft tissue infections in Sukuta, The Gambia, a low-income country (LIC) in West Africa where there is a high burden of such infections. To act as a comparator to these LIC isolates, skin infection isolates from Sheffield, United Kingdom (a high-income country [HIC]), were also sequenced. The LIC isolates from The Gambia were genetically more diverse (46 emm types in 107 isolates) than the HIC isolates from Sheffield (23 emm types in 142 isolates), with only 7 overlapping emm types. Other molecular markers were shared, including a high prevalence of the skin infection-associated emm pattern D and the variable fibronectin-collagen-T antigen (FCT) types FCT-3 and FCT-4. Fewer of the Gambian LIC isolates carried prophage-associated superantigens (64%) and DNases (26%) than did the Sheffield HIC isolates (99% and 95%, respectively). We also identified streptococcin genes unique to 36% of the Gambian LIC isolates and a higher prevalence (48%) of glucuronic acid utilization pathway genes in the Gambian LIC isolates than in the Sheffield HIC isolates (26%). Comparison to a wider collection of HIC and LIC isolate genomes supported our findings of differing emm diversity and prevalence of bacterial factors. Our study provides insight into the genetics of LIC isolates and how they compare to HIC isolates. IMPORTANCE The global burden of rheumatic heart disease (RHD) has triggered a World Health Organization response to drive forward development of a vaccine against the causative human pathogen Streptococcus pyogenes. This burden stems primarily from low- and middle-income settings where there are high levels of S. pyogenes skin and soft tissue infections, which can lead to RHD. Our study provides much needed whole-genome-based molecular characterization of isolates causing skin infections in Sukuta, The Gambia, a low-income country (LIC) in West Africa where infection and RHD rates are high. Although we identified a greater level of diversity in these LIC isolates than in isolates from Sheffield, United Kingdom (a high-income country), there were some shared features. There were also some features that differed by geographical region, warranting further investigation into their contribution to infection. Our study has also contributed data essential for the development of a vaccine that would target geographically relevant strains.
Collapse
|
20
|
Myers BK, Shin GY, Agarwal G, Stice SP, Gitaitis RD, Kvitko BH, Dutta B. Genome-wide association and dissociation studies in Pantoea ananatis reveal potential virulence factors affecting Allium porrum and Allium fistulosum × Allium cepa hybrid. Front Microbiol 2023; 13:1094155. [PMID: 36817114 PMCID: PMC9933511 DOI: 10.3389/fmicb.2022.1094155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 12/30/2022] [Indexed: 02/05/2023] Open
Abstract
Pantoea ananatis is a member of a Pantoea species complex that causes center rot of bulb onions (A. cepa) and also infects other Allium crops like leeks (Allium porrum), chives (Allium schoenoprasum), bunching onion or Welsh onion (Allium fistulosum), and garlic (Allium sativum). This pathogen relies on a chromosomal phosphonate biosynthetic gene cluster (HiVir) and a plasmid-borne thiosulfinate tolerance cluster (alt) for onion pathogenicity and virulence, respectively. However, pathogenicity and virulence factors associated with other Allium species remain unknown. We used phenotype-dependent genome-wide association (GWAS) and phenotype-independent gene-pair coincidence (GPC) analyses on a panel of diverse 92 P. ananatis strains, which were inoculated on A. porrum and A. fistulosum × A. cepa under greenhouse conditions. Phenotypic assays showed that, in general, these strains were more aggressive on A. fistulosum × A. cepa as opposed to A. porrum. Of the 92 strains, only six showed highly aggressive foliar lesions on A. porrum compared to A. fistulosum × A. cepa. Conversely, nine strains showed highly aggressive foliar lesions on A. fistulosum × A. cepa compared to A. porrum. These results indicate that there are underlying genetic components in P. ananatis that may drive pathogenicity in these two Allium spp. Based on GWAS for foliar pathogenicity, 835 genes were associated with P. ananatis' pathogenicity on A. fistulosum × A. cepa whereas 243 genes were associated with bacterial pathogenicity on A. porrum. The Hivir as well as the alt gene clusters were identified among these genes. Besides the 'HiVir' and the alt gene clusters that are known to contribute to pathogenicity and virulence from previous studies, genes annotated with functions related to stress responses, a potential toxin-antitoxin system, flagellar-motility, quorum sensing, and a previously described phosphonoglycan biosynthesis (pgb) cluster were identified. The GPC analysis resulted in the identification of 165 individual genes sorted into 39 significant gene-pair association components and 255 genes sorted into 50 significant gene-pair dissociation components. Within the coincident gene clusters, several genes that occurred on the GWAS outputs were associated with each other but dissociated with genes that did not appear in their respective GWAS output. To focus on candidate genes that could explain the difference in virulence between hosts, a comparative genomics analysis was performed on five P. ananatis strains that were differentially pathogenic on A. porrum or A. fistulosum × A. cepa. Here, we found a putative type III secretion system, and several other genes that occurred on both GWAS outputs of both Allium hosts. Further, we also demonstrated utilizing mutational analysis that the pepM gene in the HiVir cluster is important than the pepM gene in the pgb cluster for P. ananatis pathogenicity in A. fistulosum × A. cepa and A. porrum. Overall, our results support that P. ananatis may utilize a common set of genes or gene clusters to induce symptoms on A. fistulosum × A. cepa foliar tissue as well as A. cepa but implicates additional genes for infection on A. porrum.
Collapse
Affiliation(s)
- Brendon K. Myers
- Department of Plant Pathology, The University of Georgia, Tifton, GA, United States
| | - Gi Yoon Shin
- Department of Plant Pathology, The University of Georgia, Athens, GA, United States
| | - Gaurav Agarwal
- Department of Plant Pathology, The University of Georgia, Tifton, GA, United States
| | - Shaun P. Stice
- Department of Plant Pathology, The University of Georgia, Athens, GA, United States
| | - Ronald D. Gitaitis
- Department of Plant Pathology, The University of Georgia, Tifton, GA, United States
| | - Brian H. Kvitko
- Department of Plant Pathology, The University of Georgia, Athens, GA, United States
| | - Bhabesh Dutta
- Department of Plant Pathology, The University of Georgia, Tifton, GA, United States,*Correspondence: Bhabesh Dutta, ✉
| |
Collapse
|
21
|
Li S, Wu J, Ma N, Liu W, Shao M, Ying N, Zhu L. Prediction of genome-wide imipenem resistance features in Klebsiella pneumoniae using machine learning. J Med Microbiol 2023; 72. [PMID: 36753438 DOI: 10.1099/jmm.0.001657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023] Open
Abstract
Introduction. The resistance rate of Klebsiella pneumoniae (K. pneumoniae) to imipenem is increasing year by year, and the imipenem resistance mechanism of K. pneumoniae is complex. Therefore, it is urgent to develop new strategies to explore the resistance mechanism of imipenem for its effective and accurate use in clinical practice.Hypothesis/Gap sStatement. Machine learning could identify resistance features and biological process that influence microbial resistance from whole-genome sequencing (WGS) data.Aims. This work aimed to predict imipenem resistance genetic features in K. pneumoniae from whole-genome k-mer features, and analyse their function for understanding its resistance mechanism.Methods. This study analysed WGS data of K. pneumoniae combined with resistance phenotype for imipenem, and established K. pneumoniae to imipenem genotype-phenotype model to predict resistance features using chi-squared test and random forest. An external clinical dataset was used to verify prediction power of resistance features. The potential genes were identified through alignment the resistance features with the K. pneumoniae reference genome using blastn, the functions of potential genes were further analysed to explore its resistance-related signalling pathways with GO and KEGG analysis, the resistance sequence patterns were screened using streme software. Finally, the resistance features were combined and modelled through four machine-learning algorithms (logistic regression, SVM, GBDT and XGBoost) to evaluate their phenotype prediction ability.Results. A total of 16 670 imipenem resistance features were predicted from genotype-phenotype model. The 30 potential genes were identified by annotating the resistance features and corresponded to known antibiotic-related genes (mdtM, dedA, rne, etc.). GO and KEGG pathway analyses indicated the possible association of imipenem resistance with metabolism process and cell membrane. CRYCAGCDN and CGRDAAAN were found from the imipenem resistance features, which were widely presented in the reported β-lactam resistance genes (bla SHV, bla CTX-M, bla TEM, etc.), and YCYAGCMCAST with metabolic functions (organic substance metabolic process, nitrogen compound metabolic process and cellular metabolic process) was identified from the top 50 resistance features. The 25 resistance genes in the training dataset included 19 genes in the external dataset, which verified the accuracy of prediction. The area under curve values of logistics regression, SVM, GBDT and XGBoost were 0.965, 0.966, 0.969 and 0.969, respectively, indicating that the imipenem resistance features have a strong prediction power.Conclusion. Machine-learning methods could effectively predict the imipenem resistance feature in K. pneumoniae, and provide resistance sequence profiles for predicting resistance phenotype and exploring potential resistance mechanisms. It provides an important insight into the potential therapeutic strategies of K. pneumoniae resistance to imipenem, and speed up the application of machine learning in routine diagnosis.
Collapse
Affiliation(s)
- Shanshan Li
- College of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China
| | - Jun Wu
- Lin'an Center for Disease Control and Prevention, Lin'an, 311300, PR China
| | - Nan Ma
- College of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China
| | - Wenjia Liu
- College of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China.,College of Electronics and Information Engineering, Hangzhou Dianzi University, Hangzhou 310018, PR China
| | - Mengjie Shao
- College of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China
| | - Nanjiao Ying
- College of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China.,Institute of Biomedical Engineering and Instrument, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China
| | - Lei Zhu
- College of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China.,Institute of Biomedical Engineering and Instrument, Hangzhou Dianzi University, Hangzhou, Zhejiang, 310018, PR China
| |
Collapse
|
22
|
Hilton B, Wilson DJ, O'Connell AM, Ironmonger D, Rudkin JK, Allen N, Oliver I, Wyllie DH. Laboratory diagnosed microbial infection in English UK Biobank participants in comparison to the general population. Sci Rep 2023; 13:496. [PMID: 36627297 PMCID: PMC9831014 DOI: 10.1038/s41598-022-20635-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 09/15/2022] [Indexed: 01/11/2023] Open
Abstract
Understanding the genetic and environmental risk factors for serious bacterial infections in ageing populations remains incomplete. Utilising the UK Biobank (UKB), a prospective cohort study of 500,000 adults aged 40-69 years at recruitment (2006-2010), can help address this. Partial implementation of such a system helped groups around the world make rapid progress understanding risk factors for SARS-CoV-2 infection and COVID-19, with insights appearing as early as May 2020. In principle, such approaches could also to be used for bacterial isolations. Here we report feasibility testing of linking an England-wide dataset of microbial reporting to UKB participants, to enable characterisation of microbial infections within the UKB Cohort. These records pertain mainly to bacterial isolations; SARS-CoV-2 isolations were not included. Microbiological infections occurring in patients in England, as recorded in the Public Health England second generation surveillance system (SGSS), were linked to UKB participants using pseudonymised identifiers. By January 2015, ascertainment of laboratory reports from UKB participants by SGSS was estimated at 98%. 4.5% of English UKB participants had a positive microbiological isolate in 2015. Half of UKB isolates came from 12 laboratories, and 70% from 21 laboratories. Incidence rate ratios for microbial isolation, which is indicative of serious infection, from the UKB cohort relative to the comparably aged general population ranged from 0.6 to 1, compatible with the previously described healthy participant bias in UKB. Data on microbial isolations can be linked to UKB participants from January 2015 onwards. This linked data would offer new opportunities for research into the role of bacterial agents on health and disease in middle to-old age.
Collapse
Affiliation(s)
| | - Daniel J Wilson
- Nuffield Department of Population Health, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | | | | | - Justine K Rudkin
- Nuffield Department of Population Health, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | - Naomi Allen
- Nuffield Department of Population Health, Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
| | | | - David H Wyllie
- UK Health Security Agency, London, UK.
- Nuffield Department of Medicine, University of Oxford, Oxford, UK.
| |
Collapse
|
23
|
Javkar K, Rand H, Strain E, Pop M. PRAWNS: compact pan-genomic features for whole-genome population genomics. Bioinformatics 2022; 39:6965020. [PMID: 36579850 PMCID: PMC9825322 DOI: 10.1093/bioinformatics/btac844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 11/09/2022] [Accepted: 12/28/2022] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION Scientists seeking to understand the genomic basis of bacterial phenotypes, such as antibiotic resistance, today have access to an unprecedented number of complete and nearly complete genomes. Making sense of these data requires computational tools able to perform multiple-genome comparisons efficiently, yet currently available tools cannot scale beyond several tens of genomes. RESULTS We describe PRAWNS, an efficient and scalable tool for multiple-genome analysis. PRAWNS defines a concise set of genomic features (metablocks), as well as pairwise relationships between them, which can be used as a basis for large-scale genotype-phenotype association studies. We demonstrate the effectiveness of PRAWNS by identifying genomic regions associated with antibiotic resistance in Acinetobacter baumannii. AVAILABILITY AND IMPLEMENTATION PRAWNS is implemented in C++ and Python3, licensed under the GPLv3 license, and freely downloadable from GitHub (https://github.com/KiranJavkar/PRAWNS.git). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kiran Javkar
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA,Joint Institute for Food Safety and Applied Nutrition, University of Maryland, College Park, MD 20740, USA
| | - Hugh Rand
- Center for Food Safety and Applied Nutrition, United States Food and Drug Administration, College Park, MD 20740, USA
| | - Errol Strain
- Center for Veterinary Medicine, United States Food and Drug Administration, Laurel, MD 20708, USA
| | - Mihai Pop
- To whom correspondence should be addressed.
| |
Collapse
|
24
|
Charity OJ, Acton L, Bawn M, Tassinari E, Thilliez G, Chattaway MA, Dallman TJ, Petrovska L, Kingsley RA. Increased phage resistance through lysogenic conversion accompanying emergence of monophasic Salmonella Typhimurium ST34 pandemic strain. Microb Genom 2022; 8:mgen000897. [PMID: 36382789 PMCID: PMC9836087 DOI: 10.1099/mgen.0.000897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Salmonella enterica serovar Typhimurium (S. Typhimurium) comprises a group of closely related human and animal pathogens that account for a large proportion of all Salmonella infections globally. The epidemiological record of S. Typhimurium in Europe is characterized by successive waves of dominant clones, each prevailing for approximately 10-15 years before replacement. Succession of epidemic clones may represent a moving target for interventions aimed at controlling the spread and impact of this pathogen on human and animal health. Here, we investigate the relationship of phage sensitivity and population structure of S. Typhimurium using data from the Anderson phage typing scheme. We observed greater resistance to phage predation of epidemic clones circulating in livestock over the past decades compared to variants with a restricted host range implicating increased resistance to phage in the emergence of epidemic clones of particular importance to human health. Emergence of monophasic S. Typhimurium ST34, the most recent dominant multidrug-resistant clone, was accompanied by increased resistance to phage predation during clonal expansion, in part by the acquisition of the mTmII prophage that may have contributed to the fitness of the strains that replaced ancestors lacking this prophage.
Collapse
Affiliation(s)
- Oliver J. Charity
- Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK,University of East Anglia, Norwich NR4 7TJ, UK
| | - Luke Acton
- Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK,University of East Anglia, Norwich NR4 7TJ, UK
| | - Matt Bawn
- Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK,Earlham Institute, Norwich, NR4 7UZ, UK
| | - Eleonora Tassinari
- Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK,University of East Anglia, Norwich NR4 7TJ, UK
| | - Gaёtan Thilliez
- Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK
| | - Marie A. Chattaway
- Gastrointestinal Bacteria Reference Unit, UK Health Security Agency (UKHSA), London, NW9 5EQ, UK
| | - Timothy J. Dallman
- Gastrointestinal Bacteria Reference Unit, UK Health Security Agency (UKHSA), London, NW9 5EQ, UK
| | - Liljana Petrovska
- Animal & Plant Health Agency (APHA), Weybridge, London, KT15 3NB, UK
| | - Robert A. Kingsley
- Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK,University of East Anglia, Norwich NR4 7TJ, UK,*Correspondence: Robert A. Kingsley,
| |
Collapse
|
25
|
Kremer PHC, Ferwerda B, Bootsma HJ, Rots NY, Wijmenga-Monsuur AJ, Sanders EAM, Trzciński K, Wyllie AL, Turner P, van der Ende A, Brouwer MC, Bentley SD, van de Beek D, Lees JA. Pneumococcal genetic variability in age-dependent bacterial carriage. eLife 2022; 11:e69244. [PMID: 35881438 PMCID: PMC9395192 DOI: 10.7554/elife.69244] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 07/03/2022] [Indexed: 11/13/2022] Open
Abstract
The characteristics of pneumococcal carriage vary between infants and adults. Host immune factors have been shown to contribute to these age-specific differences, but the role of pathogen sequence variation is currently less well-known. Identification of age-associated pathogen genetic factors could leadto improved vaccine formulations. We therefore performed genome sequencing in a large carriage cohort of children and adults and combined this with data from an existing age-stratified carriage study. We compiled a dictionary of pathogen genetic variation, including serotype, strain, sequence elements, single-nucleotide polymorphisms (SNPs), and clusters of orthologous genes (COGs) for each cohort - all of which were used in a genome-wide association with host age. Age-dependent colonization showed weak evidence of being heritable in the first cohort (h2 = 0.10, 95% CI 0.00-0.69) and stronger evidence in the second cohort (h2 = 0.56, 95% CI 0.23-0.87). We found that serotypes and genetic background (strain) explained a proportion of the heritability in the first cohort (h2serotype = 0.07, 95% CI 0.04-0.14 and h2GPSC = 0.06, 95% CI 0.03-0.13) and the second cohort (h2serotype = 0.11, 95% CI 0.05-0.21 and h2GPSC = 0.20, 95% CI 0.12-0.31). In a meta-analysis of these cohorts, we found one candidate association (p=1.2 × 10-9) upstream of an accessory Sec-dependent serine-rich glycoprotein adhesin. Overall, while we did find a small effect of pathogen genome variation on pneumococcal carriage between child and adult hosts, this was variable between populations and does not appear to be caused by strong effects of individual genes. This supports proposals for adaptive future vaccination strategies that are primarily targeted at dominant circulating serotypes and tailored to the composition of the pathogen populations.
Collapse
Affiliation(s)
- Philip HC Kremer
- Department of Neurology, Amsterdam UMC, University of AmsterdamMeibergdreefNetherlands
| | - Bart Ferwerda
- Department of Neurology, Amsterdam UMC, University of AmsterdamMeibergdreefNetherlands
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics, University of AmsterdamAmsterdamNetherlands
| | - Hester J Bootsma
- Centre for Infectious Disease Control, National Institute for Public Health and the EnvironmentBilthovenNetherlands
| | - Nienke Y Rots
- Centre for Infectious Disease Control, National Institute for Public Health and the EnvironmentBilthovenNetherlands
| | - Alienke J Wijmenga-Monsuur
- Centre for Infectious Disease Control, National Institute for Public Health and the EnvironmentBilthovenNetherlands
| | - Elisabeth AM Sanders
- Centre for Infectious Disease Control, National Institute for Public Health and the EnvironmentBilthovenNetherlands
- Department of Pediatric Immunology and Infectious D, Wilhelmina Children's HospitalUtrechtNetherlands
| | - Krzysztof Trzciński
- Department of Pediatric Immunology and Infectious D, Wilhelmina Children's HospitalUtrechtNetherlands
| | - Anne L Wyllie
- Department of Pediatric Immunology and Infectious D, Wilhelmina Children's HospitalUtrechtNetherlands
- Epidemiology of Microbial Diseases, Yale School of Public HealthNew HavenUnited States
| | - Paul Turner
- Cambodia Oxford Medical Research Unit, Angkor Hospital for ChildrenSiem ReapCambodia
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of OxfordOxfordUnited Kingdom
| | - Arie van der Ende
- Department of Medical Microbiology and Infection Prevention, Amsterdam UMCAmsterdamNetherlands
- The Netherlands Reference Laboratory for Bacterial MeningitisAmsterdamNetherlands
| | - Matthijs C Brouwer
- Department of Neurology, Amsterdam UMC, University of AmsterdamMeibergdreefNetherlands
| | - Stephen D Bentley
- Parasites and Microbes, Wellcome Sanger InstituteCambridgeUnited Kingdom
| | - Diederik van de Beek
- Department of Neurology, Amsterdam UMC, University of AmsterdamMeibergdreefNetherlands
| | - John A Lees
- European Molecular Biology Laboratory–European Bioinformatics InstituteCambridgeUnited Kingdom
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College LondonLondonUnited Kingdom
| |
Collapse
|
26
|
Genome-Wide Association Study of Nucleotide Variants Associated with Resistance to Nine Antimicrobials in Mycoplasma bovis. Microorganisms 2022; 10:microorganisms10071366. [PMID: 35889084 PMCID: PMC9320666 DOI: 10.3390/microorganisms10071366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 12/03/2022] Open
Abstract
Antimicrobial resistance (AMR) studies of Mycoplasma bovis have generally focused on specific loci versus using a genome-wide association study (GWAS) approach. A GWAS approach, using two different models, was applied to 194 Mycoplasma bovis genomes. Both a fixed effects linear model (FEM) and a linear mixed model (LMM) identified associations between nucleotide variants (NVs) and antimicrobial susceptibility testing (AST) phenotypes. The AMR phenotypes represented fluoroquinolones, tetracyclines, phenicols, and macrolides. Both models identified known and novel NVs associated (Bonferroni adjusted p < 0.05) with AMR. Fluoroquinolone resistance was associated with multiple NVs, including previously identified mutations in gyrA and parC. NVs in the 30S ribosomal protein 16S were associated with tetracycline resistance, whereas NVs in 5S rRNA, 23S rRNA, and 50S ribosomal proteins were associated with phenicol and macrolide resistance. For all antimicrobial classes, resistance was associated with NVs in genes coding for ABC transporters and other membrane proteins, tRNA-ligases, peptidases, and transposases, suggesting a NV-based multifactorial model of AMR in M. bovis. This study was the largest collection of North American M. bovis isolates used with a GWAS for the sole purpose of identifying novel and non-antimicrobial-target NVs associated with AMR.
Collapse
|
27
|
Iquebal MA, Jagannadham J, Jaiswal S, Prabha R, Rai A, Kumar D. Potential Use of Microbial Community Genomes in Various Dimensions of Agriculture Productivity and Its Management: A Review. Front Microbiol 2022; 13:708335. [PMID: 35655999 PMCID: PMC9152772 DOI: 10.3389/fmicb.2022.708335] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 03/17/2022] [Indexed: 12/12/2022] Open
Abstract
Agricultural productivity is highly influenced by its associated microbial community. With advancements in omics technology, metagenomics is known to play a vital role in microbial world studies by unlocking the uncultured microbial populations present in the environment. Metagenomics is a diagnostic tool to target unique signature loci of plant and animal pathogens as well as beneficial microorganisms from samples. Here, we reviewed various aspects of metagenomics from experimental methods to techniques used for sequencing, as well as diversified computational resources, including databases and software tools. Exhaustive focus and study are conducted on the application of metagenomics in agriculture, deciphering various areas, including pathogen and plant disease identification, disease resistance breeding, plant pest control, weed management, abiotic stress management, post-harvest management, discoveries in agriculture, source of novel molecules/compounds, biosurfactants and natural product, identification of biosynthetic molecules, use in genetically modified crops, and antibiotic-resistant genes. Metagenomics-wide association studies study in agriculture on crop productivity rates, intercropping analysis, and agronomic field is analyzed. This article is the first of its comprehensive study and prospects from an agriculture perspective, focusing on a wider range of applications of metagenomics and its association studies.
Collapse
Affiliation(s)
- Mir Asif Iquebal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Jaisri Jagannadham
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sarika Jaiswal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Ratna Prabha
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Dinesh Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
- School of Interdisciplinary and Applied Sciences, Central University of Haryana, Mahendergarh, Haryana, India
| |
Collapse
|
28
|
Coll F, Gouliouris T, Bruchmann S, Phelan J, Raven KE, Clark TG, Parkhill J, Peacock SJ. PowerBacGWAS: a computational pipeline to perform power calculations for bacterial genome-wide association studies. Commun Biol 2022; 5:266. [PMID: 35338232 PMCID: PMC8956664 DOI: 10.1038/s42003-022-03194-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Accepted: 02/25/2022] [Indexed: 12/14/2022] Open
Abstract
Genome-wide association studies (GWAS) are increasingly being applied to investigate the genetic basis of bacterial traits. However, approaches to perform power calculations for bacterial GWAS are limited. Here we implemented two alternative approaches to conduct power calculations using existing collections of bacterial genomes. First, a sub-sampling approach was undertaken to reduce the allele frequency and effect size of a known and detectable genotype-phenotype relationship by modifying phenotype labels. Second, a phenotype-simulation approach was conducted to simulate phenotypes from existing genetic variants. We implemented both approaches into a computational pipeline (PowerBacGWAS) that supports power calculations for burden testing, pan-genome and variant GWAS; and applied it to collections of Enterococcus faecium, Klebsiella pneumoniae and Mycobacterium tuberculosis. We used this pipeline to determine sample sizes required to detect causal variants of different minor allele frequencies (MAF), effect sizes and phenotype heritability, and studied the effect of homoplasy and population diversity on the power to detect causal variants. Our pipeline and user documentation are made available and can be applied to other bacterial populations. PowerBacGWAS can be used to determine sample sizes required to find statistically significant associations, or the associations detectable with a given sample size. We recommend to perform power calculations using existing genomes of the bacterial species and population of study.
Collapse
Affiliation(s)
- Francesc Coll
- Department of Infection Biology, Faculty of Infectious & Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK.
| | - Theodore Gouliouris
- Department of Medicine, University of Cambridge, Cambridge, UK
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | | | - Jody Phelan
- Department of Infection Biology, Faculty of Infectious & Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
| | - Kathy E Raven
- Department of Medicine, University of Cambridge, Cambridge, UK
| | - Taane G Clark
- Department of Infection Biology, Faculty of Infectious & Tropical Diseases, London School of Hygiene & Tropical Medicine, London, UK
- Faculty of Epidemiology and Population Health, Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, UK
| | - Julian Parkhill
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | | |
Collapse
|
29
|
Genome-Wide Association Study Reveals Host Factors Affecting Conjugation in Escherichia coli. Microorganisms 2022; 10:microorganisms10030608. [PMID: 35336183 PMCID: PMC8954029 DOI: 10.3390/microorganisms10030608] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 03/08/2022] [Accepted: 03/10/2022] [Indexed: 02/04/2023] Open
Abstract
The emergence and dissemination of antibiotic resistance threaten the treatment of common bacterial infections. Resistance genes are often encoded on conjugative elements, which can be horizontally transferred to diverse bacteria. In order to delay conjugative transfer of resistance genes, more information is needed on the genetic determinants promoting conjugation. Here, we focus on which bacterial host factors in the donor assist transfer of conjugative plasmids. We introduced the broad-host-range plasmid pKJK10 into a diverse collection of 113 Escherichia coli strains and measured by flow cytometry how effectively each strain transfers its plasmid to a fixed E. coli recipient. Differences in conjugation efficiency of up to 2.7 and 3.8 orders of magnitude were observed after mating for 24 h and 48 h, respectively. These differences were linked to the underlying donor strain genetic variants in genome-wide association studies, thereby identifying candidate genes involved in conjugation. We confirmed the role of fliF, fliK, kefB and ucpA in the donor ability of conjugative elements by validating defects in the conjugation efficiency of the corresponding lab strain single-gene deletion mutants. Based on the known cellular functions of these genes, we suggest that the motility and the energy supply, the intracellular pH or salinity of the donor affect the efficiency of plasmid transfer. Overall, this work advances the search for targets for the development of conjugation inhibitors, which can be administered alongside antibiotics to more effectively treat bacterial infections.
Collapse
|
30
|
Tang D, Li Y, Tan D, Fu J, Tang Y, Lin J, Zhao R, Du H, Zhao Z. KCOSS: an ultra-fast k-mer counter for assembled genome analysis. Bioinformatics 2022; 38:933-940. [PMID: 34849595 DOI: 10.1093/bioinformatics/btab797] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Revised: 10/13/2021] [Accepted: 11/19/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION The k-mer frequency in whole genome sequences provides researchers with an insightful perspective on genomic complexity, comparative genomics, metagenomics and phylogeny. The current k-mer counting tools are typically slow, and they require large memory and hard disk for assembled genome analysis. RESULTS We propose a novel and ultra-fast k-mer counting algorithm, KCOSS, to fulfill k-mer counting mainly for assembled genomes with segmented Bloom filter, lock-free queue, lock-free thread pool and cuckoo hash table. We optimize running time and memory consumption by recycling memory blocks, merging multiple consecutive first-occurrence k-mers into C-read, and writing a set of C-reads to disk asynchronously. KCOSS was comparatively tested with Jellyfish2, CHTKC and KMC3 on seven assembled genomes and three sequencing datasets in running time, memory consumption, and hard disk occupation. The experimental results show that KCOSS counts k-mer with less memory and disk while having a shorter running time on assembled genomes. KCOSS can be used to calculate the k-mer frequency not only for assembled genomes but also for sequencing data. AVAILABILITYAND IMPLEMENTATION The KCOSS software is implemented in C++. It is freely available on GitHub: https://github.com/kcoss-2021/KCOSS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Deyou Tang
- School of Software Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China.,Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yucheng Li
- School of Software Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Daqiang Tan
- School of Software Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Juan Fu
- School of Medicine, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Yelei Tang
- School of Software Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Jiabin Lin
- School of Software Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Rong Zhao
- School of Software Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Hongli Du
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, Guangdong 510006, China
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| |
Collapse
|
31
|
Tonkin-Hill G, Ling C, Chaguza C, Salter SJ, Hinfonthong P, Nikolaou E, Tate N, Pastusiak A, Turner C, Chewapreecha C, Frost SDW, Corander J, Croucher NJ, Turner P, Bentley SD. Pneumococcal within-host diversity during colonization, transmission and treatment. Nat Microbiol 2022; 7:1791-1804. [PMID: 36216891 PMCID: PMC9613479 DOI: 10.1038/s41564-022-01238-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 07/18/2022] [Indexed: 11/13/2022]
Abstract
Characterizing the genetic diversity of pathogens within the host promises to greatly improve surveillance and reconstruction of transmission chains. For bacteria, it also informs our understanding of inter-strain competition and how this shapes the distribution of resistant and sensitive bacteria. Here we study the genetic diversity of Streptococcus pneumoniae within 468 infants and 145 of their mothers by deep sequencing whole pneumococcal populations from 3,761 longitudinal nasopharyngeal samples. We demonstrate that deep sequencing has unsurpassed sensitivity for detecting multiple colonization, doubling the rate at which highly invasive serotype 1 bacteria were detected in carriage compared with gold-standard methods. The greater resolution identified an elevated rate of transmission from mothers to their children in the first year of the child's life. Comprehensive treatment data demonstrated that infants were at an elevated risk of both the acquisition and persistent colonization of a multidrug-resistant bacterium following antimicrobial treatment. Some alleles were enriched after antimicrobial treatment, suggesting that they aided persistence, but generally purifying selection dominated within-host evolution. Rates of co-colonization imply that in the absence of treatment, susceptible lineages outcompeted resistant lineages within the host. These results demonstrate the many benefits of deep sequencing for the genomic surveillance of bacterial pathogens.
Collapse
Affiliation(s)
- Gerry Tonkin-Hill
- grid.10306.340000 0004 0606 5382Parasites and Microbes, Wellcome Sanger Institute, Cambridge, UK ,grid.5510.10000 0004 1936 8921Department of Biostatistics, University of Oslo, Blindern, Norway
| | - Clare Ling
- grid.10223.320000 0004 1937 0490Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand ,grid.4991.50000 0004 1936 8948Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Chrispin Chaguza
- grid.10306.340000 0004 0606 5382Parasites and Microbes, Wellcome Sanger Institute, Cambridge, UK ,grid.47100.320000000419368710Department of Epidemiology of Microbial Diseases, Yale School of Public Health, Yale University, New Haven, CT USA
| | - Susannah J. Salter
- grid.5335.00000000121885934Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Pattaraporn Hinfonthong
- grid.10223.320000 0004 1937 0490Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand
| | - Elissavet Nikolaou
- grid.48004.380000 0004 1936 9764Department of Clinical Sciences, Liverpool School of Tropical Medicine, Liverpool, UK ,grid.1058.c0000 0000 9442 535XInfection and Immunity, Murdoch Children’s Research Institute, Melbourne, Victoria Australia ,grid.1008.90000 0001 2179 088XDepartment of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, Victoria Australia
| | - Natalie Tate
- grid.48004.380000 0004 1936 9764Department of Clinical Sciences, Liverpool School of Tropical Medicine, Liverpool, UK
| | | | - Claudia Turner
- grid.4991.50000 0004 1936 8948Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK ,grid.459332.a0000 0004 0418 5364Cambodia-Oxford Medical Research Unit, Angkor Hospital for Children, Siem Reap, Cambodia
| | - Claire Chewapreecha
- grid.10306.340000 0004 0606 5382Parasites and Microbes, Wellcome Sanger Institute, Cambridge, UK ,grid.10223.320000 0004 1937 0490Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Bangkok, Thailand
| | - Simon D. W. Frost
- grid.419815.00000 0001 2181 3404Microsoft Research, Redmond, WA USA ,grid.8991.90000 0004 0425 469XLondon School of Hygiene and Tropical Medicine, London, UK
| | - Jukka Corander
- grid.10306.340000 0004 0606 5382Parasites and Microbes, Wellcome Sanger Institute, Cambridge, UK ,grid.5510.10000 0004 1936 8921Department of Biostatistics, University of Oslo, Blindern, Norway ,grid.7737.40000 0004 0410 2071Helsinki Institute for Information Technology HIIT, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - Nicholas J. Croucher
- grid.7445.20000 0001 2113 8111MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, UK
| | - Paul Turner
- grid.4991.50000 0004 1936 8948Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK ,grid.459332.a0000 0004 0418 5364Cambodia-Oxford Medical Research Unit, Angkor Hospital for Children, Siem Reap, Cambodia
| | - Stephen D. Bentley
- grid.10306.340000 0004 0606 5382Parasites and Microbes, Wellcome Sanger Institute, Cambridge, UK
| |
Collapse
|
32
|
Zhou B, Ye Q, Chen M, Li F, Xiang X, Shang Y, Wang C, Zhang J, Xue L, Wang J, Wu S, Pang R, Ding Y, Wu Q. Novel species-specific targets for real-time PCR detection of four common pathogenic Staphylococcus spp. Food Control 2022. [DOI: 10.1016/j.foodcont.2021.108478] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
|
33
|
Wee BA, Alves J, Lindsay DSJ, Klatt AB, Sargison FA, Cameron RL, Pickering A, Gorzynski J, Corander J, Marttinen P, Opitz B, Smith AJ, Fitzgerald JR. Population analysis of Legionella pneumophila reveals a basis for resistance to complement-mediated killing. Nat Commun 2021; 12:7165. [PMID: 34887398 PMCID: PMC8660822 DOI: 10.1038/s41467-021-27478-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 11/19/2021] [Indexed: 11/09/2022] Open
Abstract
Legionella pneumophila is the most common cause of the severe respiratory infection known as Legionnaires' disease. However, the microorganism is typically a symbiont of free-living amoeba, and our understanding of the bacterial factors that determine human pathogenicity is limited. Here we carried out a population genomic study of 902 L. pneumophila isolates from human clinical and environmental samples to examine their genetic diversity, global distribution and the basis for human pathogenicity. We find that the capacity for human disease is representative of the breadth of species diversity although some clones are more commonly associated with clinical infections. We identified a single gene (lag-1) to be most strongly associated with clinical isolates. lag-1, which encodes an O-acetyltransferase for lipopolysaccharide modification, has been distributed horizontally across all major phylogenetic clades of L. pneumophila by frequent recent recombination events. The gene confers resistance to complement-mediated killing in human serum by inhibiting deposition of classical pathway molecules on the bacterial surface. Furthermore, acquisition of lag-1 inhibits complement-dependent phagocytosis by human neutrophils, and promoted survival in a mouse model of pulmonary legionellosis. Thus, our results reveal L. pneumophila genetic traits linked to disease and provide a molecular basis for resistance to complement-mediated killing.
Collapse
Affiliation(s)
- Bryan A Wee
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Joana Alves
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Diane S J Lindsay
- Bacterial Respiratory Infections Service (Ex Mycobacteria), Scottish Microbiology Reference Laboratory, Glasgow, Scotland, UK
| | - Ann-Brit Klatt
- Department of Internal Medicine/Infectious Diseases and Pulmonary Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Fiona A Sargison
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Ross L Cameron
- NHS National Services Scotland, Health Protection Scotland, Glasgow, Scotland, UK
| | - Amy Pickering
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Jamie Gorzynski
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK
| | - Jukka Corander
- Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
- Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Pekka Marttinen
- Helsinki Institute for Information Technology, Department of Computer Science, Aalto University, Aalto, Finland
| | - Bastian Opitz
- Department of Internal Medicine/Infectious Diseases and Pulmonary Medicine, Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Andrew J Smith
- Bacterial Respiratory Infections Service (Ex Mycobacteria), Scottish Microbiology Reference Laboratory, Glasgow, Scotland, UK
- College of Medical, Veterinary & Life Sciences, Glasgow Dental Hospital & School, University of Glasgow, Glasgow, UK
| | - J Ross Fitzgerald
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, Scotland, UK.
| |
Collapse
|
34
|
Mahoney DBJ, Falardeau J, Hingston P, Chmielowska C, Carroll LM, Wiedmann M, Jang SS, Wang S. Associations between Listeria monocytogenes genomic characteristics and adhesion to polystyrene at 8 °C. Food Microbiol 2021; 102:103915. [PMID: 34809941 DOI: 10.1016/j.fm.2021.103915] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 08/12/2021] [Accepted: 09/22/2021] [Indexed: 11/04/2022]
Abstract
Listeria monocytogenes remains a threat to the food system and has led to numerous foodborne outbreaks worldwide. L. monocytogenes can establish itself in food production facilities by adhering to surfaces, resulting in increased resistance to environmental stressors. The aim of this study was to evaluate the adhesion ability of L. monocytogenes at 8 °C and to analyse associations between the observed phenotypes and genetic factors such as internalin A (inlA) genotypes, stress survival islet 1 (SSI-1) genotype, and clonal complex (CC). L. monocytogenes isolates (n = 184) were grown at 8 °C and 100% relative humidity for 15 days. The growth was measured by optical density at 600 nm every 24 h. Adherent cells were stained using crystal violet and quantified spectrophotometrically. Genotyping of inlA and SSI-1, multi-locus sequence typing, and a genome-wide association study (GWAS) were performed to elucidate the phenotype-genotype relationships in L. monocytogenes cold adhesion. Among all inlA genotypes, truncated inlA isolates had the highest mean adhered cells, ABS595nm = 0.30 ± 0.15 (Tukey HSD; P < 0.05), while three-codon deletion inlA isolates had the least mean adhered cells (Tukey HSD; P < 0.05). When SSI-1 was present, more cells adhered; less cells adhered when SSI-1 was absent (Welch's t-test; P < 0.05). Adhesion was associated with clonal complexes which have low clinical frequency, while reduced adhesion was associated with clonal complexes which have high frequency. The results of this study support that premature stop codons in the virulence gene inlA are associated with increased cold adhesion and that an invasion enhancing deletion in inlA is associated with decreased cold adhesion. This study also provides evidence to suggest that there is an evolutionary trade off between virulence and adhesion in L. monocytogenes. These results provide a greater understanding of L. monocytogenes adhesion which will aid in the development of strategies to reduce L. monocytogenes in the food system.
Collapse
Affiliation(s)
| | - Justin Falardeau
- Department of Food, Nutrition, and Health, University of British Columbia, Vancouver, BC, Canada
| | - Patricia Hingston
- Department of Food, Nutrition, and Health, University of British Columbia, Vancouver, BC, Canada
| | - Cora Chmielowska
- Department of Bacterial Genetics, University of Warsaw, Warsaw, Poland
| | - Laura M Carroll
- Department of Food Science, Cornell University, Ithaca, NY, USA
| | - Martin Wiedmann
- Department of Food Science, Cornell University, Ithaca, NY, USA
| | - Sung Sik Jang
- British Columbia Centre for Disease Control, Vancouver, BC, Canada
| | - Siyun Wang
- Department of Food, Nutrition, and Health, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
35
|
Colquhoun RM, Hall MB, Lima L, Roberts LW, Malone KM, Hunt M, Letcher B, Hawkey J, George S, Pankhurst L, Iqbal Z. Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs. Genome Biol 2021; 22:267. [PMID: 34521456 PMCID: PMC8442373 DOI: 10.1186/s13059-021-02473-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 08/19/2021] [Indexed: 12/21/2022] Open
Abstract
We present pandora, a novel pan-genome graph structure and algorithms for identifying variants across the full bacterial pan-genome. As much bacterial adaptability hinges on the accessory genome, methods which analyze SNPs in just the core genome have unsatisfactory limitations. Pandora approximates a sequenced genome as a recombinant of references, detects novel variation and pan-genotypes multiple samples. Using a reference graph of 578 Escherichia coli genomes, we compare 20 diverse isolates. Pandora recovers more rare SNPs than single-reference-based tools, is significantly better than picking the closest RefSeq reference, and provides a stable framework for analyzing diverse samples without reference bias.
Collapse
Affiliation(s)
- Rachel M Colquhoun
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
- Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford, UK
- Institute of Evolutionary Biology, Ashworth Laboratories, University of Edinburgh, Edinburgh, UK
| | - Michael B Hall
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Leandro Lima
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Leah W Roberts
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Kerri M Malone
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Martin Hunt
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Brice Letcher
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | - Jane Hawkey
- Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Victoria, 3004, Australia
| | - Sophie George
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Louise Pankhurst
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Department of Zoology, University of Oxford, Mansfield Road, Oxford, UK
| | - Zamin Iqbal
- European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
36
|
VanWallendael A, Alvarez M. Alignment-free methods for polyploid genomes: Quick and reliable genetic distance estimation. Mol Ecol Resour 2021; 22:612-622. [PMID: 34478242 DOI: 10.1111/1755-0998.13499] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 08/20/2021] [Indexed: 01/10/2023]
Abstract
Polyploid genomes pose several inherent challenges to population genetic analyses. While alignment-based methods are fundamentally limited in their applicability to polyploids, alignment-free methods bypass most of these limits. We investigated the use of Mash, a k-mer analysis tool that uses the MinHash method to reduce complexity in large genomic data sets, for basic population genetic analyses of polyploid sequences. We measured the degree to which Mash correctly estimated pairwise genetic distance in simulated haploid and polyploid short-read sequences with various levels of missing data. Mash-based estimates of genetic distance were comparable to alignment-based estimates, and were less impacted by missing data. We also used Mash to analyse publicly available short-read data for three polyploid and one diploid species, then compared Mash results to published results. For both simulated and real data, Mash accurately estimated pairwise genetic differences for polyploids as well as diploids as much as 476 times faster than alignment-based methods, though we found that Mash genetic distance estimates could be biased by per-sample read depth. Mash may be a particularly useful addition to the toolkit of polyploid geneticists for rapid confirmation of alignment-based results and for basic population genetics in reference-free systems or those with only poor-quality sequence data available.
Collapse
Affiliation(s)
- Acer VanWallendael
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Mariano Alvarez
- Biology Department, Wesleyan University, Middletown, CT, USA
| |
Collapse
|
37
|
Bellabarba A, Bacci G, Decorosi F, Aun E, Azzarello E, Remm M, Giovannetti L, Viti C, Mengoni A, Pini F. Competitiveness for Nodule Colonization in Sinorhizobium meliloti: Combined In Vitro-Tagged Strain Competition and Genome-Wide Association Analysis. mSystems 2021. [PMID: 34313466 DOI: 10.1101/2020.09.15.298034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2023] Open
Abstract
Associations between leguminous plants and symbiotic nitrogen-fixing rhizobia are a classic example of mutualism between a eukaryotic host and a specific group of prokaryotic microbes. Although this symbiosis is in part species specific, different rhizobial strains may colonize the same nodule. Some rhizobial strains are commonly known as better competitors than others, but detailed analyses that aim to predict rhizobial competitive abilities based on genomes are still scarce. Here, we performed a bacterial genome-wide association (GWAS) analysis to define the genomic determinants related to the competitive capabilities in the model rhizobial species Sinorhizobium meliloti. For this, 13 tester strains were green fluorescent protein (GFP) tagged and assayed versus 3 red fluorescent protein (RFP)-tagged reference competitor strains (Rm1021, AK83, and BL225C) in a Medicago sativa nodule occupancy test. Competition data and strain genomic sequences were employed to build a model for GWAS based on k-mers. Among the k-mers with the highest scores, 51 k-mers mapped on the genomes of four strains showing the highest competition phenotypes (>60% single strain nodule occupancy; GR4, KH35c, KH46, and SM11) versus BL225C. These k-mers were mainly located on the symbiosis-related megaplasmid pSymA, specifically on genes coding for transporters, proteins involved in the biosynthesis of cofactors, and proteins related to metabolism (e.g., fatty acids). The same analysis was performed considering the sum of single and mixed nodules obtained in the competition assays versus BL225C, retrieving k-mers mapped on the genes previously found and on vir genes. Therefore, the competition abilities seem to be linked to multiple genetic determinants and comprise several cellular components. IMPORTANCE Decoding the competitive pattern that occurs in the rhizosphere is challenging in the study of bacterial social interaction strategies. To date, the single-gene approach has mainly been used to uncover the bases of nodulation, but there is still a knowledge gap regarding the main features that a priori characterize rhizobial strains able to outcompete indigenous rhizobia. Therefore, tracking down which traits make different rhizobial strains able to win the competition for plant infection over other indigenous rhizobia will improve the strain selection process and, consequently, plant yield in sustainable agricultural production systems. We proved that a k-mer-based GWAS approach can efficiently identify the competition determinants of a panel of strains previously analyzed for their plant tissue occupancy using double fluorescent labeling. The reported strategy will be useful for detailed studies on the genomic aspects of the evolution of bacterial symbiosis and for an extensive evaluation of rhizobial inoculants.
Collapse
Affiliation(s)
- Agnese Bellabarba
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Giovanni Bacci
- Department of Biology, University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Francesca Decorosi
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Erki Aun
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartugrid.10939.32, Tartu, Estonia
| | - Elisa Azzarello
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Maido Remm
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartugrid.10939.32, Tartu, Estonia
| | - Luciana Giovannetti
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Carlo Viti
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Alessio Mengoni
- Department of Biology, University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Francesco Pini
- Department of Biology, University of Bari Aldo Morogrid.7644.1, Bari, Italy
| |
Collapse
|
38
|
Competitiveness for Nodule Colonization in Sinorhizobium meliloti: Combined In Vitro-Tagged Strain Competition and Genome-Wide Association Analysis. mSystems 2021; 6:e0055021. [PMID: 34313466 PMCID: PMC8407117 DOI: 10.1128/msystems.00550-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Associations between leguminous plants and symbiotic nitrogen-fixing rhizobia are a classic example of mutualism between a eukaryotic host and a specific group of prokaryotic microbes. Although this symbiosis is in part species specific, different rhizobial strains may colonize the same nodule. Some rhizobial strains are commonly known as better competitors than others, but detailed analyses that aim to predict rhizobial competitive abilities based on genomes are still scarce. Here, we performed a bacterial genome-wide association (GWAS) analysis to define the genomic determinants related to the competitive capabilities in the model rhizobial species Sinorhizobium meliloti. For this, 13 tester strains were green fluorescent protein (GFP) tagged and assayed versus 3 red fluorescent protein (RFP)-tagged reference competitor strains (Rm1021, AK83, and BL225C) in a Medicago sativa nodule occupancy test. Competition data and strain genomic sequences were employed to build a model for GWAS based on k-mers. Among the k-mers with the highest scores, 51 k-mers mapped on the genomes of four strains showing the highest competition phenotypes (>60% single strain nodule occupancy; GR4, KH35c, KH46, and SM11) versus BL225C. These k-mers were mainly located on the symbiosis-related megaplasmid pSymA, specifically on genes coding for transporters, proteins involved in the biosynthesis of cofactors, and proteins related to metabolism (e.g., fatty acids). The same analysis was performed considering the sum of single and mixed nodules obtained in the competition assays versus BL225C, retrieving k-mers mapped on the genes previously found and on vir genes. Therefore, the competition abilities seem to be linked to multiple genetic determinants and comprise several cellular components. IMPORTANCE Decoding the competitive pattern that occurs in the rhizosphere is challenging in the study of bacterial social interaction strategies. To date, the single-gene approach has mainly been used to uncover the bases of nodulation, but there is still a knowledge gap regarding the main features that a priori characterize rhizobial strains able to outcompete indigenous rhizobia. Therefore, tracking down which traits make different rhizobial strains able to win the competition for plant infection over other indigenous rhizobia will improve the strain selection process and, consequently, plant yield in sustainable agricultural production systems. We proved that a k-mer-based GWAS approach can efficiently identify the competition determinants of a panel of strains previously analyzed for their plant tissue occupancy using double fluorescent labeling. The reported strategy will be useful for detailed studies on the genomic aspects of the evolution of bacterial symbiosis and for an extensive evaluation of rhizobial inoculants.
Collapse
|
39
|
Sánchez-Reyes A, Bretón-Deval L, Mangelson H, Salinas-Peralta I, Sanchez-Flores A. Hi-C deconvolution of a textile dye-related microbiome reveals novel taxonomic landscapes and links phenotypic potential to individual genomes. Int Microbiol 2021; 25:99-110. [PMID: 34269948 DOI: 10.1007/s10123-021-00189-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 05/20/2021] [Accepted: 06/27/2021] [Indexed: 12/12/2022]
Abstract
Microbial biodiversity is represented by a variety of genomic landscapes adapted to dissimilar environments on Earth. These genomic landscapes contain functional signatures connected with the community phenotypes. Here, we assess the genomic microbial diversity landscape at a high-resolution level of a polluted river-associated microbiome (Morelos, México), cultured in a medium enriched with anthraquinone Deep Blue 35 dye. We explore the resultant textile dye microbiome to infer links between predicted biodegradative functions, and metagenomic and metabolic potential, especially using the information obtained from individual reconstructed genomes. By using Hi-C proximity-ligation deconvolution method, we deconvoluted 97 genome composites (80% potentially novel species). The main taxonomic determinants were Methanobacterium, Clostridium, and Cupriavidus genera constituting 50, 22, and 11% of the total community profile. Also, we observed a rare biosphere of novel taxa without clear taxonomic standing. Removal of 50% chemical oxygen demand with 23% decolorization was observed after 30 days of dye enrichment. Genes related to catalase-peroxidase, polyphenol oxidase, and laccase enzymes were predicted as associated with textile dye biodegradation phenotype under our study conditions, highlighting the potential of metagenome-wide analysis to predict biodegradative determinants. This study prompts high-resolution screening of individual genomes within textile dye river sediment microbiomes or complex communities under environmental pressures.
Collapse
Affiliation(s)
- Ayixon Sánchez-Reyes
- Cátedras Conacyt-Instituto de Biotecnología, Universidad Nacional Autónoma de México, Avenida Universidad 2001, Chamilpa, 62210, Cuernavaca, Morelos, México.
| | - Luz Bretón-Deval
- Cátedras Conacyt-Instituto de Biotecnología, Universidad Nacional Autónoma de México, Avenida Universidad 2001, Chamilpa, 62210, Cuernavaca, Morelos, México
| | | | | | - Alejandro Sanchez-Flores
- Unidad Universitaria de Secuenciación Masiva y Bioinformática, Instituto de Biotecnología, Universidad Nacional Autónoma de México, Cuernavaca, Morelos, Mexico
| |
Collapse
|
40
|
Absence of Host-Specific Genes in Canine and Human Staphylococcus pseudintermedius as Inferred from Comparative Genomics. Antibiotics (Basel) 2021; 10:antibiotics10070854. [PMID: 34356775 PMCID: PMC8300826 DOI: 10.3390/antibiotics10070854] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Revised: 06/24/2021] [Accepted: 06/30/2021] [Indexed: 11/16/2022] Open
Abstract
Staphylococcus pseudintermedius is an important pathogen in dogs that occasionally causes infections in humans as an opportunistic pathogen of elderly and immunocompromised people. This study compared the genomic relatedness and antimicrobial resistance genes using genome-wide association study (GWAS) to examine host association of canine and human S. pseudintermedius isolates. Canine (n = 25) and human (n = 32) methicillin-susceptible S. pseudintermedius (MSSP) isolates showed a high level of genetic diversity with an overrepresentation of clonal complex CC241 in human isolates. This clonal complex was associated with carriage of a plasmid containing a bacteriocin with cytotoxic properties, a CRISPR-cas domain and a pRE25-like mobile element containing five antimicrobial resistance genes. Multi-drug resistance (MDR) was predicted in 13 (41%) of human isolates and 14 (56%) of canine isolates. CC241 represented 54% of predicted MDR isolates from humans and 21% of predicted MDR canine isolates. While it had previously been suggested that certain host-specific genes were present the current GWAS analysis did not identify any genes that were significantly associated with human or canine isolates. In conclusion, this is the first genomic study showing that MSSP is genetically diverse in both hosts and that multidrug resistance is important in dog and human-associated S. pseudintermedius isolates.
Collapse
|
41
|
Allen JP, Snitkin E, Pincus NB, Hauser AR. Forest and Trees: Exploring Bacterial Virulence with Genome-wide Association Studies and Machine Learning. Trends Microbiol 2021; 29:621-633. [PMID: 33455849 PMCID: PMC8187264 DOI: 10.1016/j.tim.2020.12.002] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 12/07/2020] [Accepted: 12/08/2020] [Indexed: 12/15/2022]
Abstract
The advent of inexpensive and rapid sequencing technologies has allowed bacterial whole-genome sequences to be generated at an unprecedented pace. This wealth of information has revealed an unanticipated degree of strain-to-strain genetic diversity within many bacterial species. Awareness of this genetic heterogeneity has corresponded with a greater appreciation of intraspecies variation in virulence. A number of comparative genomic strategies have been developed to link these genotypic and pathogenic differences with the aim of discovering novel virulence factors. Here, we review recent advances in comparative genomic approaches to identify bacterial virulence determinants, with a focus on genome-wide association studies and machine learning.
Collapse
Affiliation(s)
- Jonathan P Allen
- Department of Microbiology and Immunology, Loyola University Chicago Stritch School of Medicine, Maywood, IL 60153, USA.
| | - Evan Snitkin
- Department of Microbiology and Immunology, Department of Internal Medicine/Division of Infectious Diseases, University of Michigan, Ann Arbor, MI 48109, USA
| | - Nathan B Pincus
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Alan R Hauser
- Department of Microbiology-Immunology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA; Department of Medicine/Division of Infectious Diseases, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| |
Collapse
|
42
|
Hwang W, Yong JH, Min KB, Lee KM, Pascoe B, Sheppard SK, Yoon SS. Genome-wide association study of signature genetic alterations among pseudomonas aeruginosa cystic fibrosis isolates. PLoS Pathog 2021; 17:e1009681. [PMID: 34161396 PMCID: PMC8274868 DOI: 10.1371/journal.ppat.1009681] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 07/12/2021] [Accepted: 05/31/2021] [Indexed: 12/18/2022] Open
Abstract
Pseudomonas aeruginosa (PA) is an opportunistic pathogen that causes diverse human infections including chronic airway infection in patients with cystic fibrosis (CF). Comparing the genomes of CF and non-CF PA isolates has great potential to identify the genetic basis of pathogenicity. To gain a deeper understanding of PA adaptation in CF airways, we performed a genome-wide association study (GWAS) on 1,001 PA genomes. Genetic variations identified among CF isolates were categorized into (i) alterations in protein-coding regions, either large- or small-scale, and (ii) polymorphic variation in intergenic regions. We introduced each CF-associated genetic alteration into the genome of PAO1, a prototype PA strain, and validated the outcomes experimentally. Loci readily mutated among CF isolates included genes encoding a probable sulfatase, a probable TonB-dependent receptor (PA2332~PA2336), L-cystine transporter (YecS, PA0313), and a probable transcriptional regulator (PA5438). A promoter region of a heme/hemoglobin uptake outer membrane receptor (PhuR, PA4710) was also different between the CF and non-CF isolate groups. Our analysis highlights ways in which the PA genome evolves to survive and persist within the context of chronic CF infection.
Collapse
Affiliation(s)
- Wontae Hwang
- Department of Microbiology and Immunology, Seoul, Republic of Korea
- Brain Korea 21 PLUS Project for Medical Sciences, Seoul, Republic of Korea
| | - Ji Hyun Yong
- Department of Microbiology and Immunology, Seoul, Republic of Korea
- Brain Korea 21 PLUS Project for Medical Sciences, Seoul, Republic of Korea
| | - Kyung Bae Min
- Department of Microbiology and Immunology, Seoul, Republic of Korea
- Brain Korea 21 PLUS Project for Medical Sciences, Seoul, Republic of Korea
| | - Kang-Mu Lee
- Department of Microbiology and Immunology, Seoul, Republic of Korea
- Brain Korea 21 PLUS Project for Medical Sciences, Seoul, Republic of Korea
| | - Ben Pascoe
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, United Kingdom
| | - Samuel K Sheppard
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath, United Kingdom
| | - Sang Sun Yoon
- Department of Microbiology and Immunology, Seoul, Republic of Korea
- Brain Korea 21 PLUS Project for Medical Sciences, Seoul, Republic of Korea
- Institute for Immunology and Immunological Diseases, Seoul, Republic of Korea
- Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea
- * E-mail:
| |
Collapse
|
43
|
Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research. J Clin Microbiol 2021; 59:e0126020. [PMID: 33536291 DOI: 10.1128/jcm.01260-20] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Antimicrobial resistance (AMR) remains one of the most challenging phenomena of modern medicine. Machine learning (ML) is a subfield of artificial intelligence that focuses on the development of algorithms that learn how to accurately predict outcome variables using large sets of predictor variables that are typically not hand selected and are minimally curated. Models are parameterized using a training data set and then applied to a test data set on which predictive performance is evaluated. The application of ML algorithms to the problem of AMR has garnered increasing interest in the past 5 years due to the exponential growth of experimental and clinical data, heavy investment in computational capacity, improvements in algorithm performance, and increasing urgency for innovative approaches to reducing the burden of disease. Here, we review the current state of research at the intersection of ML and AMR with an emphasis on three domains of work. The first is the prediction of AMR using genomic data. The second is the use of ML to gain insight into the cellular functions disrupted by antibiotics, which forms the basis for understanding mechanisms of action and developing novel anti-infectives. The third focuses on the application of ML for antimicrobial stewardship using data extracted from the electronic health record. Although the use of ML for understanding, diagnosing, treating, and preventing AMR is still in its infancy, the continued growth of data and interest ensures it will become an important tool for future translational research programs.
Collapse
|
44
|
Malekian N, Al-Fatlawi A, Berendonk TU, Schroeder M. Mutations in bdcA and valS Correlate with Quinolone Resistance in Wastewater Escherichia coli. Int J Mol Sci 2021; 22:ijms22116063. [PMID: 34199768 PMCID: PMC8200043 DOI: 10.3390/ijms22116063] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 05/27/2021] [Accepted: 05/29/2021] [Indexed: 11/16/2022] Open
Abstract
Single mutations can confer resistance to antibiotics. Identifying such mutations can help to develop and improve drugs. Here, we systematically screen for candidate quinolone resistance-conferring mutations. We sequenced highly diverse wastewater E. coli and performed a genome-wide association study (GWAS) to determine associations between over 200,000 mutations and quinolone resistance phenotypes. We uncovered 13 statistically significant mutations including 1 located at the active site of the biofilm dispersal gene bdcA and 6 silent mutations in the aminoacyl-tRNA synthetase valS. The study also recovered the known mutations in the topoisomerases gyrase (gyrA) and topoisomerase IV (parC). In summary, we demonstrate that GWAS effectively and comprehensively identifies resistance mutations without a priori knowledge of targets and mode of action. The results suggest that mutations in the bdcA and valS genes, which are involved in biofilm dispersal and translation, may lead to novel resistance mechanisms.
Collapse
Affiliation(s)
- Negin Malekian
- Biotechnology Center (BIOTEC), Dresden University of Technology, Tatzberg 47-49, 01307 Dresden, Germany; (N.M.); (A.A.-F.)
| | - Ali Al-Fatlawi
- Biotechnology Center (BIOTEC), Dresden University of Technology, Tatzberg 47-49, 01307 Dresden, Germany; (N.M.); (A.A.-F.)
| | - Thomas U. Berendonk
- Institute of Hydrobiology, Dresden University of Technology, 01217 Dresden, Germany;
| | - Michael Schroeder
- Biotechnology Center (BIOTEC), Dresden University of Technology, Tatzberg 47-49, 01307 Dresden, Germany; (N.M.); (A.A.-F.)
- Correspondence:
| |
Collapse
|
45
|
Liu X, Ma Y, Wang J. Genetic variation and function: revealing potential factors associated with microbial phenotypes. BIOPHYSICS REPORTS 2021; 7:111-126. [PMID: 37288143 PMCID: PMC10235906 DOI: 10.52601/bpr.2021.200040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 03/09/2021] [Indexed: 06/09/2023] Open
Abstract
Innovations in sequencing technology have generated voluminous microbial and host genomic data, making it possible to detect these genetic variations and analyze the function influenced by them. Recently, many studies have linked such genetic variations to phenotypes through association or comparative analysis, which have further advanced our understanding of multiple microbial functions. In this review, we summarized the application of association analysis in microbes like Mycobacterium tuberculosis, focusing on screening of microbial genetic variants potentially associated with phenotypes such as drug resistance, pathogenesis and novel drug targets etc.; reviewed the application of additional comparative genomic or transcriptomic methods to identify genetic factors associated with functions in microbes; expanded the scope of our study to focus on host genetic factors associated with certain microbes or microbiome and summarized the recent host genetic variations associated with microbial phenotypes, including susceptibility and load after infection of HIV, presence/absence of different taxa, and quantitative traits of microbiome, and lastly, discussed the challenges that may be encountered and the apparent or potential viable solutions. Gene-function analysis of microbe and microbiome is still in its infancy, and in order to unleash its full potential, it is necessary to understand its history, current status, and the challenges hindering its development.
Collapse
Affiliation(s)
- Xiaolin Liu
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yue Ma
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jun Wang
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
46
|
Kobras CM, Fenton AK, Sheppard SK. Next-generation microbiology: from comparative genomics to gene function. Genome Biol 2021; 22:123. [PMID: 33926534 PMCID: PMC8082670 DOI: 10.1186/s13059-021-02344-9] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 04/08/2021] [Indexed: 11/12/2022] Open
Abstract
Microbiology is at a turning point in its 120-year history. Widespread next-generation sequencing has revealed genetic complexity among bacteria that could hardly have been imagined by pioneers such as Pasteur, Escherich and Koch. This data cascade brings enormous potential to improve our understanding of individual bacterial cells and the genetic basis of phenotype variation. However, this revolution in data science cannot replace established microbiology practices, presenting the challenge of how to integrate these new techniques. Contrasting comparative and functional genomic approaches, we evoke molecular microbiology theory and established practice to present a conceptual framework and practical roadmap for next-generation microbiology.
Collapse
Affiliation(s)
- Carolin M Kobras
- Department of Molecular Biology & Biotechnology, University of Sheffield, The Florey Institute for Host-Pathogen Interactions, Sheffield, UK
| | - Andrew K Fenton
- Department of Molecular Biology & Biotechnology, University of Sheffield, The Florey Institute for Host-Pathogen Interactions, Sheffield, UK.
| | - Samuel K Sheppard
- Department of Biology & Biochemistry, University of Bath, Milner Centre for Evolution, Bath, UK.
| |
Collapse
|
47
|
Mai TT, Turner P, Corander J. Boosting heritability: estimating the genetic component of phenotypic variation with multiple sample splitting. BMC Bioinformatics 2021; 22:164. [PMID: 33773584 PMCID: PMC8004405 DOI: 10.1186/s12859-021-04079-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 03/15/2021] [Indexed: 11/29/2022] Open
Abstract
Background Heritability is a central measure in genetics quantifying how much of the variability observed in a trait is attributable to genetic differences. Existing methods for estimating heritability are most often based on random-effect models, typically for computational reasons. The alternative of using a fixed-effect model has received much more limited attention in the literature. Results In this paper, we propose a generic strategy for heritability inference, termed as “boosting heritability”, by combining the advantageous features of different recent methods to produce an estimate of the heritability with a high-dimensional linear model. Boosting heritability uses in particular a multiple sample splitting strategy which leads in general to a stable and accurate estimate. We use both simulated data and real antibiotic resistance data from a major human pathogen, Sptreptococcus pneumoniae, to demonstrate the attractive features of our inference strategy. Conclusions Boosting is shown to offer a reliable and practically useful tool for inference about heritability.
Collapse
Affiliation(s)
- The Tien Mai
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway.
| | - Paul Turner
- Cambodia-Oxford Medical Research Unit, Angkor Hospital for Children, Siem Reap, Cambodia.,Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Jukka Corander
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway.,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| |
Collapse
|
48
|
Liang Y, Li B, Zhang Q, Zhang S, He X, Jiang L, Jin Y. Interaction analyses based on growth parameters of GWAS between Escherichia coli and Staphylococcus aureus. AMB Express 2021; 11:34. [PMID: 33646434 PMCID: PMC7921238 DOI: 10.1186/s13568-021-01192-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Accepted: 02/09/2021] [Indexed: 01/02/2023] Open
Abstract
To accurately explore the interaction mechanism between Escherichia coli and Staphylococcus aureus, we designed an ecological experiment to monoculture and co-culture E. coli and S. aureus. We co-cultured 45 strains of E. coli and S. aureus, as well as each species individually to measure growth over 36 h. We implemented a genome wide association study (GWAS) based on growth parameters (λ, R, A and s) to identify significant single nucleotide polymorphisms (SNPs) of the bacteria. Three commonly used growth regression equations, Logistic, Gompertz, and Richards, were used to fit the bacteria growth data of each strain. Then each equation's Akaike's information criterion (AIC) value was calculated as a commonly used information criterion. We used the optimal growth equation to estimate the four parameters above for strains in co-culture. By plotting the estimates for each parameter across two strains, we can visualize how growth parameters respond ecologically to environment stimuli. We verified that different genotypes of bacteria had different growth trajectories, although they were the same species. We reported 85 and 52 significant SNPs that were associated with interaction in E. coli and S. aureus, respectively. Many significant genes might play key roles in interaction, such as yjjW, dnaK, aceE, tatD, ftsA, rclR, ftsK, fepA in E. coli, and scdA, trpD, sdrD, SAOUHSC_01219 in S. aureus. Our study illustrated that there were multiple genes working together to affect bacterial interaction, and laid a solid foundation for the later study of more complex inter-bacterial interaction mechanisms.
Collapse
|
49
|
Weber RE, Fuchs S, Layer F, Sommer A, Bender JK, Thürmer A, Werner G, Strommenger B. Genome-Wide Association Studies for the Detection of Genetic Variants Associated With Daptomycin and Ceftaroline Resistance in Staphylococcus aureus. Front Microbiol 2021; 12:639660. [PMID: 33658988 PMCID: PMC7917082 DOI: 10.3389/fmicb.2021.639660] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 01/22/2021] [Indexed: 12/29/2022] Open
Abstract
Background As next generation sequencing (NGS) technologies have experienced a rapid development over the last decade, the investigation of the bacterial genetic architecture reveals a high potential to dissect causal loci of antibiotic resistance phenotypes. Although genome-wide association studies (GWAS) have been successfully applied for investigating the basis of resistance traits, complex resistance phenotypes have been omitted so far. For S. aureus this especially refers to antibiotics of last resort like daptomycin and ceftaroline. Therefore, we aimed to perform GWAS for the identification of genetic variants associated with DAP and CPT resistance in clinical S. aureus isolates. Materials/methods To conduct microbial GWAS, we selected cases and controls according to their clonal background, date of isolation, and geographical origin. Association testing was performed with PLINK and SEER analysis. By using in silico analysis, we also searched for rare genetic variants in candidate loci that have previously been described to be involved in the development of corresponding resistance phenotypes. Results GWAS revealed MprF P314L and L826F to be significantly associated with DAP resistance. These mutations were found to be homogenously distributed among clonal lineages suggesting convergent evolution. Additionally, rare and yet undescribed single nucleotide polymorphisms could be identified within mprF and putative candidate genes. Finally, we could show that each DAP resistant isolate exhibited at least one amino acid substitution within the open reading frame of mprF. Due to the presence of strong population stratification, no genetic variants could be associated with CPT resistance. However, the investigation of the staphylococcal cassette chromosome mec (SCCmec) revealed various mecA SNPs to be putatively linked with CPT resistance. Additionally, some CPT resistant isolates revealed no mecA mutations, supporting the hypothesis that further and still unknown resistance determinants are crucial for the development of CPT resistance in S. aureus. Conclusion We hereby confirmed the potential of GWAS to identify genetic variants that are associated with antibiotic resistance traits in S. aureus. However, precautions need to be taken to prevent the detection of spurious associations. In addition, the implementation of different approaches is still essential to detect multiple forms of variations and mutations that occur with a low frequency.
Collapse
Affiliation(s)
- Robert E Weber
- Department of Infectious Diseases, Robert Koch-Institute, Wernigerode, Germany.,Methodology and Research Infrastructure, Genome Sequencing, Robert Koch-Institute, Berlin, Germany
| | - Stephan Fuchs
- Methodology and Research Infrastructure, Bioinformatics, Robert Koch-Institute, Berlin, Germany
| | - Franziska Layer
- Department of Infectious Diseases, Robert Koch-Institute, Wernigerode, Germany.,Methodology and Research Infrastructure, Genome Sequencing, Robert Koch-Institute, Berlin, Germany
| | - Anna Sommer
- Department of Infectious Diseases, Robert Koch-Institute, Wernigerode, Germany.,Methodology and Research Infrastructure, Genome Sequencing, Robert Koch-Institute, Berlin, Germany
| | - Jennifer K Bender
- Department of Infectious Diseases, Robert Koch-Institute, Wernigerode, Germany.,Methodology and Research Infrastructure, Genome Sequencing, Robert Koch-Institute, Berlin, Germany
| | - Andrea Thürmer
- Methodology and Research Infrastructure, Bioinformatics, Robert Koch-Institute, Berlin, Germany
| | - Guido Werner
- Department of Infectious Diseases, Robert Koch-Institute, Wernigerode, Germany.,Methodology and Research Infrastructure, Genome Sequencing, Robert Koch-Institute, Berlin, Germany
| | - Birgit Strommenger
- Department of Infectious Diseases, Robert Koch-Institute, Wernigerode, Germany.,Methodology and Research Infrastructure, Genome Sequencing, Robert Koch-Institute, Berlin, Germany
| |
Collapse
|
50
|
Moller AG, Winston K, Ji S, Wang J, Hargita Davis MN, Solís-Lemus CR, Read TD. Genes Influencing Phage Host Range in Staphylococcus aureus on a Species-Wide Scale. mSphere 2021; 6:e01263-20. [PMID: 33441407 PMCID: PMC7845607 DOI: 10.1128/msphere.01263-20] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 12/16/2020] [Indexed: 12/20/2022] Open
Abstract
Staphylococcus aureus is a human pathogen that causes serious diseases, ranging from skin infections to septic shock. Bacteriophages (phages) are both natural killers of S. aureus, offering therapeutic possibilities, and important vectors of horizontal gene transfer (HGT) in the species. Here, we used high-throughput approaches to understand the genetic basis of strain-to-strain variation in sensitivity to phages, which defines the host range. We screened 259 diverse S. aureus strains covering more than 40 sequence types for sensitivity to eight phages, which were representatives of the three phage classes that infect the species. The phages were variable in host range, each infecting between 73 and 257 strains. Using genome-wide association approaches, we identified putative loci that affect host range and validated their function using USA300 transposon knockouts. In addition to rediscovering known host range determinants, we found several previously unreported genes affecting bacterial growth during phage infection, including trpA, phoR, isdB, sodM, fmtC, and relA We used the data from our host range matrix to develop predictive models that achieved between 40% and 95% accuracy. This work illustrates the complexity of the genetic basis for phage susceptibility in S. aureus but also shows that with more data, we may be able to understand much of the variation. With a knowledge of host range determination, we can rationally design phage therapy cocktails that target the broadest host range of S. aureus strains and address basic questions regarding phage-host interactions, such as the impact of phage on S. aureus evolution.IMPORTANCEStaphylococcus aureus is a widespread, hospital- and community-acquired pathogen, many strains of which are antibiotic resistant. It causes diverse diseases, ranging from local to systemic infection, and affects both the skin and many internal organs, including the heart, lungs, bones, and brain. Its ubiquity, antibiotic resistance, and disease burden make new therapies urgent. One alternative therapy to antibiotics is phage therapy, in which viruses specific to infecting bacteria clear infection. In this work, we identified and validated S. aureus genes that influence phage host range-the number of strains a phage can infect and kill-by testing strains representative of the diversity of the S. aureus species for phage host range and associating the genome sequences of strains with host range. These findings together improved our understanding of how phage therapy works in the bacterium and improve prediction of phage therapy efficacy based on the predicted host range of the infecting strain.
Collapse
Affiliation(s)
- Abraham G Moller
- Microbiology and Molecular Genetics (MMG) Program, Graduate Division of Biological and Biomedical Sciences (GDBBS), Emory University, Atlanta, Georgia, USA
- Division of Infectious Diseases, Department of Medicine, Emory University, Atlanta, Georgia, USA
| | - Kyle Winston
- Department of Epidemiology, Rollins School of Public Health (RSPH), Emory University, Atlanta, Georgia, USA
| | - Shiyu Ji
- Eugene Gangarosa Laboratory Research Fellowship, Emory College Online & Summer Programs, Emory College of Arts and Sciences, Atlanta, Georgia, USA
| | - Junting Wang
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Michelle N Hargita Davis
- Division of Infectious Diseases, Department of Medicine, Emory University, Atlanta, Georgia, USA
| | - Claudia R Solís-Lemus
- Wisconsin Institute for Discovery, Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Timothy D Read
- Division of Infectious Diseases, Department of Medicine, Emory University, Atlanta, Georgia, USA
| |
Collapse
|