1
|
Jin C, Jia C, Hu W, Xu H, Shen Y, Yue M. Predicting antimicrobial resistance in E. coli with discriminative position fused deep learning classifier. Comput Struct Biotechnol J 2024; 23:559-565. [PMID: 38274998 PMCID: PMC10809114 DOI: 10.1016/j.csbj.2023.12.041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/26/2023] [Accepted: 12/26/2023] [Indexed: 01/27/2024] Open
Abstract
Escherichia coli (E. coli) has become a particular concern due to the increasing incidence of antimicrobial resistance (AMR) observed worldwide. Using machine learning (ML) to predict E. coli AMR is a more efficient method than traditional laboratory testing. However, further improvement in the predictive performance of existing models remains challenging. In this study, we collected 1937 high-quality whole genome sequencing (WGS) data from public databases with an antimicrobial resistance phenotype and modified the existing workflow by adding an attention mechanism to enable the modified workflow to focus more on core single nucleotide polymorphisms (SNPs) that may significantly lead to the development of AMR in E. coli. While comparing the model performance before and after adding the attention mechanism, we also performed a cross-comparison among the published models using random forest (RF), support vector machine (SVM), logistic regression (LR), and convolutional neural network (CNN). Our study demonstrates that the discriminative positional colors of Chaos Game Representation (CGR) images can selectively influence and highlight genome regions without prior knowledge, enhancing prediction accuracy. Furthermore, we developed an online tool (https://github.com/tjiaa/E.coli-ML/tree/main) for assisting clinicians in the rapid prediction of the AMR phenotype of E. coli and accelerating clinical decision-making.
Collapse
Affiliation(s)
- Canghong Jin
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
| | - Chenghao Jia
- Institute of Preventive Veterinary Sciences and Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou 310058, China
| | - Wenkang Hu
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
- College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
| | - Haidong Xu
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
| | - Yanyi Shen
- School of Computer and Computing Science, Hangzhou City University, Hangzhou 310015, China
| | - Min Yue
- Institute of Preventive Veterinary Sciences and Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou 310058, China
- Hainan Institute of Zhejiang University, Sanya 572000, China
- Zhejiang Provincial Key Laboratory of Preventive Veterinary Medicine, Hangzhou 310058, China
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou 310003, China
| |
Collapse
|
2
|
Zhang G, Zhang C, Liu J, Zhang Y, Fu W. Occurrence, fate, and risk assessment of antibiotics in conventional and advanced drinking water treatment systems: From source to tap. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 358:120746. [PMID: 38593734 DOI: 10.1016/j.jenvman.2024.120746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 02/26/2024] [Accepted: 03/19/2024] [Indexed: 04/11/2024]
Abstract
The occurrence and removal of 38 antibiotics from nine classes in two drinking water treatment plants (WTPs) were monitored monthly over one year to evaluate the efficiency of typical treatment processes, track the source of antibiotics in tap water and assess their potential risks to ecosystem and human health. In both source waters, 18 antibiotics were detected at least once, with average total antibiotic concentrations of 538.5 ng/L in WTP1 and 569.3 ng/L in WTP2. The coagulation/flocculation and sedimentation, sand filtration and granular activated carbon processes demonstrated limited removal efficiencies. Chlorination, on the other hand, effectively eliminated antibiotics by 48.7 ± 11.9%. Interestingly, negative removal was observed along the distribution system, resulting in a significant antibiotic presence in tap water, with average concentrations of 131.5 ng/L in WTP1 and 362.8 ng/L in WTP2. Source tracking analysis indicates that most antibiotics in tap water may originate from distribution system. The presence of antibiotics in raw water and tap water posed risks to the aquatic ecosystem. Untreated or partially treated raw water could pose a medium risk to infants under six months. Water parameters, for example, temperature, total nitrogen and total organic carbon, can serve as indicators to estimate antibiotic occurrence and associated risks. Furthermore, machine learning models were developed that successfully predicted risk levels using water quality parameters. Our study provides valuable insights into the occurrence, removal and risk of antibiotics in urban WTPs, contributing to the broader understanding of antibiotic pollution in water treatment systems.
Collapse
Affiliation(s)
- Guorui Zhang
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, Center for Grassland Microbiome, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Chao Zhang
- Key Laboratory of Ecology of Rare and Endangered Species and Environmental Protection (Guangxi Normal University), Ministry of Education, College of Environment and Resources, Guangxi Normal University, 541004, Guilin, China
| | - Jie Liu
- State Key Laboratory of Herbage Improvement and Grassland Agro-ecosystems, Center for Grassland Microbiome, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China.
| | - Yixiang Zhang
- Department of Chemistry and Key Laboratory of Organic Optoelectronics and Molecular Engineering, Ministry of Education, Tsinghua University, 100084, Beijing, China
| | - Wenjie Fu
- Key Laboratory of Ecology of Rare and Endangered Species and Environmental Protection (Guangxi Normal University), Ministry of Education, College of Environment and Resources, Guangxi Normal University, 541004, Guilin, China.
| |
Collapse
|
3
|
Pikalyova K, Orlov A, Horvath D, Marcou G, Varnek A. Predicting S. aureus antimicrobial resistance with interpretable genomic space maps. Mol Inform 2024; 43:e202300263. [PMID: 38386182 DOI: 10.1002/minf.202300263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2023] [Revised: 01/15/2024] [Accepted: 02/08/2024] [Indexed: 02/23/2024]
Abstract
Increasing antimicrobial resistance (AMR) represents a global healthcare threat. To decrease the spread of AMR and associated mortality, methods for rapid selection of optimal antibiotic treatment are urgently needed. Machine learning (ML) models based on genomic data to predict resistant phenotypes can serve as a fast screening tool prior to phenotypic testing. Nonetheless, many existing ML methods lack interpretability. Therefore, we present a methodology for visualization of sequence space and AMR prediction based on the non-linear dimensionality reduction method - generative topographic mapping (GTM). This approach, applied to AMR data of >5000 S. aureus isolates retrieved from the PATRIC database, yielded GTM models with reasonable accuracy for all drugs (balanced accuracy values ≥0.75). The Generative Topographic Maps (GTMs) represent data in the form of illustrative maps of the genomic space and allow for antibiotic-wise comparison of resistant phenotypes. The maps were also found to be useful for the analysis of genetic determinants responsible for drug resistance. Overall, the GTM-based methodology is a useful tool for both the illustrative exploration of the genomic sequence space and AMR prediction.
Collapse
Affiliation(s)
- Karina Pikalyova
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Alexey Orlov
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| |
Collapse
|
4
|
Asnicar F, Thomas AM, Passerini A, Waldron L, Segata N. Machine learning for microbiologists. Nat Rev Microbiol 2024; 22:191-205. [PMID: 37968359 DOI: 10.1038/s41579-023-00984-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/2023] [Indexed: 11/17/2023]
Abstract
Machine learning is increasingly important in microbiology where it is used for tasks such as predicting antibiotic resistance and associating human microbiome features with complex host diseases. The applications in microbiology are quickly expanding and the machine learning tools frequently used in basic and clinical research range from classification and regression to clustering and dimensionality reduction. In this Review, we examine the main machine learning concepts, tasks and applications that are relevant for experimental and clinical microbiologists. We provide the minimal toolbox for a microbiologist to be able to understand, interpret and use machine learning in their experimental and translational activities.
Collapse
Affiliation(s)
- Francesco Asnicar
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrew Maltez Thomas
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrea Passerini
- Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
| | - Levi Waldron
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Epidemiology and Biostatistics, City University of New York, New York, NY, USA.
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Milan, Italy.
| |
Collapse
|
5
|
Hu K, Meyer F, Deng ZL, Asgari E, Kuo TH, Münch PC, McHardy AC. Assessing computational predictions of antimicrobial resistance phenotypes from microbial genomes. Brief Bioinform 2024; 25:bbae206. [PMID: 38706320 PMCID: PMC11070729 DOI: 10.1093/bib/bbae206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
The advent of rapid whole-genome sequencing has created new opportunities for computational prediction of antimicrobial resistance (AMR) phenotypes from genomic data. Both rule-based and machine learning (ML) approaches have been explored for this task, but systematic benchmarking is still needed. Here, we evaluated four state-of-the-art ML methods (Kover, PhenotypeSeeker, Seq2Geno2Pheno and Aytan-Aktug), an ML baseline and the rule-based ResFinder by training and testing each of them across 78 species-antibiotic datasets, using a rigorous benchmarking workflow that integrates three evaluation approaches, each paired with three distinct sample splitting methods. Our analysis revealed considerable variation in the performance across techniques and datasets. Whereas ML methods generally excelled for closely related strains, ResFinder excelled for handling divergent genomes. Overall, Kover most frequently ranked top among the ML approaches, followed by PhenotypeSeeker and Seq2Geno2Pheno. AMR phenotypes for antibiotic classes such as macrolides and sulfonamides were predicted with the highest accuracies. The quality of predictions varied substantially across species-antibiotic combinations, particularly for beta-lactams; across species, resistance phenotyping of the beta-lactams compound, aztreonam, amoxicillin/clavulanic acid, cefoxitin, ceftazidime and piperacillin/tazobactam, alongside tetracyclines demonstrated more variable performance than the other benchmarked antibiotics. By organism, Campylobacter jejuni and Enterococcus faecium phenotypes were more robustly predicted than those of Escherichia coli, Staphylococcus aureus, Salmonella enterica, Neisseria gonorrhoeae, Klebsiella pneumoniae, Pseudomonas aeruginosa, Acinetobacter baumannii, Streptococcus pneumoniae and Mycobacterium tuberculosis. In addition, our study provides software recommendations for each species-antibiotic combination. It furthermore highlights the need for optimization for robust clinical applications, particularly for strains that diverge substantially from those used for training.
Collapse
Affiliation(s)
- Kaixin Hu
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Fernando Meyer
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Zhi-Luo Deng
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Ehsaneddin Asgari
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Molecular Cell Biomechanics Laboratory, Department of Bioengineering and Mechanical Engineering, University of California, Berkeley, USA
| | - Tzu-Hao Kuo
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| | - Philipp C Münch
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School, Hannover, Germany
- German Center for Infection Research (DZIF), partner site Hannover Braunschweig, Braunschweig, Germany
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - Alice C McHardy
- Computational Biology of Infection Research, Helmholtz Center for Infection Research, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), Technische Universität Braunschweig, Braunschweig, Germany
| |
Collapse
|
6
|
Batisti Biffignandi G, Chindelevitch L, Corbella M, Feil EJ, Sassera D, Lees JA. Optimising machine learning prediction of minimum inhibitory concentrations in Klebsiella pneumoniae. Microb Genom 2024; 10:001222. [PMID: 38529944 PMCID: PMC10995625 DOI: 10.1099/mgen.0.001222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 03/07/2024] [Indexed: 03/27/2024] Open
Abstract
Minimum Inhibitory Concentrations (MICs) are the gold standard for quantitatively measuring antibiotic resistance. However, lab-based MIC determination can be time-consuming and suffers from low reproducibility, and interpretation as sensitive or resistant relies on guidelines which change over time. Genome sequencing and machine learning promise to allow in silico MIC prediction as an alternative approach which overcomes some of these difficulties, albeit the interpretation of MIC is still needed. Nevertheless, precisely how we should handle MIC data when dealing with predictive models remains unclear, since they are measured semi-quantitatively, with varying resolution, and are typically also left- and right-censored within varying ranges. We therefore investigated genome-based prediction of MICs in the pathogen Klebsiella pneumoniae using 4367 genomes with both simulated semi-quantitative traits and real MICs. As we were focused on clinical interpretation, we used interpretable rather than black-box machine learning models, namely, Elastic Net, Random Forests, and linear mixed models. Simulated traits were generated accounting for oligogenic, polygenic, and homoplastic genetic effects with different levels of heritability. Then we assessed how model prediction accuracy was affected when MICs were framed as regression and classification. Our results showed that treating the MICs differently depending on the number of concentration levels of antibiotic available was the most promising learning strategy. Specifically, to optimise both prediction accuracy and inference of the correct causal variants, we recommend considering the MICs as continuous and framing the learning problem as a regression when the number of observed antibiotic concentration levels is large, whereas with a smaller number of concentration levels they should be treated as a categorical variable and the learning problem should be framed as a classification. Our findings also underline how predictive models can be improved when prior biological knowledge is taken into account, due to the varying genetic architecture of each antibiotic resistance trait. Finally, we emphasise that incrementing the population database is pivotal for the future clinical implementation of these models to support routine machine-learning based diagnostics.
Collapse
Affiliation(s)
- Gherard Batisti Biffignandi
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
- MRC Centre for Global Infectious Disease Analysis, Imperial College, London, England, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| | - Leonid Chindelevitch
- MRC Centre for Global Infectious Disease Analysis, Imperial College, London, England, UK
| | - Marta Corbella
- Microbiology and Virology Unit, Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - Edward J. Feil
- The Milner Centre for Evolution, Department of Life Sciences, University of Bath, Bath, UK
| | - Davide Sassera
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
- Fondazione IRCCS Policlinico San Matteo, Pavia, Italy
| | - John A. Lees
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
7
|
Lai HY, Cooper TF. Interaction with a phage gene underlie costs of a β-lactamase. mBio 2024; 15:e0277623. [PMID: 38194254 PMCID: PMC10865808 DOI: 10.1128/mbio.02776-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 12/05/2023] [Indexed: 01/10/2024] Open
Abstract
The fitness cost of an antibiotic resistance gene (ARG) can differ across host strains, creating refuges that allow the maintenance of an ARG in the absence of direct selection for its resistance phenotype. Despite the importance of such ARG-host interactions for predicting ARG dynamics, the basis of ARG fitness costs and their variability between hosts are not well understood. We determined the genetic basis of a host-dependent cost of a β-lactamase, blaTEM-116*, that conferred a significant cost in one Escherichia coli strain but was close to neutral in 11 other Escherichia spp. strains. Selection of a blaTEM-116*-encoding plasmid in the strain in which it initially had a high cost resulted in rapid and parallel compensation for that cost through mutations in a P1-like phage gene, relAP1. When the wild-type relAP1 gene was added to a strain in which it was not present and in which blaTEM-116* was neutral, it caused the ARG to become costly. Thus, relAP1 is both necessary and sufficient to explain blaTEM-116* costs in at least some host backgrounds. To our knowledge, these findings represent the first demonstrated case of the cost of an ARG being influenced by a genetic interaction with a phage gene. The interaction between a phage gene and a plasmid-borne ARG highlights the complexity of selective forces determining the maintenance and spread of ARGs and, by extension, encoding phage and plasmids in natural bacterial communities.IMPORTANCEAntibiotic resistance genes (ARGs) play a major role in the increasing problem of antibiotic resistance in clinically relevant bacteria. Selection of these genes occurs in the presence of antibiotics, but their eventual success also depends on the sometimes substantial costs they impose on host bacteria in antibiotic-free environments. We evolved an ARG that confers resistance to penicillin-type antibiotics in one host in which it did confer a cost and in one host in which it did not. We found that costs were rapidly and consistently reduced through parallel genetic changes in a gene encoded by a phage that was infecting the costly host. The unmutated version of this gene was sufficient to cause the ARG to confer a cost in a host in which it was originally neutral, demonstrating an antagonism between the two genetic elements and underlining the range and complexity of pressures determining ARG dynamics in natural populations.
Collapse
Affiliation(s)
- Huei-Yi Lai
- School of Natural Sciences, Massey University, Auckland, New Zealand
| | - Tim F. Cooper
- School of Natural Sciences, Massey University, Auckland, New Zealand
| |
Collapse
|
8
|
Liu GY, Yu D, Fan MM, Zhang X, Jin ZY, Tang C, Liu XF. Antimicrobial resistance crisis: could artificial intelligence be the solution? Mil Med Res 2024; 11:7. [PMID: 38254241 PMCID: PMC10804841 DOI: 10.1186/s40779-024-00510-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 01/08/2024] [Indexed: 01/24/2024] Open
Abstract
Antimicrobial resistance is a global public health threat, and the World Health Organization (WHO) has announced a priority list of the most threatening pathogens against which novel antibiotics need to be developed. The discovery and introduction of novel antibiotics are time-consuming and expensive. According to WHO's report of antibacterial agents in clinical development, only 18 novel antibiotics have been approved since 2014. Therefore, novel antibiotics are critically needed. Artificial intelligence (AI) has been rapidly applied to drug development since its recent technical breakthrough and has dramatically improved the efficiency of the discovery of novel antibiotics. Here, we first summarized recently marketed novel antibiotics, and antibiotic candidates in clinical development. In addition, we systematically reviewed the involvement of AI in antibacterial drug development and utilization, including small molecules, antimicrobial peptides, phage therapy, essential oils, as well as resistance mechanism prediction, and antibiotic stewardship.
Collapse
Affiliation(s)
- Guang-Yu Liu
- Department of Immunology and Pathogen Biology, School of Basic Medical Sciences, Hangzhou Normal University, Key Laboratory of Aging and Cancer Biology of Zhejiang Province, Key Laboratory of Inflammation and Immunoregulation of Hangzhou, Hangzhou Normal University, Hangzhou, 311121, China
| | - Dan Yu
- National Key Discipline of Pediatrics Key Laboratory of Major Diseases in Children Ministry of Education, Laboratory of Dermatology, Beijing Pediatric Research Institute, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, 100045, China
| | - Mei-Mei Fan
- Department of Immunology and Pathogen Biology, School of Basic Medical Sciences, Hangzhou Normal University, Key Laboratory of Aging and Cancer Biology of Zhejiang Province, Key Laboratory of Inflammation and Immunoregulation of Hangzhou, Hangzhou Normal University, Hangzhou, 311121, China
| | - Xu Zhang
- Robert and Arlene Kogod Center on Aging, Mayo Clinic, Rochester, MN, 55905, USA
- Department of Biochemistry and Molecular Biology, Mayo Clinic, Rochester, MN, 55905, USA
| | - Ze-Yu Jin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Christoph Tang
- Sir William Dunn School of Pathology, University of Oxford, Oxford, OX1 3RE, UK.
| | - Xiao-Fen Liu
- Institute of Antibiotics, Huashan Hospital, Fudan University, Key Laboratory of Clinical Pharmacology of Antibiotics, National Health Commission of the People's Republic of China, National Clinical Research Centre for Aging and Medicine, Huashan Hospital, Fudan University, Shanghai, 200040, China.
| |
Collapse
|
9
|
Li X, Brejnrod A, Thorsen J, Zachariasen T, Trivedi U, Russel J, Vestergaard GA, Stokholm J, Rasmussen MA, Sørensen SJ. Differential responses of the gut microbiome and resistome to antibiotic exposures in infants and adults. Nat Commun 2023; 14:8526. [PMID: 38135681 PMCID: PMC10746713 DOI: 10.1038/s41467-023-44289-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 12/07/2023] [Indexed: 12/24/2023] Open
Abstract
Despite their crucial importance for human health, there is still relatively limited knowledge on how the gut resistome changes or responds to antibiotic treatment across ages, especially in the latter case. Here, we use fecal metagenomic data from 662 Danish infants and 217 young adults to fill this gap. The gut resistomes are characterized by a bimodal distribution driven by E. coli composition. The typical profile of the gut resistome differs significantly between adults and infants, with the latter distinguished by higher gene and plasmid abundances. However, the predominant antibiotic resistance genes (ARGs) are the same. Antibiotic treatment reduces bacterial diversity and increased ARG and plasmid abundances in both cohorts, especially core ARGs. The effects of antibiotic treatments on the gut microbiome last longer in adults than in infants, and different antibiotics are associated with distinct impacts. Overall, this study broadens our current understanding of gut resistome dynamics and the impact of antibiotic treatment across age groups.
Collapse
Affiliation(s)
- Xuanji Li
- Department of Biology, Section of Microbiology, University of Copenhagen, 2100, Copenhagen, Denmark
| | - Asker Brejnrod
- Department of Health Technology, Technical University of Denmark, Section of Bioinformatics, 2800 Kgs, Lyngby, Denmark
| | - Jonathan Thorsen
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Trine Zachariasen
- Department of Health Technology, Technical University of Denmark, Section of Bioinformatics, 2800 Kgs, Lyngby, Denmark
| | - Urvish Trivedi
- Department of Biology, Section of Microbiology, University of Copenhagen, 2100, Copenhagen, Denmark
| | - Jakob Russel
- Department of Biology, Section of Microbiology, University of Copenhagen, 2100, Copenhagen, Denmark
| | - Gisle Alberg Vestergaard
- Department of Health Technology, Technical University of Denmark, Section of Bioinformatics, 2800 Kgs, Lyngby, Denmark
| | - Jakob Stokholm
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
- Department of Food Science, Section of Microbiology and Fermentation, University of Copenhagen, 1958, Frederiksberg C, Denmark
| | - Morten Arendt Rasmussen
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark.
- Department of Food Science, Section of Microbiology and Fermentation, University of Copenhagen, 1958, Frederiksberg C, Denmark.
| | - Søren Johannes Sørensen
- Department of Biology, Section of Microbiology, University of Copenhagen, 2100, Copenhagen, Denmark.
| |
Collapse
|
10
|
Yurtseven A, Buyanova S, Agrawal AA, Bochkareva OO, Kalinina OV. Machine learning and phylogenetic analysis allow for predicting antibiotic resistance in M. tuberculosis. BMC Microbiol 2023; 23:404. [PMID: 38124060 PMCID: PMC10731705 DOI: 10.1186/s12866-023-03147-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/07/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Antimicrobial resistance (AMR) poses a significant global health threat, and an accurate prediction of bacterial resistance patterns is critical for effective treatment and control strategies. In recent years, machine learning (ML) approaches have emerged as powerful tools for analyzing large-scale bacterial AMR data. However, ML methods often ignore evolutionary relationships among bacterial strains, which can greatly impact performance of the ML methods, especially if resistance-associated features are attempted to be detected. Genome-wide association studies (GWAS) methods like linear mixed models accounts for the evolutionary relationships in bacteria, but they uncover only highly significant variants which have already been reported in literature. RESULTS In this work, we introduce a novel phylogeny-related parallelism score (PRPS), which measures whether a certain feature is correlated with the population structure of a set of samples. We demonstrate that PRPS can be used, in combination with SVM- and random forest-based models, to reduce the number of features in the analysis, while simultaneously increasing models' performance. We applied our pipeline to publicly available AMR data from PATRIC database for Mycobacterium tuberculosis against six common antibiotics. CONCLUSIONS Using our pipeline, we re-discovered known resistance-associated mutations as well as new candidate mutations which can be related to resistance and not previously reported in the literature. We demonstrated that taking into account phylogenetic relationships not only improves the model performance, but also yields more biologically relevant predicted most contributing resistance markers.
Collapse
Affiliation(s)
- Alper Yurtseven
- Department of Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Campus E8.1, Saarbrücken, 66123, Saarland, Germany.
- Graduate School of Computer Science, Saarland University, Saarbrücken, 66123, Saarland, Germany.
| | - Sofia Buyanova
- Institute of Science and Technology Austria (ISTA), Am Campus 1, Klosterneuburg, 3400, Austria
| | - Amay Ajaykumar Agrawal
- Department of Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Campus E8.1, Saarbrücken, 66123, Saarland, Germany
- Graduate School of Computer Science, Saarland University, Saarbrücken, 66123, Saarland, Germany
| | - Olga O Bochkareva
- Institute of Science and Technology Austria (ISTA), Am Campus 1, Klosterneuburg, 3400, Austria
- Centre for Microbiology and Environmental Systems Science, Division of Computational System Biology, University of Vienna, Djerassiplatz 1 A, Wien, 1030, Austria
| | - Olga V Kalinina
- Department of Drug Bioinformatics, Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Campus E8.1, Saarbrücken, 66123, Saarland, Germany
- Graduate School of Computer Science, Saarland University, Saarbrücken, 66123, Saarland, Germany
- Faculty of Medicine, Saarland University, Homburg, 66421, Saarland, Germany
| |
Collapse
|
11
|
Theuretzbacher U, Blasco B, Duffey M, Piddock LJV. Unrealized targets in the discovery of antibiotics for Gram-negative bacterial infections. Nat Rev Drug Discov 2023; 22:957-975. [PMID: 37833553 DOI: 10.1038/s41573-023-00791-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/15/2023] [Indexed: 10/15/2023]
Abstract
Advances in areas that include genomics, systems biology, protein structure determination and artificial intelligence provide new opportunities for target-based antibacterial drug discovery. The selection of a 'good' new target for direct-acting antibacterial compounds is the first decision, for which multiple criteria must be explored, integrated and re-evaluated as drug discovery programmes progress. Criteria include essentiality of the target for bacterial survival, its conservation across different strains of the same species, bacterial species and growth conditions (which determines the spectrum of activity of a potential antibiotic) and the level of homology with human genes (which influences the potential for selective inhibition). Additionally, a bacterial target should have the potential to bind to drug-like molecules, and its subcellular location will govern the need for inhibitors to penetrate one or two bacterial membranes, which is a key challenge in targeting Gram-negative bacteria. The risk of the emergence of target-based drug resistance for drugs with single targets also requires consideration. This Review describes promising but as-yet-unrealized targets for antibacterial drugs against Gram-negative bacteria and examples of cognate inhibitors, and highlights lessons learned from past drug discovery programmes.
Collapse
Affiliation(s)
| | - Benjamin Blasco
- Global Antibiotic Research and Development Partnership (GARDP), Geneva, Switzerland
| | - Maëlle Duffey
- Global Antibiotic Research and Development Partnership (GARDP), Geneva, Switzerland
| | - Laura J V Piddock
- Global Antibiotic Research and Development Partnership (GARDP), Geneva, Switzerland.
| |
Collapse
|
12
|
Hyun JC, Monk JM, Szubin R, Hefner Y, Palsson BO. Global pathogenomic analysis identifies known and candidate genetic antimicrobial resistance determinants in twelve species. Nat Commun 2023; 14:7690. [PMID: 38001096 PMCID: PMC10673929 DOI: 10.1038/s41467-023-43549-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
Surveillance programs for managing antimicrobial resistance (AMR) have yielded thousands of genomes suited for data-driven mechanism discovery. We present a workflow integrating pangenomics, gene annotation, and machine learning to identify AMR genes at scale. When applied to 12 species, 27,155 genomes, and 69 drugs, we 1) find AMR gene transfer mostly confined within related species, with 925 genes in multiple species but just eight in multiple phylogenetic classes, 2) demonstrate that discovery-oriented support vector machines outperform contemporary methods at recovering known AMR genes, recovering 263 genes compared to 145 by Pyseer, and 3) identify 142 AMR gene candidates. Validation of two candidates in E. coli BW25113 reveals cases of conditional resistance: ΔcycA confers ciprofloxacin resistance in minimal media with D-serine, and frdD V111D confers ampicillin resistance in the presence of ampC by modifying the overlapping promoter. We expect this approach to be adaptable to other species and phenotypes.
Collapse
Affiliation(s)
- Jason C Hyun
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Jonathan M Monk
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Richard Szubin
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Ying Hefner
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| | - Bernhard O Palsson
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA.
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA.
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, USA.
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA, USA.
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Building 220, 2800, Kongens, Lyngby, Denmark.
| |
Collapse
|
13
|
Wu J, Ouyang J, Qin H, Zhou J, Roberts R, Siam R, Wang L, Tong W, Liu Z, Shi T. PLM-ARG: antibiotic resistance gene identification using a pretrained protein language model. Bioinformatics 2023; 39:btad690. [PMID: 37995287 PMCID: PMC10676515 DOI: 10.1093/bioinformatics/btad690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/23/2023] [Accepted: 11/22/2023] [Indexed: 11/25/2023] Open
Abstract
MOTIVATION Antibiotic resistance presents a formidable global challenge to public health and the environment. While considerable endeavors have been dedicated to identify antibiotic resistance genes (ARGs) for assessing the threat of antibiotic resistance, recent extensive investigations using metagenomic and metatranscriptomic approaches have unveiled a noteworthy concern. A significant fraction of proteins defies annotation through conventional sequence similarity-based methods, an issue that extends to ARGs, potentially leading to their under-recognition due to dissimilarities at the sequence level. RESULTS Herein, we proposed an Artificial Intelligence-powered ARG identification framework using a pretrained large protein language model, enabling ARG identification and resistance category classification simultaneously. The proposed PLM-ARG was developed based on the most comprehensive ARG and related resistance category information (>28K ARGs and associated 29 resistance categories), yielding Matthew's correlation coefficients (MCCs) of 0.983 ± 0.001 by using a 5-fold cross-validation strategy. Furthermore, the PLM-ARG model was verified using an independent validation set and achieved an MCC of 0.838, outperforming other publicly available ARG prediction tools with an improvement range of 51.8%-107.9%. Moreover, the utility of the proposed PLM-ARG model was demonstrated by annotating resistance in the UniProt database and evaluating the impact of ARGs on the Earth's environmental microbiota. AVAILABILITY AND IMPLEMENTATION PLM-ARG is available for academic purposes at https://github.com/Junwu302/PLM-ARG, and a user-friendly webserver (http://www.unimd.org/PLM-ARG) is also provided.
Collapse
Affiliation(s)
- Jun Wu
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Jian Ouyang
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Haipeng Qin
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Jiajia Zhou
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
| | - Ruth Roberts
- ApconiX Ltd, Alderley Park, Alderley Edge SK10 4TG, United Kingdom
- University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - Rania Siam
- Biology Department, School of Sciences and Engineering, The American University in Cairo, New Cairo 11835, Egypt
| | - Lan Wang
- College of Architecture and Urban Planning, Tongji University, Shanghai 200092, China
| | - Weida Tong
- National Center for Toxicological Research, Food and Drug Administration, Jefferson, AR 72079, United States
| | - Zhichao Liu
- Nonclinical Drug Safety, Boehringer Ingelheim Pharmaceuticals, Inc, Ridgefield, CT 06877, United States
| | - Tieliu Shi
- Center for Bioinformatics and Computational Biology, and The Institute of Biomedical Sciences, School of Life Sciences, East China Normal University, Shanghai 200241, China
- School of Statistics, Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, East China Normal University, Shanghai 200062, China
| |
Collapse
|
14
|
Graña-Miraglia L, Morales-Lizcano N, Wang PW, Hwang DM, Yau YCW, Waters VJ, Guttman DS. Predictive modeling of antibiotic eradication therapy success for new-onset Pseudomonas aeruginosa pulmonary infections in children with cystic fibrosis. PLoS Comput Biol 2023; 19:e1011424. [PMID: 37672526 PMCID: PMC10506723 DOI: 10.1371/journal.pcbi.1011424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 09/18/2023] [Accepted: 08/09/2023] [Indexed: 09/08/2023] Open
Abstract
Chronic Pseudomonas aeruginosa (Pa) lung infections are the leading cause of mortality among cystic fibrosis (CF) patients; therefore, the eradication of new-onset Pa lung infections is an important therapeutic goal that can have long-term health benefits. The use of early antibiotic eradication therapy (AET) has been shown to clear the majority of new-onset Pa infections, and it is hoped that identifying the underlying basis for AET failure will further improve treatment outcomes. Here we generated machine learning models to predict AET outcomes based on pathogen genomic data. We used a nested cross validation design, population structure control, and recursive feature selection to improve model performance and showed that incorporating population structure control was crucial for improving model interpretation and generalizability. Our best model, controlling for population structure and using only 30 recursively selected features, had an area under the curve of 0.87 for a holdout test dataset. The top-ranked features were generally associated with motility, adhesion, and biofilm formation.
Collapse
Affiliation(s)
- Lucía Graña-Miraglia
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Nadia Morales-Lizcano
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Pauline W. Wang
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario, Canada
| | - David M. Hwang
- Department of Laboratory Medicine and Pathobiology, Toronto, Ontario, Canada
- Laboratory Medicine and Molecular Diagnostics, Sunnybrook Health Sciences Centre, Toronto, Ontario, Canada
| | - Yvonne C. W. Yau
- Department of Laboratory Medicine and Pathobiology, Toronto, Ontario, Canada
- Department of Paediatric Laboratory Medicine, Division of Microbiology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Valerie J. Waters
- Department of Pediatrics, Division of Infectious Diseases, The Hospital for Sick Children, Toronto, Ontario, Canada
- Translational Medicine, Research Institute, Hospital for Sick Children, Toronto, Ontario, Canada
| | - David S. Guttman
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
15
|
Nguyen M, Elmore Z, Ihle C, Moen FS, Slater AD, Turner BN, Parrello B, Best AA, Davis JJ. Predicting variable gene content in Escherichia coli using conserved genes. mSystems 2023; 8:e0005823. [PMID: 37314210 PMCID: PMC10469788 DOI: 10.1128/msystems.00058-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Accepted: 04/25/2023] [Indexed: 06/15/2023] Open
Abstract
Having the ability to predict the protein-encoding gene content of an incomplete genome or metagenome-assembled genome is important for a variety of bioinformatic tasks. In this study, as a proof of concept, we built machine learning classifiers for predicting variable gene content in Escherichia coli genomes using only the nucleotide k-mers from a set of 100 conserved genes as features. Protein families were used to define orthologs, and a single classifier was built for predicting the presence or absence of each protein family occurring in 10%-90% of all E. coli genomes. The resulting set of 3,259 extreme gradient boosting classifiers had a per-genome average macro F1 score of 0.944 [0.943-0.945, 95% CI]. We show that the F1 scores are stable across multi-locus sequence types and that the trend can be recapitulated by sampling a smaller number of core genes or diverse input genomes. Surprisingly, the presence or absence of poorly annotated proteins, including "hypothetical proteins" was accurately predicted (F1 = 0.902 [0.898-0.906, 95% CI]). Models for proteins with horizontal gene transfer-related functions had slightly lower F1 scores but were still accurate (F1s = 0.895, 0.872, 0.824, and 0.841 for transposon, phage, plasmid, and antimicrobial resistance-related functions, respectively). Finally, using a holdout set of 419 diverse E. coli genomes that were isolated from freshwater environmental sources, we observed an average per-genome F1 score of 0.880 [0.876-0.883, 95% CI], demonstrating the extensibility of the models. Overall, this study provides a framework for predicting variable gene content using a limited amount of input sequence data. IMPORTANCE Having the ability to predict the protein-encoding gene content of a genome is important for assessing genome quality, binning genomes from shotgun metagenomic assemblies, and assessing risk due to the presence of antimicrobial resistance and other virulence genes. In this study, we built a set of binary classifiers for predicting the presence or absence of variable genes occurring in 10%-90% of all publicly available E. coli genomes. Overall, the results show that a large portion of the E. coli variable gene content can be predicted with high accuracy, including genes with functions relating to horizontal gene transfer. This study offers a strategy for predicting gene content using limited input sequence data.
Collapse
Affiliation(s)
- Marcus Nguyen
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, USA
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
| | - Zachary Elmore
- Biology Department, Hope College, Holland, Michigan, USA
| | - Clay Ihle
- Biology Department, Hope College, Holland, Michigan, USA
| | | | - Adam D. Slater
- Biology Department, Hope College, Holland, Michigan, USA
| | | | - Bruce Parrello
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
- Fellowship for Interpretation of Genomes, Burr Ridge, Illinois, USA
| | - Aaron A. Best
- Biology Department, Hope College, Holland, Michigan, USA
| | - James J. Davis
- Data Science and Learning Division, Argonne National Laboratory, Lemont, Illinois, USA
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, USA
| |
Collapse
|
16
|
Li T, Huang J, Yang S, Chen J, Yao Z, Zhong M, Zhong X, Ye X. Pan-Genome-Wide Association Study of Serotype 19A Pneumococci Identifies Disease-Associated Genes. Microbiol Spectr 2023; 11:e0407322. [PMID: 37358412 PMCID: PMC10433855 DOI: 10.1128/spectrum.04073-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 06/04/2023] [Indexed: 06/27/2023] Open
Abstract
Despite the widespread implementation of pneumococcal vaccines, hypervirulent Streptococcus pneumoniae serotype 19A is endemic worldwide. It is still unclear whether specific genetic elements contribute to complex pathogenicity of serotype 19A isolates. We performed a large-scale pan-genome-wide association study (pan-GWAS) of 1,292 serotype 19A isolates sampled from patients with invasive disease and asymptomatic carriers. To address the underlying disease-associated genotypes, a comprehensive analysis using three methods (Scoary, a linear mixed model, and random forest) was performed to compare disease and carriage isolates to identify genes consistently associated with disease phenotype. By using three pan-GWAS methods, we found consensus on statistically significant associations between genotypes and disease phenotypes (disease or carriage), with a subset of 30 consistently significant disease-associated genes. The results of functional annotation revealed that these disease-associated genes had diverse predicted functions, including those that participated in mobile genetic elements, antibiotic resistance, virulence, and cellular metabolism. Our findings suggest the multifactorial pathogenicity nature of this hypervirulent serotype and provide important evidence for the design of novel protein-based vaccines to prevent and control pneumococcal disease. IMPORTANCE It is important to understand the genetic and pathogenic characteristics of S. pneumoniae serotype 19A, which may provide important information for the prevention and treatment of pneumococcal disease. This global large-sample pan-GWAS study has identified a subset of 30 consistently significant disease-associated genes that are involved in mobile genetic elements, antibiotic resistance, virulence, and cellular metabolism. These findings suggest the multifactorial pathogenicity nature of hypervirulent S. pneumoniae serotype 19A isolates and provide implications for the design of novel protein-based vaccines.
Collapse
Affiliation(s)
- Ting Li
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Jiayin Huang
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Shimin Yang
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Jianyu Chen
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Zhenjiang Yao
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| | - Minghao Zhong
- Department of Prevention and Health Care, The Sixth People’s Hospital of Dongguan City, Guangdong, China
| | - Xinguang Zhong
- Department of Prevention and Health Care, The Sixth People’s Hospital of Dongguan City, Guangdong, China
| | - Xiaohua Ye
- School of Public Health, Guangdong Pharmaceutical University, Guangzhou, China
| |
Collapse
|
17
|
Karlsen ST, Rau MH, Sánchez BJ, Jensen K, Zeidan AA. From genotype to phenotype: computational approaches for inferring microbial traits relevant to the food industry. FEMS Microbiol Rev 2023; 47:fuad030. [PMID: 37286882 PMCID: PMC10337747 DOI: 10.1093/femsre/fuad030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 05/31/2023] [Accepted: 06/06/2023] [Indexed: 06/09/2023] Open
Abstract
When selecting microbial strains for the production of fermented foods, various microbial phenotypes need to be taken into account to achieve target product characteristics, such as biosafety, flavor, texture, and health-promoting effects. Through continuous advances in sequencing technologies, microbial whole-genome sequences of increasing quality can now be obtained both cheaper and faster, which increases the relevance of genome-based characterization of microbial phenotypes. Prediction of microbial phenotypes from genome sequences makes it possible to quickly screen large strain collections in silico to identify candidates with desirable traits. Several microbial phenotypes relevant to the production of fermented foods can be predicted using knowledge-based approaches, leveraging our existing understanding of the genetic and molecular mechanisms underlying those phenotypes. In the absence of this knowledge, data-driven approaches can be applied to estimate genotype-phenotype relationships based on large experimental datasets. Here, we review computational methods that implement knowledge- and data-driven approaches for phenotype prediction, as well as methods that combine elements from both approaches. Furthermore, we provide examples of how these methods have been applied in industrial biotechnology, with special focus on the fermented food industry.
Collapse
Affiliation(s)
- Signe T Karlsen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Martin H Rau
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Benjamín J Sánchez
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Kristian Jensen
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| | - Ahmad A Zeidan
- Bioinformatics & Modeling, R&D Digital Innovation, Chr. Hansen A/S, Bøge Allé 10-12, 2970 Hørsholm, Denmark
| |
Collapse
|
18
|
Chong YY, Chan PK, Chan VWK, Cheung A, Luk MH, Cheung MH, Fu H, Chiu KY. Application of machine learning in the prevention of periprosthetic joint infection following total knee arthroplasty: a systematic review. ARTHROPLASTY 2023; 5:38. [PMID: 37316877 DOI: 10.1186/s42836-023-00195-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Accepted: 05/11/2023] [Indexed: 06/16/2023] Open
Abstract
BACKGROUND Machine learning is a promising and powerful technology with increasing use in orthopedics. Periprosthetic joint infection following total knee arthroplasty results in increased morbidity and mortality. This systematic review investigated the use of machine learning in preventing periprosthetic joint infection. METHODS A systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines. PubMed was searched in November 2022. All studies that investigated the clinical applications of machine learning in the prevention of periprosthetic joint infection following total knee arthroplasty were included. Non-English studies, studies with no full text available, studies focusing on non-clinical applications of machine learning, reviews and meta-analyses were excluded. For each included study, its characteristics, machine learning applications, algorithms, statistical performances, strengths and limitations were summarized. Limitations of the current machine learning applications and the studies, including their 'black box' nature, overfitting, the requirement of a large dataset, the lack of external validation, and their retrospective nature were identified. RESULTS Eleven studies were included in the final analysis. Machine learning applications in the prevention of periprosthetic joint infection were divided into four categories: prediction, diagnosis, antibiotic application and prognosis. CONCLUSION Machine learning may be a favorable alternative to manual methods in the prevention of periprosthetic joint infection following total knee arthroplasty. It aids in preoperative health optimization, preoperative surgical planning, the early diagnosis of infection, the early application of suitable antibiotics, and the prediction of clinical outcomes. Future research is warranted to resolve the current limitations and bring machine learning into clinical settings.
Collapse
Affiliation(s)
- Yuk Yee Chong
- Department of Orthopaedics and Traumatology, School of Clinical Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Ping Keung Chan
- Department of Orthopaedics and Traumatology, School of Clinical Medicine, The University of Hong Kong, Hong Kong SAR, China.
| | - Vincent Wai Kwan Chan
- Department of Orthopaedics and Traumatology, Queen Mary Hospital, Hong Kong SAR, China
| | - Amy Cheung
- Department of Orthopaedics and Traumatology, Queen Mary Hospital, Hong Kong SAR, China
| | - Michelle Hilda Luk
- Department of Orthopaedics and Traumatology, Queen Mary Hospital, Hong Kong SAR, China
| | - Man Hong Cheung
- Department of Orthopaedics and Traumatology, School of Clinical Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Henry Fu
- Department of Orthopaedics and Traumatology, School of Clinical Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kwong Yuen Chiu
- Department of Orthopaedics and Traumatology, School of Clinical Medicine, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
19
|
Yang MR, Su SF, Wu YW. Using bacterial pan-genome-based feature selection approach to improve the prediction of minimum inhibitory concentration (MIC). Front Genet 2023; 14:1054032. [PMID: 37323667 PMCID: PMC10267731 DOI: 10.3389/fgene.2023.1054032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Accepted: 05/16/2023] [Indexed: 06/17/2023] Open
Abstract
Background: Predicting the resistance profiles of antimicrobial resistance (AMR) pathogens is becoming more and more important in treating infectious diseases. Various attempts have been made to build machine learning models to classify resistant or susceptible pathogens based on either known antimicrobial resistance genes or the entire gene set. However, the phenotypic annotations are translated from minimum inhibitory concentration (MIC), which is the lowest concentration of antibiotic drugs in inhibiting certain pathogenic strains. Since the MIC breakpoints that classify a strain to be resistant or susceptible to specific antibiotic drug may be revised by governing institutes, we refrained from translating these MIC values into the categories "susceptible" or "resistant" but instead attempted to predict the MIC values using machine learning approaches. Results: By applying a machine learning feature selection approach on a Salmonella enterica pan-genome, in which the protein sequences were clustered to identify highly similar gene families, we showed that the selected features (genes) performed better than known AMR genes, and that models built on the selected genes achieved very accurate MIC prediction. Functional analysis revealed that about half of the selected genes were annotated as hypothetical proteins (i.e., with unknown functional roles), and that only a small portion of known AMR genes were among the selected genes, indicating that applying feature selection on the entire gene set has the potential of uncovering novel genes that may be associated with and may contribute to pathogenic antimicrobial resistances. Conclusion: The application of the pan-genome-based machine learning approach was indeed capable of predicting MIC values with very high accuracy. The feature selection process may also identify novel AMR genes for inferring bacterial antimicrobial resistance phenotypes.
Collapse
Affiliation(s)
- Ming-Ren Yang
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Shun-Feng Su
- Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
| | - Yu-Wei Wu
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
- Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan
- TMU Research Center for Digestive Medicine, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
20
|
Vorimore F, Jaudou S, Tran ML, Richard H, Fach P, Delannoy S. Combination of whole genome sequencing and supervised machine learning provides unambiguous identification of eae-positive Shiga toxin-producing Escherichia coli. Front Microbiol 2023; 14:1118158. [PMID: 37250024 PMCID: PMC10213463 DOI: 10.3389/fmicb.2023.1118158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 04/21/2023] [Indexed: 05/31/2023] Open
Abstract
Introduction The objective of this study was to develop, using a genome wide machine learning approach, an unambiguous model to predict the presence of highly pathogenic STEC in E. coli reads assemblies derived from complex samples containing potentially multiple E. coli strains. Our approach has taken into account the high genomic plasticity of E. coli and utilized the stratification of STEC and E. coli pathogroups classification based on the serotype and virulence factors to identify specific combinations of biomarkers for improved characterization of eae-positive STEC (also named EHEC for enterohemorrhagic E.coli) which are associated with bloody diarrhea and hemolytic uremic syndrome (HUS) in human. Methods The Machine Learning (ML) approach was used in this study on a large curated dataset composed of 1,493 E. coli genome sequences and 1,178 Coding Sequences (CDS). Feature selection has been performed using eight classification algorithms, resulting in a reduction of the number of CDS to six. From this reduced dataset, the eight ML models were trained with hyper-parameter tuning and cross-validation steps. Results and discussion It is remarkable that only using these six genes, EHEC can be clearly identified from E. coli read assemblies obtained from in silico mixtures and complex samples such as milk metagenomes. These various combinations of discriminative biomarkers can be implemented as novel marker genes for the unambiguous EHEC characterization from different E. coli strains mixtures as well as from raw milk metagenomes.
Collapse
Affiliation(s)
- Fabien Vorimore
- ANSES, Laboratory for Food Safety, Genomics Platform IdentyPath, Maisons-Alfort, France
| | - Sandra Jaudou
- ANSES, Laboratory for Food Safety, Genomics Platform IdentyPath, Maisons-Alfort, France
- ANSES, Laboratory for Food Safety, COLiPATH Unit, Maisons-Alfort, France
| | - Mai-Lan Tran
- ANSES, Laboratory for Food Safety, Genomics Platform IdentyPath, Maisons-Alfort, France
- ANSES, Laboratory for Food Safety, COLiPATH Unit, Maisons-Alfort, France
| | - Hugues Richard
- Bioinformatics Unit, Genome Competence Center (MF1), Robert Koch Institute, Berlin, Germany
| | - Patrick Fach
- ANSES, Laboratory for Food Safety, Genomics Platform IdentyPath, Maisons-Alfort, France
- ANSES, Laboratory for Food Safety, COLiPATH Unit, Maisons-Alfort, France
| | - Sabine Delannoy
- ANSES, Laboratory for Food Safety, Genomics Platform IdentyPath, Maisons-Alfort, France
- ANSES, Laboratory for Food Safety, COLiPATH Unit, Maisons-Alfort, France
| |
Collapse
|
21
|
Pandey D, Singhal N, Kumar M. β-LacFamPred: An online tool for prediction and classification of β-lactamase class, subclass, and family. Front Microbiol 2023; 13:1039687. [PMID: 36713195 PMCID: PMC9878453 DOI: 10.3389/fmicb.2022.1039687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 12/19/2022] [Indexed: 01/13/2023] Open
Abstract
β-Lactams are a broad class of antimicrobial agents with a high safety profile, making them the most widely used class in clinical, agricultural, and veterinary setups. The widespread use of β-lactams has induced the extensive spread of β-lactamase hydrolyzing enzymes known as β-lactamases (BLs). To neutralize the effect of β-lactamases, newer generations of β-lactams have been developed, which ultimately led to the evolution of a highly diverse family of BLs. Based on sequence homology, BLs are categorized into four classes: A-D in Ambler's classification system. Further, each class is subdivided into families. Class B is first divided into subclasses B1-B3, and then each subclass is divided into families. The class to which a BL belongs gives a lot of insight into its hydrolytic profile. Traditional methods of determining the hydrolytic profile of BLs and their classification are time-consuming and require resources. Hence we developed a machine-learning-based in silico method, named as β-LacFamPred, for the prediction and annotation of Ambler's class, subclass, and 96 families of BLs. During leave-one-out cross-validation, except one all β-LacFamPred model HMMs showed 100% accuracy. Benchmarking with other BL family prediction methods showed β-LacFamPred to be the most accurate. Out of 60 penicillin-binding proteins (PBPs) and 57 glyoxalase II proteins, β-LacFamPred correctly predicted 56 PBPs and none of the glyoxalase II sequences as non-BLs. Proteome-wide annotation of BLs by β-LacFamPred showed a very less number of false-positive predictions in comparison to the recently developed BL class prediction tool DeepBL. β-LacFamPred is available both as a web-server and standalone tool at http://proteininformatics.org/mkumar/blacfampred and GitHub repository https://github.com/mkubiophysics/B-LacFamPred respectively.
Collapse
|
22
|
Wang CC, Hung YT, Chou CY, Hsuan SL, Chen ZW, Chang PY, Jan TR, Tung CW. Using random forest to predict antimicrobial minimum inhibitory concentrations of nontyphoidal Salmonella in Taiwan. Vet Res 2023; 54:11. [PMID: 36747286 PMCID: PMC9903507 DOI: 10.1186/s13567-023-01141-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 01/13/2023] [Indexed: 02/08/2023] Open
Abstract
Antimicrobial resistance (AMR) is a global health issue and surveillance of AMR can be useful for understanding AMR trends and planning intervention strategies. Salmonella, widely distributed in food-producing animals, has been considered the first priority for inclusion in the AMR surveillance program by the World Health Organization (WHO). Recent advances in rapid and affordable whole-genome sequencing (WGS) techniques lead to the emergence of WGS as a one-stop test to predict the antimicrobial susceptibility. Since the variation of sequencing and minimum inhibitory concentration (MIC) measurement methods could result in different results, this study aimed to develop WGS-based random forest models for predicting MIC values of 24 drugs using data generated from the same laboratories in Taiwan. The WGS data have been transformed as a feature vector of 10-mers for machine learning. Based on rigorous validation and independent tests, a good performance was obtained with an average mean absolute error (MAE) less than 1 for both validation and independent test. Feature selection was then applied to identify top-ranked 10-mers that can further improve the prediction performance. For surveillance purposes, the genome sequence-based machine learning methods could be utilized to monitor the difference between predicted and experimental MIC, where a large difference might be worthy of investigation on the emerging genomic determinants.
Collapse
Affiliation(s)
- Chia-Chi Wang
- grid.19188.390000 0004 0546 0241Department and Graduate Institute of Veterinary Medicine, School of Veterinary Medicine, National Taiwan University, Taipei, 106 Taiwan
| | - Yu-Ting Hung
- grid.482517.dAnimal Technology Laboratories, Agricultural Technology Research Institute, Hsinchu City, 300 Taiwan ,grid.260542.70000 0004 0532 3749Graduate Institute of Veterinary Pathobiology, College of Veterinary Medicine, National Chung Hsing University, Taichung, 402 Taiwan
| | - Che-Yu Chou
- grid.412896.00000 0000 9337 0481Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, 106 Taiwan
| | - Shih-Ling Hsuan
- grid.260542.70000 0004 0532 3749Graduate Institute of Veterinary Pathobiology, College of Veterinary Medicine, National Chung Hsing University, Taichung, 402 Taiwan
| | - Zeng-Weng Chen
- grid.482517.dAnimal Technology Laboratories, Agricultural Technology Research Institute, Hsinchu City, 300 Taiwan
| | - Pei-Yu Chang
- grid.59784.370000000406229172Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County, 350 Taiwan
| | - Tong-Rong Jan
- Department and Graduate Institute of Veterinary Medicine, School of Veterinary Medicine, National Taiwan University, Taipei, 106, Taiwan.
| | - Chun-Wei Tung
- Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, 106, Taiwan. .,Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County, 350, Taiwan.
| |
Collapse
|
23
|
Yee R, Simner PJ. Next-Generation Sequencing Approaches to Predicting Antimicrobial Susceptibility Testing Results. Clin Lab Med 2022; 42:557-572. [PMID: 36368782 DOI: 10.1016/j.cll.2022.09.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Rebecca Yee
- Division of Medical Microbiology, Department of Pathology, Johns Hopkins University School of Medicine, Meyer B1-193, 600 North Wolfe Street, Baltimore, MD 21287-7093, USA
| | - Patricia J Simner
- Division of Medical Microbiology, Department of Pathology, Johns Hopkins University School of Medicine, Meyer B1-193, 600 North Wolfe Street, Baltimore, MD 21287-7093, USA.
| |
Collapse
|
24
|
Jesus HNR, Rocha DJPG, Ramos RTJ, Silva A, Brenig B, Góes-Neto A, Costa MM, Soares SC, Azevedo V, Aguiar ERGR, Martínez-Martínez L, Ocampo A, Alibi S, Dorta A, Pacheco LGC, Navas J. Pan-genomic analysis of Corynebacterium amycolatum gives insights into molecular mechanisms underpinning the transition to a pathogenic phenotype. Front Microbiol 2022; 13:1011578. [DOI: 10.3389/fmicb.2022.1011578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 10/17/2022] [Indexed: 11/17/2022] Open
Abstract
Corynebacterium amycolatum is a nonlipophilic coryneform which is increasingly being recognized as a relevant human and animal pathogen showing multidrug resistance to commonly used antibiotics. However, little is known about the molecular mechanisms involved in transition from colonization to the MDR invasive phenotype in clinical isolates. In this study, we performed a comprehensive pan-genomic analysis of C. amycolatum, including 26 isolates from different countries. We obtained the novel genome sequences of 8 of them, which are multidrug resistant clinical isolates from Spain and Tunisia. They were analyzed together with other 18 complete or draft C. amycolatum genomes retrieved from GenBank. The species C. amycolatum presented an open pan-genome (α = 0.854905), with 3,280 gene families, being 1,690 (51.52%) in the core genome, 1,121 related to accessory genes (34.17%), and 469 related to unique genes (14.29%). Although some classic corynebacterial virulence factors are absent in the species C. amycolatum, we did identify genes associated with immune evasion, toxin, and antiphagocytosis among the predicted putative virulence factors. Additionally, we found genomic evidence for extensive acquisition of antimicrobial resistance genes through genomic islands.
Collapse
|
25
|
Wang H, Jia C, Li H, Yin R, Chen J, Li Y, Yue M. Paving the way for precise diagnostics of antimicrobial resistant bacteria. Front Mol Biosci 2022; 9:976705. [PMID: 36032670 PMCID: PMC9413203 DOI: 10.3389/fmolb.2022.976705] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 07/19/2022] [Indexed: 12/26/2022] Open
Abstract
The antimicrobial resistance (AMR) crisis from bacterial pathogens is frequently emerging and rapidly disseminated during the sustained antimicrobial exposure in human-dominated communities, posing a compelling threat as one of the biggest challenges in humans. The frequent incidences of some common but untreatable infections unfold the public health catastrophe that antimicrobial-resistant pathogens have outpaced the available countermeasures, now explicitly amplified during the COVID-19 pandemic. Nowadays, biotechnology and machine learning advancements help create more fundamental knowledge of distinct spatiotemporal dynamics in AMR bacterial adaptation and evolutionary processes. Integrated with reliable diagnostic tools and powerful analytic approaches, a collaborative and systematic surveillance platform with high accuracy and predictability should be established and implemented, which is not just for an effective controlling strategy on AMR but also for protecting the longevity of valuable antimicrobials currently and in the future.
Collapse
Affiliation(s)
- Hao Wang
- Institute of Preventive Veterinary Sciences & Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou, China
| | - Chenhao Jia
- Institute of Preventive Veterinary Sciences & Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Hongzhao Li
- Institute of Preventive Veterinary Sciences & Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Rui Yin
- Institute of Preventive Veterinary Sciences & Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou, China
| | - Jiang Chen
- Department of Microbiology, Zhejiang Provincial Center for Disease Control and Prevention, Hangzhou, China
- *Correspondence: Jiang Chen, ; Yan Li, ; Min Yue,
| | - Yan Li
- Institute of Preventive Veterinary Sciences & Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
- Zhejiang Provincial Key Laboratory of Preventive Veterinary Medicine, Hangzhou, China
- *Correspondence: Jiang Chen, ; Yan Li, ; Min Yue,
| | - Min Yue
- Institute of Preventive Veterinary Sciences & Department of Veterinary Medicine, Zhejiang University College of Animal Sciences, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
- Zhejiang Provincial Key Laboratory of Preventive Veterinary Medicine, Hangzhou, China
- State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, The First Affiliated Hospital, College of Medicine, Zhejiang University, Hangzhou, China
- *Correspondence: Jiang Chen, ; Yan Li, ; Min Yue,
| |
Collapse
|
26
|
Nielsen TK, Browne PD, Hansen LH. Antibiotic resistance genes are differentially mobilized according to resistance mechanism. Gigascience 2022; 11:6652189. [PMID: 35906888 PMCID: PMC9338424 DOI: 10.1093/gigascience/giac072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/16/2022] [Accepted: 06/24/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Screening for antibiotic resistance genes (ARGs) in especially environmental samples with (meta)genomic sequencing is associated with false-positive predictions of phenotypic resistance. This stems from the fact that most acquired ARGs require being overexpressed before conferring resistance, which is often caused by decontextualization of putative ARGs by mobile genetic elements (MGEs). Consequent overexpression of ARGs can be caused by strong promoters often present in insertion sequence (IS) elements and integrons and the copy number effect of plasmids, which may contribute to high expression of accessory genes. RESULTS Here, we screen all complete bacterial RefSeq genomes for ARGs. The genetic contexts of detected ARGs are investigated for IS elements, integrons, plasmids, and phylogenetic dispersion. The ARG-MOB scale is proposed, which indicates how mobilized detected ARGs are in bacterial genomes. It is concluded that antibiotic efflux genes are rarely mobilized and even 80% of β-lactamases have never, or very rarely, been mobilized in the 15,790 studied genomes. However, some ARGs are indeed mobilized and co-occur with IS elements, plasmids, and integrons. CONCLUSIONS In this study, ARGs in all complete bacterial genomes are classified by their association with MGEs, using the proposed ARG-MOB scale. These results have consequences for the design and interpretation of studies screening for resistance determinants, as mobilized ARGs pose a more concrete risk to human health. An interactive table of all results is provided for future studies targeting highly mobilized ARGs.
Collapse
Affiliation(s)
- Tue Kjærgaard Nielsen
- Department of Plant and Environmental Sciences, Section for Environmental Microbiology and Biotechnology, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C 1871, Denmark
| | - Patrick Denis Browne
- Department of Plant and Environmental Sciences, Section for Environmental Microbiology and Biotechnology, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C 1871, Denmark
| | - Lars Hestbjerg Hansen
- Department of Plant and Environmental Sciences, Section for Environmental Microbiology and Biotechnology, University of Copenhagen, Thorvaldsensvej 40, Frederiksberg C 1871, Denmark
| |
Collapse
|
27
|
Marciano DC, Wang C, Hsu TK, Bourquard T, Atri B, Nehring RB, Abel NS, Bowling EA, Chen TJ, Lurie PD, Katsonis P, Rosenberg SM, Herman C, Lichtarge O. Evolutionary action of mutations reveals antimicrobial resistance genes in Escherichia coli. Nat Commun 2022; 13:3189. [PMID: 35680894 PMCID: PMC9184624 DOI: 10.1038/s41467-022-30889-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 05/24/2022] [Indexed: 11/08/2022] Open
Abstract
Since antibiotic development lags, we search for potential drug targets through directed evolution experiments. A challenge is that many resistance genes hide in a noisy mutational background as mutator clones emerge in the adaptive population. Here, to overcome this noise, we quantify the impact of mutations through evolutionary action (EA). After sequencing ciprofloxacin or colistin resistance strains grown under different mutational regimes, we find that an elevated sum of the evolutionary action of mutations in a gene identifies known resistance drivers. This EA integration approach also suggests new antibiotic resistance genes which are then shown to provide a fitness advantage in competition experiments. Moreover, EA integration analysis of clinical and environmental isolates of antibiotic resistant of E. coli identifies gene drivers of resistance where a standard approach fails. Together these results inform the genetic basis of de novo colistin resistance and support the robust discovery of phenotype-driving genes via the evolutionary action of genetic perturbations in fitness landscapes.
Collapse
Affiliation(s)
- David C Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Chen Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Teng-Kuei Hsu
- The Verna and Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Thomas Bourquard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Benu Atri
- Structural and Computational Biology & Molecular Biophysics Program, Baylor College of Medicine, Houston, TX, 77030, USA
- Clara Analytics Inc., 451 El Camino Real #201, Santa Clara, CA, 95050, USA
| | - Ralf B Nehring
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- The Verna and Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, 77030, USA
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Nicholas S Abel
- Department of Pharmacology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Elizabeth A Bowling
- The Verna and Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Taylor J Chen
- Integrative Molecular & Biomedical Biosciences Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Pamela D Lurie
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Susan M Rosenberg
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- The Verna and Marrs McLean Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, 77030, USA
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
- Integrative Molecular & Biomedical Biosciences Program, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Christophe Herman
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, TX, 77030, USA
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.
- Structural and Computational Biology & Molecular Biophysics Program, Baylor College of Medicine, Houston, TX, 77030, USA.
- Dan L. Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, 77030, USA.
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| |
Collapse
|
28
|
Qian W, Li X, Yang M, Liu C, Kong Y, Li Y, Wang T, Zhang Q. Relationship Between Antibiotic Resistance, Biofilm Formation, and Biofilm-Specific Resistance in Escherichia coli Isolates from Ningbo, China. Infect Drug Resist 2022; 15:2865-2878. [PMID: 35686192 PMCID: PMC9172925 DOI: 10.2147/idr.s363652] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 05/17/2022] [Indexed: 01/09/2023] Open
Abstract
Purpose Methods Results Conclusion
Collapse
Affiliation(s)
- Weidong Qian
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an, 710021, People’s Republic of China
| | - Xinchen Li
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an, 710021, People’s Republic of China
| | - Min Yang
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an, 710021, People’s Republic of China
| | - Chanchan Liu
- Xi’an Medical College, Xi’an, 710309, People’s Republic of China
| | - Yi Kong
- Research Center for Tissue Repair and Regeneration Affiliated to the Medical Innovation Research Department, the General Hospital of the People’s Liberation Army, Beijing, 100048, People’s Republic of China
| | - Yongdong Li
- Ningbo Municipal Center for Disease Control and Prevention, Ningbo, 315010, People’s Republic of China
| | - Ting Wang
- School of Food and Biological Engineering, Shaanxi University of Science and Technology, Xi’an, 710021, People’s Republic of China
- Correspondence: Ting Wang; Qian Zhang, Tel +10 29-86168583, Email ;
| | - Qian Zhang
- Department of Dermatology, Huazhong University of Science and Technology Union Shenzhen Hospital, Shenzhen, 518004, People’s Republic of China
| |
Collapse
|
29
|
Machine Learning for Antimicrobial Resistance Prediction: Current Practice, Limitations, and Clinical Perspective. Clin Microbiol Rev 2022; 35:e0017921. [PMID: 35612324 DOI: 10.1128/cmr.00179-21] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Antimicrobial resistance (AMR) is a global health crisis that poses a great threat to modern medicine. Effective prevention strategies are urgently required to slow the emergence and further dissemination of AMR. Given the availability of data sets encompassing hundreds or thousands of pathogen genomes, machine learning (ML) is increasingly being used to predict resistance to different antibiotics in pathogens based on gene content and genome composition. A key objective of this work is to advocate for the incorporation of ML into front-line settings but also highlight the further refinements that are necessary to safely and confidently incorporate these methods. The question of what to predict is not trivial given the existence of different quantitative and qualitative laboratory measures of AMR. ML models typically treat genes as independent predictors, with no consideration of structural and functional linkages; they also may not be accurate when new mutational variants of known AMR genes emerge. Finally, to have the technology trusted by end users in public health settings, ML models need to be transparent and explainable to ensure that the basis for prediction is clear. We strongly advocate that the next set of AMR-ML studies should focus on the refinement of these limitations to be able to bridge the gap to diagnostic implementation.
Collapse
|
30
|
Youn J, Rai N, Tagkopoulos I. Knowledge integration and decision support for accelerated discovery of antibiotic resistance genes. Nat Commun 2022; 13:2360. [PMID: 35487919 PMCID: PMC9055065 DOI: 10.1038/s41467-022-29993-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 03/04/2022] [Indexed: 11/09/2022] Open
Abstract
We present a machine learning framework to automate knowledge discovery through knowledge graph construction, inconsistency resolution, and iterative link prediction. By incorporating knowledge from 10 publicly available sources, we construct an Escherichia coli antibiotic resistance knowledge graph with 651,758 triples from 23 triple types after resolving 236 sets of inconsistencies. Iteratively applying link prediction to this graph and wet-lab validation of the generated hypotheses reveal 15 antibiotic resistant E. coli genes, with 6 of them never associated with antibiotic resistance for any microbe. Iterative link prediction leads to a performance improvement and more findings. The probability of positive findings highly correlates with experimentally validated findings (R2 = 0.94). We also identify 5 homologs in Salmonella enterica that are all validated to confer resistance to antibiotics. This work demonstrates how evidence-driven decisions are a step toward automating knowledge discovery with high confidence and accelerated pace, thereby substituting traditional time-consuming and expensive methods.
Collapse
Affiliation(s)
- Jason Youn
- Department of Computer Science, University of California, Davis, CA, 95616, USA
- Genome Center, University of California, Davis, CA, 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA, 95616, USA
| | - Navneet Rai
- Department of Computer Science, University of California, Davis, CA, 95616, USA
- Genome Center, University of California, Davis, CA, 95616, USA
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA, 95616, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, CA, 95616, USA.
- Genome Center, University of California, Davis, CA, 95616, USA.
- USDA/NSF AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, CA, 95616, USA.
| |
Collapse
|
31
|
Yang MR, Wu YW. Enhancing predictions of antimicrobial resistance of pathogens by expanding the potential resistance gene repertoire using a pan-genome-based feature selection approach. BMC Bioinformatics 2022; 23:131. [PMID: 35428201 PMCID: PMC9011928 DOI: 10.1186/s12859-022-04666-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 04/04/2022] [Indexed: 11/10/2022] Open
Abstract
Background Predicting which pathogens might exhibit antimicrobial resistance (AMR) based on genomics data is one of the promising ways to swiftly and precisely identify AMR pathogens. Currently, the most widely used genomics approach is through identifying known AMR genes from genomic information in order to predict whether a pathogen might be resistant to certain antibiotic drugs. The list of known AMR genes, however, is still far from comprehensive and may result in inaccurate AMR pathogen predictions. We thus felt the need to expand the AMR gene set and proposed a pan-genome-based feature selection method to identify potential gene sets for AMR prediction purposes. Results By building pan-genome datasets and extracting gene presence/absence patterns from four bacterial species, each with more than 2000 strains, we showed that machine learning models built from pan-genome data can be very promising for predicting AMR pathogens. The gene set selected by the eXtreme Gradient Boosting (XGBoost) feature selection approach further improved prediction outcomes, and an incremental approach selecting subsets of XGBoost-selected features brought the machine learning model performance to the next level. Investigating selected gene sets revealed that on average about 50% of genes had no known function and very few of them were known AMR genes, indicating the potential of the selected gene sets to expand resistance gene repertoires. Conclusions We demonstrated that a pan-genome-based feature selection approach is suitable for building machine learning models for predicting AMR pathogens. The extracted gene sets may provide future clues to expand our knowledge of known AMR genes and provide novel hypotheses for inferring bacterial AMR mechanisms. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04666-2.
Collapse
|
32
|
Peng Z, Maciel-Guerra A, Baker M, Zhang X, Hu Y, Wang W, Rong J, Zhang J, Xue N, Barrow P, Renney D, Stekel D, Williams P, Liu L, Chen J, Li F, Dottorini T. Whole-genome sequencing and gene sharing network analysis powered by machine learning identifies antibiotic resistance sharing between animals, humans and environment in livestock farming. PLoS Comput Biol 2022; 18:e1010018. [PMID: 35333870 PMCID: PMC8986120 DOI: 10.1371/journal.pcbi.1010018] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 04/06/2022] [Accepted: 03/14/2022] [Indexed: 01/26/2023] Open
Abstract
Anthropogenic environments such as those created by intensive farming of livestock, have been proposed to provide ideal selection pressure for the emergence of antimicrobial-resistant Escherichia coli bacteria and antimicrobial resistance genes (ARGs) and spread to humans. Here, we performed a longitudinal study in a large-scale commercial poultry farm in China, collecting E. coli isolates from both farm and slaughterhouse; targeting animals, carcasses, workers and their households and environment. By using whole-genome phylogenetic analysis and network analysis based on single nucleotide polymorphisms (SNPs), we found highly interrelated non-pathogenic and pathogenic E. coli strains with phylogenetic intermixing, and a high prevalence of shared multidrug resistance profiles amongst livestock, human and environment. Through an original data processing pipeline which combines omics, machine learning, gene sharing network and mobile genetic elements analysis, we investigated the resistance to 26 different antimicrobials and identified 361 genes associated to antimicrobial resistance (AMR) phenotypes; 58 of these were known AMR-associated genes and 35 were associated to multidrug resistance. We uncovered an extensive network of genes, correlated to AMR phenotypes, shared among livestock, humans, farm and slaughterhouse environments. We also found several human, livestock and environmental isolates sharing closely related mobile genetic elements carrying ARGs across host species and environments. In a scenario where no consensus exists on how antibiotic use in the livestock may affect antibiotic resistance in the human population, our findings provide novel insights into the broader epidemiology of antimicrobial resistance in livestock farming. Moreover, our original data analysis method has the potential to uncover AMR transmission pathways when applied to the study of other pathogens active in other anthropogenic environments characterised by complex interconnections between host species. Livestock have been suggested as an important source of antimicrobial-resistant (AMR) Escherichia coli, capable of infecting humans and carrying resistance to drugs used in human medicine. China has a large intensive livestock farming industry, poultry being the second most important source of meat in the country, and is the largest user of antibiotics for food production in the world. Here we studied antimicrobial resistance gene overlap between E. coli isolates collected from humans, livestock and their shared environments in a large-scale Chinese poultry farm and associated slaughterhouse. By using a computational approach that integrates machine learning, whole-genome sequencing, gene sharing network and mobile genetic elements analysis we characterized the E. coli community structure, antimicrobial resistance phenotypes and the genetic relatedness of non-pathogenic and pathogenic E. coli strains. We uncovered the network of genes, associated with AMR, shared across host species (animals and workers) and environments (farm and slaughterhouse). Our approach opens up new avenues for the development of a fast, affordable and effective computational solutions that provide novel insights into the broader epidemiology of antimicrobial resistance in livestock farming.
Collapse
Affiliation(s)
- Zixin Peng
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Alexandre Maciel-Guerra
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Michelle Baker
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Xibin Zhang
- Qingdao Tian run Food Co., Ltd, New Hope, Beijing, People’s Republic of China
| | - Yue Hu
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Wei Wang
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Jia Rong
- Qingdao Tian run Food Co., Ltd, New Hope, Beijing, People’s Republic of China
| | - Jing Zhang
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Ning Xue
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
| | - Paul Barrow
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
- School of Veterinary Medicine, University of Surrey, Guildford, Surrey, United Kingdom
| | - David Renney
- Nimrod Veterinary Products Limited, Moreton-in-Marsh, United Kingdom
| | - Dov Stekel
- School of Biosciences, University of Nottingham, Sutton Bonington, United Kingdom
| | - Paul Williams
- Biodiscovery Institute and School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
| | - Longhai Liu
- Qingdao Tian run Food Co., Ltd, New Hope, Beijing, People’s Republic of China
| | - Junshi Chen
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
| | - Fengqin Li
- NHC Key Laboratory of Food Safety Risk Assessment, Chinese Academy of Medical Science Research Unit (2019RU014), China National Center for Food Safety Risk Assessment, Beijing, People’s Republic of China
- * E-mail: (FL); (TD)
| | - Tania Dottorini
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington, United Kingdom
- * E-mail: (FL); (TD)
| |
Collapse
|
33
|
Shaban TF, Alkawareek MY. Prediction of qualitative antibiofilm activity of antibiotics using supervised machine learning techniques. Comput Biol Med 2022; 140:105065. [PMID: 34839184 DOI: 10.1016/j.compbiomed.2021.105065] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Revised: 11/21/2021] [Accepted: 11/21/2021] [Indexed: 11/18/2022]
Abstract
Although biofilm-specific antibiotic susceptibility assays are available, they are time-consuming and resource-intensive, and hence they are not usually performed in clinical settings. Herein, we introduce a machine learning-based predictive modeling approach that uses routinely available and easily accessible data to qualitatively predict in vitro antibiofilm activity of antibiotics with relatively high accuracy. Three optimized models based on logistic regression, decision tree, and random forest algorithms were successfully developed in this study using data manually collected from published literature. In these models, independent variables that serve as significant predictors of antibiofilm activity are minimum inhibitory concentration, bacterial Gram type, biofilm formation method, in addition to antibiotic's mechanism of action, molecular weight, and pKa. The cross-validation method showed that the optimized models exhibit prediction accuracy of 67% ± 6.1% for the logistic regression model, 73% ± 5.8% for the decision tree model, and 74% ± 5% for the random forest model. However, the one-way ANOVA test revealed that the difference in prediction accuracy between the 3 models is not statistically significant, and hence they can be considered to have comparable performance. The presented modeling approach can serve as an alternative to the resource-intensive biofilm assays to rapidly and properly manage biofilm-associated infections, especially in resource-limited clinical settings.
Collapse
Affiliation(s)
- Taqwa F Shaban
- School of Pharmacy, The University of Jordan, Amman, Jordan
| | | |
Collapse
|
34
|
VanOeffelen M, Nguyen M, Aytan-Aktug D, Brettin T, Dietrich EM, Kenyon RW, Machi D, Mao C, Olson R, Pusch GD, Shukla M, Stevens R, Vonstein V, Warren AS, Wattam AR, Yoo H, Davis JJ. A genomic data resource for predicting antimicrobial resistance from laboratory-derived antimicrobial susceptibility phenotypes. Brief Bioinform 2021; 22:bbab313. [PMID: 34379107 PMCID: PMC8575023 DOI: 10.1093/bib/bbab313] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 06/18/2021] [Accepted: 07/20/2021] [Indexed: 11/14/2022] Open
Abstract
Antimicrobial resistance (AMR) is a major global health threat that affects millions of people each year. Funding agencies worldwide and the global research community have expended considerable capital and effort tracking the evolution and spread of AMR by isolating and sequencing bacterial strains and performing antimicrobial susceptibility testing (AST). For the last several years, we have been capturing these efforts by curating data from the literature and data resources and building a set of assembled bacterial genome sequences that are paired with laboratory-derived AST data. This collection currently contains AST data for over 67 000 genomes encompassing approximately 40 genera and over 100 species. In this paper, we describe the characteristics of this collection, highlighting areas where sampling is comparatively deep or shallow, and showing areas where attention is needed from the research community to improve sampling and tracking efforts. In addition to using the data to track the evolution and spread of AMR, it also serves as a useful starting point for building machine learning models for predicting AMR phenotypes. We demonstrate this by describing two machine learning models that are built from the entire dataset to show where the predictive power is comparatively high or low. This AMR metadata collection is freely available and maintained on the Bacterial and Viral Bioinformatics Center (BV-BRC) FTP site ftp://ftp.bvbrc.org/RELEASE_NOTES/PATRIC_genomes_AMR.txt.
Collapse
Affiliation(s)
| | - Marcus Nguyen
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, IL, USA
| | - Derya Aytan-Aktug
- National Food Institute, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - Thomas Brettin
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Computing Environment and Life Sciences, Argonne National Laboratory, Argonne, IL, USA
| | - Emily M Dietrich
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Computing Environment and Life Sciences, Argonne National Laboratory, Argonne, IL, USA
| | - Ronald W Kenyon
- Biocomplexity Institute and Initiative, University of Virginia, Virginia, USA
| | - Dustin Machi
- Biocomplexity Institute and Initiative, University of Virginia, Virginia, USA
| | - Chunhong Mao
- Biocomplexity Institute and Initiative, University of Virginia, Virginia, USA
| | - Robert Olson
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, IL, USA
| | - Gordon D Pusch
- Fellowship for Interpretation of Genomes, Burr Ridge, IL, USA
| | - Maulik Shukla
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, IL, USA
| | - Rick Stevens
- Computing Environment and Life Sciences, Argonne National Laboratory, Argonne, IL, USA
- Department of Computer Science, University of Chicago, Chicago, IL, USA
| | | | - Andrew S Warren
- Biocomplexity Institute and Initiative, University of Virginia, Virginia, USA
| | - Alice R Wattam
- Data Science and Learning Division, Argonne National Laboratory, Argonne, IL, USA
- Biocomplexity Institute and Initiative, University of Virginia, Virginia, USA
| | - Hyunseung Yoo
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, IL, USA
| | - James J Davis
- University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
- Data Science and Learning Division, Argonne National Laboratory, Argonne, IL, USA
- Northwestern Argonne Institute for Science and Engineering, Evanston, IL, USA
| |
Collapse
|
35
|
He S, Leanse LG, Feng Y. Artificial intelligence and machine learning assisted drug delivery for effective treatment of infectious diseases. Adv Drug Deliv Rev 2021; 178:113922. [PMID: 34461198 DOI: 10.1016/j.addr.2021.113922] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 07/14/2021] [Accepted: 08/09/2021] [Indexed: 12/23/2022]
Abstract
In the era of antimicrobial resistance, the prevalence of multidrug-resistant microorganisms that resist conventional antibiotic treatment has steadily increased. Thus, it is now unquestionable that infectious diseases are significant global burdens that urgently require innovative treatment strategies. Emerging studies have demonstrated that artificial intelligence (AI) can transform drug delivery to promote effective treatment of infectious diseases. In this review, we propose to evaluate the significance, essential principles, and popular tools of AI in drug delivery for infectious disease treatment. Specifically, we will focus on the achievements and key findings of current research, as well as the applications of AI on drug delivery throughout the whole antimicrobial treatment process, with an emphasis on drug development, treatment regimen optimization, drug delivery system and administration route design, and drug delivery outcome prediction. To that end, the challenges of AI in drug delivery for infectious disease treatments and their current solutions and future perspective will be presented and discussed.
Collapse
Affiliation(s)
- Sheng He
- Boston Children's Hospital, Harvard Medical School, Harvard University, Boston, MA, USA.
| | - Leon G Leanse
- Massachusetts General Hospital, Harvard Medical School, Harvard University, Boston, MA, USA
| | - Yanfang Feng
- Massachusetts General Hospital, Harvard Medical School, Harvard University, Boston, MA, USA.
| |
Collapse
|
36
|
Uddin TM, Chakraborty AJ, Khusro A, Zidan BRM, Mitra S, Emran TB, Dhama K, Ripon MKH, Gajdács M, Sahibzada MUK, Hossain MJ, Koirala N. Antibiotic resistance in microbes: History, mechanisms, therapeutic strategies and future prospects. J Infect Public Health 2021; 14:1750-1766. [PMID: 34756812 DOI: 10.1016/j.jiph.2021.10.020] [Citation(s) in RCA: 246] [Impact Index Per Article: 82.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Revised: 10/04/2021] [Accepted: 10/14/2021] [Indexed: 12/22/2022] Open
Abstract
Antibiotics have been used to cure bacterial infections for more than 70 years, and these low-molecular-weight bioactive agents have also been used for a variety of other medicinal applications. In the battle against microbes, antibiotics have certainly been a blessing to human civilization by saving millions of lives. Globally, infections caused by multidrug-resistant (MDR) bacteria are on the rise. Antibiotics are being used to combat diversified bacterial infections. Synthetic biology techniques, in combination with molecular, functional genomic, and metagenomic studies of bacteria, plants, and even marine invertebrates are aimed at unlocking the world's natural products faster than previous methods of antibiotic discovery. There are currently only few viable remedies, potential preventive techniques, and a limited number of antibiotics, thereby necessitating the discovery of innovative medicinal approaches and antimicrobial therapies. MDR is also facilitated by biofilms, which makes infection control more complex. In this review, we have spotlighted comprehensively various aspects of antibiotics viz. overview of antibiotics era, mode of actions of antibiotics, development and mechanisms of antibiotic resistance in bacteria, and future strategies to fight the emerging antimicrobial resistant threat.
Collapse
Affiliation(s)
- Tanvir Mahtab Uddin
- Department of Pharmacy, Faculty of Pharmacy, University of Dhaka, Dhaka 1000, Bangladesh.
| | - Arka Jyoti Chakraborty
- Department of Pharmacy, Faculty of Pharmacy, University of Dhaka, Dhaka 1000, Bangladesh.
| | - Ameer Khusro
- Research Department of Plant Biology and Biotechnology, Loyola College, Nungambakkam, Chennai, Tamil Nadu, India.
| | - Bm Redwan Matin Zidan
- Department of Pharmacy, Faculty of Pharmacy, University of Dhaka, Dhaka 1000, Bangladesh.
| | - Saikat Mitra
- Department of Pharmacy, Faculty of Pharmacy, University of Dhaka, Dhaka 1000, Bangladesh.
| | - Talha Bin Emran
- Department of Pharmacy, BGC Trust University Bangladesh, Chittagong 4381, Bangladesh.
| | - Kuldeep Dhama
- Division of Pathology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, Uttar Pradesh, India.
| | - Md Kamal Hossain Ripon
- Department of Pharmacy, Mawlana Bhashani Science and Technology University, Santosh, Tangail 1902, Bangladesh.
| | - Márió Gajdács
- Department of Oral Biology and Experimental Dental Research, Faculty of Dentistry, University of Szeged, 6720 Szeged, Hungary.
| | | | - Md Jamal Hossain
- Department of Pharmacy, State University of Bangladesh, 77 Satmasjid Road, Dhanmondi, Dhaka 1205, Bangladesh.
| | - Niranjan Koirala
- Department of Natural Products Research, Dr. Koirala Research Institute for Biotechnology and Biodiversity, Kathmandu 44600, Nepal.
| |
Collapse
|
37
|
Ren Y, Chakraborty T, Doijad S, Falgenhauer L, Falgenhauer J, Goesmann A, Hauschild AC, Schwengers O, Heider D. Prediction of antimicrobial resistance based on whole-genome sequencing and machine learning. Bioinformatics 2021; 38:325-334. [PMID: 34613360 PMCID: PMC8722762 DOI: 10.1093/bioinformatics/btab681] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 08/27/2021] [Accepted: 09/24/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Antimicrobial resistance (AMR) is one of the biggest global problems threatening human and animal health. Rapid and accurate AMR diagnostic methods are thus very urgently needed. However, traditional antimicrobial susceptibility testing (AST) is time-consuming, low throughput and viable only for cultivable bacteria. Machine learning methods may pave the way for automated AMR prediction based on genomic data of the bacteria. However, comparing different machine learning methods for the prediction of AMR based on different encodings and whole-genome sequencing data without previously known knowledge remains to be done. RESULTS In this study, we evaluated logistic regression (LR), support vector machine (SVM), random forest (RF) and convolutional neural network (CNN) for the prediction of AMR for the antibiotics ciprofloxacin, cefotaxime, ceftazidime and gentamicin. We could demonstrate that these models can effectively predict AMR with label encoding, one-hot encoding and frequency matrix chaos game representation (FCGR encoding) on whole-genome sequencing data. We trained these models on a large AMR dataset and evaluated them on an independent public dataset. Generally, RFs and CNNs perform better than LR and SVM with AUCs up to 0.96. Furthermore, we were able to identify mutations that are associated with AMR for each antibiotic. AVAILABILITY AND IMPLEMENTATION Source code in data preparation and model training are provided at GitHub website (https://github.com/YunxiaoRen/ML-iAMR). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yunxiao Ren
- Department of Data Science in Biomedicine, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg 35032, Germany
| | - Trinad Chakraborty
- Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen 35392, Germany,German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany
| | - Swapnil Doijad
- Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen 35392, Germany,German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany
| | - Linda Falgenhauer
- German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany,Institute of Hygiene and Environmental Medicine, Justus Liebig University Giessen, Giessen 35392, Germany,Hessisches universitäres Kompetenzzentrum Krankenhaushygiene, Giessen 35392, Germany
| | - Jane Falgenhauer
- Institute of Medical Microbiology, Justus Liebig University Giessen, Giessen 35392, Germany,German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany
| | - Alexander Goesmann
- German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany,Department of Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | - Anne-Christin Hauschild
- Department of Data Science in Biomedicine, Faculty of Mathematics and Computer Science, Philipps-University of Marburg, Marburg 35032, Germany
| | - Oliver Schwengers
- German Center for Infection Research, Partner site Giessen-Marburg-Langen, Giessen 35392, Germany,Department of Bioinformatics and Systems Biology, Justus Liebig University Giessen, Giessen 35392, Germany
| | | |
Collapse
|
38
|
Melo MCR, Maasch JRMA, de la Fuente-Nunez C. Accelerating antibiotic discovery through artificial intelligence. Commun Biol 2021; 4:1050. [PMID: 34504303 PMCID: PMC8429579 DOI: 10.1038/s42003-021-02586-0] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 07/16/2021] [Indexed: 02/07/2023] Open
Abstract
By targeting invasive organisms, antibiotics insert themselves into the ancient struggle of the host-pathogen evolutionary arms race. As pathogens evolve tactics for evading antibiotics, therapies decline in efficacy and must be replaced, distinguishing antibiotics from most other forms of drug development. Together with a slow and expensive antibiotic development pipeline, the proliferation of drug-resistant pathogens drives urgent interest in computational methods that promise to expedite candidate discovery. Strides in artificial intelligence (AI) have encouraged its application to multiple dimensions of computer-aided drug design, with increasing application to antibiotic discovery. This review describes AI-facilitated advances in the discovery of both small molecule antibiotics and antimicrobial peptides. Beyond the essential prediction of antimicrobial activity, emphasis is also given to antimicrobial compound representation, determination of drug-likeness traits, antimicrobial resistance, and de novo molecular design. Given the urgency of the antimicrobial resistance crisis, we analyze uptake of open science best practices in AI-driven antibiotic discovery and argue for openness and reproducibility as a means of accelerating preclinical research. Finally, trends in the literature and areas for future inquiry are discussed, as artificially intelligent enhancements to drug discovery at large offer many opportunities for future applications in antibiotic development.
Collapse
Affiliation(s)
- Marcelo C R Melo
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Jacqueline R M A Maasch
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computer and Information Science, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Cesar de la Fuente-Nunez
- Machine Biology Group, Departments of Psychiatry and Microbiology, Institute for Biomedical Informatics, Institute for Translational Medicine and Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Departments of Bioengineering and Chemical and Biomolecular Engineering, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA.
- Penn Institute for Computational Science, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
39
|
Bellabarba A, Bacci G, Decorosi F, Aun E, Azzarello E, Remm M, Giovannetti L, Viti C, Mengoni A, Pini F. Competitiveness for Nodule Colonization in Sinorhizobium meliloti: Combined In Vitro-Tagged Strain Competition and Genome-Wide Association Analysis. mSystems 2021. [PMID: 34313466 DOI: 10.1101/2020.09.15.298034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2023] Open
Abstract
Associations between leguminous plants and symbiotic nitrogen-fixing rhizobia are a classic example of mutualism between a eukaryotic host and a specific group of prokaryotic microbes. Although this symbiosis is in part species specific, different rhizobial strains may colonize the same nodule. Some rhizobial strains are commonly known as better competitors than others, but detailed analyses that aim to predict rhizobial competitive abilities based on genomes are still scarce. Here, we performed a bacterial genome-wide association (GWAS) analysis to define the genomic determinants related to the competitive capabilities in the model rhizobial species Sinorhizobium meliloti. For this, 13 tester strains were green fluorescent protein (GFP) tagged and assayed versus 3 red fluorescent protein (RFP)-tagged reference competitor strains (Rm1021, AK83, and BL225C) in a Medicago sativa nodule occupancy test. Competition data and strain genomic sequences were employed to build a model for GWAS based on k-mers. Among the k-mers with the highest scores, 51 k-mers mapped on the genomes of four strains showing the highest competition phenotypes (>60% single strain nodule occupancy; GR4, KH35c, KH46, and SM11) versus BL225C. These k-mers were mainly located on the symbiosis-related megaplasmid pSymA, specifically on genes coding for transporters, proteins involved in the biosynthesis of cofactors, and proteins related to metabolism (e.g., fatty acids). The same analysis was performed considering the sum of single and mixed nodules obtained in the competition assays versus BL225C, retrieving k-mers mapped on the genes previously found and on vir genes. Therefore, the competition abilities seem to be linked to multiple genetic determinants and comprise several cellular components. IMPORTANCE Decoding the competitive pattern that occurs in the rhizosphere is challenging in the study of bacterial social interaction strategies. To date, the single-gene approach has mainly been used to uncover the bases of nodulation, but there is still a knowledge gap regarding the main features that a priori characterize rhizobial strains able to outcompete indigenous rhizobia. Therefore, tracking down which traits make different rhizobial strains able to win the competition for plant infection over other indigenous rhizobia will improve the strain selection process and, consequently, plant yield in sustainable agricultural production systems. We proved that a k-mer-based GWAS approach can efficiently identify the competition determinants of a panel of strains previously analyzed for their plant tissue occupancy using double fluorescent labeling. The reported strategy will be useful for detailed studies on the genomic aspects of the evolution of bacterial symbiosis and for an extensive evaluation of rhizobial inoculants.
Collapse
Affiliation(s)
- Agnese Bellabarba
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Giovanni Bacci
- Department of Biology, University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Francesca Decorosi
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Erki Aun
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartugrid.10939.32, Tartu, Estonia
| | - Elisa Azzarello
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Maido Remm
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartugrid.10939.32, Tartu, Estonia
| | - Luciana Giovannetti
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Carlo Viti
- Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
- Genexpress Laboratory, Department of Agronomy, Food, Environmental and Forestry (DAGRI), University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Alessio Mengoni
- Department of Biology, University of Florencegrid.8404.8, Sesto Fiorentino, Italy
| | - Francesco Pini
- Department of Biology, University of Bari Aldo Morogrid.7644.1, Bari, Italy
| |
Collapse
|
40
|
Machine Learning Prediction of Resistance to Subinhibitory Antimicrobial Concentrations from Escherichia coli Genomes. mSystems 2021; 6:e0034621. [PMID: 34427505 PMCID: PMC8407197 DOI: 10.1128/msystems.00346-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Escherichia coli is an important cause of bacterial infections worldwide, with multidrug-resistant strains incurring substantial costs on human lives. Besides therapeutic concentrations of antimicrobials in health care settings, the presence of subinhibitory antimicrobial residues in the environment and in clinics selects for antimicrobial resistance (AMR), but the underlying genetic repertoire is less well understood. Here, we used machine learning to predict the population doubling time and cell growth yield of 1,407 genetically diverse E. coli strains expanding under exposure to three subinhibitory concentrations of six classes of antimicrobials from single-nucleotide genetic variants, accessory gene variation, and the presence of known AMR genes. We predicted cell growth yields in the held-out test data with an average correlation (Spearman's ρ) of 0.63 (0.36 to 0.81 across concentrations) and cell doubling times with an average correlation of 0.59 (0.32 to 0.92 across concentrations), with moderate increases in sample size unlikely to improve predictions further. This finding points to the remaining missing heritability of growth under antimicrobial exposure being explained by effects that are too rare or weak to be captured unless sample size is dramatically increased, or by effects other than those conferred by the presence of individual single-nucleotide polymorphisms (SNPs) and genes. Predictions based on whole-genome information were generally superior to those based only on known AMR genes and were accurate for AMR resistance at therapeutic concentrations. We pinpointed genes and SNPs determining the predicted growth and thereby recapitulated many known AMR determinants. Finally, we estimated the effect sizes of resistance genes across the entire collection of strains, disclosing the growth effects for known resistance genes in each individual strain. Our results underscore the potential of predictive modeling of growth patterns from genomic data under subinhibitory concentrations of antimicrobials, although the remaining missing heritability poses a challenge for achieving the accuracy and precision required for clinical use. IMPORTANCE Predicting bacterial growth from genome sequences is important for a rapid characterization of strains in clinical diagnostics and to disclose candidate novel targets for anti-infective drugs. Previous studies have dissected the relationship between bacterial growth and genotype in mutant libraries for laboratory strains, yet no study so far has examined the predictive power of genome sequence in natural strains. In this study, we used a high-throughput phenotypic assay to measure the growth of a systematic collection of natural Escherichia coli strains and then employed machine learning models to predict bacterial growth from genomic data under nontherapeutic subinhibitory concentrations of antimicrobials that are common in nonclinical settings. We found a moderate to strong correlation between predicted and actual values for the different collected data sets. Moreover, we observed that the known resistance genes are still effective at sublethal concentrations, pointing to clinical implications of these concentrations.
Collapse
|
41
|
Golden AR, Karlowsky JA, Walkty A, Baxter MR, Denisuik AJ, McCracken M, Mulvey MR, Adam HJ, Bay D, Zhanel GG. Comparison of phenotypic antimicrobial susceptibility testing results and WGS-derived genotypic resistance profiles for a cohort of ESBL-producing Escherichia coli collected from Canadian hospitals: CANWARD 2007-18. J Antimicrob Chemother 2021; 76:2825-2832. [PMID: 34378044 DOI: 10.1093/jac/dkab268] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 07/05/2021] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVES To determine whether the genotypic resistance profile inferred from WGS could accurately predict phenotypic resistance for ESBL-producing Escherichia coli isolated from patient samples in Canadian hospital laboratories. METHODS As part of the ongoing CANWARD study, 671 E. coli were collected and phenotypically confirmed as ESBL producers using CLSI M100 disc testing criteria. Isolates were sequenced using the Illumina MiSeq platform, resulting in 636 high-quality genomes for comparison. Using a rules-based approach, the genotypic resistance profile was compared with the phenotypic resistance interpretation generated using the CLSI broth microdilution method for ceftriaxone, ciprofloxacin, gentamicin and trimethoprim/sulfamethoxazole. RESULTS The most common genes associated with non-susceptibility to ceftriaxone, gentamicin and trimethoprim/sulfamethoxazole were CTX-M-15 (n = 391), aac(3)-IIa + aac(6')-Ib-cr (n = 121) and dfrA17 + sul1 (n = 169), respectively. Ciprofloxacin non-susceptibility was most commonly attributed to alterations in both gyrA (S83L + D87N) and parC (S80I + E84V), with (n = 187) or without (n = 197) aac(6')-Ib-cr. Categorical agreement (susceptible or non-susceptible) between actual and predicted phenotype was 95.6%, 98.9%, 97.6% and 88.8% for ceftriaxone, ciprofloxacin, gentamicin and trimethoprim/sulfamethoxazole, respectively. Only ciprofloxacin results (susceptible or non-susceptible) were predicted with major error (ME) and very major error (VME) rates of <3%: ciprofloxacin (ME, 1.5%; VME, 1.1%); gentamicin (ME, 0.8%-31.7%; VME, 4.8%); ceftriaxone (ME, 81.8%; VME, 3.0%); and trimethoprim/sulfamethoxazole (ME, 0.9%-23.0%; VME, 5.2%-8.5%). CONCLUSIONS Our rules-based approach for predicting a resistance phenotype from WGS performed well for ciprofloxacin, with categorical agreement of 98.9%, an ME rate of 1.5% and a VME rate of 1.1%. Although high categorical agreements were also obtained for gentamicin, ceftriaxone and trimethoprim/sulfamethoxazole, ME and/or VME rates were ≥3%.
Collapse
Affiliation(s)
- Alyssa R Golden
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada
| | - James A Karlowsky
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada.,Department of Clinical Microbiology, Shared Health Manitoba, MS673-820 Sherbrook Street, Winnipeg, Manitoba R3A 1R9, Canada
| | - Andrew Walkty
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada.,Department of Clinical Microbiology, Shared Health Manitoba, MS673-820 Sherbrook Street, Winnipeg, Manitoba R3A 1R9, Canada
| | - Melanie R Baxter
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada
| | - Andrew J Denisuik
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada
| | - Melissa McCracken
- National Microbiology Laboratory-Public Health Agency of Canada, 1015 Arlington Street, Winnipeg, Manitoba R3E 3R2 Canada
| | - Michael R Mulvey
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada.,National Microbiology Laboratory-Public Health Agency of Canada, 1015 Arlington Street, Winnipeg, Manitoba R3E 3R2 Canada
| | - Heather J Adam
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada.,Department of Clinical Microbiology, Shared Health Manitoba, MS673-820 Sherbrook Street, Winnipeg, Manitoba R3A 1R9, Canada
| | - Denice Bay
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada
| | - George G Zhanel
- Department of Medical Microbiology and Infectious Diseases, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, 727 McDermot Avenue, Winnipeg, Manitoba R3E 3P5, Canada
| |
Collapse
|
42
|
Chattaway MA, Gentle A, Nair S, Tingley L, Day M, Mohamed I, Jenkins C, Godbole G. Phylogenomics and antimicrobial resistance of Salmonella Typhi and Paratyphi A, B and C in England, 2016-2019. Microb Genom 2021; 7. [PMID: 34370659 PMCID: PMC8549371 DOI: 10.1099/mgen.0.000633] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The emergence of antimicrobial resistance (AMR) to first- and second-line treatment regimens of enteric fever is a global public-health problem, and routine genomic surveillance to inform clinical and public-health management guidance is essential. Here, we present the prospective analysis of genomic data to monitor trends in incidence, AMR and travel, and assess hierarchical clustering (HierCC) methodology of 1742 isolates of typhoidal salmonellae. Trend analysis of Salmonella Typhi and S. Paratyphi A cases per year increased 48 and 17.3%, respectively, between 2016 and 2019 in England, mainly associated with travel to South Asia. S. Paratyphi B cases have remained stable and are mainly associated with travel to the Middle East and South America. There has been an increase in the number of S. Typhi exhibiting a multidrug-resistant (MDR) profile and the emergence of extensively drug resistant (XDR) profiles. HierCC was a robust method to categorize clonal groups into clades and clusters associated with travel and AMR profiles. The majority of cases that had XDR S. Typhi reported recent travel to Pakistan (94 %) and belonged to a subpopulation of the 4.3.1 (H58) clone (HC5_1452). The phenotypic and genotypic AMR results showed high concordance for S. Typhi and S. Paratyphi A, B and C, with 99.99 % concordance and only three (0.01 %) discordant results out of a possible 23 178 isolate/antibiotic combinations. Genomic surveillance of enteric fever has shown the recent emergence and increase of MDR and XDR S. Typhi strains, resulting in a review of clinical guidelines to improve management of imported infections.
Collapse
Affiliation(s)
- Marie Anne Chattaway
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Amy Gentle
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Satheesh Nair
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Laura Tingley
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Martin Day
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Iman Mohamed
- Travel Health and IHR, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Claire Jenkins
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| | - Gauri Godbole
- Gastrointestinal Bacteria Reference Unit, National Infection Service, Public Health England, 61 Colindale Avenue, London NW9 5EQ, UK
| |
Collapse
|
43
|
Genome-Scale Metabolic Models and Machine Learning Reveal Genetic Determinants of Antibiotic Resistance in Escherichia coli and Unravel the Underlying Metabolic Adaptation Mechanisms. mSystems 2021; 6:e0091320. [PMID: 34342537 PMCID: PMC8409726 DOI: 10.1128/msystems.00913-20] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Antimicrobial resistance (AMR) is becoming one of the largest threats to public health worldwide, with the opportunistic pathogen Escherichia coli playing a major role in the AMR global health crisis. Unravelling the complex interplay between drug resistance and metabolic rewiring is key to understand the ability of bacteria to adapt to new treatments and to the development of new effective solutions to combat resistant infections. We developed a computational pipeline that combines machine learning with genome-scale metabolic models (GSMs) to elucidate the systemic relationships between genetic determinants of resistance and metabolism beyond annotated drug resistance genes. Our approach was used to identify genetic determinants of 12 AMR profiles for the opportunistic pathogenic bacterium E. coli. Then, to interpret the large number of identified genetic determinants, we applied a constraint-based approach using the GSM to predict the effects of genetic changes on growth, metabolite yields, and reaction fluxes. Our computational platform leads to multiple results. First, our approach corroborates 225 known AMR-conferring genes, 35 of which are known for the specific antibiotic. Second, integration with the GSM predicted 20 top-ranked genetic determinants (including accA, metK, fabD, fabG, murG, lptG, mraY, folP, and glmM) essential for growth, while a further 17 top-ranked genetic determinants linked AMR to auxotrophic behavior. Third, clusters of AMR-conferring genes affecting similar metabolic processes are revealed, which strongly suggested that metabolic adaptations in cell wall, energy, iron and nucleotide metabolism are associated with AMR. The computational solution can be used to study other human and animal pathogens. IMPORTANCEEscherichia coli is a major public health concern given its increasing level of antibiotic resistance worldwide and extraordinary capacity to acquire and spread resistance via horizontal gene transfer with surrounding species and via mutations in its existing genome. E. coli also exhibits a large amount of metabolic pathway redundancy, which promotes resistance via metabolic adaptability. In this study, we developed a computational approach that integrates machine learning with metabolic modeling to understand the correlation between AMR and metabolic adaptation mechanisms in this model bacterium. Using our approach, we identified AMR genetic determinants associated with cell wall modifications for increased permeability, virulence factor manipulation of host immunity, reduction of oxidative stress toxicity, and changes to energy metabolism. Unravelling the complex interplay between antibiotic resistance and metabolic rewiring may open new opportunities to understand the ability of E. coli, and potentially of other human and animal pathogens, to adapt to new treatments.
Collapse
|
44
|
Reimche JL, Chivukula VL, Schmerer MW, Joseph SJ, Pham CD, Schlanger K, St Cyr SB, Weinstock HS, Raphael BH, Kersh EN, Gernert KM. Genomic Analysis of the Predominant Strains and Antimicrobial Resistance Determinants Within 1479 Neisseria gonorrhoeae Isolates From the US Gonococcal Isolate Surveillance Project in 2018. Sex Transm Dis 2021; 48:S78-S87. [PMID: 33993166 PMCID: PMC8284387 DOI: 10.1097/olq.0000000000001471] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 05/05/2021] [Indexed: 12/19/2022]
Abstract
BACKGROUND The prevalence of Neisseria gonorrhoeae (GC) isolates with elevated minimum inhibitory concentrations to various antibiotics continues to rise in the United States and globally. Genomic analysis provides a powerful tool for surveillance of circulating strains, antimicrobial resistance determinants, and understanding of transmission through a population. METHODS Neisseria gonorrhoeae isolates collected from the US Gonococcal Isolate Surveillance Project in 2018 (n = 1479) were sequenced and characterized. Whole-genome sequencing was used to identify sequence types, antimicrobial resistance profiles, and phylogenetic relationships across demographic and geographic populations. RESULTS Genetic characterization identified that (1) 80% of the GC isolates were represented in 33 multilocus sequence types, (2) isolates clustered in 23 major phylogenetic clusters with select phenotypic and demographic prevalence, and (3) common antimicrobial resistance determinants associated with low-level or high-level decreased susceptibility or resistance to relevant antibiotics. CONCLUSIONS Characterization of this 2018 Gonococcal Isolate Surveillance Project genomic data set, which is the largest US whole-genome sequence data set to date, sets the basis for future prospective studies, and establishes a genomic baseline of GC populations for local and national monitoring.
Collapse
Affiliation(s)
- Jennifer L. Reimche
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
- Oak Ridge Institute for Science and Education Research Participation and Fellowship Program, Oak Ridge, TN
| | - Vasanta L. Chivukula
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
- Oak Ridge Institute for Science and Education Research Participation and Fellowship Program, Oak Ridge, TN
| | - Matthew W. Schmerer
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Sandeep J. Joseph
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Cau D. Pham
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Karen Schlanger
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Sancta B. St Cyr
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Hillard S. Weinstock
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Brian H. Raphael
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Ellen N. Kersh
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | - Kim M. Gernert
- From the Division of STD Prevention, National Center for HIV/AIDS, Viral Hepatitis, STD and TB Prevention, Centers for Disease Control and Prevention, Atlanta, GA
| | | |
Collapse
|
45
|
Predictive Antibiotic Susceptibility Testing by Next-Generation Sequencing for Periprosthetic Joint Infections: Potential and Limitations. Biomedicines 2021; 9:biomedicines9080910. [PMID: 34440114 PMCID: PMC8389688 DOI: 10.3390/biomedicines9080910] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 07/21/2021] [Accepted: 07/22/2021] [Indexed: 01/18/2023] Open
Abstract
Joint replacement surgeries are one of the most frequent medical interventions globally. Infections of prosthetic joints are a major health challenge and typically require prolonged or even indefinite antibiotic treatment. As multidrug-resistant pathogens continue to rise globally, novel diagnostics are critical to ensure appropriate treatment and help with prosthetic joint infections (PJI) management. To this end, recent studies have shown the potential of molecular methods such as next-generation sequencing to complement established phenotypic, culture-based methods. Together with advanced bioinformatics approaches, next-generation sequencing can provide comprehensive information on pathogen identity as well as antimicrobial susceptibility, potentially enabling rapid diagnosis and targeted therapy of PJIs. In this review, we summarize current developments in next generation sequencing based predictive antibiotic susceptibility testing and discuss potential and limitations for common PJI pathogens.
Collapse
|
46
|
Competitiveness for Nodule Colonization in Sinorhizobium meliloti: Combined In Vitro-Tagged Strain Competition and Genome-Wide Association Analysis. mSystems 2021; 6:e0055021. [PMID: 34313466 PMCID: PMC8407117 DOI: 10.1128/msystems.00550-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Associations between leguminous plants and symbiotic nitrogen-fixing rhizobia are a classic example of mutualism between a eukaryotic host and a specific group of prokaryotic microbes. Although this symbiosis is in part species specific, different rhizobial strains may colonize the same nodule. Some rhizobial strains are commonly known as better competitors than others, but detailed analyses that aim to predict rhizobial competitive abilities based on genomes are still scarce. Here, we performed a bacterial genome-wide association (GWAS) analysis to define the genomic determinants related to the competitive capabilities in the model rhizobial species Sinorhizobium meliloti. For this, 13 tester strains were green fluorescent protein (GFP) tagged and assayed versus 3 red fluorescent protein (RFP)-tagged reference competitor strains (Rm1021, AK83, and BL225C) in a Medicago sativa nodule occupancy test. Competition data and strain genomic sequences were employed to build a model for GWAS based on k-mers. Among the k-mers with the highest scores, 51 k-mers mapped on the genomes of four strains showing the highest competition phenotypes (>60% single strain nodule occupancy; GR4, KH35c, KH46, and SM11) versus BL225C. These k-mers were mainly located on the symbiosis-related megaplasmid pSymA, specifically on genes coding for transporters, proteins involved in the biosynthesis of cofactors, and proteins related to metabolism (e.g., fatty acids). The same analysis was performed considering the sum of single and mixed nodules obtained in the competition assays versus BL225C, retrieving k-mers mapped on the genes previously found and on vir genes. Therefore, the competition abilities seem to be linked to multiple genetic determinants and comprise several cellular components. IMPORTANCE Decoding the competitive pattern that occurs in the rhizosphere is challenging in the study of bacterial social interaction strategies. To date, the single-gene approach has mainly been used to uncover the bases of nodulation, but there is still a knowledge gap regarding the main features that a priori characterize rhizobial strains able to outcompete indigenous rhizobia. Therefore, tracking down which traits make different rhizobial strains able to win the competition for plant infection over other indigenous rhizobia will improve the strain selection process and, consequently, plant yield in sustainable agricultural production systems. We proved that a k-mer-based GWAS approach can efficiently identify the competition determinants of a panel of strains previously analyzed for their plant tissue occupancy using double fluorescent labeling. The reported strategy will be useful for detailed studies on the genomic aspects of the evolution of bacterial symbiosis and for an extensive evaluation of rhizobial inoculants.
Collapse
|
47
|
Mahfouz N, Ferreira I, Beisken S, von Haeseler A, Posch AE. Large-scale assessment of antimicrobial resistance marker databases for genetic phenotype prediction: a systematic review. J Antimicrob Chemother 2021; 75:3099-3108. [PMID: 32658975 PMCID: PMC7566382 DOI: 10.1093/jac/dkaa257] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 05/04/2020] [Accepted: 05/11/2020] [Indexed: 02/07/2023] Open
Abstract
Background Antimicrobial resistance (AMR) is a rising health threat with 10 million annual casualties estimated by 2050. Appropriate treatment of infectious diseases with the right antibiotics reduces the spread of antibiotic resistance. Today, clinical practice relies on molecular and PCR techniques for pathogen identification and culture-based antibiotic susceptibility testing (AST). Recently, WGS has started to transform clinical microbiology, enabling prediction of resistance phenotypes from genotypes and allowing for more informed treatment decisions. WGS-based AST (WGS-AST) depends on the detection of AMR markers in sequenced isolates and therefore requires AMR reference databases. The completeness and quality of these databases are material to increase WGS-AST performance. Methods We present a systematic evaluation of the performance of publicly available AMR marker databases for resistance prediction on clinical isolates. We used the public databases CARD and ResFinder with a final dataset of 2587 isolates across five clinically relevant pathogens from PATRIC and NDARO, public repositories of antibiotic-resistant bacterial isolates. Results CARD and ResFinder WGS-AST performance had an overall balanced accuracy of 0.52 (±0.12) and 0.66 (±0.18), respectively. Major error rates were higher in CARD (42.68%) than ResFinder (25.06%). However, CARD showed almost no very major errors (1.17%) compared with ResFinder (4.42%). Conclusions We show that AMR databases need further expansion, improved marker annotations per antibiotic rather than per antibiotic class and validated multivariate marker panels to achieve clinical utility, e.g. in order to meet performance requirements such as provided by the FDA for clinical microbiology diagnostic testing.
Collapse
Affiliation(s)
- Norhan Mahfouz
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria
| | - Inês Ferreira
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria.,Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of Vienna, Vienna 1030, Austria
| | - Stephan Beisken
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of Vienna, Vienna 1030, Austria.,Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria
| | - Andreas E Posch
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria
| |
Collapse
|
48
|
Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research. J Clin Microbiol 2021; 59:e0126020. [PMID: 33536291 DOI: 10.1128/jcm.01260-20] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Antimicrobial resistance (AMR) remains one of the most challenging phenomena of modern medicine. Machine learning (ML) is a subfield of artificial intelligence that focuses on the development of algorithms that learn how to accurately predict outcome variables using large sets of predictor variables that are typically not hand selected and are minimally curated. Models are parameterized using a training data set and then applied to a test data set on which predictive performance is evaluated. The application of ML algorithms to the problem of AMR has garnered increasing interest in the past 5 years due to the exponential growth of experimental and clinical data, heavy investment in computational capacity, improvements in algorithm performance, and increasing urgency for innovative approaches to reducing the burden of disease. Here, we review the current state of research at the intersection of ML and AMR with an emphasis on three domains of work. The first is the prediction of AMR using genomic data. The second is the use of ML to gain insight into the cellular functions disrupted by antibiotics, which forms the basis for understanding mechanisms of action and developing novel anti-infectives. The third focuses on the application of ML for antimicrobial stewardship using data extracted from the electronic health record. Although the use of ML for understanding, diagnosing, treating, and preventing AMR is still in its infancy, the continued growth of data and interest ensures it will become an important tool for future translational research programs.
Collapse
|
49
|
Abstract
Antimicrobial resistance (AMR) is an important global health threat that impacts millions of people worldwide each year. Developing methods that can detect and predict AMR phenotypes can help to mitigate the spread of AMR by informing clinical decision making and appropriate mitigation strategies. Many bioinformatic methods have been developed for predicting AMR phenotypes from whole-genome sequences and AMR genes, but recent studies have indicated that predictions can be made from incomplete genome sequence data. In order to more systematically understand this, we built random forest-based machine learning classifiers for predicting susceptible and resistant phenotypes for Klebsiella pneumoniae (1,640 strains), Mycobacterium tuberculosis (2,497 strains), and Salmonella enterica (1,981 strains). We started by building models from alignments that were based on a reference chromosome for each species. We then subsampled each chromosomal alignment and built models for the resulting subalignments, finding that very small regions, representing approximately 0.1 to 0.2% of the chromosome, are predictive. In K. pneumoniae, M. tuberculosis, and S. enterica, the subalignments are able to predict multiple AMR phenotypes with at least 70% accuracy, even though most do not encode an AMR-related function. We used these models to identify regions of the chromosome with high and low predictive signals. Finally, subalignments that retain high accuracy across larger phylogenetic distances were examined in greater detail, revealing genes and intergenic regions with potential links to AMR, virulence, transport, and survival under stress conditions. IMPORTANCE Antimicrobial resistance causes thousands of deaths annually worldwide. Understanding the regions of the genome that are involved in antimicrobial resistance is important for developing mitigation strategies and preventing transmission. Machine learning models are capable of predicting antimicrobial resistance phenotypes from bacterial genome sequence data by identifying resistance genes, mutations, and other correlated features. They are also capable of implicating regions of the genome that have not been previously characterized as being involved in resistance. In this study, we generated global chromosomal alignments for Klebsiella pneumoniae, Mycobacterium tuberculosis, and Salmonella enterica and systematically searched them for small conserved regions of the genome that enable the prediction of antimicrobial resistance phenotypes. In addition to known antimicrobial resistance genes, this analysis identified genes involved in virulence and transport functions, as well as many genes with no previous implication in antimicrobial resistance.
Collapse
|
50
|
Im H, Hwang SH, Kim BS, Choi SH. Pathogenic potential assessment of the Shiga toxin-producing Escherichia coli by a source attribution-considered machine learning model. Proc Natl Acad Sci U S A 2021; 118:e2018877118. [PMID: 33986113 PMCID: PMC8157976 DOI: 10.1073/pnas.2018877118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Instead of conventional serotyping and virulence gene combination methods, methods have been developed to evaluate the pathogenic potential of newly emerging pathogens. Among them, the machine learning (ML)-based method using whole-genome sequencing (WGS) data are getting attention because of the recent advances in ML algorithms and sequencing technologies. Here, we developed various ML models to predict the pathogenicity of Shiga toxin-producing Escherichia coli (STEC) isolates using their WGS data. The input dataset for the ML models was generated using distinct gene repertoires from positive (pathogenic) and negative (nonpathogenic) control groups in which each STEC isolate was designated based on the source attribution, the relative risk potential of the isolation sources. Among the various ML models examined, a model using the support vector machine (SVM) algorithm, the SVM model, discriminated between the two control groups most accurately. The SVM model successfully predicted the pathogenicity of the isolates from the major sources of STEC outbreaks, the isolates with the history of outbreaks, and the isolates that cannot be assessed by conventional methods. Furthermore, the SVM model effectively differentiated the pathogenic potentials of the isolates at a finer resolution. Permutation importance analyses of the input dataset further revealed the genes important for the estimation, proposing the genes potentially essential for the pathogenicity of STEC. Altogether, these results suggest that the SVM model is a more reliable and broadly applicable method to evaluate the pathogenic potential of STEC isolates compared with conventional methods.
Collapse
Affiliation(s)
- Hanhyeok Im
- National Research Laboratory of Molecular Microbiology and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
- Department of Agricultural Biotechnology and Center for Food Safety and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
| | - Seung-Ho Hwang
- National Research Laboratory of Molecular Microbiology and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
- Department of Agricultural Biotechnology and Center for Food Safety and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
| | - Byoung Sik Kim
- Department of Food Science and Engineering, Ewha Womans University, 03760 Seoul, Republic of Korea
| | - Sang Ho Choi
- National Research Laboratory of Molecular Microbiology and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea;
- Department of Agricultural Biotechnology and Center for Food Safety and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
- Center for Food and Bioconvergence, Seoul National University, 08826 Seoul, Republic of Korea
| |
Collapse
|