51
|
Competitiveness for Nodule Colonization in Sinorhizobium meliloti: Combined In Vitro-Tagged Strain Competition and Genome-Wide Association Analysis. mSystems 2021; 6:e0055021. [PMID: 34313466 PMCID: PMC8407117 DOI: 10.1128/msystems.00550-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Associations between leguminous plants and symbiotic nitrogen-fixing rhizobia are a classic example of mutualism between a eukaryotic host and a specific group of prokaryotic microbes. Although this symbiosis is in part species specific, different rhizobial strains may colonize the same nodule. Some rhizobial strains are commonly known as better competitors than others, but detailed analyses that aim to predict rhizobial competitive abilities based on genomes are still scarce. Here, we performed a bacterial genome-wide association (GWAS) analysis to define the genomic determinants related to the competitive capabilities in the model rhizobial species Sinorhizobium meliloti. For this, 13 tester strains were green fluorescent protein (GFP) tagged and assayed versus 3 red fluorescent protein (RFP)-tagged reference competitor strains (Rm1021, AK83, and BL225C) in a Medicago sativa nodule occupancy test. Competition data and strain genomic sequences were employed to build a model for GWAS based on k-mers. Among the k-mers with the highest scores, 51 k-mers mapped on the genomes of four strains showing the highest competition phenotypes (>60% single strain nodule occupancy; GR4, KH35c, KH46, and SM11) versus BL225C. These k-mers were mainly located on the symbiosis-related megaplasmid pSymA, specifically on genes coding for transporters, proteins involved in the biosynthesis of cofactors, and proteins related to metabolism (e.g., fatty acids). The same analysis was performed considering the sum of single and mixed nodules obtained in the competition assays versus BL225C, retrieving k-mers mapped on the genes previously found and on vir genes. Therefore, the competition abilities seem to be linked to multiple genetic determinants and comprise several cellular components. IMPORTANCE Decoding the competitive pattern that occurs in the rhizosphere is challenging in the study of bacterial social interaction strategies. To date, the single-gene approach has mainly been used to uncover the bases of nodulation, but there is still a knowledge gap regarding the main features that a priori characterize rhizobial strains able to outcompete indigenous rhizobia. Therefore, tracking down which traits make different rhizobial strains able to win the competition for plant infection over other indigenous rhizobia will improve the strain selection process and, consequently, plant yield in sustainable agricultural production systems. We proved that a k-mer-based GWAS approach can efficiently identify the competition determinants of a panel of strains previously analyzed for their plant tissue occupancy using double fluorescent labeling. The reported strategy will be useful for detailed studies on the genomic aspects of the evolution of bacterial symbiosis and for an extensive evaluation of rhizobial inoculants.
Collapse
|
52
|
Mahfouz N, Ferreira I, Beisken S, von Haeseler A, Posch AE. Large-scale assessment of antimicrobial resistance marker databases for genetic phenotype prediction: a systematic review. J Antimicrob Chemother 2021; 75:3099-3108. [PMID: 32658975 PMCID: PMC7566382 DOI: 10.1093/jac/dkaa257] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 05/04/2020] [Accepted: 05/11/2020] [Indexed: 02/07/2023] Open
Abstract
Background Antimicrobial resistance (AMR) is a rising health threat with 10 million annual casualties estimated by 2050. Appropriate treatment of infectious diseases with the right antibiotics reduces the spread of antibiotic resistance. Today, clinical practice relies on molecular and PCR techniques for pathogen identification and culture-based antibiotic susceptibility testing (AST). Recently, WGS has started to transform clinical microbiology, enabling prediction of resistance phenotypes from genotypes and allowing for more informed treatment decisions. WGS-based AST (WGS-AST) depends on the detection of AMR markers in sequenced isolates and therefore requires AMR reference databases. The completeness and quality of these databases are material to increase WGS-AST performance. Methods We present a systematic evaluation of the performance of publicly available AMR marker databases for resistance prediction on clinical isolates. We used the public databases CARD and ResFinder with a final dataset of 2587 isolates across five clinically relevant pathogens from PATRIC and NDARO, public repositories of antibiotic-resistant bacterial isolates. Results CARD and ResFinder WGS-AST performance had an overall balanced accuracy of 0.52 (±0.12) and 0.66 (±0.18), respectively. Major error rates were higher in CARD (42.68%) than ResFinder (25.06%). However, CARD showed almost no very major errors (1.17%) compared with ResFinder (4.42%). Conclusions We show that AMR databases need further expansion, improved marker annotations per antibiotic rather than per antibiotic class and validated multivariate marker panels to achieve clinical utility, e.g. in order to meet performance requirements such as provided by the FDA for clinical microbiology diagnostic testing.
Collapse
Affiliation(s)
- Norhan Mahfouz
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria
| | - Inês Ferreira
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria.,Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of Vienna, Vienna 1030, Austria
| | - Stephan Beisken
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria
| | - Arndt von Haeseler
- Center for Integrative Bioinformatics Vienna, Max Perutz Laboratories, University of Vienna and Medical University of Vienna, Vienna 1030, Austria.,Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria
| | - Andreas E Posch
- Ares Genetics GmbH, Karl-Farkas-Gasse 18, Vienna 1030, Austria
| |
Collapse
|
53
|
Applications of Machine Learning to the Problem of Antimicrobial Resistance: an Emerging Model for Translational Research. J Clin Microbiol 2021; 59:e0126020. [PMID: 33536291 DOI: 10.1128/jcm.01260-20] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Antimicrobial resistance (AMR) remains one of the most challenging phenomena of modern medicine. Machine learning (ML) is a subfield of artificial intelligence that focuses on the development of algorithms that learn how to accurately predict outcome variables using large sets of predictor variables that are typically not hand selected and are minimally curated. Models are parameterized using a training data set and then applied to a test data set on which predictive performance is evaluated. The application of ML algorithms to the problem of AMR has garnered increasing interest in the past 5 years due to the exponential growth of experimental and clinical data, heavy investment in computational capacity, improvements in algorithm performance, and increasing urgency for innovative approaches to reducing the burden of disease. Here, we review the current state of research at the intersection of ML and AMR with an emphasis on three domains of work. The first is the prediction of AMR using genomic data. The second is the use of ML to gain insight into the cellular functions disrupted by antibiotics, which forms the basis for understanding mechanisms of action and developing novel anti-infectives. The third focuses on the application of ML for antimicrobial stewardship using data extracted from the electronic health record. Although the use of ML for understanding, diagnosing, treating, and preventing AMR is still in its infancy, the continued growth of data and interest ensures it will become an important tool for future translational research programs.
Collapse
|
54
|
Abstract
Antimicrobial resistance (AMR) is an important global health threat that impacts millions of people worldwide each year. Developing methods that can detect and predict AMR phenotypes can help to mitigate the spread of AMR by informing clinical decision making and appropriate mitigation strategies. Many bioinformatic methods have been developed for predicting AMR phenotypes from whole-genome sequences and AMR genes, but recent studies have indicated that predictions can be made from incomplete genome sequence data. In order to more systematically understand this, we built random forest-based machine learning classifiers for predicting susceptible and resistant phenotypes for Klebsiella pneumoniae (1,640 strains), Mycobacterium tuberculosis (2,497 strains), and Salmonella enterica (1,981 strains). We started by building models from alignments that were based on a reference chromosome for each species. We then subsampled each chromosomal alignment and built models for the resulting subalignments, finding that very small regions, representing approximately 0.1 to 0.2% of the chromosome, are predictive. In K. pneumoniae, M. tuberculosis, and S. enterica, the subalignments are able to predict multiple AMR phenotypes with at least 70% accuracy, even though most do not encode an AMR-related function. We used these models to identify regions of the chromosome with high and low predictive signals. Finally, subalignments that retain high accuracy across larger phylogenetic distances were examined in greater detail, revealing genes and intergenic regions with potential links to AMR, virulence, transport, and survival under stress conditions. IMPORTANCE Antimicrobial resistance causes thousands of deaths annually worldwide. Understanding the regions of the genome that are involved in antimicrobial resistance is important for developing mitigation strategies and preventing transmission. Machine learning models are capable of predicting antimicrobial resistance phenotypes from bacterial genome sequence data by identifying resistance genes, mutations, and other correlated features. They are also capable of implicating regions of the genome that have not been previously characterized as being involved in resistance. In this study, we generated global chromosomal alignments for Klebsiella pneumoniae, Mycobacterium tuberculosis, and Salmonella enterica and systematically searched them for small conserved regions of the genome that enable the prediction of antimicrobial resistance phenotypes. In addition to known antimicrobial resistance genes, this analysis identified genes involved in virulence and transport functions, as well as many genes with no previous implication in antimicrobial resistance.
Collapse
|
55
|
Im H, Hwang SH, Kim BS, Choi SH. Pathogenic potential assessment of the Shiga toxin-producing Escherichia coli by a source attribution-considered machine learning model. Proc Natl Acad Sci U S A 2021; 118:e2018877118. [PMID: 33986113 PMCID: PMC8157976 DOI: 10.1073/pnas.2018877118] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Instead of conventional serotyping and virulence gene combination methods, methods have been developed to evaluate the pathogenic potential of newly emerging pathogens. Among them, the machine learning (ML)-based method using whole-genome sequencing (WGS) data are getting attention because of the recent advances in ML algorithms and sequencing technologies. Here, we developed various ML models to predict the pathogenicity of Shiga toxin-producing Escherichia coli (STEC) isolates using their WGS data. The input dataset for the ML models was generated using distinct gene repertoires from positive (pathogenic) and negative (nonpathogenic) control groups in which each STEC isolate was designated based on the source attribution, the relative risk potential of the isolation sources. Among the various ML models examined, a model using the support vector machine (SVM) algorithm, the SVM model, discriminated between the two control groups most accurately. The SVM model successfully predicted the pathogenicity of the isolates from the major sources of STEC outbreaks, the isolates with the history of outbreaks, and the isolates that cannot be assessed by conventional methods. Furthermore, the SVM model effectively differentiated the pathogenic potentials of the isolates at a finer resolution. Permutation importance analyses of the input dataset further revealed the genes important for the estimation, proposing the genes potentially essential for the pathogenicity of STEC. Altogether, these results suggest that the SVM model is a more reliable and broadly applicable method to evaluate the pathogenic potential of STEC isolates compared with conventional methods.
Collapse
Affiliation(s)
- Hanhyeok Im
- National Research Laboratory of Molecular Microbiology and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
- Department of Agricultural Biotechnology and Center for Food Safety and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
| | - Seung-Ho Hwang
- National Research Laboratory of Molecular Microbiology and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
- Department of Agricultural Biotechnology and Center for Food Safety and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
| | - Byoung Sik Kim
- Department of Food Science and Engineering, Ewha Womans University, 03760 Seoul, Republic of Korea
| | - Sang Ho Choi
- National Research Laboratory of Molecular Microbiology and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea;
- Department of Agricultural Biotechnology and Center for Food Safety and Toxicology, Seoul National University, 08826 Seoul, Republic of Korea
- Center for Food and Bioconvergence, Seoul National University, 08826 Seoul, Republic of Korea
| |
Collapse
|
56
|
Karlsen ST, Vesth TC, Oregaard G, Poulsen VK, Lund O, Henderson G, Bælum J. Machine learning predicts and provides insights into milk acidification rates of Lactococcus lactis. PLoS One 2021; 16:e0246287. [PMID: 33720959 PMCID: PMC7959382 DOI: 10.1371/journal.pone.0246287] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 01/17/2021] [Indexed: 11/18/2022] Open
Abstract
Lactococcus lactis strains are important components in industrial starter cultures for cheese manufacturing. They have many strain-dependent properties, which affect the final product. Here, we explored the use of machine learning to create systematic, high-throughput screening methods for these properties. Fast acidification of milk is such a strain-dependent property. To predict the maximum hourly acidification rate (Vmax), we trained Random Forest (RF) models on four different genomic representations: Presence/absence of gene families, counts of Pfam domains, the 8 nucleotide long subsequences of their DNA (8-mers), and the 9 nucleotide long subsequences of their DNA (9-mers). Vmax was measured at different temperatures, volumes, and in the presence or absence of yeast extract. These conditions were added as features in each RF model. The four models were trained on 257 strains, and the correlation between the measured Vmax and the predicted Vmax was evaluated with Pearson Correlation Coefficients (PC) on a separate dataset of 85 strains. The models all had high PC scores: 0.83 (gene presence/absence model), 0.84 (Pfam domain model), 0.76 (8-mer model), and 0.85 (9-mer model). The models all based their predictions on relevant genetic features and showed consensus on systems for lactose metabolism, degradation of casein, and pH stress response. Each model also predicted a set of features not found by the other models.
Collapse
Affiliation(s)
- Signe Tang Karlsen
- Chr. Hansen A/S, Hoersholm, Denmark
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
- * E-mail:
| | | | | | | | - Ole Lund
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
| | | | | |
Collapse
|
57
|
Rampone S, Pagliarulo C, Marena C, Orsillo A, Iannaccone M, Trionfo C, Sateriale D, Paolucci M. In silico analysis of the antimicrobial activity of phytochemicals: towards a technological breakthrough. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2021; 200:105820. [PMID: 33168272 DOI: 10.1016/j.cmpb.2020.105820] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Accepted: 10/26/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND The complications associated with infections from pathogens increasingly resistant to traditional drugs lead to a constant increase in the mortality rate among those affected. In such cases the fundamental purpose of the microbiology laboratory is to determine the sensitivity profile of pathogens to antimicrobial agents. This is an intense and complex work often not facilitated by the test's characteristics. Despite the evolution of the Antimicrobial Susceptibility Testing (AST) technologies, the technological breakthrough that could guide and facilitate the search for new antimicrobial agents is still missing. METHODS In this work, we propose the experimental use of in silico instruments, particularly feedforward Multi-Layer Perceptron (MLP) Artificial Neural Network, and Genetic Programming (GP), to verify, but also to predict, the effectiveness of natural and experimental mixtures of polyphenols against several microbial strains. RESULTS We value the results in predicting the antimicrobial sensitivity profile from the mixture data. Trained MLP shows very high correlations coefficients (0,93 and 0,97) and mean absolute errors (110,70 and 56,60) in determining the Minimum Inhibitory Concentration and Minimum Microbicidal Concentration, respectively, while GP not only evidences very high correlation coefficients (0,89 and 0,96) and low mean absolute errors (6,99 and 5,60) in the same tasks, but also gives an explicit representation of the acquired knowledge about the polyphenol mixtures. CONCLUSIONS In silico tools can help to predict phytobiotics antimicrobial efficacy, providing an useful strategy to innovate and speed up the extant classic microbiological techniques.
Collapse
Affiliation(s)
- Salvatore Rampone
- DEMM - Università del Sannio - Via delle Puglie 76, Benevento, Italy.
| | | | - Chiara Marena
- 2019-2020 EDA Course Group - Università del Sannio - Via Calandra, Benevento, Italy
| | - Antonello Orsillo
- 2019-2020 EDA Course Group - Università del Sannio - Via Calandra, Benevento, Italy
| | | | - Carmela Trionfo
- 2019-2020 EDA Course Group - Università del Sannio - Via Calandra, Benevento, Italy
| | | | - Marina Paolucci
- DST - Università del Sannio - Via dei Mulini, Benevento, Italy
| |
Collapse
|
58
|
Wang Y, Zhang L, Niu M, Li R, Tu R, Liu X, Hou J, Mao Z, Wang Z, Wang C. Genetic Risk Score Increased Discriminant Efficiency of Predictive Models for Type 2 Diabetes Mellitus Using Machine Learning: Cohort Study. Front Public Health 2021; 9:606711. [PMID: 33681127 PMCID: PMC7925839 DOI: 10.3389/fpubh.2021.606711] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 01/25/2021] [Indexed: 11/13/2022] Open
Abstract
Background: Previous studies have constructed prediction models for type 2 diabetes mellitus (T2DM), but machine learning was rarely used and few focused on genetic prediction. This study aimed to establish an effective T2DM prediction tool and to further explore the potential of genetic risk scores (GRS) via various classifiers among rural adults. Methods: In this prospective study, the GRS for a total of 5,712 participants from the Henan Rural Cohort Study was calculated. Cox proportional hazards (CPH) regression was used to analyze the associations between GRS and T2DM. CPH, artificial neural network (ANN), random forest (RF), and gradient boosting machine (GBM) were used to establish prediction models, respectively. The area under the receiver operating characteristic curve (AUC) and net reclassification index (NRI) were used to assess the discrimination ability of the models. The decision curve was plotted to determine the clinical-utility for prediction models. Results: Compared with the individuals in the lowest quintile of the GRS, the HR (95% CI) was 2.06 (1.40 to 3.03) for those with the highest quintile of GRS (Ptrend < 0.05). Based on conventional predictors, the AUCs of the prediction model were 0.815, 0.816, 0.843, and 0.851 via CPH, ANN, RF, and GBM, respectively. Changes with the integration of GRS for CPH, ANN, RF, and GBM were 0.001, 0.002, 0.018, and 0.033, respectively. The reclassifications were significantly improved for all classifiers when adding GRS (NRI: 41.2% for CPH; 41.0% for ANN; 46.4% for ANN; 45.1% for GBM). Decision curve analysis indicated the clinical benefits of model combined GRS. Conclusion: The prediction model combined with GRS may provide incremental predictions of performance beyond conventional factors for T2DM, which demonstrated the potential clinical use of genetic markers to screen vulnerable populations. Clinical Trial Registration: The Henan Rural Cohort Study is registered in the Chinese Clinical Trial Register (Registration number: ChiCTR-OOC-15006699). http://www.chictr.org.cn/showproj.aspx?proj=11375.
Collapse
Affiliation(s)
- Yikang Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Liying Zhang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China.,School of Information Engineering, Zhengzhou University, Zhengzhou, China
| | - Miaomiao Niu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Ruiying Li
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Runqi Tu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Xiaotian Liu
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Jian Hou
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Zhenxing Mao
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| | - Zhenfei Wang
- School of Information Engineering, Zhengzhou University, Zhengzhou, China
| | - Chongjian Wang
- Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, China
| |
Collapse
|
59
|
Puvača N, de Llanos Frutos R. Antimicrobial Resistance in Escherichia coli Strains Isolated from Humans and Pet Animals. Antibiotics (Basel) 2021; 10:69. [PMID: 33450827 PMCID: PMC7828219 DOI: 10.3390/antibiotics10010069] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 01/06/2021] [Accepted: 01/12/2021] [Indexed: 12/12/2022] Open
Abstract
Throughout scientific literature, we can find evidence that antimicrobial resistance has become a big problem in the recent years on a global scale. Public healthcare systems all over the world are faced with a great challenge in this respect. Obviously, there are many bacteria that can cause infections in humans and animals alike, but somehow it seems that the greatest threat nowadays comes from the Enterobacteriaceae members, especially Escherichia coli. Namely, we are witnesses to the fact that the systems that these bacteria developed to fight off antibiotics are the strongest and most diverse in Enterobacteriaceae. Our great advantage is in understanding the systems that bacteria developed to fight off antibiotics, so these can help us understand the connection between these microorganisms and the occurrence of antibiotic-resistance both in humans and their pets. Furthermore, unfavorable conditions related to the ease of E. coli transmission via the fecal-oral route among humans, environmental sources, and animals only add to the problem. For all the above stated reasons, it is evident that the epidemiology of E. coli strains and resistance mechanisms they have developed over time are extremely significant topics and all scientific findings in this area will be of vital importance in the fight against infections caused by these bacteria.
Collapse
Affiliation(s)
- Nikola Puvača
- Faculty of Biomedical and Health Sciences, Jaume I University, Avinguda de Vicent Sos Baynat, s/n, 12071 Castelló de la Plana, Spain;
- Department of Engineering Management in Biotechnology, Faculty of Economics and Engineering Management in Novi Sad, University Business Academy in Novi Sad, Cvećarska 2, 21000 Novi Sad, Serbia
| | - Rosa de Llanos Frutos
- Faculty of Biomedical and Health Sciences, Jaume I University, Avinguda de Vicent Sos Baynat, s/n, 12071 Castelló de la Plana, Spain;
| |
Collapse
|
60
|
Abstract
Escherichia coli is a clinically important bacterial species implicated in human- and livestock-associated infections worldwide. The bacterium is known to reside in the guts of humans, livestock, and wild animals. Escherichia coli is a common bacterial species in the gastrointestinal tracts of warm-blooded animals and humans. Pathogenicity and antimicrobial resistance in E. coli may emerge via host switching from animal reservoirs. Despite its potential clinical importance, knowledge of the population structure of commensal E. coli within wild hosts and the epidemiological links between E. coli in nonhuman hosts and E. coli in humans is still scarce. In this study, we analyzed the whole-genome sequencing data of a collection of 119 commensal E. coli strains recovered from the guts of 55 mammal and bird species in Mexico and Venezuela in the 1990s. We observed low concordance between the population structures of E. coli isolates colonizing wild animals and the phylogeny, taxonomy, and ecological and physiological attributes of the host species, with distantly related E. coli strains often colonizing the same or similar host species and distantly related host species often hosting closely related E. coli strains. We found no evidence for recent transmission of E. coli genomes from wild animals to either domesticated animals or humans. However, multiple livestock- and human-related virulence factor genes were present in E. coli of wild animals, including virulence factors characteristic of Shiga toxin-producing E. coli (STEC) and atypical enteropathogenic E. coli (aEPEC), where several isolates from wild hosts harbored the locus of enterocyte effacement (LEE) pathogenicity island. Moreover, E. coli isolates from wild animal hosts often harbored known antibiotic resistance determinants, including those against ciprofloxacin, aminoglycosides, tetracyclines, and beta-lactams, with some determinants present in multiple, distantly related E. coli lineages colonizing very different host animals. We conclude that genome pools of E. coli colonizing the guts of wild animals and humans share virulence and antibiotic resistance genes, underscoring the idea that wild animals could serve as reservoirs for E. coli pathogenicity in human and livestock infections. IMPORTANCEEscherichia coli is a clinically important bacterial species implicated in human- and livestock-associated infections worldwide. The bacterium is known to reside in the guts of humans, livestock, and wild animals. Although wild animals are recognized as potential reservoirs for pathogenic E. coli strains, the knowledge of the population structure of E. coli in wild hosts is still scarce. In this study, we used fine resolution of whole-genome sequencing to provide novel insights into the evolution of E. coli genomes from a small yet diverse collection of strains recovered within a broad range of wild animal species (including mammals and birds), the coevolution of E. coli strains with their hosts, and the genetics of pathogenicity of E. coli strains in wild hosts in Mexico. Our results provide evidence for the clinical importance of wild animals as reservoirs for pathogenic strains and highlight the need to include nonhuman hosts in the surveillance programs for E. coli infections.
Collapse
|
61
|
Robust detection of point mutations involved in multidrug-resistant Mycobacterium tuberculosis in the presence of co-occurrent resistance markers. PLoS Comput Biol 2020; 16:e1008518. [PMID: 33347430 PMCID: PMC7785249 DOI: 10.1371/journal.pcbi.1008518] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 01/05/2021] [Accepted: 11/11/2020] [Indexed: 11/23/2022] Open
Abstract
Tuberculosis disease is a major global public health concern and the growing prevalence of drug-resistant Mycobacterium tuberculosis is making disease control more difficult. However, the increasing application of whole-genome sequencing as a diagnostic tool is leading to the profiling of drug resistance to inform clinical practice and treatment decision making. Computational approaches for identifying established and novel resistance-conferring mutations in genomic data include genome-wide association study (GWAS) methodologies, tests for convergent evolution and machine learning techniques. These methods may be confounded by extensive co-occurrent resistance, where statistical models for a drug include unrelated mutations known to be causing resistance to other drugs. Here, we introduce a novel ‘cannibalistic’ elimination algorithm (“Hungry, Hungry SNPos”) that attempts to remove these co-occurrent resistant variants. Using an M. tuberculosis genomic dataset for the virulent Beijing strain-type (n = 3,574) with phenotypic resistance data across five drugs (isoniazid, rifampicin, ethambutol, pyrazinamide, and streptomycin), we demonstrate that this new approach is considerably more robust than traditional methods and detects resistance-associated variants too rare to be likely picked up by correlation-based techniques like GWAS. Tuberculosis is one of the deadliest infectious diseases, being responsible for more than one million deaths per year. The causing bacteria are becoming increasingly drug-resistant, which is hampering disease control. At the same time, an unprecedented amount of bacterial whole-genome sequencing is increasingly informing clinical practice. In order to detect the genetic alterations responsible for developing drug resistance and predict resistance status from genomic data, bio-statistical methods and machine learning models have been employed. However, due to strongly overlapping drug resistance phenotypes and genotypes in multidrug-resistant datasets, the results of these correlation-based approaches frequently also contain mutations related to resistance against other drugs. In the past, this issue has often been ignored or partially resolved by either restricting the input data or in post-analysis screening—with both strategies relying on prior information. Here we present a heuristic algorithm for finding resistance-associated variants and demonstrate that it is considerably more robust towards co-occurrent resistance compared to traditional techniques. The software is available at https://github.com/julibeg/HHS.
Collapse
|
62
|
Higdon SM, Huang BC, Bennett AB, Weimer BC. Identification of Nitrogen Fixation Genes in Lactococcus Isolated from Maize Using Population Genomics and Machine Learning. Microorganisms 2020; 8:microorganisms8122043. [PMID: 33419343 PMCID: PMC7768417 DOI: 10.3390/microorganisms8122043] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 12/08/2020] [Accepted: 12/17/2020] [Indexed: 02/06/2023] Open
Abstract
Sierra Mixe maize is a landrace variety from Oaxaca, Mexico, that utilizes nitrogen derived from the atmosphere via an undefined nitrogen fixation mechanism. The diazotrophic microbiota associated with the plant’s mucilaginous aerial root exudate composed of complex carbohydrates was previously identified and characterized by our group where we found 23 lactococci capable of biological nitrogen fixation (BNF) without containing any of the proposed essential genes for this trait (nifHDKENB). To determine the genes in Lactococcus associated with this phenotype, we selected 70 lactococci from the dairy industry that are not known to be diazotrophic to conduct a comparative population genomic analysis. This showed that the diazotrophic lactococcal genomes were distinctly different from the dairy isolates. Examining the pangenome followed by genome-wide association study and machine learning identified genes with the functions needed for BNF in the maize isolates that were absent from the dairy isolates. Many of the putative genes received an ‘unknown’ annotation, which led to the domain analysis of the 135 homologs. This revealed genes with molecular functions needed for BNF, including mucilage carbohydrate catabolism, glycan-mediated host adhesion, iron/siderophore utilization, and oxidation/reduction control. This is the first report of this pathway in this organism to underpin BNF. Consequently, we proposed a model needed for BNF in lactococci that plausibly accounts for BNF in the absence of the nif operon in this organism.
Collapse
Affiliation(s)
- Shawn M. Higdon
- Department of Plant Sciences, University of California, Davis, CA 95616, USA; (S.M.H.); (A.B.B.)
| | - Bihua C. Huang
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, CA 95616, USA;
- 100 K Pathogen Genome Project, University of California, Davis, CA 95616, USA
| | - Alan B. Bennett
- Department of Plant Sciences, University of California, Davis, CA 95616, USA; (S.M.H.); (A.B.B.)
| | - Bart C. Weimer
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, CA 95616, USA;
- 100 K Pathogen Genome Project, University of California, Davis, CA 95616, USA
- Correspondence:
| |
Collapse
|
63
|
Wang Y, Li F, Bharathwaj M, Rosas NC, Leier A, Akutsu T, Webb GI, Marquez-Lago TT, Li J, Lithgow T, Song J. DeepBL: a deep learning-based approach for in silico discovery of beta-lactamases. Brief Bioinform 2020; 22:5992357. [PMID: 33212503 DOI: 10.1093/bib/bbaa301] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 10/05/2020] [Accepted: 10/09/2020] [Indexed: 01/14/2023] Open
Abstract
Beta-lactamases (BLs) are enzymes localized in the periplasmic space of bacterial pathogens, where they confer resistance to beta-lactam antibiotics. Experimental identification of BLs is costly yet crucial to understand beta-lactam resistance mechanisms. To address this issue, we present DeepBL, a deep learning-based approach by incorporating sequence-derived features to enable high-throughput prediction of BLs. Specifically, DeepBL is implemented based on the Small VGGNet architecture and the TensorFlow deep learning library. Furthermore, the performance of DeepBL models is investigated in relation to the sequence redundancy level and negative sample selection in the benchmark dataset. The models are trained on datasets of varying sequence redundancy thresholds, and the model performance is evaluated by extensive benchmarking tests. Using the optimized DeepBL model, we perform proteome-wide screening for all reviewed bacterium protein sequences available from the UniProt database. These results are freely accessible at the DeepBL webserver at http://deepbl.erc.monash.edu.au/.
Collapse
Affiliation(s)
- Yanan Wang
- Biomedicine Discovery Institute and the Department of Biochemistry and Molecular Biology at Monash University, Australia
| | - Fuyi Li
- Bioinformatics from Monash University, Australia
| | - Manasa Bharathwaj
- Department of Microbiology at the Biomedicine Discovery Institute, Monash University, Australia
| | - Natalia C Rosas
- Department of Microbiology at the Biomedicine Discovery Institute, Monash University, Australia
| | - André Leier
- Department of Genetics and the Department of Cell, Developmental and Integrative Biology, University of Alabama at Birmingham (UAB) School of Medicine, USA
| | | | | | - Tatiana T Marquez-Lago
- Department of Genetics and the Department of Cell, Developmental and Integrative Biology, UAB School of Medicine, USA
| | - Jian Li
- Monash Biomedicine Discovery Institute and Department of Microbiology, Monash University, Australia
| | - Trevor Lithgow
- Department of Microbiology at Monash University, Australia
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| |
Collapse
|
64
|
Li X, Lin J, Hu Y, Zhou J. PARMAP: A Pan-Genome-Based Computational Framework for Predicting Antimicrobial Resistance. Front Microbiol 2020; 11:578795. [PMID: 33193203 PMCID: PMC7642336 DOI: 10.3389/fmicb.2020.578795] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 09/24/2020] [Indexed: 11/17/2022] Open
Abstract
Antimicrobial resistance (AMR) has emerged as one of the most urgent global threats to public health. Accurate detection of AMR phenotypes is critical for reducing the spread of AMR strains. Here, we developed PARMAP (Prediction of Antimicrobial Resistance by MAPping genetic alterations in pan-genome) to predict AMR phenotypes and to identify AMR-associated genetic alterations based on the pan-genome of bacteria by utilizing machine learning algorithms. When we applied PARMAP to 1,597 Neisseria gonorrhoeae strains, it successfully predicted their AMR phenotypes based on a pan-genome analysis. Furthermore, it identified 328 genetic alterations in 23 known AMR genes and discovered many new AMR-associated genetic alterations in ciprofloxacin-resistant N. gonorrhoeae, and it clearly indicated the genetic heterogeneity of AMR genes in different subtypes of resistant N. gonorrhoeae. Additionally, PARMAP performed well in predicting the AMR phenotypes of Mycobacterium tuberculosis and Escherichia coli, indicating the robustness of the PARMAP framework. In conclusion, PARMAP not only precisely predicts the AMR of a population of strains of a given species but also uses whole-genome sequencing data to prioritize candidate AMR-associated genetic alterations based on their likelihood of contributing to AMR. Thus, we believe that PARMAP will accelerate investigations into AMR mechanisms in other human pathogens.
Collapse
Affiliation(s)
- Xuefei Li
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Jingxia Lin
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Yongfei Hu
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Jiajian Zhou
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
65
|
Tunstall T, Portelli S, Phelan J, Clark TG, Ascher DB, Furnham N. Combining structure and genomics to understand antimicrobial resistance. Comput Struct Biotechnol J 2020; 18:3377-3394. [PMID: 33294134 PMCID: PMC7683289 DOI: 10.1016/j.csbj.2020.10.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 10/15/2020] [Accepted: 10/17/2020] [Indexed: 02/07/2023] Open
Abstract
Antimicrobials against bacterial, viral and parasitic pathogens have transformed human and animal health. Nevertheless, their widespread use (and misuse) has led to the emergence of antimicrobial resistance (AMR) which poses a potentially catastrophic threat to public health and animal husbandry. There are several routes, both intrinsic and acquired, by which AMR can develop. One major route is through non-synonymous single nucleotide polymorphisms (nsSNPs) in coding regions. Large scale genomic studies using high-throughput sequencing data have provided powerful new ways to rapidly detect and respond to such genetic mutations linked to AMR. However, these studies are limited in their mechanistic insight. Computational tools can rapidly and inexpensively evaluate the effect of mutations on protein function and evolution. Subsequent insights can then inform experimental studies, and direct existing or new computational methods. Here we review a range of sequence and structure-based computational tools, focussing on tools successfully used to investigate mutational effect on drug targets in clinically important pathogens, particularly Mycobacterium tuberculosis. Combining genomic results with the biophysical effects of mutations can help reveal the molecular basis and consequences of resistance development. Furthermore, we summarise how the application of such a mechanistic understanding of drug resistance can be applied to limit the impact of AMR.
Collapse
Affiliation(s)
- Tanushree Tunstall
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - Stephanie Portelli
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Australia
| | - Jody Phelan
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - Taane G. Clark
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - David B. Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Australia
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| |
Collapse
|
66
|
Amino Acid k-mer Feature Extraction for Quantitative Antimicrobial Resistance (AMR) Prediction by Machine Learning and Model Interpretation for Biological Insights. BIOLOGY 2020; 9:biology9110365. [PMID: 33126516 PMCID: PMC7694136 DOI: 10.3390/biology9110365] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 10/17/2020] [Accepted: 10/19/2020] [Indexed: 12/31/2022]
Abstract
Machine learning algorithms can learn mechanisms of antimicrobial resistance from the data of DNA sequence without any a priori information. Interpreting a trained machine learning algorithm can be exploited for validating the model and obtaining new information about resistance mechanisms. Different feature extraction methods, such as SNP calling and counting nucleotide k-mers have been proposed for presenting DNA sequences to the model. However, there are trade-offs between interpretability, computational complexity and accuracy for different feature extraction methods. In this study, we have proposed a new feature extraction method, counting amino acid k-mers or oligopeptides, which provides easier model interpretation compared to counting nucleotide k-mers and reaches the same or even better accuracy in comparison with different methods. Additionally, we have trained machine learning algorithms using different feature extraction methods and compared the results in terms of accuracy, model interpretability and computational complexity. We have built a new feature selection pipeline for extraction of important features so that new AMR determinants can be discovered by analyzing these features. This pipeline allows the construction of models that only use a small number of features and can predict resistance accurately.
Collapse
|
67
|
Jaillard M, Palmieri M, van Belkum A, Mahé P. Interpreting k-mer-based signatures for antibiotic resistance prediction. Gigascience 2020; 9:giaa110. [PMID: 33068113 PMCID: PMC7568433 DOI: 10.1093/gigascience/giaa110] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 07/23/2020] [Accepted: 09/16/2020] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Recent years have witnessed the development of several k-mer-based approaches aiming to predict phenotypic traits of bacteria on the basis of their whole-genome sequences. While often convincing in terms of predictive performance, the underlying models are in general not straightforward to interpret, the interplay between the actual genetic determinant and its translation as k-mers being generally hard to decipher. RESULTS We propose a simple and computationally efficient strategy allowing one to cope with the high correlation inherent to k-mer-based representations in supervised machine learning models, leading to concise and easily interpretable signatures. We demonstrate the benefit of this approach on the task of predicting the antibiotic resistance profile of a Klebsiella pneumoniae strain from its genome, where our method leads to signatures defined as weighted linear combinations of genetic elements that can easily be identified as genuine antibiotic resistance determinants, with state-of-the-art predictive performance. CONCLUSIONS By enhancing the interpretability of genomic k-mer-based antibiotic resistance prediction models, our approach improves their clinical utility and hence will facilitate their adoption in routine diagnostics by clinicians and microbiologists. While antibiotic resistance was the motivating application, the method is generic and can be transposed to any other bacterial trait. An R package implementing our method is available at https://gitlab.com/biomerieux-data-science/clustlasso.
Collapse
Affiliation(s)
| | | | | | - Pierre Mahé
- bioMérieux, Chemin de l'Orme, 69280 Marcy l'Etoile, France
| |
Collapse
|
68
|
McDermott PF, Davis JJ. Predicting antimicrobial susceptibility from the bacterial genome: A new paradigm for one health resistance monitoring. J Vet Pharmacol Ther 2020; 44:223-237. [PMID: 33010049 DOI: 10.1111/jvp.12913] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 08/25/2020] [Accepted: 09/09/2020] [Indexed: 12/11/2022]
Abstract
The laboratory identification of antibacterial resistance is a cornerstone of infectious disease medicine. In vitro antimicrobial susceptibility testing has long been based on the growth response of organisms in pure culture to a defined concentration of antimicrobial agents. By comparing individual isolates to wild-type susceptibility patterns, strains with acquired resistance can be identified. Acquired resistance can also be detected genetically. After many decades of research, the inventory of genes underlying antimicrobial resistance is well known for several pathogenic genera including zoonotic enteric organisms such as Salmonella and Campylobacter and continues to grow substantially for others. With the decline in costs for large scale DNA sequencing, it is now practicable to characterize bacteria using whole genome sequencing, including the carriage of resistance genes in individual microorganisms and those present in complex biological samples. With genomics, we can generate comprehensive, detailed information on the bacterium, the mechanisms of antibiotic resistance, clues to its source, and the nature of mobile DNA elements by which resistance spreads. These developments point to a new paradigm for antimicrobial resistance detection and tracking for both clinical and public health purposes.
Collapse
Affiliation(s)
- Patrick F McDermott
- Office of Research, Center for Veterinary Medicine, U.S. Food and Drug Administration, Laurel, MD, USA
| | - James J Davis
- Division of Data Science and Learning, Argonne National Laboratory, Argonne, IL, USA.,University of Chicago Consortium for Advanced Science and Engineering, University of Chicago, Chicago, IL, USA
| |
Collapse
|
69
|
Nguyen M, Olson R, Shukla M, VanOeffelen M, Davis JJ. Predicting antimicrobial resistance using conserved genes. PLoS Comput Biol 2020; 16:e1008319. [PMID: 33075053 PMCID: PMC7595632 DOI: 10.1371/journal.pcbi.1008319] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 10/29/2020] [Accepted: 09/07/2020] [Indexed: 11/18/2022] Open
Abstract
A growing number of studies are using machine learning models to accurately predict antimicrobial resistance (AMR) phenotypes from bacterial sequence data. Although these studies are showing promise, the models are typically trained using features derived from comprehensive sets of AMR genes or whole genome sequences and may not be suitable for use when genomes are incomplete. In this study, we explore the possibility of predicting AMR phenotypes using incomplete genome sequence data. Models were built from small sets of randomly-selected core genes after removing the AMR genes. For Klebsiella pneumoniae, Mycobacterium tuberculosis, Salmonella enterica, and Staphylococcus aureus, we report that it is possible to classify susceptible and resistant phenotypes with average F1 scores ranging from 0.80-0.89 with as few as 100 conserved non-AMR genes, with very major error rates ranging from 0.11-0.23 and major error rates ranging from 0.10-0.20. Models built from core genes have predictive power in cases where the primary AMR mechanisms result from SNPs or horizontal gene transfer. By randomly sampling non-overlapping sets of core genes, we show that F1 scores and error rates are stable and have little variance between replicates. Although these small core gene models have lower accuracies and higher error rates than models built from the corresponding assembled genomes, the results suggest that sufficient variation exists in the core non-AMR genes of a species for predicting AMR phenotypes.
Collapse
Affiliation(s)
- Marcus Nguyen
- Division of Data Science and Learning, Argonne National Laboratory, Argonne Illinois, United States of America
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, United States of America
| | - Robert Olson
- Division of Data Science and Learning, Argonne National Laboratory, Argonne Illinois, United States of America
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, United States of America
| | - Maulik Shukla
- Division of Data Science and Learning, Argonne National Laboratory, Argonne Illinois, United States of America
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, United States of America
| | - Margo VanOeffelen
- Fellowship for Interpretation of Genomes, Burr Ridge, Illinois, Illinois, United States of America
| | - James J. Davis
- Division of Data Science and Learning, Argonne National Laboratory, Argonne Illinois, United States of America
- Consortium for Advanced Science and Engineering, University of Chicago, Chicago, Illinois, United States of America
- Fellowship for Interpretation of Genomes, Burr Ridge, Illinois, Illinois, United States of America
- Northwestern Argonne Institute for Science and Engineering, Evanston, Illinois, United States of America
| |
Collapse
|
70
|
Pataki BÁ, Matamoros S, van der Putten BCL, Remondini D, Giampieri E, Aytan-Aktug D, Hendriksen RS, Lund O, Csabai I, Schultsz C. Understanding and predicting ciprofloxacin minimum inhibitory concentration in Escherichia coli with machine learning. Sci Rep 2020; 10:15026. [PMID: 32929164 PMCID: PMC7490380 DOI: 10.1038/s41598-020-71693-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 08/18/2020] [Indexed: 11/13/2022] Open
Abstract
It is important that antibiotics prescriptions are based on antimicrobial susceptibility data to ensure effective treatment outcomes. The increasing availability of next-generation sequencing, bacterial whole genome sequencing (WGS) can facilitate a more reliable and faster alternative to traditional phenotyping for the detection and surveillance of AMR. This work proposes a machine learning approach that can predict the minimum inhibitory concentration (MIC) for a given antibiotic, here ciprofloxacin, on the basis of both genome-wide mutation profiles and profiles of acquired antimicrobial resistance genes. We analysed 704 Escherichia coli genomes combined with their respective MIC measurements for ciprofloxacin originating from different countries. The four most important predictors found by the model, mutations in gyrA residues Ser83 and Asp87, a mutation in parC residue Ser80 and presence of the qnrS1 gene, have been experimentally validated before. Using only these four predictors in a linear regression model, 65% and 93% of the test samples’ MIC were correctly predicted within a two- and a four-fold dilution range, respectively. The presented work does not treat machine learning as a black box model concept, but also identifies the genomic features that determine susceptibility. The recent progress in WGS technology in combination with machine learning analysis approaches indicates that in the near future WGS of bacteria might become cheaper and faster than a MIC measurement.
Collapse
Affiliation(s)
- Bálint Ármin Pataki
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary. .,Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest, Hungary.
| | - Sébastien Matamoros
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Boas C L van der Putten
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Daniel Remondini
- Department of Physics and Astronomy (DIFA), University of Bologna, Bologna, Italy
| | - Enrico Giampieri
- Department of Experimental, Diagnostic and Specialty Medicine (DIMES), University of Bologna, Bologna, Italy
| | - Derya Aytan-Aktug
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
| | - Rene S Hendriksen
- National Food Institute, Technical University of Denmark, Lyngby, Denmark
| | - Ole Lund
- Department of Bioinformatics, Technical University of Denmark, Lyngby, Denmark
| | - István Csabai
- Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary.,Department of Computational Sciences, Wigner Research Centre for Physics of the HAS, Budapest, Hungary
| | - Constance Schultsz
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | | |
Collapse
|
71
|
Abstract
BACKGROUND During the past decade, breakthroughs in sequencing technology and computational biology have provided the basis for studies of the myriad ways in which microbial communities ("microbiota") in and on the human body influence human health and disease. In almost every medical specialty, there is now a growing interest in accurate and replicable profiling of the microbiota for use in diagnostic and therapeutic application. CONTENT This review provides an overview of approaches, challenges, and considerations for diagnostic applications borrowing from other areas of molecular diagnostics, including clinical metagenomics. Methodological considerations and evolving approaches for microbiota profiling from mitochondrially encoded 16S rRNA-based amplicon sequencing to metagenomics and metatranscriptomics are discussed. To improve replicability, at least the most vulnerable steps in testing workflows will need to be standardized and continuous efforts needed to define QC standards. Challenges such as purity of reagents and consumables, improvement of reference databases, and availability of diagnostic-grade data analysis solutions will require joint efforts across disciplines and with manufacturers. SUMMARY The body of literature supporting important links between the microbiota at different anatomic sites with human health and disease is expanding rapidly and therapeutic manipulation of the intestinal microbiota is becoming routine. The next decade will likely see implementation of microbiome diagnostics in diagnostic laboratories to fully capitalize on technological and scientific advances and apply them in routine medical practice.
Collapse
Affiliation(s)
- Robert Schlaberg
- Department of Pathology, University of Utah, Salt Lake City, UT.,ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, UT.,IDbyDNA Inc., San Francisco, CA
| |
Collapse
|
72
|
Lees JA, Mai TT, Galardini M, Wheeler NE, Horsfield ST, Parkhill J, Corander J. Improved Prediction of Bacterial Genotype-Phenotype Associations Using Interpretable Pangenome-Spanning Regressions. mBio 2020; 11:e01344-20. [PMID: 32636251 PMCID: PMC7343994 DOI: 10.1128/mbio.01344-20] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 06/05/2020] [Indexed: 12/19/2022] Open
Abstract
Discovery of genetic variants underlying bacterial phenotypes and the prediction of phenotypes such as antibiotic resistance are fundamental tasks in bacterial genomics. Genome-wide association study (GWAS) methods have been applied to study these relations, but the plastic nature of bacterial genomes and the clonal structure of bacterial populations creates challenges. We introduce an alignment-free method which finds sets of loci associated with bacterial phenotypes, quantifies the total effect of genetics on the phenotype, and allows accurate phenotype prediction, all within a single computationally scalable joint modeling framework. Genetic variants covering the entire pangenome are compactly represented by extended DNA sequence words known as unitigs, and model fitting is achieved using elastic net penalization, an extension of standard multiple regression. Using an extensive set of state-of-the-art bacterial population genomic data sets, we demonstrate that our approach performs accurate phenotype prediction, comparable to popular machine learning methods, while retaining both interpretability and computational efficiency. Compared to those of previous approaches, which test each genotype-phenotype association separately for each variant and apply a significance threshold, the variants selected by our joint modeling approach overlap substantially.IMPORTANCE Being able to identify the genetic variants responsible for specific bacterial phenotypes has been the goal of bacterial genetics since its inception and is fundamental to our current level of understanding of bacteria. This identification has been based primarily on painstaking experimentation, but the availability of large data sets of whole genomes with associated phenotype metadata promises to revolutionize this approach, not least for important clinical phenotypes that are not amenable to laboratory analysis. These models of phenotype-genotype association can in the future be used for rapid prediction of clinically important phenotypes such as antibiotic resistance and virulence by rapid-turnaround or point-of-care tests. However, despite much effort being put into adapting genome-wide association study (GWAS) approaches to cope with bacterium-specific problems, such as strong population structure and horizontal gene exchange, current approaches are not yet optimal. We describe a method that advances methodology for both association and generation of portable prediction models.
Collapse
Affiliation(s)
- John A Lees
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - T Tien Mai
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Marco Galardini
- Biological Design Center, Boston University, Boston, Massachusetts, USA
| | - Nicole E Wheeler
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Samuel T Horsfield
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - Julian Parkhill
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Jukka Corander
- Oslo Centre for Biostatistics and Epidemiology, Department of Biostatistics, University of Oslo, Oslo, Norway
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
- Helsinki Institute of Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| |
Collapse
|
73
|
PARGT: a software tool for predicting antimicrobial resistance in bacteria. Sci Rep 2020; 10:11033. [PMID: 32620856 PMCID: PMC7335159 DOI: 10.1038/s41598-020-67949-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 06/16/2020] [Indexed: 11/08/2022] Open
Abstract
With the ever-increasing availability of whole-genome sequences, machine-learning approaches can be used as an alternative to traditional alignment-based methods for identifying new antimicrobial-resistance genes. Such approaches are especially helpful when pathogens cannot be cultured in the lab. In previous work, we proposed a game-theory-based feature evaluation algorithm. When using the protein characteristics identified by this algorithm, called ‘features’ in machine learning, our model accurately identified antimicrobial resistance (AMR) genes in Gram-negative bacteria. Here we extend our study to Gram-positive bacteria showing that coupling game-theory-identified features with machine learning achieved classification accuracies between 87% and 90% for genes encoding resistance to the antibiotics bacitracin and vancomycin. Importantly, we present a standalone software tool that implements the game-theory algorithm and machine-learning model used in these studies.
Collapse
|
74
|
Anani H, Zgheib R, Hasni I, Raoult D, Fournier PE. Interest of bacterial pangenome analyses in clinical microbiology. Microb Pathog 2020; 149:104275. [PMID: 32562810 DOI: 10.1016/j.micpath.2020.104275] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Revised: 05/22/2020] [Accepted: 05/25/2020] [Indexed: 12/12/2022]
Abstract
Thanks to the progress and decreasing costs in genome sequencing technologies, more than 250,000 bacterial genomes are currently available in public databases, covering most, if not all, of the major human-associated phylogenetic groups of these microorganisms, pathogenic or not. In addition, for many of them, sequences from several strains of a given species are available, thus enabling to evaluate their genetic diversity and study their evolution. In addition, the significant cost reduction of bacterial whole genome sequencing as well as the rapid increase in the number of available bacterial genomes have prompted the development of pangenomic software tools. The study of bacterial pangenome has many applications in clinical microbiology. It can unveil the pathogenic potential and ability of bacteria to resist antimicrobials as well identify specific sequences and predict antigenic epitopes that allow molecular or serologic assays and vaccines to be designed. Bacterial pangenome constitutes a powerful method for understanding the history of human bacteria and relating these findings to diagnosis in clinical microbiology laboratories in order to optimize patient management.
Collapse
Affiliation(s)
- Hussein Anani
- Aix Marseille Univ, Institut de Recherche pour le Développement (IRD), Service de Santé des Armées, AP-HM, UMR Vecteurs Infections Tropicales et Méditerranéennes (VITROME), Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France; Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France
| | - Rita Zgheib
- Aix Marseille Univ, Institut de Recherche pour le Développement (IRD), Service de Santé des Armées, AP-HM, UMR Vecteurs Infections Tropicales et Méditerranéennes (VITROME), Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France; Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France
| | - Issam Hasni
- Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France; Aix-Marseille Université, Institut de Recherche pour le Développement (IRD), UMR Microbes Evolution Phylogeny and Infections (MEPHI), Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, France
| | - Didier Raoult
- Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France; Aix-Marseille Université, Institut de Recherche pour le Développement (IRD), UMR Microbes Evolution Phylogeny and Infections (MEPHI), Institut Hospitalo-Universitaire Méditerranée-Infection, Marseille, France; Special Infectious Agents Unit, King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Pierre-Edouard Fournier
- Aix Marseille Univ, Institut de Recherche pour le Développement (IRD), Service de Santé des Armées, AP-HM, UMR Vecteurs Infections Tropicales et Méditerranéennes (VITROME), Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France; Institut Hospitalo-Universitaire Méditerranée Infection, Marseille, France.
| |
Collapse
|
75
|
Serafim MSM, Kronenberger T, Oliveira PR, Poso A, Honório KM, Mota BEF, Maltarollo VG. The application of machine learning techniques to innovative antibacterial discovery and development. Expert Opin Drug Discov 2020; 15:1165-1180. [PMID: 32552005 DOI: 10.1080/17460441.2020.1776696] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
INTRODUCTION After the initial wave of antibiotic discovery, few novel classes of antibiotics have emerged, with the latest dating back to the 1980's. Furthermore, the pace of antibiotic drug discovery is unable to keep up with the increasing prevalence of antibiotic drug resistance. However, the increasing amount of available data promotes the use of machine learning techniques (MLT) in drug discovery projects (e.g. construction of regression/classification models and ranking/virtual screening of compounds). AREAS COVERED In this review, the authors cover some of the applications of MLT in medicinal chemistry, focusing on the development of new antibiotics, the prediction of resistance and its mechanisms. The aim of this review is to illustrate the main advantages and disadvantages and the major trends from studies over the past 5 years. EXPERT OPINION The application of MLT to antibacterial drug discovery can aid the selection of new and potent lead compounds, with desirable pharmacokinetic and toxic profiles for further optimization. The increasing volume of available data along with the constant improvement in computational power and algorithms has meant that we are experiencing a transition in the way we face modern issues such as drug resistance, where our decisions are data-driven and experiments can be focused by data-suggested hypotheses.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Internal Medicine VIII, University Hospital of Tübingen , Tübingen, Germany
| | | | - Antti Poso
- Department of Internal Medicine VIII, University Hospital of Tübingen , Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland , Kuopio, Finland
| | - Káthia Maria Honório
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP) , São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC , Santo André, Brazil
| | - Bruno Eduardo Fernandes Mota
- Departamento de Análises Clínicas e Toxicológicas, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG) , Belo Horizonte, Brazil
| |
Collapse
|
76
|
Application of artificial intelligence to the in silico assessment of antimicrobial resistance and risks to human and animal health presented by priority enteric bacterial pathogens. ACTA ACUST UNITED AC 2020; 46:180-185. [PMID: 32673383 DOI: 10.14745/ccdr.v46i06a05] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Each year, approximately one in eight Canadians are affected by foodborne illness, either through outbreaks or sporadic illness, with animals being the major reservoir for the pathogens. Whole genome sequence analyses are now routinely implemented by public and animal health laboratories to define epidemiological disease clusters and to identify potential sources of infection. Similarly, a number of bioinformatics tools can be used to identify virulence and antimicrobial resistance (AMR) determinants in the genomes of pathogenic strains. Many important clinical and phenotypic characteristics of these pathogens can now be predicted using machine learning algorithms applied to whole genome sequence data. In this overview, we compare the ability of support vector machines, gradient-boosted decision trees and artificial neural networks to predict the levels of AMR within Salmonella enterica and extended-spectrum β-lactamase (ESBL) producing Escherichia coli. We show that minimum inhibitory concentrations (MIC) for each of 13 antimicrobials for S. enterica strains can be accurately determined, and that ESBL-producing E. coli strains can be accurately classified as susceptible, intermediate or resistant for each of seven antimicrobials. In addition to AMR and bacterial populations of greatest risk to human health, artificial intelligence algorithms hold promise as tools to predict other clinically and epidemiologically important phenotypes of enteric pathogens.
Collapse
|
77
|
Macesic N, Bear Don't Walk OJ, Pe'er I, Tatonetti NP, Peleg AY, Uhlemann AC. Predicting Phenotypic Polymyxin Resistance in Klebsiella pneumoniae through Machine Learning Analysis of Genomic Data. mSystems 2020; 5:e00656-19. [PMID: 32457240 PMCID: PMC7253370 DOI: 10.1128/msystems.00656-19] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Accepted: 05/01/2020] [Indexed: 02/06/2023] Open
Abstract
Polymyxins are used as treatments of last resort for Gram-negative bacterial infections. Their increased use has led to concerns about emerging polymyxin resistance (PR). Phenotypic polymyxin susceptibility testing is resource intensive and difficult to perform accurately. The complex polygenic nature of PR and our incomplete understanding of its genetic basis make it difficult to predict PR using detection of resistance determinants. We therefore applied machine learning (ML) to whole-genome sequencing data from >600 Klebsiella pneumoniae clonal group 258 (CG258) genomes to predict phenotypic PR. Using a reference-based representation of genomic data with ML outperformed a rule-based approach that detected variants in known PR genes (area under receiver-operator curve [AUROC], 0.894 versus 0.791, P = 0.006). We noted modest increases in performance by using a bacterial genome-wide association study to filter relevant genomic features and by integrating clinical data in the form of prior polymyxin exposure. Conversely, reference-free representation of genomic data as k-mers was associated with decreased performance (AUROC, 0.692 versus 0.894, P = 0.015). When ML models were interpreted to extract genomic features, six of seven known PR genes were correctly identified by models without prior programming and several genes involved in stress responses and maintenance of the cell membrane were identified as potential novel determinants of PR. These findings are a proof of concept that whole-genome sequencing data can accurately predict PR in K. pneumoniae CG258 and may be applicable to other forms of complex antimicrobial resistance.IMPORTANCE Polymyxins are last-resort antibiotics used to treat highly resistant Gram-negative bacteria. There are increasing reports of polymyxin resistance emerging, raising concerns of a postantibiotic era. Polymyxin resistance is therefore a significant public health threat, but current phenotypic methods for detection are difficult and time-consuming to perform. There have been increasing efforts to use whole-genome sequencing for detection of antibiotic resistance, but this has been difficult to apply to polymyxin resistance because of its complex polygenic nature. The significance of our research is that we successfully applied machine learning methods to predict polymyxin resistance in Klebsiella pneumoniae clonal group 258, a common health care-associated and multidrug-resistant pathogen. Our findings highlight that machine learning can be successfully applied even in complex forms of antibiotic resistance and represent a significant contribution to the literature that could be used to predict resistance in other bacteria and to other antibiotics.
Collapse
Affiliation(s)
- Nenad Macesic
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, New York, USA
- Department of Infectious Diseases, The Alfred Hospital and Central Clinical School, Monash University, Melbourne, Australia
| | | | - Itsik Pe'er
- Department of Computer Science, Columbia University, New York, New York, USA
| | - Nicholas P Tatonetti
- Department of Biomedical Informatics, Columbia University, New York, New York, USA
| | - Anton Y Peleg
- Department of Infectious Diseases, The Alfred Hospital and Central Clinical School, Monash University, Melbourne, Australia
- Infection and Immunity Program, Monash Biomedicine Discovery Institute, Department of Microbiology, Monash University, Clayton, Victoria, Australia
| | - Anne-Catrin Uhlemann
- Division of Infectious Diseases, Columbia University Irving Medical Center, New York, New York, USA
- Microbiome & Pathogen Genomics Core, Columbia University Irving Medical Center, New York, New York, USA
| |
Collapse
|
78
|
Khaledi A, Weimann A, Schniederjans M, Asgari E, Kuo T, Oliver A, Cabot G, Kola A, Gastmeier P, Hogardt M, Jonas D, Mofrad MRK, Bremges A, McHardy AC, Häussler S. Predicting antimicrobial resistance in Pseudomonas aeruginosa with machine learning-enabled molecular diagnostics. EMBO Mol Med 2020; 12:e10264. [PMID: 32048461 PMCID: PMC7059009 DOI: 10.15252/emmm.201910264] [Citation(s) in RCA: 90] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 12/24/2019] [Accepted: 01/09/2020] [Indexed: 12/20/2022] Open
Abstract
Limited therapy options due to antibiotic resistance underscore the need for optimization of current diagnostics. In some bacterial species, antimicrobial resistance can be unambiguously predicted based on their genome sequence. In this study, we sequenced the genomes and transcriptomes of 414 drug-resistant clinical Pseudomonas aeruginosa isolates. By training machine learning classifiers on information about the presence or absence of genes, their sequence variation, and expression profiles, we generated predictive models and identified biomarkers of resistance to four commonly administered antimicrobial drugs. Using these data types alone or in combination resulted in high (0.8-0.9) or very high (> 0.9) sensitivity and predictive values. For all drugs except for ciprofloxacin, gene expression information improved diagnostic performance. Our results pave the way for the development of a molecular resistance profiling tool that reliably predicts antimicrobial susceptibility based on genomic and transcriptomic markers. The implementation of a molecular susceptibility test system in routine microbiology diagnostics holds promise to provide earlier and more detailed information on antibiotic resistance profiles of bacterial pathogens and thus could change how physicians treat bacterial infections.
Collapse
Affiliation(s)
- Ariane Khaledi
- Department of Molecular BacteriologyHelmholtz Centre for Infection ResearchBraunschweigGermany
- Molecular Bacteriology GroupTWINCORE‐Centre for Experimental and Clinical Infection ResearchHannoverGermany
| | - Aaron Weimann
- Molecular Bacteriology GroupTWINCORE‐Centre for Experimental and Clinical Infection ResearchHannoverGermany
- Computational Biology of Infection ResearchHelmholtz Centre for Infection ResearchBraunschweigGermany
- German Center for Infection Research (DZIF)BraunschweigGermany
| | - Monika Schniederjans
- Department of Molecular BacteriologyHelmholtz Centre for Infection ResearchBraunschweigGermany
- Molecular Bacteriology GroupTWINCORE‐Centre for Experimental and Clinical Infection ResearchHannoverGermany
| | - Ehsaneddin Asgari
- Computational Biology of Infection ResearchHelmholtz Centre for Infection ResearchBraunschweigGermany
- Molecular Cell Biomechanics LaboratoryDepartments of Bioengineering and Mechanical EngineeringUniversity of CaliforniaBerkeleyCAUSA
| | - Tzu‐Hao Kuo
- Computational Biology of Infection ResearchHelmholtz Centre for Infection ResearchBraunschweigGermany
| | - Antonio Oliver
- Servicio de Microbiología y Unidad de Investigación Hospital Universitario Son EspasesInstituto de Investigación Sanitaria Illes Balears (IdISPa)Palma de MallorcaSpain
| | - Gabriel Cabot
- Servicio de Microbiología y Unidad de Investigación Hospital Universitario Son EspasesInstituto de Investigación Sanitaria Illes Balears (IdISPa)Palma de MallorcaSpain
| | - Axel Kola
- Institute of Hygiene and Environmental MedicineCharité – Universitätsmedizin BerlinBerlinGermany
| | - Petra Gastmeier
- Institute of Hygiene and Environmental MedicineCharité – Universitätsmedizin BerlinBerlinGermany
| | - Michael Hogardt
- Institute of Medical Microbiology and Infection ControlUniversity Hospital FrankfurtFrankfurt/MainGermany
| | - Daniel Jonas
- Faculty of MedicineInstitute for Infection Prevention and Hospital EpidemiologyMedical Center‐University of FreiburgFreiburgGermany
| | - Mohammad RK Mofrad
- Molecular Cell Biomechanics LaboratoryDepartments of Bioengineering and Mechanical EngineeringUniversity of CaliforniaBerkeleyCAUSA
- Molecular Biophysics and Integrated Bioimaging DivisionLawrence Berkeley National LabBerkeleyCAUSA
| | - Andreas Bremges
- Computational Biology of Infection ResearchHelmholtz Centre for Infection ResearchBraunschweigGermany
- German Center for Infection Research (DZIF)BraunschweigGermany
| | - Alice C McHardy
- Computational Biology of Infection ResearchHelmholtz Centre for Infection ResearchBraunschweigGermany
- German Center for Infection Research (DZIF)BraunschweigGermany
| | - Susanne Häussler
- Department of Molecular BacteriologyHelmholtz Centre for Infection ResearchBraunschweigGermany
- Molecular Bacteriology GroupTWINCORE‐Centre for Experimental and Clinical Infection ResearchHannoverGermany
| |
Collapse
|
79
|
Abstract
Prompt and effective antimicrobial therapy is crucial for the management of patients with severe bacterial infections but is becoming increasingly difficult to provide due to emerging antibiotic resistance. The traditional methods for antibiotic susceptibility testing (AST) used in most clinical laboratories are reliable but slow with turnaround times of 2 to 3 days, which necessitates the use of empirical therapy with broad-spectrum antibiotics. There is a great need for fast and reliable AST methods that enable starting targeted treatment within a few hours to improve patient outcome and reduce the overuse of broad-spectrum antibiotics. The multiplex fluidic chip for phenotypic AST described in the present study may enable data on antimicrobial resistance within 2 to 4 h, allowing for an early initiation of appropriate antibiotic therapy. Many patients with severe infections receive inappropriate empirical treatment, and rapid detection of bacterial antibiotic susceptibility can improve clinical outcome and reduce mortality. To this end, we have developed a multiplex fluidic chip for rapid phenotypic antibiotic susceptibility testing of bacteria. A total of 21 clinical isolates of Escherichia coli, Klebsiella pneumoniae, and Staphylococcus aureus were acquired from the EUCAST Development Laboratory and tested against amikacin, ceftazidime, and meropenem (Gram-negative bacteria) or gentamicin, ofloxacin, and tetracycline (Gram-positive bacteria). The bacterial samples were mixed with agarose and loaded in an array of growth chambers in the chip where bacterial microcolony growth was monitored over time using automated image analysis. MIC values were automatically obtained by tracking the growth rates of individual microcolonies in different regions of antibiotic gradients. Stable MIC values were obtained within 2 to 4 h, and the results showed categorical agreement with reference MIC values as determined by broth microdilution in 86% of the cases.
Collapse
|
80
|
Peiffer-Smadja N, Dellière S, Rodriguez C, Birgand G, Lescure FX, Fourati S, Ruppé E. Machine learning in the clinical microbiology laboratory: has the time come for routine practice? Clin Microbiol Infect 2020; 26:1300-1309. [PMID: 32061795 DOI: 10.1016/j.cmi.2020.02.006] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 02/04/2020] [Accepted: 02/06/2020] [Indexed: 12/20/2022]
Abstract
BACKGROUND Machine learning (ML) allows the analysis of complex and large data sets and has the potential to improve health care. The clinical microbiology laboratory, at the interface of clinical practice and diagnostics, is of special interest for the development of ML systems. AIMS This narrative review aims to explore the current use of ML In clinical microbiology. SOURCES References for this review were identified through searches of MEDLINE/PubMed, EMBASE, Google Scholar, biorXiv, arXiV, ACM Digital Library and IEEE Xplore Digital Library up to November 2019. CONTENT We found 97 ML systems aiming to assist clinical microbiologists. Overall, 82 ML systems (85%) targeted bacterial infections, 11 (11%) parasitic infections, nine (9%) viral infections and three (3%) fungal infections. Forty ML systems (41%) focused on microorganism detection, identification and quantification, 36 (37%) evaluated antimicrobial susceptibility, and 21 (22%) targeted the diagnosis, disease classification and prediction of clinical outcomes. The ML systems used very diverse data sources: 21 (22%) used genomic data of microorganisms, 19 (20%) microbiota data obtained by metagenomic sequencing, 19 (20%) analysed microscopic images, 17 (18%) spectroscopy data, eight (8%) targeted gene sequencing, six (6%) volatile organic compounds, four (4%) photographs of bacterial colonies, four (4%) transcriptome data, three (3%) protein structure, and three (3%) clinical data. Most systems used data from high-income countries (n = 71, 73%) but a significant number used data from low- and middle-income countries (n = 36, 37%). Performance measures were reported for the 97 ML systems, but no article described their use in clinical practice or reported impact on processes or clinical outcomes. IMPLICATIONS In clinical microbiology, ML has been used with various data sources and diverse practical applications. The evaluation and implementation processes represent the main gap in existing ML systems, requiring a focus on their interpretability and potential integration into real-world settings.
Collapse
Affiliation(s)
- N Peiffer-Smadja
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK; Université de Paris, IAME, INSERM, F-75018 Paris, France
| | - S Dellière
- Université de Paris, Laboratoire de Parasitologie-Mycologie, Groupe Hospitalier Saint-Louis-Lariboisière-Fernand-Widal, Assistance Publique-Hôpitaux de Paris (AP-HP), Paris, France
| | - C Rodriguez
- Department of Prevention, Diagnosis and Treatment of Infections, Henri-Mondor Hospital, APHP, Université Paris-Est Créteil, IMRB, INSERM U955, Créteil, France
| | - G Birgand
- National Institute for Health Research Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, Imperial College London, London, UK
| | - F-X Lescure
- Université de Paris, IAME, INSERM, F-75018 Paris, France
| | - S Fourati
- Department of Prevention, Diagnosis and Treatment of Infections, Henri-Mondor Hospital, APHP, Université Paris-Est Créteil, IMRB, INSERM U955, Créteil, France
| | - E Ruppé
- Université de Paris, IAME, INSERM, F-75018 Paris, France.
| |
Collapse
|
81
|
Feretzakis G, Loupelis E, Sakagianni A, Kalles D, Martsoukou M, Lada M, Skarmoutsou N, Christopoulos C, Valakis K, Velentza A, Petropoulou S, Michelidou S, Alexiou K. Using Machine Learning Techniques to Aid Empirical Antibiotic Therapy Decisions in the Intensive Care Unit of a General Hospital in Greece. Antibiotics (Basel) 2020; 9:E50. [PMID: 32023854 PMCID: PMC7167935 DOI: 10.3390/antibiotics9020050] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 01/26/2020] [Accepted: 01/27/2020] [Indexed: 11/18/2022] Open
Abstract
Hospital-acquired infections, particularly in the critical care setting, have become increasingly common during the last decade, with Gram-negative bacterial infections presenting the highest incidence among them. Multi-drug-resistant (MDR) Gram-negative infections are associated with high morbidity and mortality with significant direct and indirect costs resulting from long hospitalization due to antibiotic failure. Time is critical to identifying bacteria and their resistance to antibiotics due to the critical health status of patients in the intensive care unit (ICU). As common antibiotic resistance tests require more than 24 h after the sample is collected to determine sensitivity in specific antibiotics, we suggest applying machine learning (ML) techniques to assist the clinician in determining whether bacteria are resistant to individual antimicrobials by knowing only a sample's Gram stain, site of infection, and patient demographics. In our single center study, we compared the performance of eight machine learning algorithms to assess antibiotic susceptibility predictions. The demographic characteristics of the patients are considered for this study, as well as data from cultures and susceptibility testing. Applying machine learning algorithms to patient antimicrobial susceptibility data, readily available, solely from the Microbiology Laboratory without any of the patient's clinical data, even in resource-limited hospital settings, can provide informative antibiotic susceptibility predictions to aid clinicians in selecting appropriate empirical antibiotic therapy. These strategies, when used as a decision support tool, have the potential to improve empiric therapy selection and reduce the antimicrobial resistance burden.
Collapse
Affiliation(s)
- Georgios Feretzakis
- School of Science and Technology, Hellenic Open University, 26335 Patras, Greece;
- IT Department, Sismanogleio General Hospital, 15126 Marousi, Greece; (E.L.); (S.P.)
- Department of Quality Control, Research and Continuing Education, Sismanogleio General Hospital, 15126 Marousi, Greece
| | - Evangelos Loupelis
- IT Department, Sismanogleio General Hospital, 15126 Marousi, Greece; (E.L.); (S.P.)
| | - Aikaterini Sakagianni
- Intensive Care Unit, Sismanogleio General Hospital, 15126 Marousi, Greece; (A.S.); (K.V.); (S.M.)
| | - Dimitris Kalles
- School of Science and Technology, Hellenic Open University, 26335 Patras, Greece;
| | - Maria Martsoukou
- Microbiology Laboratory, Sismanogleio General Hospital, 15126 Marousi, Greece; (M.M.); (N.S.); (A.V.)
| | - Malvina Lada
- 2nd Internal Medicine Department, Sismanogleio General Hospital, 15126 Marousi, Greece;
| | - Nikoletta Skarmoutsou
- Microbiology Laboratory, Sismanogleio General Hospital, 15126 Marousi, Greece; (M.M.); (N.S.); (A.V.)
| | | | - Konstantinos Valakis
- Intensive Care Unit, Sismanogleio General Hospital, 15126 Marousi, Greece; (A.S.); (K.V.); (S.M.)
| | - Aikaterini Velentza
- Microbiology Laboratory, Sismanogleio General Hospital, 15126 Marousi, Greece; (M.M.); (N.S.); (A.V.)
| | | | - Sophia Michelidou
- Intensive Care Unit, Sismanogleio General Hospital, 15126 Marousi, Greece; (A.S.); (K.V.); (S.M.)
| | | |
Collapse
|
82
|
Abstract
Machine learning is a proven method to predict AMR; however, the performance of any machine learning model depends on the quality of the input data. Therefore, we evaluated different methods of representing information about mutations as well as mobilizable genes, so that the information can serve as input for a robust model. We combined data from multiple bacterial species in order to develop species-independent machine learning models that can predict resistance profiles for multiple antimicrobials and species with high performance. Machine learning has proven to be a powerful method to predict antimicrobial resistance (AMR) without using prior knowledge for selected bacterial species-antimicrobial combinations. To date, only species-specific machine learning models have been developed, and to the best of our knowledge, the inclusion of information from multiple species has not been attempted. The aim of this study was to determine the feasibility of including information from multiple bacterial species to predict AMR for an individual species, since this may make it easier to train and update resistance predictions for multiple species and may lead to improved predictions. Whole-genome sequence data and susceptibility profiles from 3,528 Mycobacterium tuberculosis, 1,694 Escherichia coli, 658 Salmonella enterica, and 1,236 Staphylococcus aureus isolates were included. We developed machine learning models trained by the features of the PointFinder and ResFinder programs detected to predict binary (susceptible/resistant) AMR profiles. We tested four feature representation methods to determine the most efficient way for introducing features into the models. When training the model only on the Mycobacterium tuberculosis isolates, high prediction performances were obtained for the six AMR profiles included. By adding information on ciprofloxacin from the additional 3,588 isolates, there was no reduction in performance for the other antimicrobials but an increased performance for ciprofloxacin AMR profile prediction for Mycobacterium tuberculosis and Escherichia coli. In conclusion, the species-independent models can predict multi-AMR profiles for multiple species without losing any robustness. IMPORTANCE Machine learning is a proven method to predict AMR; however, the performance of any machine learning model depends on the quality of the input data. Therefore, we evaluated different methods of representing information about mutations as well as mobilizable genes, so that the information can serve as input for a robust model. We combined data from multiple bacterial species in order to develop species-independent machine learning models that can predict resistance profiles for multiple antimicrobials and species with high performance.
Collapse
|
83
|
Ndagi U, Falaki AA, Abdullahi M, Lawal MM, Soliman ME. Antibiotic resistance: bioinformatics-based understanding as a functional strategy for drug design. RSC Adv 2020; 10:18451-18468. [PMID: 35685616 PMCID: PMC9122625 DOI: 10.1039/d0ra01484b] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Accepted: 05/01/2020] [Indexed: 12/19/2022] Open
Abstract
The use of antibiotics to manage infectious diseases dates back to ancient civilization, but the lack of a clear distinction between the therapeutic and toxic dose has been a major challenge. This precipitates the notion that antibiotic resistance was from time immemorial, principally because of a lack of adequate knowledge of therapeutic doses and continuous exposure of these bacteria to suboptimal plasma concentration of antibiotics. With the discovery of penicillin by Alexander Fleming in 1924, a milestone in bacterial infections' treatment was achieved. This forms the foundation for the modern era of antibiotic drugs. Antibiotics such as penicillins, cephalosporins, quinolones, tetracycline, macrolides, sulphonamides, aminoglycosides and glycopeptides are the mainstay in managing severe bacterial infections, but resistant strains of bacteria have emerged and hampered the progress of research in this field. Recently, new approaches to research involving bacteria resistance to antibiotics have appeared; these involve combining the molecular understanding of bacteria systems with the knowledge of bioinformatics. Consequently, many molecules have been developed to curb resistance associated with different bacterial infections. However, because of increased emphasis on the clinical relevance of antibiotics, the synergy between in silico study and in vivo study is well cemented and this facilitates the discovery of potent antibiotics. In this review, we seek to give an overview of earlier reviews and molecular and structural understanding of bacteria resistance to antibiotics, while focusing on the recent bioinformatics approach to antibacterial drug discovery. Understanding the evolution of antibiotic resistance at the molecular level as a functional tool for bioinformatic-based drug design.![]()
Collapse
Affiliation(s)
- Umar Ndagi
- Centre for Trans-Sahara Disease, Vaccine and Drug Research
- Ibrahim Badamasi Babangida University
- Lapai
- Nigeria
| | - Abubakar A. Falaki
- Department of Microbiology
- School of Agriculture and Applied Sciences
- University of KwaZulu-Natal
- Durban 4001
- South Africa
| | - Maryam Abdullahi
- Faculty of Pharmaceutical Sciences
- Ahmadu Bello University Zaria
- Nigeria
| | - Monsurat M. Lawal
- School of Laboratory Medicine and Medical Sciences
- University of KwaZulu-Natal
- Durban 4001
- South Africa
| | - Mahmoud E. Soliman
- Molecular Modeling and Drug Design Research Group
- School of Health Sciences
- University of KwaZulu Natal
- Durban 4001
- South Africa
| |
Collapse
|
84
|
Mancino W, Lugli GA, van Sinderen D, Ventura M, Turroni F. Mobilome and Resistome Reconstruction from Genomes Belonging to Members of the Bifidobacterium Genus. Microorganisms 2019; 7:microorganisms7120638. [PMID: 31810287 PMCID: PMC6956390 DOI: 10.3390/microorganisms7120638] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 11/27/2019] [Accepted: 11/29/2019] [Indexed: 02/06/2023] Open
Abstract
Specific members of the genus Bifidobacterium are among the first colonizers of the human/animal gut, where they act as important intestinal commensals associated with host health. As part of the gut microbiota, bifidobacteria may be exposed to antibiotics, used in particular for intrapartum prophylaxis, especially to prevent Streptococcus infections, or in the very early stages of life after the birth. In the current study, we reconstructed the in silico resistome of the Bifidobacterium genus, analyzing a database composed of 625 bifidobacterial genomes, including partial assembled strains with less than 100 genomic sequences. Furthermore, we screened bifidobacterial genomes for mobile genetic elements, such as transposases and prophage-like elements, in order to investigate the correlation between the bifido-mobilome and the bifido-resistome, also identifying genetic insertion hotspots that appear to be prone to horizontal gene transfer (HGT) events. These insertion hotspots were shown to be widely distributed among analyzed bifidobacterial genomes, and suggest the acquisition of antibiotic resistance genes through HGT events. These data were further corroborated by growth experiments directed to evaluate bacitracin A resistance in Bifidobacterium spp., a property that was predicted by in silico analyses to be part of the HGT-acquired resistome.
Collapse
Affiliation(s)
- Walter Mancino
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (W.M.); (G.A.L.); (M.V.)
| | - Gabriele Andrea Lugli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (W.M.); (G.A.L.); (M.V.)
| | - Douwe van Sinderen
- School of Microbiology, APC Microbiome Institute, University College Cork, Cork T12 K8AF, Ireland;
| | - Marco Ventura
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (W.M.); (G.A.L.); (M.V.)
- Microbiome Research Hub, University of Parma, 43124 Parma, Italy
| | - Francesca Turroni
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences and Environmental Sustainability, University of Parma, 43124 Parma, Italy; (W.M.); (G.A.L.); (M.V.)
- Microbiome Research Hub, University of Parma, 43124 Parma, Italy
- Correspondence: ; Tel.: +39-521-905666; Fax: +39-521-905604
| |
Collapse
|
85
|
Wardell SJT, Rehman A, Martin LW, Winstanley C, Patrick WM, Lamont IL. A large-scale whole-genome comparison shows that experimental evolution in response to antibiotics predicts changes in naturally evolved clinical Pseudomonas aeruginosa. Antimicrob Agents Chemother 2019; 63:AAC.01619-19. [PMID: 31570397 PMCID: PMC6879238 DOI: 10.1128/aac.01619-19] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Accepted: 09/23/2019] [Indexed: 12/13/2022] Open
Abstract
Pseudomonas aeruginosa is an opportunistic pathogen that causes a wide range of acute and chronic infections. An increasing number of isolates have mutations that make them antibiotic resistant, making treatment difficult. To identify resistance-associated mutations we experimentally evolved the antibiotic sensitive strain P. aeruginosa PAO1 to become resistant to three widely used anti-pseudomonal antibiotics, ciprofloxacin, meropenem and tobramycin. Mutants could tolerate up to 2048-fold higher concentrations of antibiotic than strain PAO1. Genome sequences were determined for thirteen mutants for each antibiotic. Each mutant had between 2 and 8 mutations. For each antibiotic at least 8 genes were mutated in multiple mutants, demonstrating the genetic complexity of resistance. For all three antibiotics mutations arose in genes known to be associated with resistance, but also in genes not previously associated with resistance. To determine the clinical relevance of mutations uncovered in this study we analysed the corresponding genes in 558 isolates of P. aeruginosa from patients with chronic lung disease and in 172 isolates from the general environment. Many genes identified through experimental evolution had predicted function-altering changes in clinical isolates but not in environmental isolates, showing that mutated genes in experimentally evolved bacteria can predict those that undergo mutation during infection. Additionally, large deletions of up to 479kb arose in experimentally evolved meropenem resistant mutants and large deletions were present in 87 of the clinical isolates. These findings significantly advance understanding of antibiotic resistance in P. aeruginosa and demonstrate the validity of experimental evolution in identifying clinically-relevant resistance-associated mutations.
Collapse
Affiliation(s)
| | - Attika Rehman
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Lois W Martin
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Craig Winstanley
- Institute of Infection and Global Health, University of Liverpool, Liverpool, United Kingdom
| | - Wayne M Patrick
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
- School of Biological Sciences, Victoria University of Wellington, Wellington, New Zealand
| | - Iain L Lamont
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| |
Collapse
|
86
|
Chowdhury A, Khaledian E, Broschat S. Capreomycin resistance prediction in two species of
Mycobacterium
using a stacked ensemble method. J Appl Microbiol 2019; 127:1656-1664. [DOI: 10.1111/jam.14413] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2019] [Revised: 07/05/2019] [Accepted: 07/17/2019] [Indexed: 01/29/2023]
Affiliation(s)
- A.S. Chowdhury
- School of Electrical Engineering and Computer Science Washington State University Pullman WA USA
| | - E. Khaledian
- School of Electrical Engineering and Computer Science Washington State University Pullman WA USA
| | - S.L. Broschat
- School of Electrical Engineering and Computer Science Washington State University Pullman WA USA
- Paul G. Allen School for Global Animal Health Washington State University Pullman WA USA
- Department of Veterinary Microbiology and Pathology Washington State University Pullman WA USA
| |
Collapse
|
87
|
Hicks AL, Wheeler N, Sánchez-Busó L, Rakeman JL, Harris SR, Grad YH. Evaluation of parameters affecting performance and reliability of machine learning-based antibiotic susceptibility testing from whole genome sequencing data. PLoS Comput Biol 2019; 15:e1007349. [PMID: 31479500 PMCID: PMC6743791 DOI: 10.1371/journal.pcbi.1007349] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Revised: 09/13/2019] [Accepted: 08/21/2019] [Indexed: 12/20/2022] Open
Abstract
Prediction of antibiotic resistance phenotypes from whole genome sequencing data by machine learning methods has been proposed as a promising platform for the development of sequence-based diagnostics. However, there has been no systematic evaluation of factors that may influence performance of such models, how they might apply to and vary across clinical populations, and what the implications might be in the clinical setting. Here, we performed a meta-analysis of seven large Neisseria gonorrhoeae datasets, as well as Klebsiella pneumoniae and Acinetobacter baumannii datasets, with whole genome sequence data and antibiotic susceptibility phenotypes using set covering machine classification, random forest classification, and random forest regression models to predict resistance phenotypes from genotype. We demonstrate how model performance varies by drug, dataset, resistance metric, and species, reflecting the complexities of generating clinically relevant conclusions from machine learning-derived models. Our findings underscore the importance of incorporating relevant biological and epidemiological knowledge into model design and assessment and suggest that doing so can inform tailored modeling for individual drugs, pathogens, and clinical populations. We further suggest that continued comprehensive sampling and incorporation of up-to-date whole genome sequence data, resistance phenotypes, and treatment outcome data into model training will be crucial to the clinical utility and sustainability of machine learning-based molecular diagnostics.
Collapse
Affiliation(s)
- Allison L. Hicks
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- * E-mail: (ALH); (YHG)
| | - Nicole Wheeler
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Leonor Sánchez-Busó
- Centre for Genomic Pathogen Surveillance, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Jennifer L. Rakeman
- Public Health Laboratory, Division of Disease Control, New York City Department of Health and Mental Hygiene, New York, New York, United States of America
| | - Simon R. Harris
- Microbiotica Ltd, Biodata Innovation Centre, Wellcome Genome Campus, Hinxton, Cambridgeshire, United Kingdom
| | - Yonatan H. Grad
- Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, United States of America
- Division of Infectious Diseases, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
- * E-mail: (ALH); (YHG)
| |
Collapse
|
88
|
Martínez-Agüero S, Mora-Jiménez I, Lérida-García J, Álvarez-Rodríguez J, Soguero-Ruiz C. Machine Learning Techniques to Identify Antimicrobial Resistance in the Intensive Care Unit. ENTROPY 2019; 21:e21060603. [PMID: 33267317 PMCID: PMC7515087 DOI: 10.3390/e21060603] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/04/2019] [Accepted: 06/13/2019] [Indexed: 12/21/2022]
Abstract
The presence of bacteria with resistance to specific antibiotics is one of the greatest threats to the global health system. According to the World Health Organization, antimicrobial resistance has already reached alarming levels in many parts of the world, involving a social and economic burden for the patient, for the system, and for society in general. Because of the critical health status of patients in the intensive care unit (ICU), time is critical to identify bacteria and their resistance to antibiotics. Since common antibiotics resistance tests require between 24 and 48 h after the culture is collected, we propose to apply machine learning (ML) techniques to determine whether a bacterium will be resistant to different families of antimicrobials. For this purpose, clinical and demographic features from the patient, as well as data from cultures and antibiograms are considered. From a population point of view, we also show graphically the relationship between different bacteria and families of antimicrobials by performing correspondence analysis. Results of the ML techniques evidence non-linear relationships helping to identify antimicrobial resistance at the ICU, with performance dependent on the family of antimicrobials. A change in the trend of antimicrobial resistance is also evidenced.
Collapse
Affiliation(s)
- Sergio Martínez-Agüero
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid 28943, Spain
| | - Inmaculada Mora-Jiménez
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid 28943, Spain
| | - Jon Lérida-García
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid 28943, Spain
| | | | - Cristina Soguero-Ruiz
- Department of Signal Theory and Communications, Telematics and Computing Systems, Rey Juan Carlos University, Madrid 28943, Spain
- Correspondence: ; Tel.: +34-91-488-87-41
| |
Collapse
|
89
|
Affiliation(s)
- Jason A. Papin
- University of Virginia, Charlottesville, Virginia, United States of America
- * E-mail:
| | | |
Collapse
|
90
|
Tacconelli E, Górska A, De Angelis G, Lammens C, Restuccia G, Schrenzel J, Huson DH, Carević B, Preoţescu L, Carmeli Y, Kazma M, Spanu T, Carrara E, Malhotra-Kumar S, Gladstone BP. Estimating the association between antibiotic exposure and colonization with extended-spectrum β-lactamase-producing Gram-negative bacteria using machine learning methods: a multicentre, prospective cohort study. Clin Microbiol Infect 2019; 26:87-94. [PMID: 31128285 DOI: 10.1016/j.cmi.2019.05.013] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 04/20/2019] [Accepted: 05/13/2019] [Indexed: 10/26/2022]
Abstract
OBJECTIVES The aim of the study was to measure the impact of antibiotic exposure on the acquisition of colonization with extended-spectrum β-lactamase-producing Gram-negative bacteria (ESBL-GNB) accounting for individual- and group-level confounding using machine-learning methods. METHODS Patients hospitalized between September 2010 and June 2013 at six medical and six surgical wards in Italy, Serbia and Romania were screened for ESBL-GNB at hospital admission, discharge, antibiotic start, and after 3, 7, 15 and 30 days. Primary outcomes were the incidence rate and predictive factors of new ESBL-GNB colonization. Random forest algorithm was used to rank antibiotics according to the risk of selection of ESBL-GNB colonization in patients not colonized before starting antibiotics. RESULTS We screened 10 034 patients collecting 28 322 rectal swab samples. New ESBL-GNB colonization incidence with and without antibiotic treatment was 22/1000 and 9/1000 exposure-days, respectively. In the adjusted regression analyses, antibiotic exposure (hazard ratio (HR) 2.38; 95% CI 1.29-4.40), age 60-69 years (HR 1.19; 95% CI 1.05-1.34), and spring season (HR 1.25; 95% CI 1.14-1.38) were independently associated with new colonization. Monotherapy ranked higher als combination therapy in promoting ESBL-GNB colonization. Among monotherapy, cephalosporins ranked first followed by tetracycline (second), macrolide (fourth) and cotrimoxazole (seventh). Overall the ranking of cephalosporins was lower when used in combination. Among combinations not including cephalosporins, quinolones plus carbapenems ranked highest (eighth). Among sequential therapies, quinolones ranked highest (tenth) when prescribed within 30 days of therapy with cephalosporins. CONCLUSIONS Impact of antibiotics on selecting ESBL-GNB at intestinal level varies if used in monotherapy or combination and according to previous antibiotic exposure. These finding should be explored in future clinical trials on antibiotic stewardship interventions. CLINICAL TRIAL REGISTRATION NCT01208519.
Collapse
Affiliation(s)
- E Tacconelli
- Division of Infectious Disease, Department of Internal Medicine I, Tübingen University Hospital, Tübingen, Germany; Division of Infectious Diseases, Department of Diagnostic and Public Health, University of Verona, Italy.
| | - A Górska
- Algorithms in Bioinformatics, University of Tübingen and International Max Planck Research School, Tübingen, Germany
| | - G De Angelis
- Institute of Microbiology, Fondazione Policlinico Universitario A. Gemelli IRCCS - Università Cattolica del Sacro Cuore, Rome, Italy
| | - C Lammens
- Laboratory of Medical Microbiology, Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - G Restuccia
- Department of Anaesthesiology and Intensive Care Medicine, University of Catania, Catania, Italy
| | - J Schrenzel
- Bacteriology Laboratory, Service of Infectious Diseases, University of Geneva Hospitals and Medical Faculty, Geneva, Switzerland
| | - D H Huson
- Algorithms in Bioinformatics, University of Tübingen and International Max Planck Research School, Tübingen, Germany
| | - B Carević
- Department for Hospital Epidemiology, Clinical Centre of Serbia, Belgrade, Serbia
| | - L Preoţescu
- National Institute for Infectious Diseases, University of Medicine 'Carol Davila', Bucharest, Romania
| | - Y Carmeli
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; National Centre for Infection Control, Israel Ministry of Health, Tel Aviv, Israel
| | - M Kazma
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; National Centre for Infection Control, Israel Ministry of Health, Tel Aviv, Israel
| | - T Spanu
- Institute of Microbiology, Fondazione Policlinico Universitario A. Gemelli IRCCS - Università Cattolica del Sacro Cuore, Rome, Italy
| | - E Carrara
- Division of Infectious Diseases, Department of Diagnostic and Public Health, University of Verona, Italy
| | - S Malhotra-Kumar
- Laboratory of Medical Microbiology, Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - B P Gladstone
- Division of Infectious Disease, Department of Internal Medicine I, Tübingen University Hospital, Tübingen, Germany
| |
Collapse
|
91
|
Decano AG, Ludden C, Feltwell T, Judge K, Parkhill J, Downing T. Complete Assembly of Escherichia coli Sequence Type 131 Genomes Using Long Reads Demonstrates Antibiotic Resistance Gene Variation within Diverse Plasmid and Chromosomal Contexts. mSphere 2019; 4:e00130-19. [PMID: 31068432 PMCID: PMC6506616 DOI: 10.1128/msphere.00130-19] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 04/24/2019] [Indexed: 11/20/2022] Open
Abstract
The incidence of infections caused by extraintestinal Escherichia coli (ExPEC) is rising globally, which is a major public health concern. ExPEC strains that are resistant to antimicrobials have been associated with excess mortality, prolonged hospital stays, and higher health care costs. E. coli sequence type 131 (ST131) is a major ExPEC clonal group worldwide, with variable plasmid composition, and has an array of genes enabling antimicrobial resistance (AMR). ST131 isolates frequently encode the AMR genes blaCTX-M-14, blaCTX-M-15, and blaCTX-M-27, which are often rearranged, amplified, and translocated by mobile genetic elements (MGEs). Short DNA reads do not fully resolve the architecture of repetitive elements on plasmids to allow MGE structures encoding blaCTX-M genes to be fully determined. Here, we performed long-read sequencing to decipher the genome structures of six E. coli ST131 isolates from six patients. Most long-read assemblies generated entire chromosomes and plasmids as single contigs, in contrast to more fragmented assemblies created with short reads alone. The long-read assemblies highlighted diverse accessory genomes with blaCTX-M-15, blaCTX-M-14, and blaCTX-M-27 genes identified in three, one, and one isolates, respectively. One sample had no blaCTX-M gene. Two samples had chromosomal blaCTX-M-14 and blaCTX-M-15 genes, and the latter was at three distinct locations, likely transposed by the adjacent MGEs: ISEcp1, IS903B, and Tn2 This study showed that AMR genes exist in multiple different chromosomal and plasmid contexts, even between closely related isolates within a clonal group such as E. coli ST131.IMPORTANCE Drug-resistant bacteria are a major cause of illness worldwide, and a specific subtype called Escherichia coli ST131 causes a significant number of these infections. ST131 bacteria become resistant to treatments by modifying their DNA and by transferring genes among one another via large packages of genes called plasmids, like a game of pass-the-parcel. Tackling infections more effectively requires a better understanding of what plasmids are being exchanged and their exact contents. To achieve this, we applied new high-resolution DNA sequencing technology to six ST131 samples from infected patients and compared the output to that of an existing approach. A combination of methods shows that drug resistance genes on plasmids are highly mobile because they can jump into ST131's chromosomes. We found that the plasmids are very elastic and undergo extensive rearrangements even in closely related samples. This application of DNA sequencing technologies illustrates at a new level the highly dynamic nature of ST131 genomes.
Collapse
Affiliation(s)
| | - Catherine Ludden
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
- London School of Hygiene & Tropical Medicine, London, United Kingdom
| | | | - Kim Judge
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | | | - Tim Downing
- School of Biotechnology, Dublin City University, Dublin, Ireland
| |
Collapse
|