Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tanui CK, Benefo EO, Karanth S, Pradhan AK. A Machine Learning Model for Food Source Attribution of Listeria monocytogenes. Pathogens 2022;11:pathogens11060691. [PMID: 35745545 PMCID: PMC9230378 DOI: 10.3390/pathogens11060691] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/06/2022] [Accepted: 06/10/2022] [Indexed: 12/07/2022] Open

For:	Tanui CK, Benefo EO, Karanth S, Pradhan AK. A Machine Learning Model for Food Source Attribution of Listeria monocytogenes. Pathogens 2022;11:pathogens11060691. [PMID: 35745545 PMCID: PMC9230378 DOI: 10.3390/pathogens11060691] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/06/2022] [Accepted: 06/10/2022] [Indexed: 12/07/2022] Open

Number

Cited by Other Article(s)

Feng S, Karanth S, Almuhaideb E, Parveen S, Pradhan AK. Machine learning to predict the relationship between Vibrio spp. concentrations in seawater and oysters and prevalent environmental conditions. Food Res Int 2024;188:114464. [PMID: 38823834 DOI: 10.1016/j.foodres.2024.114464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 04/26/2024] [Accepted: 05/01/2024] [Indexed: 06/03/2024]

Abstract

Vibrio parahaemolyticus and Vibrio vulnificus are bacteria with a significant public health impact. Identifying factors impacting their presence and concentrations in food sources could enable the identification of significant risk factors and prevent incidences of foodborne illness. In recent years, machine learning has shown promise in modeling microbial presence based on prevalent external and internal variables, such as environmental variables and gene presence/absence, respectively, particularly with the generation and availability of large amounts and diverse sources of data. Such analyses can prove useful in predicting microbial behavior in food systems, particularly under the influence of the constant changes in environmental variables. In this study, we tested the efficacy of six machine learning regression models (random forest, support vector machine, elastic net, neural network, k-nearest neighbors, and extreme gradient boosting) in predicting the relationship between environmental variables and total and pathogenic V. parahaemolyticus and V. vulnificus concentrations in seawater and oysters. In general, environmental variables were found to be reliable predictors of total and pathogenic V. parahaemolyticus and V. vulnificus concentrations in seawater, and pathogenic V. parahaemolyticus in oysters (Acceptable Prediction Zone >70 %) when analyzed using our machine learning models. SHapley Additive exPlanations, which was used to identify variables influencing Vibrio concentrations, identified chlorophyll a content, seawater salinity, seawater temperature, and turbidity as influential variables. It is important to note that different strains were differentially impacted by the same environmental variable, indicating the need for further research to study the causes and potential mechanisms of these variations. In conclusion, environmental variables could be important predictors of Vibrio growth and behavior in seafood. Moreover, the models developed in this study could prove invaluable in assessing and managing the risks associated with V. parahaemolyticus and V. vulnificus, particularly in the face of a changing environment.

Collapse

Garcia-Vozmediano A, Maurella C, Ceballos LA, Crescio E, Meo R, Martelli W, Pitti M, Lombardi D, Meloni D, Pasqualini C, Ru G. Machine learning approach as an early warning system to prevent foodborne Salmonella outbreaks in northwestern Italy. Vet Res 2024;55:72. [PMID: 38840261 PMCID: PMC11154984 DOI: 10.1186/s13567-024-01323-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/15/2024] [Indexed: 06/07/2024] Open

Abstract

Salmonellosis, one of the most common foodborne infections in Europe, is monitored by food safety surveillance programmes, resulting in the generation of extensive databases. By leveraging tree-based machine learning (ML) algorithms, we exploited data from food safety audits to predict spatiotemporal patterns of salmonellosis in northwestern Italy. Data on human cases confirmed in 2015-2018 (n = 1969) and food surveillance data collected in 2014-2018 were used to develop ML algorithms. We integrated the monthly municipal human incidence with 27 potential predictors, including the observed prevalence of Salmonella in food. We applied the tree regression, random forest and gradient boosting algorithms considering different scenarios and evaluated their predictivity in terms of the mean absolute percentage error (MAPE) and R2. Using a similar dataset from the year 2019, spatiotemporal predictions and their relative sensitivities and specificities were obtained. Random forest and gradient boosting (R2 = 0.55, MAPE = 7.5%) outperformed the tree regression algorithm (R2 = 0.42, MAPE = 8.8%). Salmonella prevalence in food; spatial features; and monitoring efforts in ready-to-eat milk, fruits and vegetables, and pig meat products contributed the most to the models' predictivity, reducing the variance by 90.5%. Conversely, the number of positive samples obtained for specific food matrices minimally influenced the predictions (2.9%). Spatiotemporal predictions for 2019 showed sensitivity and specificity levels of 46.5% (due to the lack of some infection hotspots) and 78.5%, respectively. This study demonstrates the added value of integrating data from human and veterinary health services to develop predictive models of human salmonellosis occurrence, providing early warnings useful for mitigating foodborne disease impacts on public health.

Collapse

Mather AE, Gilmour MW, Reid SWJ, French NP. Foodborne bacterial pathogens: genome-based approaches for enduring and emerging threats in a complex and changing world. Nat Rev Microbiol 2024:10.1038/s41579-024-01051-z. [PMID: 38789668 DOI: 10.1038/s41579-024-01051-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/16/2024] [Indexed: 05/26/2024]

Taiwo OR, Onyeaka H, Oladipo EK, Oloke JK, Chukwugozie DC. Advancements in Predictive Microbiology: Integrating New Technologies for Efficient Food Safety Models. Int J Microbiol 2024;2024:6612162. [PMID: 38799770 PMCID: PMC11126350 DOI: 10.1155/2024/6612162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 04/01/2024] [Accepted: 04/23/2024] [Indexed: 05/29/2024] Open

Guzinski J, Tang Y, Chattaway MA, Dallman TJ, Petrovska L. Development and validation of a random forest algorithm for source attribution of animal and human Salmonella Typhimurium and monophasic variants of S. Typhimurium isolates in England and Wales utilising whole genome sequencing data. Front Microbiol 2024;14:1254860. [PMID: 38533130 PMCID: PMC10963456 DOI: 10.3389/fmicb.2023.1254860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 12/22/2023] [Indexed: 03/28/2024] Open

Abstract

Source attribution has traditionally involved combining epidemiological data with different pathogen characterisation methods, including 7-gene multi locus sequence typing (MLST) or serotyping, however, these approaches have limited resolution. In contrast, whole genome sequencing data provide an overview of the whole genome that can be used by attribution algorithms. Here, we applied a random forest (RF) algorithm to predict the primary sources of human clinical Salmonella Typhimurium (S. Typhimurium) and monophasic variants (monophasic S. Typhimurium) isolates. To this end, we utilised single nucleotide polymorphism diversity in the core genome MLST alleles obtained from 1,061 laboratory-confirmed human and animal S. Typhimurium and monophasic S. Typhimurium isolates as inputs into a RF model. The algorithm was used for supervised learning to classify 399 animal S. Typhimurium and monophasic S. Typhimurium isolates into one of eight distinct primary source classes comprising common livestock and pet animal species: cattle, pigs, sheep, other mammals (pets: mostly dogs and horses), broilers, layers, turkeys, and game birds (pheasants, quail, and pigeons). When applied to the training set animal isolates, model accuracy was 0.929 and kappa 0.905, whereas for the test set animal isolates, for which the primary source class information was withheld from the model, the accuracy was 0.779 and kappa 0.700. Subsequently, the model was applied to assign 662 human clinical cases to the eight primary source classes. In the dataset, 60/399 (15.0%) of the animal and 141/662 (21.3%) of the human isolates were associated with a known outbreak of S. Typhimurium definitive type (DT) 104. All but two of the 141 DT104 outbreak linked human isolates were correctly attributed by the model to the primary source classes identified as the origin of the DT104 outbreak. A model that was run without the clonal DT104 animal isolates produced largely congruent outputs (training set accuracy 0.989 and kappa 0.985; test set accuracy 0.781 and kappa 0.663). Overall, our results show that RF offers considerable promise as a suitable methodology for epidemiological tracking and source attribution for foodborne pathogens.

Collapse

Zhang T, Rabhi F, Chen X, Paik HY, MacIntyre CR. A machine learning-based universal outbreak risk prediction tool. Comput Biol Med 2024;169:107876. [PMID: 38176209 DOI: 10.1016/j.compbiomed.2023.107876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 12/12/2023] [Accepted: 12/18/2023] [Indexed: 01/06/2024]

Djordjevic SP, Jarocki VM, Seemann T, Cummins ML, Watt AE, Drigo B, Wyrsch ER, Reid CJ, Donner E, Howden BP. Genomic surveillance for antimicrobial resistance - a One Health perspective. Nat Rev Genet 2024;25:142-157. [PMID: 37749210 DOI: 10.1038/s41576-023-00649-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/02/2023] [Indexed: 09/27/2023]

Affiliation(s)

Steven P Djordjevic Australian Institute for Microbiology and Infection, University of Technology Sydney, Sydney, New South Wales, Australia. Australian Centre for Genomic Epidemiological Microbiology, University of Technology Sydney, Sydney, New South Wales, Australia.
Veronica M Jarocki Australian Institute for Microbiology and Infection, University of Technology Sydney, Sydney, New South Wales, Australia Australian Centre for Genomic Epidemiological Microbiology, University of Technology Sydney, Sydney, New South Wales, Australia
Torsten Seemann Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
Max L Cummins Australian Institute for Microbiology and Infection, University of Technology Sydney, Sydney, New South Wales, Australia Australian Centre for Genomic Epidemiological Microbiology, University of Technology Sydney, Sydney, New South Wales, Australia
Anne E Watt Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
Barbara Drigo UniSA STEM, University of South Australia, Adelaide, South Australia, Australia Future Industries Institute, University of South Australia, Adelaide, South Australia, Australia
Ethan R Wyrsch Australian Institute for Microbiology and Infection, University of Technology Sydney, Sydney, New South Wales, Australia Australian Centre for Genomic Epidemiological Microbiology, University of Technology Sydney, Sydney, New South Wales, Australia
Cameron J Reid Australian Institute for Microbiology and Infection, University of Technology Sydney, Sydney, New South Wales, Australia Australian Centre for Genomic Epidemiological Microbiology, University of Technology Sydney, Sydney, New South Wales, Australia
Erica Donner Future Industries Institute, University of South Australia, Adelaide, South Australia, Australia Cooperative Research Centre for Solving Antimicrobial Resistance in Agribusiness, Food, and Environments (CRC SAAFE), Adelaide, South Australia, Australia
Benjamin P Howden Centre for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, University of Melbourne at the Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia

Collapse

Benefo EO, Karanth S, Pradhan AK. A machine learning approach to identifying Salmonella stress response genes in isolates from poultry processing. Food Res Int 2024;175:113635. [PMID: 38128977 DOI: 10.1016/j.foodres.2023.113635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 10/21/2023] [Accepted: 10/24/2023] [Indexed: 12/23/2023]

Gu W, Cui Z, Stroika S, Carleton HA, Conrad A, Katz LS, Richardson LC, Hunter J, Click ES, Bruce BB. Predicting Food Sources of Listeria monocytogenes Based on Genomic Profiling Using Random Forest Model. Foodborne Pathog Dis 2023;20:579-586. [PMID: 37699246 DOI: 10.1089/fpd.2023.0046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023] Open

D'Onofrio F, Schirone M, Krasteva I, Tittarelli M, Iannetti L, Pomilio F, Torresi M, Paparella A, D'Alterio N, Luciani M. A comprehensive investigation of protein expression profiles in L. monocytogenes exposed to thermal abuse, mild acid, and salt stress conditions. Front Microbiol 2023;14:1271787. [PMID: 37876777 PMCID: PMC10591339 DOI: 10.3389/fmicb.2023.1271787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Accepted: 09/19/2023] [Indexed: 10/26/2023] Open

Castelli P, De Ruvo A, Bucciacchio A, D'Alterio N, Cammà C, Di Pasquale A, Radomski N. Harmonization of supervised machine learning practices for efficient source attribution of Listeria monocytogenes based on genomic data. BMC Genomics 2023;24:560. [PMID: 37736708 PMCID: PMC10515079 DOI: 10.1186/s12864-023-09667-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 09/10/2023] [Indexed: 09/23/2023] Open

Abstract

BACKGROUND

Genomic data-based machine learning tools are promising for real-time surveillance activities performing source attribution of foodborne bacteria such as Listeria monocytogenes. Given the heterogeneity of machine learning practices, our aim was to identify those influencing the source prediction performance of the usual holdout method combined with the repeated k-fold cross-validation method.

METHODS

A large collection of 1 100 L. monocytogenes genomes with known sources was built according to several genomic metrics to ensure authenticity and completeness of genomic profiles. Based on these genomic profiles (i.e. 7-locus alleles, core alleles, accessory genes, core SNPs and pan kmers), we developed a versatile workflow assessing prediction performance of different combinations of training dataset splitting (i.e. 50, 60, 70, 80 and 90%), data preprocessing (i.e. with or without near-zero variance removal), and learning models (i.e. BLR, ERT, RF, SGB, SVM and XGB). The performance metrics included accuracy, Cohen's kappa, F1-score, area under the curves from receiver operating characteristic curve, precision recall curve or precision recall gain curve, and execution time.

RESULTS

The testing average accuracies from accessory genes and pan kmers were significantly higher than accuracies from core alleles or SNPs. While the accuracies from 70 and 80% of training dataset splitting were not significantly different, those from 80% were significantly higher than the other tested proportions. The near-zero variance removal did not allow to produce results for 7-locus alleles, did not impact significantly the accuracy for core alleles, accessory genes and pan kmers, and decreased significantly accuracy for core SNPs. The SVM and XGB models did not present significant differences in accuracy between each other and reached significantly higher accuracies than BLR, SGB, ERT and RF, in this order of magnitude. However, the SVM model required more computing power than the XGB model, especially for high amount of descriptors such like core SNPs and pan kmers.

CONCLUSIONS

In addition to recommendations about machine learning practices for L. monocytogenes source attribution based on genomic data, the present study also provides a freely available workflow to solve other balanced or unbalanced multiclass phenotypes from binary and categorical genomic profiles of other microorganisms without source code modifications.

Collapse

Affiliation(s)

Pierluigi Castelli Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
Andrea De Ruvo Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
Andrea Bucciacchio Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
Nicola D'Alterio Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
Cesare Cammà Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
Adriano Di Pasquale Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy
Nicolas Radomski Istituto Zooprofilattico Sperimentale dell'Abruzzo e del Molise "Giuseppe Caporale" (IZSAM), National Reference Centre (NRC) for Whole Genome Sequencing of microbial pathogens: data base and bioinformatics analysis (GENPAT), Via Campo Boario, Teramo, TE, 64100, Italy.

Collapse

Artificial Intelligence Models for Zoonotic Pathogens: A Survey. Microorganisms 2022;10:microorganisms10101911. [PMID: 36296187 PMCID: PMC9607465 DOI: 10.3390/microorganisms10101911] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/19/2022] [Accepted: 09/22/2022] [Indexed: 11/22/2022] Open