1
|
Li HM, Zheng JX, Midzi N, Mutsaka- Makuvaza MJ, Lv S, Xia S, Qian YJ, Xiao N, Berguist R, Zhou XN. Schistosomiasis transmission in Zimbabwe: Modelling based on machine learning. Infect Dis Model 2024; 9:1081-1094. [PMID: 38988829 PMCID: PMC11233785 DOI: 10.1016/j.idm.2024.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 06/02/2024] [Accepted: 06/08/2024] [Indexed: 07/12/2024] Open
Abstract
Zimbabwe, located in Southern Africa, faces a significant public health challenge due to schistosomiasis. We investigated this issue with emphasis on risk prediction of schistosomiasis for the entire population. To this end, we reviewed available data on schistosomiasis in Zimbabwe from a literature search covering the 1980-2022 period considering the potential impact of 26 environmental and socioeconomic variables obtained from public sources. We studied the population requiring praziquantel with regard to whether or not mass drug administration (MDA) had been regularly applied. Three machine-learning algorithms were tested for their ability to predict the prevalence of schistosomiasis in Zimbabwe based on the mean absolute error (MAE), the root mean squared error (RMSE) and the coefficient of determination (R2). The findings revealed different roles of the 26 factors with respect to transmission and there were particular variations between Schistosoma haematobium and S. mansoni infections. We found that the top-five correlation factors, such as the past (rather than current) time, unsettled MDA implementation, constrained economy, high rainfall during the warmest season, and high annual precipitation were closely associated with higher S. haematobium prevalence, while lower elevation, high rainfall during the warmest season, steeper slope, past (rather than current) time, and higher minimum temperature in the coldest month were rather related to higher S. mansoni prevalence. The random forest (RF) algorithm was considered as the formal best model construction method, with MAE = 0.108; RMSE = 0.143; and R2 = 0.517 for S. haematobium, and with the corresponding figures for S. mansoni being 0.053; 0.082; and 0.458. Based on this optimal model, the current total schistosomiasis prevalence in Zimbabwe under MDA implementation was 19.8%, with that of S. haematobium at 13.8% and that of S. mansoni at 7.1%, requiring annual MDA based on a population of 3,003,928. Without MDA, the current total schistosomiasis prevalence would be 23.2%, that of S. haematobium 17.1% and that of S. mansoni prevalence at 7.4%, requiring annual MDA based on a population of 3,521,466. The study reveals that MDA alone is insufficient for schistosomiasis elimination, especially that due to S. mansoni. This study predicts a moderate prevalence of schistosomiasis in Zimbabwe, with its elimination requiring comprehensive control measures beyond the currently used strategies, including health education, snail control, population surveillance and environmental management.
Collapse
Affiliation(s)
- Hong-Mei Li
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| | - Jin-Xin Zheng
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| | - Nicholas Midzi
- National Institute of Health Research, Ministry of Health and Child Care, Harare, Zimbabwe
| | - Masceline Jenipher Mutsaka- Makuvaza
- National Institute of Health Research, Ministry of Health and Child Care, Harare, Zimbabwe
- University of Rwanda, College of Medicine and Health Sciences, School of Medicine and Pharmacy, Department of Microbiology and Parasitology, Rwanda
| | - Shan Lv
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| | - Shang Xia
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| | - Ying-jun Qian
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| | - Ning Xiao
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| | | | - Xiao-Nong Zhou
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, 200025, China
| |
Collapse
|
2
|
Garcia-Vozmediano A, Maurella C, Ceballos LA, Crescio E, Meo R, Martelli W, Pitti M, Lombardi D, Meloni D, Pasqualini C, Ru G. Machine learning approach as an early warning system to prevent foodborne Salmonella outbreaks in northwestern Italy. Vet Res 2024; 55:72. [PMID: 38840261 PMCID: PMC11154984 DOI: 10.1186/s13567-024-01323-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/15/2024] [Indexed: 06/07/2024] Open
Abstract
Salmonellosis, one of the most common foodborne infections in Europe, is monitored by food safety surveillance programmes, resulting in the generation of extensive databases. By leveraging tree-based machine learning (ML) algorithms, we exploited data from food safety audits to predict spatiotemporal patterns of salmonellosis in northwestern Italy. Data on human cases confirmed in 2015-2018 (n = 1969) and food surveillance data collected in 2014-2018 were used to develop ML algorithms. We integrated the monthly municipal human incidence with 27 potential predictors, including the observed prevalence of Salmonella in food. We applied the tree regression, random forest and gradient boosting algorithms considering different scenarios and evaluated their predictivity in terms of the mean absolute percentage error (MAPE) and R2. Using a similar dataset from the year 2019, spatiotemporal predictions and their relative sensitivities and specificities were obtained. Random forest and gradient boosting (R2 = 0.55, MAPE = 7.5%) outperformed the tree regression algorithm (R2 = 0.42, MAPE = 8.8%). Salmonella prevalence in food; spatial features; and monitoring efforts in ready-to-eat milk, fruits and vegetables, and pig meat products contributed the most to the models' predictivity, reducing the variance by 90.5%. Conversely, the number of positive samples obtained for specific food matrices minimally influenced the predictions (2.9%). Spatiotemporal predictions for 2019 showed sensitivity and specificity levels of 46.5% (due to the lack of some infection hotspots) and 78.5%, respectively. This study demonstrates the added value of integrating data from human and veterinary health services to develop predictive models of human salmonellosis occurrence, providing early warnings useful for mitigating foodborne disease impacts on public health.
Collapse
Affiliation(s)
- Aitor Garcia-Vozmediano
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy.
| | - Cristiana Maurella
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Leonardo A Ceballos
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Elisabetta Crescio
- Tecnológico de Monterrey, Av. Eugenio Garza Sada 2501 Sur, Tecnológico, 64849, Monterrey, N.L., México
| | - Rosa Meo
- Department of Computer Science, University of Turin, Corso Svizzera 185, 10149, Turin, Italy
| | - Walter Martelli
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Monica Pitti
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Daniela Lombardi
- Piedmont Regional Service for the Epidemiology of Infectious Diseases (SeREMI), Via Venezia 6, 15121, Alessandria, Italy
| | - Daniela Meloni
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| | - Chiara Pasqualini
- Piedmont Regional Service for the Epidemiology of Infectious Diseases (SeREMI), Via Venezia 6, 15121, Alessandria, Italy
| | - Giuseppe Ru
- Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy
| |
Collapse
|
3
|
Rusic D, Kumric M, Seselja Perisin A, Leskur D, Bukic J, Modun D, Vilovic M, Vrdoljak J, Martinovic D, Grahovac M, Bozic J. Tackling the Antimicrobial Resistance "Pandemic" with Machine Learning Tools: A Summary of Available Evidence. Microorganisms 2024; 12:842. [PMID: 38792673 PMCID: PMC11123121 DOI: 10.3390/microorganisms12050842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Revised: 04/16/2024] [Accepted: 04/19/2024] [Indexed: 05/26/2024] Open
Abstract
Antimicrobial resistance is recognised as one of the top threats healthcare is bound to face in the future. There have been various attempts to preserve the efficacy of existing antimicrobials, develop new and efficient antimicrobials, manage infections with multi-drug resistant strains, and improve patient outcomes, resulting in a growing mass of routinely available data, including electronic health records and microbiological information that can be employed to develop individualised antimicrobial stewardship. Machine learning methods have been developed to predict antimicrobial resistance from whole-genome sequencing data, forecast medication susceptibility, recognise epidemic patterns for surveillance purposes, or propose new antibacterial treatments and accelerate scientific discovery. Unfortunately, there is an evident gap between the number of machine learning applications in science and the effective implementation of these systems. This narrative review highlights some of the outstanding opportunities that machine learning offers when applied in research related to antimicrobial resistance. In the future, machine learning tools may prove to be superbugs' kryptonite. This review aims to provide an overview of available publications to aid researchers that are looking to expand their work with new approaches and to acquaint them with the current application of machine learning techniques in this field.
Collapse
Affiliation(s)
- Doris Rusic
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Marko Kumric
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| | - Ana Seselja Perisin
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Dario Leskur
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Josipa Bukic
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Darko Modun
- Department of Pharmacy, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (D.R.); (A.S.P.); (D.L.); (J.B.); (D.M.)
| | - Marino Vilovic
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| | - Josip Vrdoljak
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| | - Dinko Martinovic
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Department of Maxillofacial Surgery, University Hospital of Split, Spinciceva 1, 21000 Split, Croatia
| | - Marko Grahovac
- Department of Pharmacology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia;
| | - Josko Bozic
- Department of Pathophysiology, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia; (M.K.); (M.V.); (J.V.); (D.M.)
- Laboratory for Cardiometabolic Research, University of Split School of Medicine, Soltanska 2A, 21000 Split, Croatia
| |
Collapse
|
4
|
Guzinski J, Tang Y, Chattaway MA, Dallman TJ, Petrovska L. Development and validation of a random forest algorithm for source attribution of animal and human Salmonella Typhimurium and monophasic variants of S. Typhimurium isolates in England and Wales utilising whole genome sequencing data. Front Microbiol 2024; 14:1254860. [PMID: 38533130 PMCID: PMC10963456 DOI: 10.3389/fmicb.2023.1254860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 12/22/2023] [Indexed: 03/28/2024] Open
Abstract
Source attribution has traditionally involved combining epidemiological data with different pathogen characterisation methods, including 7-gene multi locus sequence typing (MLST) or serotyping, however, these approaches have limited resolution. In contrast, whole genome sequencing data provide an overview of the whole genome that can be used by attribution algorithms. Here, we applied a random forest (RF) algorithm to predict the primary sources of human clinical Salmonella Typhimurium (S. Typhimurium) and monophasic variants (monophasic S. Typhimurium) isolates. To this end, we utilised single nucleotide polymorphism diversity in the core genome MLST alleles obtained from 1,061 laboratory-confirmed human and animal S. Typhimurium and monophasic S. Typhimurium isolates as inputs into a RF model. The algorithm was used for supervised learning to classify 399 animal S. Typhimurium and monophasic S. Typhimurium isolates into one of eight distinct primary source classes comprising common livestock and pet animal species: cattle, pigs, sheep, other mammals (pets: mostly dogs and horses), broilers, layers, turkeys, and game birds (pheasants, quail, and pigeons). When applied to the training set animal isolates, model accuracy was 0.929 and kappa 0.905, whereas for the test set animal isolates, for which the primary source class information was withheld from the model, the accuracy was 0.779 and kappa 0.700. Subsequently, the model was applied to assign 662 human clinical cases to the eight primary source classes. In the dataset, 60/399 (15.0%) of the animal and 141/662 (21.3%) of the human isolates were associated with a known outbreak of S. Typhimurium definitive type (DT) 104. All but two of the 141 DT104 outbreak linked human isolates were correctly attributed by the model to the primary source classes identified as the origin of the DT104 outbreak. A model that was run without the clonal DT104 animal isolates produced largely congruent outputs (training set accuracy 0.989 and kappa 0.985; test set accuracy 0.781 and kappa 0.663). Overall, our results show that RF offers considerable promise as a suitable methodology for epidemiological tracking and source attribution for foodborne pathogens.
Collapse
Affiliation(s)
- Jaromir Guzinski
- Animal and Plant Health Agency, Bacteriology Department, Addlestone, United Kingdom
| | - Yue Tang
- Animal and Plant Health Agency, Bacteriology Department, Addlestone, United Kingdom
| | - Marie Anne Chattaway
- Gastrointestinal Bacteria Reference Unit, UK Health Security Agency, London, United Kingdom
| | - Timothy J. Dallman
- Gastrointestinal Bacteria Reference Unit, UK Health Security Agency, London, United Kingdom
| | - Liljana Petrovska
- Animal and Plant Health Agency, Bacteriology Department, Addlestone, United Kingdom
| |
Collapse
|
5
|
Cai G, Xu J, Ding Q, Lin T, Chen H, Wu M, Li W, Chen G, Xu G, Lan Y. Electroencephalography oscillations can predict the cortical response following theta burst stimulation. Brain Res Bull 2024; 208:110902. [PMID: 38367675 DOI: 10.1016/j.brainresbull.2024.110902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 01/28/2024] [Accepted: 02/14/2024] [Indexed: 02/19/2024]
Abstract
BACKGROUND Continuous theta burst stimulation and intermittent theta burst stimulation are clinically popular models of repetitive transcranial magnetic stimulation. However, they are limited by high variability between individuals in cortical excitability changes following stimulation. Although electroencephalography oscillations have been reported to modulate the cortical response to transcranial magnetic stimulation, their association remains unclear. This study aims to explore whether machine learning models based on EEG oscillation features can predict the cortical response to transcranial magnetic stimulation. METHOD Twenty-three young, healthy adults attended two randomly assigned sessions for continuous and intermittent theta burst stimulation. In each session, ten minutes of resting-state electroencephalography were recorded before delivering brain stimulation. Participants were classified as responders or non-responders based on changes in resting motor thresholds. Support vector machines and multi-layer perceptrons were used to establish predictive models of individual responses to transcranial magnetic stimulation. RESULT Among the evaluated algorithms, support vector machines achieved the best performance in discriminating responders from non-responders for intermittent theta burst stimulation (accuracy: 91.30%) and continuous theta burst stimulation (accuracy: 95.66%). The global clustering coefficient and global characteristic path length in the beta band had the greatest impact on model output. CONCLUSION These findings suggest that EEG features can serve as markers of cortical response to transcranial magnetic stimulation. They offer insights into the association between neural oscillations and variability in individuals' responses to transcranial magnetic stimulation, aiding in the optimization of individualized protocols.
Collapse
Affiliation(s)
- Guiyuan Cai
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China
| | - Jiayue Xu
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China
| | - Qian Ding
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China; Department of Rehabilitation Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, 519041 China
| | - Tuo Lin
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China
| | - Hongying Chen
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China
| | - Manfeng Wu
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China
| | - Wanqi Li
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China
| | - Gengbin Chen
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China; Postgraduate Research Institute, Guangzhou Sport University, Guangzhou, 510500 China
| | - Guangqing Xu
- Department of Rehabilitation Medicine, Guangdong Provincial People's Hospital (Guangdong Academy of Medical Sciences), Southern Medical University, Guangzhou, 519041 China.
| | - Yue Lan
- Department of Rehabilitation Medicine, the Second Affiliated Hospital, School of Medicine, South China University of Technology, Guangzhou, 510013 China; Guangzhou Key Laboratory of Aging Frailty and Neurorehabilitation, Guangzhou 510013, China.
| |
Collapse
|
6
|
Zou H, Han J, Zhao L, Wang D, Guan Y, Wu T, Hou X, Han H, Li X. The shared NDM-positive strains in the hospital and connecting aquatic environments. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 860:160404. [PMID: 36427732 DOI: 10.1016/j.scitotenv.2022.160404] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 11/09/2022] [Accepted: 11/18/2022] [Indexed: 06/16/2023]
Abstract
The spread of antibiotic-resistant priority pathogens outside hospital settings is, both, a significant public health concern and an environmental problem. In recent years, New Delhi Metallo-β-lactamase (NDM)-positive strains have caused nosocomial infections with high mortality and poor prognosis worldwide. Our study investigated the links of NDM-positive strains between the hospital and the connecting river system in Jinan city, Eastern China by using NDM-producing Escherichia coli (NDM-EC) as an indicator via whole genome sequencing. Thirteen NDM-EC isolates were detected from 187 river water and sediment samples, while 9 isolates were identified from patients at the local hospital. All NDM-EC isolates were resistant to imipenem, meropenem, cefotaxime, cefoxitin, ampicillin, tetracycline, fosfomycin, piperacillin-tazobactam. The blaNDM-5 (n = 20) and blaNDM-9 (n = 2) genes were identified, which were predominantly on IncX3 plasmids (n = 13), followed by IncFII plasmids (n = 5) and IncFIA plasmids (n = 2). Conjugation experiments showed that 21 isolates could transfer NDM-harboring plasmids. The well-conserved blaNDM-5 genetic environment (ISAba125-blaNDM-5/9-bleMBL-trpF-dsbD-IS26) of these plasmids suggested a common genetic origin. Nine sequence types (STs) were detected, including three international high-risk clones ST167 (n = 8), ST410 (n = 1), and ST617 (n = 1). Phylogenetic analysis showed ST167 E. coli from the river was genotypically related to clinical isolates recovered from patients. Furthermore, ST167 isolates showed high genetic similarities with other clinical strains from geographically distinct regions. The genetic concordance between isolates from different sampling sites in the same river (ST218 clone), and different rivers (ST448 clone) raises concerns regarding the rapid dissemination of NDM-EC in the aquatic environment. The emergence and spread of the clinically relevant NDM-positive strains, especially for E. coli ST167 clone, an international high-risk clone associated with multi-resistance and virulence capacity, within and between the hospital and aquatic environments were elucidated, highlighting the need for attention and action.
Collapse
Affiliation(s)
- Huiyun Zou
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
| | - Jingyi Han
- Department of Thoracic Surgery, Qilu Hospital of Shandong University, Jinan, Shandong, China
| | - Ling Zhao
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
| | - Di Wang
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
| | - Yanyu Guan
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
| | - Tianle Wu
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
| | - Xinjiao Hou
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China
| | - Hui Han
- Department of Infection Control, Qilu Hospital of Shandong University, Jinan, China.
| | - Xuewen Li
- Department of Environment and Health, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, Shandong 250012, China.
| |
Collapse
|
7
|
Geurtsen J, de Been M, Weerdenburg E, Zomer A, McNally A, Poolman J. Genomics and pathotypes of the many faces of Escherichia coli. FEMS Microbiol Rev 2022; 46:fuac031. [PMID: 35749579 PMCID: PMC9629502 DOI: 10.1093/femsre/fuac031] [Citation(s) in RCA: 53] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 06/22/2022] [Indexed: 01/09/2023] Open
Abstract
Escherichia coli is the most researched microbial organism in the world. Its varied impact on human health, consisting of commensalism, gastrointestinal disease, or extraintestinal pathologies, has generated a separation of the species into at least eleven pathotypes (also known as pathovars). These are broadly split into two groups, intestinal pathogenic E. coli (InPEC) and extraintestinal pathogenic E. coli (ExPEC). However, components of E. coli's infinite open accessory genome are horizontally transferred with substantial frequency, creating pathogenic hybrid strains that defy a clear pathotype designation. Here, we take a birds-eye view of the E. coli species, characterizing it from historical, clinical, and genetic perspectives. We examine the wide spectrum of human disease caused by E. coli, the genome content of the bacterium, and its propensity to acquire, exchange, and maintain antibiotic resistance genes and virulence traits. Our portrayal of the species also discusses elements that have shaped its overall population structure and summarizes the current state of vaccine development targeted at the most frequent E. coli pathovars. In our conclusions, we advocate streamlining efforts for clinical reporting of ExPEC, and emphasize the pathogenic potential that exists throughout the entire species.
Collapse
Affiliation(s)
- Jeroen Geurtsen
- Janssen Vaccines and Prevention B.V., 2333 Leiden, the Netherlands
| | - Mark de Been
- Janssen Vaccines and Prevention B.V., 2333 Leiden, the Netherlands
| | | | - Aldert Zomer
- Department of Infectious Diseases and Immunology, Faculty of Veterinary Medicine, Utrecht University, 3584 Utrecht, the Netherlands
| | - Alan McNally
- Institute of Microbiology and Infection, College of Medical and Dental Sciences, University of Birmingham, B15 2TT Birmingham, United Kingdom
| | - Jan Poolman
- Janssen Vaccines and Prevention B.V., 2333 Leiden, the Netherlands
| |
Collapse
|
8
|
Development of a Prediction Method of Cell Density in Autotrophic/Heterotrophic Microorganism Mixtures by Machine Learning Using Absorbance Spectrum Data. BIOTECH (BASEL (SWITZERLAND)) 2022; 11:biotech11040046. [PMID: 36278558 PMCID: PMC9624369 DOI: 10.3390/biotech11040046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 10/09/2022] [Accepted: 10/11/2022] [Indexed: 11/06/2022]
Abstract
Microflora is actively used to produce value-added materials in industry, and each cell density should be controlled for stable microflora use. In this study, a simple system evaluating the cell density was constructed with artificial intelligence (AI) using the absorbance spectra data of microflora. To set up the system, the prediction system for cell density based on machine learning was constructed using the spectra data as the feature from the mixture of Saccharomyces cerevisiae and Chlamydomonas reinhardtii. As the results of predicting cell density by extremely randomized trees, when the cell densities of S. cerevisiae and C. reinhardtii were shifted and fixed, the coefficient of determination (R2) was 0.8495; on the other hand, when the cell densities of S. cerevisiae and C. reinhardtii were fixed and shifted, the R2 was 0.9232. To explain the prediction system, the randomized trees regressor of the decision tree-based ensemble learning method as the machine learning algorithm and Shapley additive explanations (SHAPs) as the explainable AI (XAI) to interpret the features contributing to the prediction results were used. As a result of the SHAP analyses, not only the optical density, but also the absorbance of the Soret and Q bands derived from the chloroplasts of C. reinhardtii could contribute to the prediction as the features. The simple cell density evaluating system could have an industrial impact.
Collapse
|
9
|
Vilne B, Ķibilds J, Siksna I, Lazda I, Valciņa O, Krūmiņa A. Could Artificial Intelligence/Machine Learning and Inclusion of Diet-Gut Microbiome Interactions Improve Disease Risk Prediction? Case Study: Coronary Artery Disease. Front Microbiol 2022; 13:627892. [PMID: 35479632 PMCID: PMC9036178 DOI: 10.3389/fmicb.2022.627892] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 02/24/2022] [Indexed: 12/14/2022] Open
Abstract
Coronary artery disease (CAD) is the most common cardiovascular disease (CVD) and the main leading cause of morbidity and mortality worldwide, posing a huge socio-economic burden to the society and health systems. Therefore, timely and precise identification of people at high risk of CAD is urgently required. Most current CAD risk prediction approaches are based on a small number of traditional risk factors (age, sex, diabetes, LDL and HDL cholesterol, smoking, systolic blood pressure) and are incompletely predictive across all patient groups, as CAD is a multi-factorial disease with complex etiology, considered to be driven by both genetic, as well as numerous environmental/lifestyle factors. Diet is one of the modifiable factors for improving lifestyle and disease prevention. However, the current rise in obesity, type 2 diabetes (T2D) and CVD/CAD indicates that the “one-size-fits-all” approach may not be efficient, due to significant variation in inter-individual responses. Recently, the gut microbiome has emerged as a potential and previously under-explored contributor to these variations. Hence, efficient integration of dietary and gut microbiome information alongside with genetic variations and clinical data holds a great promise to improve CAD risk prediction. Nevertheless, the highly complex nature of meals combined with the huge inter-individual variability of the gut microbiome poses several Big Data analytics challenges in modeling diet-gut microbiota interactions and integrating these within CAD risk prediction approaches for the development of personalized decision support systems (DSS). In this regard, the recent re-emergence of Artificial Intelligence (AI) / Machine Learning (ML) is opening intriguing perspectives, as these approaches are able to capture large and complex matrices of data, incorporating their interactions and identifying both linear and non-linear relationships. In this Mini-Review, we consider (1) the most used AI/ML approaches and their different use cases for CAD risk prediction (2) modeling of the content, choice and impact of dietary factors on CAD risk; (3) classification of individuals by their gut microbiome composition into CAD cases vs. controls and (4) modeling of the diet-gut microbiome interactions and their impact on CAD risk. Finally, we provide an outlook for putting it all together for improved CAD risk predictions.
Collapse
Affiliation(s)
- Baiba Vilne
- Bioinformatics Lab, Riga Stradins University, Riga, Latvia
- COST Action CA18131 - Statistical and Machine Learning Techniques in Human Microbiome Studies, Brussels, Belgium
- *Correspondence: Baiba Vilne
| | - Juris Ķibilds
- Institute of Food Safety, Animal Health and Environment BIOR, Riga, Latvia
| | - Inese Siksna
- Institute of Food Safety, Animal Health and Environment BIOR, Riga, Latvia
| | - Ilva Lazda
- Institute of Food Safety, Animal Health and Environment BIOR, Riga, Latvia
| | - Olga Valciņa
- Institute of Food Safety, Animal Health and Environment BIOR, Riga, Latvia
| | - Angelika Krūmiņa
- Institute of Food Safety, Animal Health and Environment BIOR, Riga, Latvia
- Department of Infectology and Dermatology, Riga Stradins University, Riga, Latvia
| |
Collapse
|
10
|
Karanth S, Tanui CK, Meng J, Pradhan AK. Exploring the predictive capability of advanced machine learning in identifying severe disease phenotype in Salmonella enterica. Food Res Int 2022; 151:110817. [PMID: 34980422 DOI: 10.1016/j.foodres.2021.110817] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 11/12/2021] [Accepted: 11/17/2021] [Indexed: 11/26/2022]
Abstract
The past few years have seen a significant increase in availability of whole genome sequencing information, allowing for its incorporation in predictive modeling for foodborne pathogens to account for inter- and intra-species differences in their virulence. However, this is hindered by the inability of traditional statistical methods to analyze such large amounts of data compared to the number of observations/isolates. In this study, we have explored the applicability of machine learning (ML) models to predict the disease outcome, while identifying features that exert a significant effect on the prediction. This study was conducted on Salmonella enterica, a major foodborne pathogen with considerable inter- and intra-serovar variation. WGS of isolates obtained from various sources (i.e., human, chicken, and swine) were used as input in four machine learning models (logistic regression with ridge, random forest, support vector machine, and AdaBoost) to classify isolates based on disease severity (extraintestinal vs. gastrointestinal) in the host. The predictive performances of all models were tested with and without Elastic Net regularization to combat dimensionality issues. Elastic Net-regularized logistic regression model showed the best area under the receiver operating characteristic curve (AUC-ROC; 0.86) and outcome prediction accuracy (0.76). Additionally, genes coding for transcriptional regulation, acidic, oxidative, and anaerobic stress response, and antibiotic resistance were found to be significant predictors of disease severity. These genes, which were significantly associated with each outcome, could possibly be input in amended, gene-expression-specific predictive models to estimate virulence pattern-specific effect of Salmonella and other foodborne pathogens on human health.
Collapse
Affiliation(s)
- Shraddha Karanth
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA
| | - Collins K Tanui
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA
| | - Jianghong Meng
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA; Joint Institute for Food Safety and Applied Nutrition, University of Maryland, College Park, MD 20742, USA
| | - Abani K Pradhan
- Department of Nutrition and Food Science, University of Maryland, College Park, MD 20742, USA; Center for Food Safety and Security Systems, University of Maryland, College Park, MD 20742, USA.
| |
Collapse
|
11
|
Du Y, Guo Y. Machine learning techniques and research framework in foodborne disease surveillance system. Food Control 2022. [DOI: 10.1016/j.foodcont.2021.108448] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
12
|
Voigt B, Fischer O, Krumnow C, Herta C, Dabrowski PW. NGS read classification using AI. PLoS One 2021; 16:e0261548. [PMID: 34936673 PMCID: PMC8694450 DOI: 10.1371/journal.pone.0261548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 12/03/2021] [Indexed: 11/19/2022] Open
Abstract
Clinical metagenomics is a powerful diagnostic tool, as it offers an open view into all DNA in a patient's sample. This allows the detection of pathogens that would slip through the cracks of classical specific assays. However, due to this unspecific nature of metagenomic sequencing, a huge amount of unspecific data is generated during the sequencing itself and the diagnosis only takes place at the data analysis stage where relevant sequences are filtered out. Typically, this is done by comparison to reference databases. While this approach has been optimized over the past years and works well to detect pathogens that are represented in the used databases, a common challenge in analysing a metagenomic patient sample arises when no pathogen sequences are found: How to determine whether truly no evidence of a pathogen is present in the data or whether the pathogen's genome is simply absent from the database and the sequences in the dataset could thus not be classified? Here, we present a novel approach to this problem of detecting novel pathogens in metagenomic datasets by classifying the (segments of) proteins encoded by the sequences in the datasets. We train a neural network on the sequences of coding sequences, labeled by taxonomic domain, and use this neural network to predict the taxonomic classification of sequences that can not be classified by comparison to a reference database, thus facilitating the detection of potential novel pathogens.
Collapse
Affiliation(s)
- Benjamin Voigt
- Center for Bio-Medical image and Information processing (CBMI), HTW University of Applied Sciences, Berlin, Germany
| | - Oliver Fischer
- Center for Bio-Medical image and Information processing (CBMI), HTW University of Applied Sciences, Berlin, Germany
| | - Christian Krumnow
- Center for Bio-Medical image and Information processing (CBMI), HTW University of Applied Sciences, Berlin, Germany
| | - Christian Herta
- Center for Bio-Medical image and Information processing (CBMI), HTW University of Applied Sciences, Berlin, Germany
| | - Piotr Wojciech Dabrowski
- Center for Bio-Medical image and Information processing (CBMI), HTW University of Applied Sciences, Berlin, Germany
| |
Collapse
|
13
|
Du Y, Wang H, Cui W, Zhu H, Guo Y, Dharejo FA, Zhou Y. Foodborne Disease Risk Prediction Using Multigraph Structural Long Short-term Memory Networks: Algorithm Design and Validation Study. JMIR Med Inform 2021; 9:e29433. [PMID: 34338648 PMCID: PMC8369373 DOI: 10.2196/29433] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/11/2021] [Accepted: 05/19/2021] [Indexed: 01/17/2023] Open
Abstract
BACKGROUND Foodborne disease is a common threat to human health worldwide, leading to millions of deaths every year. Thus, the accurate prediction foodborne disease risk is very urgent and of great importance for public health management. OBJECTIVE We aimed to design a spatial-temporal risk prediction model suitable for predicting foodborne disease risks in various regions, to provide guidance for the prevention and control of foodborne diseases. METHODS We designed a novel end-to-end framework to predict foodborne disease risk by using a multigraph structural long short-term memory neural network, which can utilize an encoder-decoder to achieve multistep prediction. In particular, to capture multiple spatial correlations, we divided regions by administrative area and constructed adjacent graphs with metrics that included region proximity, historical data similarity, regional function similarity, and exposure food similarity. We also integrated an attention mechanism in both spatial and temporal dimensions, as well as external factors, to refine prediction accuracy. We validated our model with a long-term real-world foodborne disease data set, comprising data from 2015 to 2019 from multiple provinces in China. RESULTS Our model can achieve F1 scores of 0.822, 0.679, 0.709, and 0.720 for single-month forecasts for the provinces of Beijing, Zhejiang, Shanxi and Hebei, respectively, and the highest F1 score was 20% higher than the best results of the other models. The experimental results clearly demonstrated that our approach can outperform other state-of-the-art models, with a margin. CONCLUSIONS The spatial-temporal risk prediction model can take into account the spatial-temporal characteristics of foodborne disease data and accurately determine future disease spatial-temporal risks, thereby providing support for the prevention and risk assessment of foodborne disease.
Collapse
Affiliation(s)
- Yi Du
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| | - Hanxue Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| | - Wenjuan Cui
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| | | | - Yunchang Guo
- China National Center for Food Safety Risk Assessment, Beijing, China
| | - Fayaz Ali Dharejo
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| | - Yuanchun Zhou
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| |
Collapse
|
14
|
Saif NA, Cobo-Díaz JF, Elserafy M, El-Shiekh I, Álvarez-Ordóñez A, Mouftah SF, Elhadidy M. A pilot study revealing host-associated genetic signatures for source attribution of sporadic Campylobacter jejuni infection in Egypt. Transbound Emerg Dis 2021; 69:1847-1861. [PMID: 34033263 DOI: 10.1111/tbed.14165] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/22/2021] [Indexed: 11/30/2022]
Abstract
Campylobacter jejuni (C. jejuni), is considered among the most common bacterial causes of human bacterial gastroenteritis worldwide. The epidemiology and the transmission dynamics of campylobacteriosis in Egypt remain poorly defined due to the limited use of high-resolution typing methods. In this pilot study, we evaluated the discriminatory power of multiple typing 'gene-by-gene based' techniques to characterize C. jejuni obtained from different sources and estimate the relative contribution of different potential sources of C. jejuni infection in Egypt. Whole genome sequencing (WGS) was performed on 90 C. jejuni isolates recovered from clinical samples, retail chicken, and dairy products in Egypt from 2017 to 2018. Comparative genomic analysis was performed using conventional seven-locus multilocus sequence typing (MLST), ribosomal MLST (rMLST), core genome MLST (cgMLST), allelic variation in 15 host-segregating (HS) markers, and comparative genomic fingerprinting (CGF40). The probabilistic source attribution was performed via STRUCTURE software using MLST, CGF40, cgMLST and allelic variation in HS markers. Comparison of the discriminatory power of the aforementioned genotyping methods revealed cgMLST to be the most discriminative method, followed by HS markers. The source attribution analysis showed the role of retail chicken as a source of infection among clinical cases in Egypt when HS and cgMLST were used (64.2% and 52.3% of clinical isolates were assigned to this source, respectively). Interestingly, the cattle reservoir was also identified as a contributor to C. jejuni infection in Egypt; 35.8% and 47.7% of clinical isolates were assigned to this source by HS and cgMLST, respectively. Here, we provided evidence of the importance of using WGS typing methods to facilitate source tracking of C. jejuni. Our findings suggest the importance of non-poultry sources, together with the previously reported role of retail chicken in human campylobacteriosis in Egypt that can provide insights to inform national control measures.
Collapse
Affiliation(s)
- Nehal A Saif
- Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt
| | - José F Cobo-Díaz
- Department of Food Hygiene and Technology, Universidad de León, León, Spain.,Institute of Food Science and Technology, Universidad de León, León, Spain
| | - Menattallah Elserafy
- Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt.,Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Iman El-Shiekh
- Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt.,Center for Genomics, Helmy Institute for Medical Sciences, Zewail City of Science and Technology, Giza, Egypt
| | - Avelino Álvarez-Ordóñez
- Department of Food Hygiene and Technology, Universidad de León, León, Spain.,Institute of Food Science and Technology, Universidad de León, León, Spain
| | - Shaimaa F Mouftah
- Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt
| | - Mohamed Elhadidy
- Biomedical Sciences Program, University of Science and Technology, Zewail City of Science and Technology, Giza, Egypt.,Department of Bacteriology, Mycology, and Immunology, Faculty of Veterinary Medicine, Mansoura University, Mansoura, Egypt
| |
Collapse
|
15
|
miRNA Regulatory Functions in Farm Animal Diseases, and Biomarker Potentials for Effective Therapies. Int J Mol Sci 2021; 22:ijms22063080. [PMID: 33802936 PMCID: PMC8002598 DOI: 10.3390/ijms22063080] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 03/03/2021] [Accepted: 03/08/2021] [Indexed: 02/06/2023] Open
Abstract
MicroRNAs (miRNAs) are small endogenous RNAs that regulate gene expression post-transcriptionally by targeting either the 3′ untranslated or coding regions of genes. They have been reported to play key roles in a wide range of biological processes. The recent remarkable developments of transcriptomics technologies, especially next-generation sequencing technologies and advanced bioinformatics tools, allow more in-depth exploration of messenger RNAs (mRNAs) and non-coding RNAs (ncRNAs), including miRNAs. These technologies have offered great opportunities for a deeper exploration of miRNA involvement in farm animal diseases, as well as livestock productivity and welfare. In this review, we provide an overview of the current knowledge of miRNA roles in major farm animal diseases with a particular focus on diseases of economic importance. In addition, we discuss the steps and future perspectives of using miRNAs as biomarkers and molecular therapy for livestock disease management as well as the challenges and opportunities for understanding the regulatory mechanisms of miRNAs related to disease pathogenesis.
Collapse
|
16
|
Wang H, Cui W, Guo Y, Du Y, Zhou Y. Machine Learning Prediction of Foodborne Disease Pathogens: Algorithm Development and Validation Study. JMIR Med Inform 2021; 9:e24924. [PMID: 33496675 PMCID: PMC7872834 DOI: 10.2196/24924] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 12/18/2020] [Accepted: 12/28/2020] [Indexed: 01/18/2023] Open
Abstract
Background Foodborne diseases, as a type of disease with a high global incidence, place a heavy burden on public health and social economy. Foodborne pathogens, as the main factor of foodborne diseases, play an important role in the treatment and prevention of foodborne diseases; however, foodborne diseases caused by different pathogens lack specificity in clinical features, and there is a low proportion of clinically actual pathogen detection in real life. Objective We aimed to analyze foodborne disease case data, select appropriate features based on analysis results, and use machine learning methods to classify foodborne disease pathogens to predict foodborne disease pathogens that have not been tested. Methods We extracted features such as space, time, and exposed food from foodborne disease case data and analyzed the relationship between these features and the foodborne disease pathogens using a variety of machine learning methods to classify foodborne disease pathogens. We compared the results of 4 models to obtain the pathogen prediction model with the highest accuracy. Results The gradient boost decision tree model obtained the highest accuracy, with accuracy approaching 69% in identifying 4 pathogens including Salmonella, Norovirus, Escherichia coli, and Vibrio parahaemolyticus. By evaluating the importance of features such as time of illness, geographical longitude and latitude, and diarrhea frequency, we found that they play important roles in classifying the foodborne disease pathogens. Conclusions Data analysis can reflect the distribution of some features of foodborne diseases and the relationship among the features. The classification of pathogens based on the analysis results and machine learning methods can provide beneficial support for clinical auxiliary diagnosis and treatment of foodborne diseases.
Collapse
Affiliation(s)
- Hanxue Wang
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| | - Wenjuan Cui
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
| | - Yunchang Guo
- China National Center for Food Safety Risk Assessment, Beijing, China
| | - Yi Du
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| | - Yuanchun Zhou
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, China.,Chinese Academy of Sciences University, Beijing, China
| |
Collapse
|
17
|
Uelze L, Grützke J, Borowiak M, Hammerl JA, Juraschek K, Deneke C, Tausch SH, Malorny B. Typing methods based on whole genome sequencing data. ONE HEALTH OUTLOOK 2020; 2:3. [PMID: 33829127 PMCID: PMC7993478 DOI: 10.1186/s42522-020-0010-1] [Citation(s) in RCA: 88] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 01/08/2020] [Indexed: 05/12/2023]
Abstract
Whole genome sequencing (WGS) of foodborne pathogens has become an effective method for investigating the information contained in the genome sequence of bacterial pathogens. In addition, its highly discriminative power enables the comparison of genetic relatedness between bacteria even on a sub-species level. For this reason, WGS is being implemented worldwide and across sectors (human, veterinary, food, and environment) for the investigation of disease outbreaks, source attribution, and improved risk characterization models. In order to extract relevant information from the large quantity and complex data produced by WGS, a host of bioinformatics tools has been developed, allowing users to analyze and interpret sequencing data, starting from simple gene-searches to complex phylogenetic studies. Depending on the research question, the complexity of the dataset and their bioinformatics skill set, users can choose between a great variety of tools for the analysis of WGS data. In this review, we describe the relevant approaches for phylogenomic studies for outbreak studies and give an overview of selected tools for the characterization of foodborne pathogens based on WGS data. Despite the efforts of the last years, harmonization and standardization of typing tools are still urgently needed to allow for an easy comparison of data between laboratories, moving towards a one health worldwide surveillance system for foodborne pathogens.
Collapse
Affiliation(s)
- Laura Uelze
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Josephine Grützke
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Maria Borowiak
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Jens Andre Hammerl
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Katharina Juraschek
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Carlus Deneke
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Simon H. Tausch
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| | - Burkhard Malorny
- Department for Biological Safety, German Federal Institute for Risk Assessment, BfR, Max-Dohrn Straße 8-10, 10589 Berlin, Germany
| |
Collapse
|