1
|
Wang L, Zhang K, Xu L, Wang J. Understanding underlying physical mechanism reveals early warning indicators and key elements for adaptive infections disease networks. PNAS NEXUS 2024; 3:pgae237. [PMID: 39035039 PMCID: PMC11259140 DOI: 10.1093/pnasnexus/pgae237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 06/03/2024] [Indexed: 07/23/2024]
Abstract
The study of infectious diseases holds significant scientific and societal importance, yet current research on the mechanisms of disease emergence and prediction methods still face challenging issues. This research uses the landscape and flux theoretical framework to reveal the non-equilibrium dynamics of adaptive infectious diseases and uncover its underlying physical mechanism. This allows the quantification of dynamics, characterizing the system with two basins of attraction determined by gradient and rotational flux forces. Quantification of entropy production rates provides insights into the system deviating from equilibrium and associated dissipative costs. The study identifies early warning indicators for the critical transition, emphasizing the advantage of observing time irreversibility from time series over theoretical entropy production and flux. The presence of rotational flux leads to an irreversible pathway between disease states. Through global sensitivity analysis, we identified the key factors influencing infectious diseases. In summary, this research offers valuable insights into infectious disease dynamics and presents a practical approach for predicting the onset of critical transition, addressing existing research gaps.
Collapse
Affiliation(s)
- Linqi Wang
- Center of Theoretical Physics, College of Physics, Jilin University, Changchun, Jilin, 130012, China
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, 130022, China
| | - Kun Zhang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, 130022, China
| | - Li Xu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin, 130022, China
| | - Jin Wang
- Department of Chemistry, Physics and Astronomy, State University of New York at Stony Brook, Stony Brook, NY 11794, USA
| |
Collapse
|
2
|
Haque S, Mengersen K, Barr I, Wang L, Yang W, Vardoulakis S, Bambrick H, Hu W. Towards development of functional climate-driven early warning systems for climate-sensitive infectious diseases: Statistical models and recommendations. ENVIRONMENTAL RESEARCH 2024; 249:118568. [PMID: 38417659 DOI: 10.1016/j.envres.2024.118568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 02/22/2024] [Accepted: 02/25/2024] [Indexed: 03/01/2024]
Abstract
Climate, weather and environmental change have significantly influenced patterns of infectious disease transmission, necessitating the development of early warning systems to anticipate potential impacts and respond in a timely and effective way. Statistical modelling plays a pivotal role in understanding the intricate relationships between climatic factors and infectious disease transmission. For example, time series regression modelling and spatial cluster analysis have been employed to identify risk factors and predict spatial and temporal patterns of infectious diseases. Recently advanced spatio-temporal models and machine learning offer an increasingly robust framework for modelling uncertainty, which is essential in climate-driven disease surveillance due to the dynamic and multifaceted nature of the data. Moreover, Artificial Intelligence (AI) techniques, including deep learning and neural networks, excel in capturing intricate patterns and hidden relationships within climate and environmental data sets. Web-based data has emerged as a powerful complement to other datasets encompassing climate variables and disease occurrences. However, given the complexity and non-linearity of climate-disease interactions, advanced techniques are required to integrate and analyse these diverse data to obtain more accurate predictions of impending outbreaks, epidemics or pandemics. This article presents an overview of an approach to creating climate-driven early warning systems with a focus on statistical model suitability and selection, along with recommendations for utilizing spatio-temporal and machine learning techniques. By addressing the limitations and embracing the recommendations for future research, we could enhance preparedness and response strategies, ultimately contributing to the safeguarding of public health in the face of evolving climate challenges.
Collapse
Affiliation(s)
- Shovanur Haque
- Ecosystem Change and Population Health Research Group, School of Public Health and Social Work, Queensland University of Technology, Brisbane, Australia
| | - Kerrie Mengersen
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia; Centre for Data Science (CDS), Queensland University of Technology (QUT), Brisbane, Australia
| | - Ian Barr
- World Health Organization Collaborating Centre for Reference and Research on Influenza, VIDRL, Doherty Institute, Melbourne, Australia; Department of Microbiology and Immunology, University of Melbourne, Victoria, Australia
| | - Liping Wang
- National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, Division of Infectious disease, Chinese Centre for Disease Control and Prevention, China
| | - Weizhong Yang
- School of Population Medicine and Public Health, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100730, China
| | - Sotiris Vardoulakis
- HEAL Global Research Centre, Health Research Institute, University of Canberra, ACT Canberra, 2601, Australia
| | - Hilary Bambrick
- National Centre for Epidemiology and Population Health, The Australian National University, ACT 2601 Canberra, Australia
| | - Wenbiao Hu
- Ecosystem Change and Population Health Research Group, School of Public Health and Social Work, Queensland University of Technology, Brisbane, Australia.
| |
Collapse
|
3
|
Shih DH, Wu YH, Wu TW, Chang SC, Shih MH. Infodemiology of Influenza-like Illness: Utilizing Google Trends' Big Data for Epidemic Surveillance. J Clin Med 2024; 13:1946. [PMID: 38610711 PMCID: PMC11012909 DOI: 10.3390/jcm13071946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Revised: 03/18/2024] [Accepted: 03/25/2024] [Indexed: 04/14/2024] Open
Abstract
Background: Influenza-like illness (ILI) encompasses symptoms similar to influenza, affecting population health. Surveillance, including Google Trends (GT), offers insights into epidemic patterns. Methods: This study used multiple regression models to analyze the correlation between ILI incidents, GT keyword searches, and climate variables during influenza outbreaks. It compared the predictive capabilities of time-series and deep learning models against ILI emergency incidents. Results: The GT searches for "fever" and "cough" were significantly associated with ILI cases (p < 0.05). Temperature had a more substantial impact on ILI incidence than humidity. Among the tested models, ARIMA provided the best predictive power. Conclusions: GT and climate data can forecast ILI trends, aiding governmental decision making. Temperature is a crucial predictor, and ARIMA models excel in forecasting ILI incidences.
Collapse
Affiliation(s)
- Dong-Her Shih
- Department of Information Management, National Yunlin University of Science and Technology, Douliu 64002, Taiwan; (D.-H.S.); (Y.-H.W.); (S.-C.C.)
| | - Yi-Huei Wu
- Department of Information Management, National Yunlin University of Science and Technology, Douliu 64002, Taiwan; (D.-H.S.); (Y.-H.W.); (S.-C.C.)
| | - Ting-Wei Wu
- Department of Information Management, National Yunlin University of Science and Technology, Douliu 64002, Taiwan; (D.-H.S.); (Y.-H.W.); (S.-C.C.)
| | - Shu-Chi Chang
- Department of Information Management, National Yunlin University of Science and Technology, Douliu 64002, Taiwan; (D.-H.S.); (Y.-H.W.); (S.-C.C.)
| | - Ming-Hung Shih
- Department of Electrical and Computer Engineering, Iowa State University, 2520 Osborn Drive, Ames, IA 50011, USA;
| |
Collapse
|
4
|
Wang P, Zhang W, Wang H, Shi C, Li Z, Wang D, Luo L, Du Z, Hao Y. Predicting the incidence of infectious diarrhea with symptom surveillance data using a stacking-based ensembled model. BMC Infect Dis 2024; 24:265. [PMID: 38408967 PMCID: PMC10898154 DOI: 10.1186/s12879-024-09138-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 02/14/2024] [Indexed: 02/28/2024] Open
Abstract
BACKGROUND Infectious diarrhea remains a major public health problem worldwide. This study used stacking ensemble to developed a predictive model for the incidence of infectious diarrhea, aiming to achieve better prediction performance. METHODS Based on the surveillance data of infectious diarrhea cases, relevant symptoms and meteorological factors of Guangzhou from 2016 to 2021, we developed four base prediction models using artificial neural networks (ANN), Long Short-Term Memory networks (LSTM), support vector regression (SVR) and extreme gradient boosting regression trees (XGBoost), which were then ensembled using stacking to obtain the final prediction model. All the models were evaluated with three metrics: mean absolute percentage error (MAPE), root mean square error (RMSE), and mean absolute error (MAE). RESULTS Base models that incorporated symptom surveillance data and weekly number of infectious diarrhea cases were able to achieve lower RMSEs, MAEs, and MAPEs than models that added meteorological data and weekly number of infectious diarrhea cases. The LSTM had the best prediction performance among the four base models, and its RMSE, MAE, and MAPE were: 84.85, 57.50 and 15.92%, respectively. The stacking ensembled model outperformed the four base models, whose RMSE, MAE, and MAPE were 75.82, 55.93, and 15.70%, respectively. CONCLUSIONS The incorporation of symptom surveillance data could improve the predictive accuracy of infectious diarrhea prediction models, and symptom surveillance data was more effective than meteorological data in enhancing model performance. Using stacking to combine multiple prediction models were able to alleviate the difficulty in selecting the optimal model, and could obtain a model with better performance than base models.
Collapse
Affiliation(s)
- Pengyu Wang
- Department of Medical Statistics, School of Public Health & Center for Health Information Research & Sun Yat-sen Global Health Institute, Sun Yat-sen University, Guangzhou, China
| | - Wangjian Zhang
- Department of Medical Statistics, School of Public Health & Center for Health Information Research & Sun Yat-sen Global Health Institute, Sun Yat-sen University, Guangzhou, China
| | - Hui Wang
- Department of Infectious Disease Control and Prevention, Guangzhou Center for Disease Control and Prevention, Guangzhou, China
| | - Congxing Shi
- Department of Medical Statistics, School of Public Health & Center for Health Information Research & Sun Yat-sen Global Health Institute, Sun Yat-sen University, Guangzhou, China
| | - Zhiqiang Li
- Department of Medical Statistics, School of Public Health & Center for Health Information Research & Sun Yat-sen Global Health Institute, Sun Yat-sen University, Guangzhou, China
| | - Dahu Wang
- Department of Infectious Disease Control and Prevention, Guangzhou Center for Disease Control and Prevention, Guangzhou, China
| | - Lei Luo
- Department of Infectious Disease Control and Prevention, Guangzhou Center for Disease Control and Prevention, Guangzhou, China.
| | - Zhicheng Du
- Department of Medical Statistics, School of Public Health & Center for Health Information Research & Sun Yat-sen Global Health Institute, Sun Yat-sen University, Guangzhou, China.
- Guangzhou Joint Research Center for Disease Surveillance and Risk Assessment, Sun Yat-sen University & Guangzhou Center for Disease Control and Prevention, Guangzhou, China.
| | - Yuantao Hao
- Peking University Center for Public Health and Epidemic Preparedness & Response, Beijing, China.
- Department of Epidemiology & Biostatistics, School of Public Health, Peking University, Beijing, China.
- Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing, China.
| |
Collapse
|
5
|
Zhang T, Rabhi F, Chen X, Paik HY, MacIntyre CR. A machine learning-based universal outbreak risk prediction tool. Comput Biol Med 2024; 169:107876. [PMID: 38176209 DOI: 10.1016/j.compbiomed.2023.107876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 12/12/2023] [Accepted: 12/18/2023] [Indexed: 01/06/2024]
Abstract
In order to prevent and control the increasing number of serious epidemics, the ability to predict the risk caused by emerging outbreaks is essential. However, most current risk prediction tools, except EPIRISK, are limited by being designed for targeting only one specific disease and one country. Differences between countries and diseases (e.g., different economic conditions, different modes of transmission, etc.) pose challenges for building models with cross-country and cross-disease prediction capabilities. The limitation of universality affects domestic and international efforts to control and prevent pandemic outbreaks. To address this problem, we used outbreak data from 43 diseases in 206 countries to develop a universal risk prediction system that can be used across countries and diseases. This system used five machine learning models (including Neural Network XGBoost, Logistic Boost, Random Forest and Kernel SVM) to predict and vote together to make ensemble predictions. It can make predictions with around 80%-90 % accuracy from economic, cultural, social, and epidemiological factors. Three different datasets were designed to test the performance of ML models under different realistic situations. This prediction system has strong predictive ability, adaptability, and generality. It can give universal outbreak risk assessment that are not limited by border or disease type, facilitate rapid response to pandemic outbreaks, government decision-making and international cooperation.
Collapse
Affiliation(s)
- Tianyu Zhang
- FinanceIT Research Group, University of New South Wales, Sydney, NSW, Australia.
| | - Fethi Rabhi
- FinanceIT Research Group, University of New South Wales, Sydney, NSW, Australia
| | - Xin Chen
- Biosecurity Program, The Kirby Institute, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Hye-Young Paik
- School of Computer Science and Engineering, Faulty of Engineering, University of New South Wales, Sydney, NSW, 2052, Australia
| | - Chandini Raina MacIntyre
- Biosecurity Program, The Kirby Institute, University of New South Wales, Sydney, NSW, 2052, Australia; College of Public Service & Community Solutions, Arizona State University, Tempe, AZ, 85004, United States
| |
Collapse
|
6
|
Huang Y, Huang Z, Yang Q, Jin H, Xu T, Fu Y, Zhu Y, Zhang X, Chen C. Predicting mild cognitive impairment among Chinese older adults: a longitudinal study based on long short-term memory networks and machine learning. Front Aging Neurosci 2023; 15:1283243. [PMID: 37937119 PMCID: PMC10626462 DOI: 10.3389/fnagi.2023.1283243] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 10/10/2023] [Indexed: 11/09/2023] Open
Abstract
Background Mild cognitive impairment (MCI) is a transitory yet reversible stage of dementia. Systematic, scientific and population-wide early screening system for MCI is lacking. This study aimed to construct prediction models using longitudinal data to identify potential MCI patients and explore its critical features among Chinese older adults. Methods A total of 2,128 participants were selected from wave 5-8 of Chinese Longitudinal Healthy Longevity Study. Cognitive function was measured using the Chinese version of Mini-Mental State Examination. Long- short-term memory (LSTM) and three machine learning techniques, including 8 sociodemographic features and 12 health behavior and health status features, were used to predict individual risk of MCI in the next year. Performances of prediction models were evaluated through receiver operating curve and decision curve analysis. The importance of predictors in prediction models were explored using Shapley Additive explanation (SHAP) model. Results The area under the curve values of three models were around 0.90 and decision curve analysis indicated that the net benefit of XGboost and Random Forest were approximate when threshold is lower than 0.8. SHAP models showed that age, education, respiratory disease, gastrointestinal ulcer and self-rated health are the five most important predictors of MCI. Conclusion This screening method of MCI, combining LSTM and machine learning, successfully predicted the risk of MCI using longitudinal datasets, and enables health care providers to implement early intervention to delay the process from MCI to dementia, reducing the incidence and treatment cost of dementia ultimately.
Collapse
Affiliation(s)
- Yucheng Huang
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Zishuo Huang
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
- School of Innovation and Entrepreneurship, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Qingren Yang
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
- School of Innovation and Entrepreneurship, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Haojie Jin
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Tingke Xu
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Yating Fu
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Yue Zhu
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Xiangyang Zhang
- The First Affiliated Hospital of Wenzhou Medical University, Wenzhou, Zhejiang, China
| | - Chun Chen
- School of Public Health and Management, Wenzhou Medical University, Wenzhou, Zhejiang, China
- Center for Healthy China Research, Wenzhou Medical University, Wenzhou, Zhejiang, China
| |
Collapse
|
7
|
Servadio JL, Convertino M, Fiecas M, Muñoz‐Zanzi C. Weekly Forecasting of Yellow Fever Occurrence and Incidence via Eco-Meteorological Dynamics. GEOHEALTH 2023; 7:e2023GH000870. [PMID: 37885914 PMCID: PMC10599710 DOI: 10.1029/2023gh000870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 08/31/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Yellow Fever (YF), a mosquito-borne disease, requires ongoing surveillance and prevention due to its persistence and ability to cause major epidemics, including one that began in Brazil in 2016. Forecasting based on factors influencing YF risk can improve efficiency in prevention. This study aimed to produce weekly forecasts of YF occurrence and incidence in Brazil using weekly meteorological and ecohydrological conditions. Occurrence was forecast as the probability of observing any cases, and incidence was forecast to represent morbidity if YF occurs. We fit gamma hurdle models, selecting predictors from several meteorological and ecohydrological factors, based on forecast accuracy defined by receiver operator characteristic curves and mean absolute error. We fit separate models for data before and after the start of the 2016 outbreak, forecasting occurrence and incidence for all municipalities of Brazil weekly. Different predictor sets were found to produce most accurate forecasts in each time period, and forecast accuracy was high for both time periods. Temperature, precipitation, and previous YF burden were most influential predictors among models. Minimum, maximum, mean, and range of weekly temperature, precipitation, and humidity contributed to forecasts, with optimal lag times of 2, 6, and 7 weeks depending on time period. Results from this study show the use of environmental predictors in providing regular forecasts of YF burden and producing nationwide forecasts. Weekly forecasts, which can be produced using the forecast model developed in this study, are beneficial for informing immediate preparedness measures.
Collapse
Affiliation(s)
- Joseph L. Servadio
- Department of BiologyCenter for Infectious Disease DynamicsPennsylvania State UniversityUniversity ParkPAUSA
- Division of Environmental Health SciencesSchool of Public HealthUniversity of MinnesotaMinneapolisMNUSA
| | | | - Mark Fiecas
- Division of BiostatisticsSchool of Public HealthUniversity of MinnesotaMinneapolisMNUSA
| | - Claudia Muñoz‐Zanzi
- Division of Environmental Health SciencesSchool of Public HealthUniversity of MinnesotaMinneapolisMNUSA
| |
Collapse
|
8
|
Alkhammash EH, Assiri SA, Nemenqani DM, Althaqafi RMM, Hadjouni M, Saeed F, Elshewey AM. Application of Machine Learning to Predict COVID-19 Spread via an Optimized BPSO Model. Biomimetics (Basel) 2023; 8:457. [PMID: 37887588 PMCID: PMC10604133 DOI: 10.3390/biomimetics8060457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 09/21/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023] Open
Abstract
During the pandemic of the coronavirus disease (COVID-19), statistics showed that the number of affected cases differed from one country to another and also from one city to another. Therefore, in this paper, we provide an enhanced model for predicting COVID-19 samples in different regions of Saudi Arabia (high-altitude and sea-level areas). The model is developed using several stages and was successfully trained and tested using two datasets that were collected from Taif city (high-altitude area) and Jeddah city (sea-level area) in Saudi Arabia. Binary particle swarm optimization (BPSO) is used in this study for making feature selections using three different machine learning models, i.e., the random forest model, gradient boosting model, and naive Bayes model. A number of predicting evaluation metrics including accuracy, training score, testing score, F-measure, recall, precision, and receiver operating characteristic (ROC) curve were calculated to verify the performance of the three machine learning models on these datasets. The experimental results demonstrated that the gradient boosting model gives better results than the random forest and naive Bayes models with an accuracy of 94.6% using the Taif city dataset. For the dataset of Jeddah city, the results demonstrated that the random forest model outperforms the gradient boosting and naive Bayes models with an accuracy of 95.5%. The dataset of Jeddah city achieved better results than the dataset of Taif city in Saudi Arabia using the enhanced model for the term of accuracy.
Collapse
Affiliation(s)
- Eman H. Alkhammash
- Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia;
| | - Sara Ahmad Assiri
- Otolaryngology-Head and Neck Surgert Department, King Faisal Hospital, P.O. Box 11099, Taif 21944, Saudi Arabia;
| | - Dalal M. Nemenqani
- College of Medicine, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; (D.M.N.); (R.M.M.A.)
| | - Raad M. M. Althaqafi
- College of Medicine, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia; (D.M.N.); (R.M.M.A.)
| | - Myriam Hadjouni
- Department of Computer Sciences, College of Computer and Information Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
| | - Faisal Saeed
- DAAI Research Group, Department of Computing and Data Science, School of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK;
| | - Ahmed M. Elshewey
- Faculty of Computers and Information, Computer Science Department, Suez University, Suez 43533, Egypt;
| |
Collapse
|
9
|
Sharma S, Gupta YK, Mishra AK. Analysis and Prediction of COVID-19 Multivariate Data Using Deep Ensemble Learning Methods. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:5943. [PMID: 37297547 PMCID: PMC10252939 DOI: 10.3390/ijerph20115943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 05/02/2023] [Accepted: 05/17/2023] [Indexed: 06/12/2023]
Abstract
The global economy has suffered losses as a result of the COVID-19 epidemic. Accurate and effective predictive models are necessary for the governance and readiness of the healthcare system and its resources and, ultimately, for the prevention of the spread of illness. The primary objective of the project is to build a robust, universal method for predicting COVID-19-positive cases. Collaborators will benefit from this while developing and revising their pandemic response plans. For accurate prediction of the spread of COVID-19, the research recommends an adaptive gradient LSTM model (AGLSTM) using multivariate time series data. RNN, LSTM, LASSO regression, Ada-Boost, Light Gradient Boosting and KNN models are also used in the research, which accurately and reliably predict the course of this unpleasant disease. The proposed technique is evaluated under two different experimental conditions. The former uses case studies from India to validate the methodology, while the latter uses data fusion and transfer-learning techniques to reuse data and models to predict the onset of COVID-19. The model extracts important advanced features that influence the COVID-19 cases using a convolutional neural network and predicts the cases using adaptive LSTM after CNN processes the data. The experiment results show that the output of AGLSTM outperforms with an accuracy of 99.81% and requires only a short time for training and prediction.
Collapse
Affiliation(s)
- Shruti Sharma
- Department of Computer Science, Banasthali Vidyapith, Tonk 304022, India;
- School of Technology & Management, SVKM’s Narsee Monji Institute of Management Studies (NMIMS), Indore 452005, India
| | - Yogesh Kumar Gupta
- Department of Computer Science, Banasthali Vidyapith, Tonk 304022, India;
| | - Abhinava K. Mishra
- Molecular, Cellular and Developmental Biology Department, University of California Santa Barbara, Santa Barbara, CA 93106, USA
| |
Collapse
|
10
|
Sheikh BUH, Zafar A. Untargeted white-box adversarial attack to break into deep leaning based COVID-19 monitoring face mask detection system. MULTIMEDIA TOOLS AND APPLICATIONS 2023:1-27. [PMID: 37362697 PMCID: PMC10160719 DOI: 10.1007/s11042-023-15405-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 09/17/2022] [Accepted: 04/18/2023] [Indexed: 06/28/2023]
Abstract
The face mask detection system has been a valuable tool to combat COVID-19 by preventing its rapid transmission. This article demonstrated that the present deep learning-based face mask detection systems are vulnerable to adversarial attacks. We proposed a framework for a robust face mask detection system that is resistant to adversarial attacks. We first developed a face mask detection system by fine-tuning the MobileNetv2 model and training it on the custom-built dataset. The model performed exceptionally well, achieving 95.83% of accuracy on test data. Then, the model's performance is assessed using adversarial images calculated by the fast gradient sign method (FGSM). The FGSM attack reduced the model's classification accuracy from 95.83% to 14.53%, indicating that the adversarial attack on the proposed model severely damaged its performance. Finally, we illustrated that the proposed robust framework enhanced the model's resistance to adversarial attacks. Although there was a notable drop in the accuracy of the robust model on unseen clean data from 95.83% to 92.79%, the model performed exceptionally well, improving the accuracy from 14.53% to 92% on adversarial data. We expect our research to heighten awareness of adversarial attacks on COVID-19 monitoring systems and inspire others to protect healthcare systems from similar attacks.
Collapse
Affiliation(s)
- Burhan Ul haque Sheikh
- Department of computer science, Aligarh Muslim University, Aligarh, Uttar Pradesh 202002 India
| | - Aasim Zafar
- Department of computer science, Aligarh Muslim University, Aligarh, Uttar Pradesh 202002 India
| |
Collapse
|
11
|
MacIntyre CR, Chen X, Kunasekaran M, Quigley A, Lim S, Stone H, Paik HY, Yao L, Heslop D, Wei W, Sarmiento I, Gurdasani D. Artificial intelligence in public health: the potential of epidemic early warning systems. J Int Med Res 2023; 51:3000605231159335. [PMID: 36967669 PMCID: PMC10052500 DOI: 10.1177/03000605231159335] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023] Open
Abstract
The use of artificial intelligence (AI) to generate automated early warnings in epidemic surveillance by harnessing vast open-source data with minimal human intervention has the potential to be both revolutionary and highly sustainable. AI can overcome the challenges faced by weak health systems by detecting epidemic signals much earlier than traditional surveillance. AI-based digital surveillance is an adjunct to-not a replacement of-traditional surveillance and can trigger early investigation, diagnostics and responses at the regional level. This narrative review focuses on the role of AI in epidemic surveillance and summarises several current epidemic intelligence systems including ProMED-mail, HealthMap, Epidemic Intelligence from Open Sources, BlueDot, Metabiota, the Global Biosurveillance Portal, Epitweetr and EPIWATCH. Not all of these systems are AI-based, and some are only accessible to paid users. Most systems have large volumes of unfiltered data; only a few can sort and filter data to provide users with curated intelligence. However, uptake of these systems by public health authorities, who have been slower to embrace AI than their clinical counterparts, is low. The widespread adoption of digital open-source surveillance and AI technology is needed for the prevention of serious epidemics.
Collapse
Affiliation(s)
- Chandini Raina MacIntyre
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
- College of Public Service & Community Solutions, Arizona State University, Tempe, United States
| | - Xin Chen
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Mohana Kunasekaran
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Ashley Quigley
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Samsung Lim
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
- School of Civil and Environmental Engineering, University of New South Wales, Sydney, Australia
| | - Haley Stone
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Hye-Young Paik
- School of Computer Science and Engineering, Faulty of Engineering, University of New South Wales, Sydney, Australia
| | - Lina Yao
- School of Computer Science and Engineering, Faulty of Engineering, University of New South Wales, Sydney, Australia
| | - David Heslop
- School of Population Health, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Wenzhao Wei
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Ines Sarmiento
- Biosecurity Program, The Kirby Institute, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Deepti Gurdasani
- William Harvey Research Institute, Queen Mary University of London, United Kingdom
| |
Collapse
|
12
|
Dai S, Han L. Influenza surveillance with Baidu index and attention-based long short-term memory model. PLoS One 2023; 18:e0280834. [PMID: 36689543 PMCID: PMC9870163 DOI: 10.1371/journal.pone.0280834] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 01/10/2023] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The prediction and prevention of influenza is a public health issue of great concern, and the study of timely acquisition of influenza transmission trend has become an important research topic. For achieving more quicker and accurate detection and prediction, the data recorded on the Internet, especially on the search engine from Google or Baidu are widely introduced into this field. Moreover, with the development of intelligent technology and machine learning algorithm, many updated and advanced trend tracking and forecasting methods are also being used in this research problem. METHODS In this paper, a new recurrent neural network architecture, attention-based long short-term memory model is proposed for influenza surveillance. This is a kind of deep learning model which is trained by processing from Baidu Index series so as to fit the real influenza survey time series. Previous studies on influenza surveillance by Baidu Index mostly used traditional autoregressive moving average model or classical machine learning models such as logarithmic linear regression, support vector regression or multi-layer perception model to fit influenza like illness data, which less considered the deep learning structure. Meanwhile, some new model that considered the deep learning structure did not take into account the application of Baidu index data. This study considers introducing the recurrent neural network with long short-term memory combined with attention mechanism into the influenza surveillance research model, which not only fits the research problems well in model structure, but also provides research methods based on Baidu index. RESULTS The actual survey data and Baidu Index data are used to train and test the proposed attention-based long short-term memory model and the other comparison models, so as to iterate the value of the model parameters, and to describe and predict the influenza epidemic situation. The experimental results show that our proposed model has better performance in the mean absolute error, mean absolute percentage error, index of agreement and other indicators than the other comparison models. CONCLUSION Our proposed attention-based long short-term memory model vividly verifies the ability of this attention-based long short-term memory structure for better surveillance and prediction the trend of influenza. In comparison with some of the latest models and methods in this research field, the model we proposed is also excellent in effect, even more lightweight and robust. Future research direction can consider fusing multimodal data based on this model and developing more application scenarios.
Collapse
Affiliation(s)
- Shangfang Dai
- School of Economics and Management, Tsinghua University, Beijing, China
| | - Litao Han
- School of Mathematics, Renmin University of China, Beijing, China
| |
Collapse
|
13
|
Kim G, Lim H, Kim Y, Kwon O, Choi JH. Intra-person multi-task learning method for chronic-disease prediction. Sci Rep 2023; 13:1069. [PMID: 36658206 PMCID: PMC9851106 DOI: 10.1038/s41598-023-28383-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 01/18/2023] [Indexed: 01/20/2023] Open
Abstract
In the medical field, various clinical information has been accumulated to help clinicians provide personalized medicine and make better diagnoses. As chronic diseases share similar characteristics, it is possible to predict multiple chronic diseases using the accumulated data of each patient. Thus, we propose an intra-person multi-task learning framework that jointly predicts the status of correlated chronic diseases and improves the model performance. Because chronic diseases occur over a long period and are affected by various factors, we considered features related to each chronic disease and the temporal relationship of the time-series data for accurate prediction. The study was carried out in three stages: (1) data preprocessing and feature selection using bidirectional recurrent imputation for time series (BRITS) and the least absolute shrinkage and selection operator (LASSO); (2) a convolutional neural network and long short-term memory (CNN-LSTM) for single-task models; and (3) a novel intra-person multi-task learning CNN-LSTM framework developed to predict multiple chronic diseases simultaneously. Our multi-task learning method between correlated chronic diseases produced a more stable and accurate system than single-task models and other baseline recurrent networks. Furthermore, the proposed model was tested using different time steps to illustrate its flexibility and generalization across multiple time steps.
Collapse
Affiliation(s)
- Gihyeon Kim
- Department of Computational Medicine, Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, 03760, Korea
| | - Heeryung Lim
- Division of Mechanical and Biomedical Engineering, Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, 03760, Korea
| | - Yunsoo Kim
- Department of Nutritional Science and Food Management, Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, 03760, Korea
| | - Oran Kwon
- Department of Nutritional Science and Food Management, Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, 03760, Korea
| | - Jang-Hwan Choi
- Division of Mechanical and Biomedical Engineering, Graduate Program in System Health Science and Engineering, Ewha Womans University, Seoul, 03760, Korea.
| |
Collapse
|
14
|
Lu W, Ren H. Diseases spectrum in the field of spatiotemporal patterns mining of infectious diseases epidemics: A bibliometric and content analysis. Front Public Health 2023; 10:1089418. [PMID: 36699887 PMCID: PMC9868952 DOI: 10.3389/fpubh.2022.1089418] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 12/21/2022] [Indexed: 01/12/2023] Open
Abstract
Numerous investigations of the spatiotemporal patterns of infectious disease epidemics, their potential influences, and their driving mechanisms have greatly contributed to effective interventions in the recent years of increasing pandemic situations. However, systematic reviews of the spatiotemporal patterns of communicable diseases are rare. Using bibliometric analysis, combined with content analysis, this study aimed to summarize the number of publications and trends, the spectrum of infectious diseases, major research directions and data-methodological-theoretical characteristics, and academic communities in this field. Based on 851 relevant publications from the Web of Science core database, from January 1991 to September 2021, the study found that the increasing number of publications and the changes in the disease spectrum have been accompanied by serious outbreaks and pandemics over the past 30 years. Owing to the current pandemic of new, infectious diseases (e.g., COVID-19) and the ravages of old infectious diseases (e.g., dengue and influenza), illustrated by the disease spectrum, the number of publications in this field would continue to rise. Three logically rigorous research directions-the detection of spatiotemporal patterns, identification of potential influencing factors, and risk prediction and simulation-support the research paradigm framework in this field. The role of human mobility in the transmission of insect-borne infectious diseases (e.g., dengue) and scale effects must be extensively studied in the future. Developed countries, such as the USA and England, have stronger leadership in the field. Therefore, much more effort must be made by developing countries, such as China, to improve their contribution and role in international academic collaborations.
Collapse
Affiliation(s)
- Weili Lu
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China,College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, China
| | - Hongyan Ren
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China,*Correspondence: Hongyan Ren ✉
| |
Collapse
|
15
|
Li D, Ren X, Su Y. Predicting COVID-19 using lioness optimization algorithm and graph convolution network. Soft comput 2023; 27:5437-5501. [PMID: 36686544 PMCID: PMC9838306 DOI: 10.1007/s00500-022-07778-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/21/2022] [Indexed: 01/11/2023]
Abstract
In this paper, a graph convolution network prediction model based on the lioness optimization algorithm (LsOA-GCN) is proposed to predict the cumulative number of confirmed COVID-19 cases in 17 regions of Hubei Province from March 23 to March 29, 2020, according to the transmission characteristics of COVID-19. On the one hand, Spearman correlation analysis with delay days and LsOA are used to capture the dynamic changes of feature information to obtain the temporal features. On the other hand, the graph convolutional network is used to capture the topological structure of the city network, so as to obtain spatial information and finally realize the prediction task. Then, we evaluate this model through performance evaluation indicators and statistical test methods and compare the results of LsOA-GCN with 10 representative prediction methods in the current epidemic prediction study. The experimental results show that the LsOA-GCN prediction model is significantly better than other prediction methods in all indicators and can successfully capture spatio-temporal information from feature data, thereby achieving accurate prediction of epidemic trends in different regions of Hubei Province.
Collapse
Affiliation(s)
- Dong Li
- College of Economics and Management, Xi’an University of Posts and Telecommunications, Xi’an, 710061 Shaanxi People’s Republic of China
| | - Xiaofei Ren
- College of Economics and Management, Xi’an University of Posts and Telecommunications, Xi’an, 710061 Shaanxi People’s Republic of China
| | - Yunze Su
- College of Economics and Management, Xi’an University of Posts and Telecommunications, Xi’an, 710061 Shaanxi People’s Republic of China
| |
Collapse
|
16
|
Wang Y, Gao C, Zhao T, Jiao H, Liao Y, Hu Z, Wang L. A comparative study of three models to analyze the impact of air pollutants on the number of pulmonary tuberculosis cases in Urumqi, Xinjiang. PLoS One 2023; 18:e0277314. [PMID: 36649267 PMCID: PMC9844834 DOI: 10.1371/journal.pone.0277314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Accepted: 10/25/2022] [Indexed: 01/18/2023] Open
Abstract
In this paper, we separately constructed ARIMA, ARIMAX, and RNN models to determine whether there exists an impact of the air pollutants (such as PM2.5, PM10, CO, O3, NO2, and SO2) on the number of pulmonary tuberculosis cases from January 2014 to December 2018 in Urumqi, Xinjiang. In addition, by using a new comprehensive evaluation index DISO to compare the performance of three models, it was demonstrated that ARIMAX (1,1,2) × (0,1,1)12 + PM2.5 (lag = 12) model was the optimal one, which was applied to predict the number of pulmonary tuberculosis cases in Urumqi from January 2019 to December 2019. The predicting results were in good agreement with the actual pulmonary tuberculosis cases and shown that pulmonary tuberculosis cases obviously declined, which indicated that the policies of environmental protection and universal health checkups in Urumqi have been very effective in recent years.
Collapse
Affiliation(s)
- Yingdan Wang
- College of Public Health, Xinjiang Medical University, Urumqi, Xinjiang, China
| | - Chunjie Gao
- College of Public Health, Xinjiang Medical University, Urumqi, Xinjiang, China
| | - Tiantian Zhao
- Department of Infection Prevention and Control, Puyang People’s Hospital, Puyang, Henan, China
| | - Haiyan Jiao
- College of Public Health, Xinjiang Medical University, Urumqi, Xinjiang, China
| | - Ying Liao
- College of Public Health, Xinjiang Medical University, Urumqi, Xinjiang, China
| | - Zengyun Hu
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, Xinjiang, China
| | - Lei Wang
- Department of Medical Engineering and Technology, Xinjiang Medical University, Urumqi, Xinjiang, China
- * E-mail:
| |
Collapse
|
17
|
Next Generation Infectious Diseases Monitoring Gages via Incremental Federated Learning: Current Trends and Future Possibilities. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:1102715. [PMID: 36909972 PMCID: PMC9995206 DOI: 10.1155/2023/1102715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 07/29/2022] [Accepted: 09/27/2022] [Indexed: 03/05/2023]
Abstract
Infectious diseases are always alarming for the survival of human life and are a key concern in the public health domain. Therefore, early diagnosis of these infectious diseases is a high demand for modern-era healthcare systems. Novel general infectious diseases such as coronavirus are infectious diseases that cause millions of human deaths across the globe in 2020. Therefore, early, robust recognition of general infectious diseases is the desirable requirement of modern intelligent healthcare systems. This systematic study is designed under Kitchenham guidelines and sets different RQs (research questions) for robust recognition of general infectious diseases. From 2018 to 2021, four electronic databases, IEEE, ACM, Springer, and ScienceDirect, are used for the extraction of research work. These extracted studies delivered different schemes for the accurate recognition of general infectious diseases through different machine learning techniques with the inclusion of deep learning and federated learning models. A framework is also introduced to share the process of detection of infectious diseases by using machine learning models. After the filtration process, 21 studies are extracted and mapped to defined RQs. In the future, early diagnosis of infectious diseases will be possible through wearable health monitoring cages. Moreover, these gages will help to reduce the time and death rate by detection of severe diseases at starting stage.
Collapse
|
18
|
Using Recurrent Neural Networks for Predicting Type-2 Diabetes from Genomic and Tabular Data. Diagnostics (Basel) 2022; 12:diagnostics12123067. [PMID: 36553074 PMCID: PMC9776641 DOI: 10.3390/diagnostics12123067] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/01/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022] Open
Abstract
The development of genomic technology for smart diagnosis and therapies for various diseases has lately been the most demanding area for computer-aided diagnostic and treatment research. Exponential breakthroughs in artificial intelligence and machine intelligence technologies could pave the way for identifying challenges afflicting the healthcare industry. Genomics is paving the way for predicting future illnesses, including cancer, Alzheimer's disease, and diabetes. Machine learning advancements have expedited the pace of biomedical informatics research and inspired new branches of computational biology. Furthermore, knowing gene relationships has resulted in developing more accurate models that can effectively detect patterns in vast volumes of data, making classification models important in various domains. Recurrent Neural Network models have a memory that allows them to quickly remember knowledge from previous cycles and process genetic data. The present work focuses on type 2 diabetes prediction using gene sequences derived from genomic DNA fragments through automated feature selection and feature extraction procedures for matching gene patterns with training data. The suggested model was tested using tabular data to predict type 2 diabetes based on several parameters. The performance of neural networks incorporating Recurrent Neural Network (RNN) components, Long Short-Term Memory (LSTM), and Gated Recurrent Units (GRU) was tested in this research. The model's efficiency is assessed using the evaluation metrics such as Sensitivity, Specificity, Accuracy, F1-Score, and Mathews Correlation Coefficient (MCC). The suggested technique predicted future illnesses with fair Accuracy. Furthermore, our research showed that the suggested model could be used in real-world scenarios and that input risk variables from an end-user Android application could be kept and evaluated on a secure remote server.
Collapse
|
19
|
Keshavamurthy R, Dixon S, Pazdernik KT, Charles LE. Predicting infectious disease for biopreparedness and response: A systematic review of machine learning and deep learning approaches. One Health 2022; 15:100439. [PMID: 36277100 PMCID: PMC9582566 DOI: 10.1016/j.onehlt.2022.100439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/20/2022] [Accepted: 09/30/2022] [Indexed: 11/21/2022] Open
Abstract
The complex, unpredictable nature of pathogen occurrence has required substantial efforts to accurately predict infectious diseases (IDs). With rising popularity of Machine Learning (ML) and Deep Learning (DL) techniques combined with their unique ability to uncover connections between large amounts of diverse data, we conducted a PRISMA systematic review to investigate advances in ID prediction for human and animal diseases using ML and DL. This review included the type of IDs modeled, ML and DL techniques utilized, geographical distribution, prediction tasks performed, input features utilized, spatial and temporal scales, error metrics used, computational efficiency, uncertainty quantification, and missing data handling methods. Among 237 relevant articles published between January 2001 and May 2021, highly contagious diseases in humans were most often represented, including COVID-19 (37.1%), influenza/influenza-like illnesses (9.3%), dengue (8.9%), and malaria (5.1%). Out of 37 diseases identified, 51.4% were zoonotic, 37.8% were human-only, and 8.1% were animal-only, with only 1.6% economically significant, non-zoonotic livestock diseases. Despite the number of zoonoses, 86.5% of articles modeled humans whereas only a few articles (5.1%) contained more than one host species. Eastern Asia (32.5%), North America (17.7%), and Southern Asia (13.1%) were the most represented locations. Frequent approaches included tree-based ML (38.4%) and feed-forward neural networks (26.6%). Articles predicted temporal incidence (66.7%), disease risk (38.0%), and/or spatial movement (31.2%). Less than 10% of studies addressed uncertainty quantification, computational efficiency, and missing data, which are essential to operational use and deployment. This study highlights trends and gaps in ML and DL for ID prediction, providing guidelines for future works to better support biopreparedness and response. To fully utilize ML and DL for improved ID forecasting, models should include the full disease ecology in a One-Health context, important food and agricultural diseases, underrepresented hotspots, and important metrics required for operational deployment.
Collapse
Affiliation(s)
- Ravikiran Keshavamurthy
- Pacific Northwest National Laboratory, Richland, WA 99354, USA
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA 99164, USA
| | - Samuel Dixon
- Pacific Northwest National Laboratory, Richland, WA 99354, USA
| | - Karl T. Pazdernik
- Pacific Northwest National Laboratory, Richland, WA 99354, USA
- Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA
| | - Lauren E. Charles
- Pacific Northwest National Laboratory, Richland, WA 99354, USA
- Paul G. Allen School for Global Health, Washington State University, Pullman, WA 99164, USA
| |
Collapse
|
20
|
Lou HR, Wang X, Gao Y, Zeng Q. Comparison of ARIMA model, DNN model and LSTM model in predicting disease burden of occupational pneumoconiosis in Tianjin, China. BMC Public Health 2022; 22:2167. [PMID: 36434563 PMCID: PMC9694549 DOI: 10.1186/s12889-022-14642-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 11/16/2022] [Indexed: 11/27/2022] Open
Abstract
BACKGROUND This study aims to explore appropriate model for predicting the disease burden of pneumoconiosis in Tianjin by comparing the prediction effects of Autoregressive Integrated Moving Average (ARIMA) model, Deep Neural Networks (DNN) model and multivariate Long Short-Term Memory Neural Network (LSTM) models. METHODS Disability adjusted life year (DALY) was used to evaluate the disease burden of occupational pneumoconiosis. ARIMA model, DNN model and multivariate LSTM model were used to establish prediction model. Three performance evaluation metrics including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) were used to compare the prediction effects of the three models. RESULTS From 1990 to 2021, there were 10,694 cases of pneumoconiosis patients in Tianjin, resulting in a total of 112,725.52 person-years of DALY. During this period, the annual DALY showed a fluctuating trend, but it had a strong correlation with the number of pneumoconiosis patients, the average age of onset, the average age of receiving dust and the gross industrial product, and had a significant nonlinear relationship with them. The comparison of prediction results showed that the performance of multivariate LSTM model and DNN model is much better than that of traditional ARIMA model. Compared with the DNN model, the multivariate LSTM model performed better in the training set, showing lower RMES (42.30 vs. 380.96), MAE (29.53 vs. 231.20) and MAPE (1.63% vs. 2.93%), but performed less stable than the DNN on the test set, showing slightly higher RMSE (1309.14 vs. 656.44), MAE (886.98 vs. 594.47) and MAPE (36.86% vs. 22.43%). CONCLUSION The machine learning techniques of DNN and LSTM are an innovative method to accurately and efficiently predict the burden of pneumoconiosis with the simplest data. It has great application prospects in the monitoring and early warning system of occupational disease burden.
Collapse
Affiliation(s)
- He-Ren Lou
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China ,grid.265021.20000 0000 9792 1228School of Public Health, Tianjin Medical University, Tianjin, 300070 China
| | - Xin Wang
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China
| | - Ya Gao
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China
| | - Qiang Zeng
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China
| |
Collapse
|
21
|
Distributed lag inspired machine learning for predicting vaccine-induced changes in COVID-19 hospitalization and intensive care unit admission. Sci Rep 2022; 12:18748. [PMID: 36335113 PMCID: PMC9637108 DOI: 10.1038/s41598-022-21969-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 10/05/2022] [Indexed: 11/08/2022] Open
Abstract
Distributed lags play important roles in explaining the short-run dynamic and long-run cumulative effects of features on a response variable. Unlike the usual lag length selection, important lags with significant weights are selected in a distributed lag model (DLM). Inspired by the importance of distributed lags, this research focuses on the construction of distributed lag inspired machine learning (DLIML) for predicting vaccine-induced changes in COVID-19 hospitalization and intensive care unit (ICU) admission rates. Importance of a lagged feature in DLM is examined by hypothesis testing and a subset of important features are selected by evaluating an information criterion. Akin to the DLM, we demonstrate the selection of distributed lags in machine learning by evaluating importance scores and objective functions. Finally, we apply the DLIML with supervised learning for forecasting daily changes in COVID-19 hospitalization and ICU admission rates in United Kingdom (UK) and United States of America (USA). A sharp decline in hospitalization and ICU admission rates are observed when around 40% people are vaccinated. For one percent more vaccination, daily changes in hospitalization and ICU admission rates are expected to reduce by 4.05 and 0.74 per million after 14 days in UK, and 5.98 and 1.04 per million after 20 days in USA, respectively. Long-run cumulative effects in the DLM demonstrate that the daily changes in hospitalization and ICU admission rates are expected to jitter around the zero line in a long-run. Application of the DLIML selects fewer lagged features but provides qualitatively better forecasting outcome for data-driven healthcare service planning.
Collapse
|
22
|
Wu Y, Sun Y, Lin M. SQEIR: An epidemic virus spread analysis and prediction model. COMPUTERS & ELECTRICAL ENGINEERING : AN INTERNATIONAL JOURNAL 2022; 102:108230. [PMID: 35965689 PMCID: PMC9364756 DOI: 10.1016/j.compeleceng.2022.108230] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 06/29/2022] [Accepted: 07/10/2022] [Indexed: 06/15/2023]
Abstract
In 2019, a new strain of coronavirus pneumonia spread quickly worldwide. Viral propagation may be simulated using the Susceptible Infectious Removed (SIR) model. However, the SIR model fails to consider that separation of patients in the COVID-19 incubation stage entails difficulty and that these patients have high transmission potential. The model also ignores the positive effect of quarantine measures on the spread of the epidemic. To address the two flaws in the SIR model, this study proposes a new infectious disease model referred to as the Susceptible Quarantined Exposed Infective Removed (SQEIR) model. The proposed model uses the weighted least squares for the optimal estimation of important parameters in the infectious disease model. Based on these parameters, new differential equations were developed to describe the spread of the epidemic. The experimental results show that this model exhibits an accuracy 6.7% higher than that of traditional infectious disease models.
Collapse
Affiliation(s)
- Yichun Wu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China
| | - Yaqi Sun
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China
- Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, 421002, China
| | - Mugang Lin
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, 421002, China
- Hunan Provincial Key Laboratory of Intelligent Information Processing and Application, Hengyang, 421002, China
| |
Collapse
|
23
|
Guo C, Li H. Application of 5G network combined with AI robots in personalized nursing in China: A literature review. Front Public Health 2022; 10:948303. [PMID: 36091551 PMCID: PMC9449115 DOI: 10.3389/fpubh.2022.948303] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 08/08/2022] [Indexed: 01/21/2023] Open
Abstract
The medical and healthcare industry is currently developing into digitization. Attributed to the rapid development of advanced technologies such as the 5G network, cloud computing, artificial intelligence (AI), and big data, and their wide applications in the medical industry, the medical model is shifting into an intelligent one. By combining the 5G network with cloud healthcare platforms and AI, nursing robots can effectively improve the overall medical efficacy. Meanwhile, patients can enjoy personalized medical services, the supply and the sharing of medical and healthcare services are promoted, and the digital transformation of the healthcare industry is accelerated. In this paper, the application and practice of 5G network technology in the medical industry are introduced, including telecare, 5G first-aid remote medical service, and remote robot applications. Also, by combining application characteristics of AI and development requirements of smart healthcare, the overall planning, intelligence, and personalization of the 5G network in the medical industry, as well as opportunities and challenges of its application in the field of nursing are discussed. This paper provides references to the development and application of 5G network technology in the field of medical service.
Collapse
Affiliation(s)
- Caixia Guo
- Presidents' Office, China-Japan Union Hospital, Jilin University, Changchun, China
| | - Hong Li
- Department of Emergency Medicine, China-Japan Union Hospital, Jilin University, Changchun, China,*Correspondence: Hong Li
| |
Collapse
|
24
|
Early Warning of Infectious Diseases in Hospitals Based on Multi-Self-Regression Deep Neural Network. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:8990907. [PMID: 36032546 PMCID: PMC9410942 DOI: 10.1155/2022/8990907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 07/11/2022] [Indexed: 11/17/2022]
Abstract
Objective. Infectious diseases usually spread rapidly. This study aims to develop a model that can provide fine-grained early warnings of infectious diseases using real hospital data combined with disease transmission characteristics, weather, and other multi-source data. Methods. Based on daily data reported for infectious diseases collected from several large general hospitals in China between 2012 and 2020, seven common infectious diseases in medical institutions were screened and a multi self-regression deep (MSRD) neural network was constructed. Using a recurrent neural network as the basic structure, the model can effectively model the epidemiological trend of infectious diseases by considering the current influencing conditions while taking into account the historical development characteristics in time-series data. The fitting and prediction accuracy of the model were evaluated using mean absolute error (MAE) and root mean squared error. Results. The proposed approach is significantly better than the existing infectious disease dynamics model, susceptible-exposed-infected-removed (SEIR), as it addresses the concerns of difficult-to-obtain quantitative data such as latent population, overfitting of long time series, and considering only a single series of the number of sick people without considering the epidemiological characteristics of infectious diseases. We also compare certain machine learning methods in this study. Experimental results demonstrate that the proposed approach achieves an MAE of 0.6928 and 1.3782 for hand, foot, and mouth disease and influenza, respectively. Conclusion. The MRSD-based infectious disease prediction model proposed in this paper can provide daily and instantaneous updates and accurate predictions for epidemic trends.
Collapse
|
25
|
Barboza MFX, Monteiro KHDC, Rodrigues IR, Santos GL, Monteiro WM, Figueira EAG, Sampaio VDS, Lynn T, Endo PT. Prediction of malaria using deep learning models: A case study on city clusters in the state of Amazonas, Brazil, from 2003 to 2018. Rev Soc Bras Med Trop 2022; 55:e0420. [PMID: 35946631 PMCID: PMC9344950 DOI: 10.1590/0037-8682-0420-2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 04/13/2022] [Indexed: 11/22/2022] Open
Abstract
Background: Malaria is curable. Nonetheless, over 229 million cases of malaria were recorded in 2019, along with 409,000 deaths. Although over 42 million Brazilians are at risk of contracting malaria, 99% percent of all malaria cases in Brazil are located in or around the Amazon rainforest. Despite declining cases and deaths, malaria remains a major public health issue in Brazil. Accurate spatiotemporal prediction of malaria propagation may enable improved resource allocation to support efforts to eradicate the disease. Methods: In response to calls for novel research on malaria elimination strategies that suit local conditions, in this study, we propose machine learning (ML) and deep learning (DL) models to predict the probability of malaria cases in the state of Amazonas. Using a dataset of approximately 6 million records (January 2003 to December 2018), we applied k-means clustering to group cities based on their similarity of malaria incidence. We evaluated random forest, long-short term memory (LSTM) and dated recurrent unit (GRU) models and compared their performance. Results: The LSTM architecture achieved better performance in clusters with less variability in the number of cases, whereas the GRU presents better results in clusters with high variability. Although Diebold-Mariano testing suggested that both the LSTM and GRU performed comparably, GRU can be trained significantly faster, which could prove advantageous in practice. Conclusions: All models showed satisfactory accuracy and strong performance in predicting new cases of malaria, and each could serve as a supplemental tool to support regional policies and strategies.
Collapse
Affiliation(s)
| | | | | | - Guto Leoni Santos
- Universidade Federal de Pernambuco, Centro de Informática, Recife, PE, Brasil
| | - Wuelton Marcelo Monteiro
- Universidade do Estado do Amazonas, Manaus, AM, Brasil.,Fundação de Medicina Tropical Doutor Heitor Vieira Dourado, Manaus, AM, Brasil
| | - Elder Augusto Guimaraes Figueira
- Fundação de Vigilância em Saúde Rosemary Costa Pinto, Manaus, AM, Brasil.,Instituto Oswaldo Cruz, Programa de Pós-graduação Stricto Sensu em Medicina Tropical, Rio de Janeiro, RJ, Brasil
| | - Vanderson de Souza Sampaio
- Fundação de Medicina Tropical Doutor Heitor Vieira Dourado, Manaus, AM, Brasil.,Fundação de Vigilância em Saúde Rosemary Costa Pinto, Manaus, AM, Brasil.,Instituto Todos pela Saúde, São Paulo, SP, Brasil
| | - Theo Lynn
- Dublin City University, Dublin, Ireland
| | - Patricia Takako Endo
- Universidade de Pernambuco, Programa de Pós-Graduação em Engenharia da Computação, Recife, PE, Brasil
| |
Collapse
|
26
|
Li J, Ma Y, Xu X, Pei J, He Y. A Study on Epidemic Information Screening, Prevention and Control of Public Opinion Based on Health and Medical Big Data: A Case Study of COVID-19. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:9819. [PMID: 36011450 PMCID: PMC9408673 DOI: 10.3390/ijerph19169819] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 08/03/2022] [Accepted: 08/08/2022] [Indexed: 06/15/2023]
Abstract
The outbreak of the coronavirus disease 2019 (COVID-19) represents an alert for epidemic prevention and control in public health. Offline anti-epidemic work is the main battlefield of epidemic prevention and control. However, online epidemic information prevention and control cannot be ignored. The aim of this study was to identify reliable information sources and false epidemic information, as well as early warnings of public opinion about epidemic information that may affect social stability and endanger the people's lives and property. Based on the analysis of health and medical big data, epidemic information screening and public opinion prevention and control research were decomposed into two modules. Eight characteristics were extracted from the four levels of coarse granularity, fine granularity, emotional tendency, and publisher behavior, and another regulatory feature was added, to build a false epidemic information identification model. Five early warning indicators of public opinion were selected from the macro level and the micro level to construct the early warning model of public opinion about epidemic information. Finally, an empirical analysis on COVID-19 information was conducted using big data analysis technology.
Collapse
Affiliation(s)
- Jinhai Li
- College of Information Engineering, Taizhou University, Taizhou 225300, China
| | - Yunlei Ma
- Department of Personnel, Taizhou University, Taizhou 225300, China
| | - Xinglong Xu
- School of Management, Jiangsu University, Zhenjiang 212013, China
| | - Jiaming Pei
- School of Computer Science, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Youshi He
- School of Management, Jiangsu University, Zhenjiang 212013, China
| |
Collapse
|
27
|
Mayer LM, Strich JR, Kadri SS, Lionakis MS, Evans NG, Prevots DR, Ricotta EE. Machine Learning in Infectious Disease for Risk Factor Identification and Hypothesis Generation: Proof of Concept Using Invasive Candidiasis. Open Forum Infect Dis 2022; 9:ofac401. [DOI: 10.1093/ofid/ofac401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 08/02/2022] [Indexed: 11/13/2022] Open
Abstract
Abstract
Background
Machine learning (ML) models can handle large datasets without assuming underlying relationships and can be useful for evaluating disease characteristics; yet, they are more commonly used for predicting individual disease risk rather than identifying factors at the population level. We offer a proof of concept applying random forest (RF) algorithms to Candida-positive hospital encounters in an electronic health record database of patients in the U.S.
Methods
Candida-positive encounters were extracted from the Cerner HealthFacts database; invasive infections were laboratory positive sterile site Candida infections. Features included demographics, admission source, care setting, physician specialty, diagnostic and procedure codes, and medications received prior to the first positive Candida culture. We used RF to assess risk factors for three outcomes: any invasive candidiasis (IC) vs non-IC, within-species IC vs non-IC (e.g. invasive C. glabrata vs non-invasive C. glabrata), and between-species IC (e.g. invasive C. glabrata vs all other IC).
Results
14 of 169 (8%) variables were consistently identified as important features in the ML models. When evaluating within-species IC, for example invasive C. glabrata vs non-invasive C. glabrata, we identified known features like central venous catheters, ICU stay, and gastrointestinal operations. In contrast, important variables for invasive C. glabrata vs all other IC included renal disease and medications like diabetes therapeutics, cholesterol medications, and antiarrhythmics.
Conclusions
Known and novel risk factors for IC were identified using ML, demonstrating the hypotheses generating utility of this approach for infectious disease conditions about which less is known, specifically at the species-level or for rarer diseases.
Collapse
Affiliation(s)
- Lisa M Mayer
- Office of Data Science and Emerging Technologies, Office of Science Management and Operations, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) , Rockville, MD , USA
| | - Jeffrey R Strich
- Critical Care Medicine Department, NIH Clinical Center, NIH , Bethesda, MD , USA
| | - Sameer S Kadri
- Critical Care Medicine Department, NIH Clinical Center, NIH , Bethesda, MD , USA
| | - Michail S Lionakis
- Fungal Pathogenesis Section, Laboratory of Clinical Immunology & Microbiology (LCIM), NIAID, NIH , Bethesda, MD , USA
| | - Nicholas G Evans
- Department of Philosophy, University of Massachusetts Lowell , 883 Broadway Street, Lowell, MA , USA
| | - D Rebecca Prevots
- Epidemiology and Population Studies Unit, LCIM, NIAID, NIH , Bethesda, MD , USA
| | - Emily E Ricotta
- Epidemiology and Population Studies Unit, LCIM, NIAID, NIH , Bethesda, MD , USA
| |
Collapse
|
28
|
Yoshida K, Fujimoto T, Muramatsu M, Shimizu H. Prediction of hand, foot, and mouth disease epidemics in Japan using a long short-term memory approach. PLoS One 2022; 17:e0271820. [PMID: 35900968 PMCID: PMC9333334 DOI: 10.1371/journal.pone.0271820] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 07/08/2022] [Indexed: 11/19/2022] Open
Abstract
Hand, foot, and mouth disease (HFMD) is a common febrile illness caused by enteroviruses in the Picornaviridae family. The major symptoms of HFMD are fever and a vesicular rash on the hand, foot, or oral mucosa. Acute meningitis and encephalitis are observed in rare cases. HFMD epidemics occur annually in Japan, usually in the summer season. Relatively large-scale outbreaks have occurred every two years since 2011. In this study, the epidemic patterns of HFMD in Japan are predicted four weeks in advance using a deep learning method. The time-series data were analyzed by a long short-term memory (LSTM) approach called a Recurrent Neural Network. The LSTM model was trained on the numbers of weekly HFMD cases in each prefecture. These data are reported in the Infectious Diseases Weekly Report, which compiles the national surveillance data from web sites at the National Institute of Infectious Diseases, Japan, under the Infectious Diseases Control Law. Consequently, our trained LSTM model distinguishes between relatively large-scale and small-scale epidemics. The trained model predicted the HFMD epidemics in 2018 and 2019, indicating that the LSTM approach can estimate the future epidemic patterns of HFMD in Japan.
Collapse
Affiliation(s)
- Kazuhiro Yoshida
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
- * E-mail:
| | - Tsuguto Fujimoto
- Department of Fungal Infection, National Institute of Infectious Diseases, Tokyo, Japan
| | - Masamichi Muramatsu
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| | - Hiroyuki Shimizu
- Department of Virology II, National Institute of Infectious Diseases, Tokyo, Japan
| |
Collapse
|
29
|
Dengue Risk Forecast with Mosquito Vector: A Multicomponent Fusion Approach Based on Spatiotemporal Analysis. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:2515432. [PMID: 35693260 PMCID: PMC9184161 DOI: 10.1155/2022/2515432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/07/2022] [Accepted: 05/10/2022] [Indexed: 12/04/2022]
Abstract
Dengue as an acute infectious disease threatens global public health and has sparked broad research interest. However, existing studies generally ignore the spatial dependencies involved in dengue forecast, and consideration of temporal periodicity is absent. In this work, we propose a spatiotemporal component fusion model (STCFM) to solve the dengue risk forecast issue. Considering that mosquitoes are an important vector of dengue transmission, we introduce feature factors involving mosquito abundance and spatiotemporal lags to model temporal trends and spatial distributions separately on the basis of statistical properties. Specifically, we conduct multiscale modeling of temporal dependencies to enhance the forecast capability of relevant periods by capturing the historical variation patterns of the data across different segments in the temporal dimension. In the spatial dimension, we quantify the multivariate spatial correlation analysis as additional features to strengthen the spatial feature representation and adopt the ConvLSTM model to learn spatial dependencies adequately. The final forecast results are obtained by stacking strategy fusion in ensemble learning. We conduct experiments on real dengue datasets. The results indicate that STCFM improves prediction accuracy through effective spatiotemporal feature representations and outperforms candidate models with a reasonable component construction strategy.
Collapse
|
30
|
Larabi-Marie-Sainte S, Alhalawani S, Shaheen S, Almustafa KM, Saba T, Khan FN, Rehman A. Forecasting COVID19 parameters using time-series: KSA, USA, Spain, and Brazil comparative case study. Heliyon 2022; 8:e09578. [PMID: 35694424 PMCID: PMC9162784 DOI: 10.1016/j.heliyon.2022.e09578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/15/2022] [Accepted: 05/23/2022] [Indexed: 12/03/2022] Open
Abstract
Many countries are suffering from the COVID19 pandemic. The number of confirmed cases, recovered, and deaths are of concern to the countries having a high number of infected patients. Forecasting these parameters is a crucial way to control the spread of the disease and struggle with the pandemic. This study aimed at forecasting the number of cases and deaths in KSA using time-series and well-known statistical forecasting techniques including Exponential Smoothing and Linear Regression. The study is extended to forecast the number of cases in the main countries such that the US, Spain, and Brazil (having a large number of contamination) to validate the proposed models (Drift, SES, Holt, and ETS). The forecast results were validated using four evaluation measures. The results showed that the proposed ETS (resp. Drift) model is efficient to forecast the number of cases (resp. deaths). The comparison study, using the number of cases in KSA, showed that ETS (with RMSE reaching 18.44) outperforms the state-of-the art studies (with RMSE equal to 107.54). The proposed forecasting model can be used as a benchmark to tackle this pandemic in any country.
Collapse
Affiliation(s)
- Souad Larabi-Marie-Sainte
- Department of Computer Science, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Sawsan Alhalawani
- Department of Computer Science, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Sara Shaheen
- Department of Computer Science, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Khaled Mohamad Almustafa
- Department of Information Sciences, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Tanzila Saba
- Artificial Intelligence Data Analytics (AIDA) Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh 12435, Saudi Arabia
| | - Fatima Nayer Khan
- Department of Information Sciences, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
| | - Amjad Rehman
- Artificial Intelligence Data Analytics (AIDA) Lab, College of Computer and Information Sciences, Prince Sultan University, Riyadh 12435, Saudi Arabia
| |
Collapse
|
31
|
Yang E, Zhang H, Guo X, Zang Z, Liu Z, Liu Y. A multivariate multi-step LSTM forecasting model for tuberculosis incidence with model explanation in Liaoning Province, China. BMC Infect Dis 2022; 22:490. [PMID: 35606725 PMCID: PMC9128107 DOI: 10.1186/s12879-022-07462-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 05/10/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Tuberculosis (TB) is the respiratory infectious disease with the highest incidence in China. We aim to design a series of forecasting models and find the factors that affect the incidence of TB, thereby improving the accuracy of the incidence prediction. RESULTS In this paper, we developed a new interpretable prediction system based on the multivariate multi-step Long Short-Term Memory (LSTM) model and SHapley Additive exPlanation (SHAP) method. Four accuracy measures are introduced into the system: Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, and symmetric Mean Absolute Percentage Error. The Autoregressive Integrated Moving Average (ARIMA) model and seasonal ARIMA model are established. The multi-step ARIMA-LSTM model is proposed for the first time to examine the performance of each model in the short, medium, and long term, respectively. Compared with the ARIMA model, each error of the multivariate 2-step LSTM model is reduced by 12.92%, 15.94%, 15.97%, and 14.81% in the short term. The 3-step ARIMA-LSTM model achieved excellent performance, with each error decreased to 15.19%, 33.14%, 36.79%, and 29.76% in the medium and long term. We provide the local and global explanation of the multivariate single-step LSTM model in the field of incidence prediction, pioneering. CONCLUSIONS The multivariate 2-step LSTM model is suitable for short-term prediction and obtained a similar performance as previous studies. The 3-step ARIMA-LSTM model is appropriate for medium-to-long-term prediction and outperforms these models. The SHAP results indicate that the five most crucial features are maximum temperature, average relative humidity, local financial budget, monthly sunshine percentage, and sunshine hours.
Collapse
Affiliation(s)
- Enbin Yang
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| | - Hao Zhang
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
- College of Software, Jilin University, Changchun, 130012 China
| | - Xinsheng Guo
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| | - Zinan Zang
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
| | - Zhen Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Graduate School of Engineering, Nagasaki Institute of Applied Science, 536 Aba-machi, Nagasaki, 851-0193 Japan
| | - Yuanning Liu
- College of Computer Science and Technology, Jilin University, Changchun, 130012 China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, 130012 China
- College of Software, Jilin University, Changchun, 130012 China
| |
Collapse
|
32
|
A Deep Learning Approach to Estimate the Incidence of Infectious Disease Cases for Routinely Collected Ambulatory Records: The Example of Varicella-Zoster. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19105959. [PMID: 35627495 PMCID: PMC9141951 DOI: 10.3390/ijerph19105959] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Revised: 05/03/2022] [Accepted: 05/10/2022] [Indexed: 02/01/2023]
Abstract
The burden of infectious diseases is crucial for both epidemiological surveillance and prompt public health response. A variety of data, including textual sources, can be fruitfully exploited. Dealing with unstructured data necessitates the use of methods for automatic data-driven variable construction and machine learning techniques (MLT) show promising results. In this framework, varicella-zoster virus (VZV) infection was chosen to perform an automatic case identification with MLT. Pedianet, an Italian pediatric primary care database, was used to train a series of models to identify whether a child was diagnosed with VZV infection between 2004 and 2014 in the Veneto region, starting from free text fields. Given the nature of the task, a recurrent neural network (RNN) with bidirectional gated recurrent units (GRUs) was chosen; the same models were then used to predict the children’s status for the following years. A gold standard produced by manual extraction for the same interval was available for comparison. RNN-GRU improved its performance over time, reaching the maximum value of area under the ROC curve (AUC-ROC) of 95.30% at the end of the period. The absolute bias in estimates of VZV infection was below 1.5% in the last five years analyzed. The findings in this study could assist the large-scale use of EHRs for clinical outcome predictive modeling and help establish high-performance systems in other medical domains.
Collapse
|
33
|
Trends in using IoT with machine learning in smart health assessment. Int J Health Sci (Qassim) 2022. [DOI: 10.53730/ijhs.v6ns3.6404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The Internet of Things (IoT) provides a rich source of information that can be uncovered using machine learning (ML). The decision-making processes in several industries, such as education, security, business, and healthcare, have been aided by these hybrid technologies. For optimum prediction and recommendation systems, ML enhances the Internet of Things (IoT). Machines are already making medical records, diagnosing diseases, and monitoring patients using IoT and ML in the healthcare industry. Various datasets need different ML algorithms to perform well. It's possible that the total findings will be impacted if the predicted results are not consistent. In clinical decision-making, the variability of prediction outcomes is a major consideration. To effectively utilise IoT data in healthcare, it's critical to have a firm grasp of the various machine learning techniques in use. Algorithms for categorization and prediction that have been employed in the healthcare industry are highlighted in this article. As stated earlier, the purpose of this work is to provide readers with an in-depth look at current machine learning algorithms and how they apply to IoT medical data.
Collapse
|
34
|
Buyrukoğlu S, Yılmaz Y, Topalcengiz Z. Correlation value determined to increase Salmonella prediction success of deep neural network for agricultural waters. ENVIRONMENTAL MONITORING AND ASSESSMENT 2022; 194:373. [PMID: 35435507 DOI: 10.1007/s10661-022-10050-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 04/09/2022] [Indexed: 06/14/2023]
Abstract
The use of computer-based tools has been becoming popular in the field of produce safety. Various algorithms have been applied to predict the population and presence of indicator microorganisms and pathogens in agricultural water sources. The purpose of this study is to improve the Salmonella prediction success of deep feed-forward neural network (DFNN) in agricultural surface waters with a determined correlation value based on selected features. Datasets were collected from six agricultural ponds in Central Florida. The most successful physicochemical and environmental features were selected by the gain ratio for the prediction of generic Escherichia coli population with machine learning algorithms (decision tree, random forest, support vector machine). Salmonella prediction success of DFNN was evaluated with dataset including selected environmental and physicochemical features combined with predicted E. coli populations with and without correlation value. The performance of correlation value was evaluated with all possible mathematical dataset combinations (nCr) of six ponds. The higher accuracy performances (%) were achieved through DFNN analyses with correlation value between 88.89 and 98.41 compared to values with no correlation value from 83.68 to 96.99 for all dataset combinations. The findings emphasize the success of determined correlation value for the prediction of Salmonella presence in agricultural surface waters.
Collapse
Affiliation(s)
- Selim Buyrukoğlu
- Department of Computer Engineering, Faculty of Engineering, Çankırı Karatekin University, 18100, Çankırı, Turkey.
| | - Yıldıran Yılmaz
- Computer Engineering Department, Faculty of Engineering and Architecture, Recep Tayyip Erdogan University, 53020, Rize, Turkey
| | - Zeynal Topalcengiz
- Department of Food Engineering, Faculty of Engineering and Architecture, Muş Alparslan University, 49250, Muş, Turkey
| |
Collapse
|
35
|
Xia Z, Qin L, Ning Z, Zhang X. Deep learning time series prediction models in surveillance data of hepatitis incidence in China. PLoS One 2022; 17:e0265660. [PMID: 35417459 PMCID: PMC9007353 DOI: 10.1371/journal.pone.0265660] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 03/06/2022] [Indexed: 12/09/2022] Open
Abstract
Background Precise incidence prediction of Hepatitis infectious disease is critical for early prevention and better government strategic planning. In this paper, we presented different prediction models using deep learning methods based on the monthly incidence of Hepatitis through a national public health surveillance system in China mainland. Methods We assessed and compared the performance of three deep learning methods, namely, Long Short-Term Memory (LSTM) prediction model, Recurrent Neural Network (RNN) prediction model, and Back Propagation Neural Network (BPNN) prediction model. The data collected from 2005 to 2018 were used for the training and prediction model, while the data are split via 5-Fold cross-validation. The performance was evaluated based on three metrics: mean square error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Results Among the year 2005–2018, 20,924,951 cases and 11,892 deaths were supervised in the system. Hepatitis B (HB) is the most disease-causing incidence and death, and the proportion is greater than 70 percent, while the percentage of the incidence and deaths is decreased much in 2018 compared with 2005. Based on the measured errors and the visualization of the three neural networks, there is no one model predicting the incidence cases that can be completely superior to other models. When predicting the number of incidence cases for HB, the performance ranking of the three models from high to low is LSTM, BPNN, RNN, while it is LSTM, RNN, BPNN for Hepatitis C (HC). while the MAE, MSE and MAPE of the LSTM model for HB, HC are 3.84*10−06, 3.08*10−11, 4.981, 8.84*10−06, 1.98*10−12,5.8519, respectively. Conclusions The deep learning time series predictive models show their significance to forecast the Hepatitis incidence and have the potential to assist the decision-makers in making efficient decisions for the early detection of the disease incidents, which would significantly promote Hepatitis disease control and management.
Collapse
Affiliation(s)
- Zhaohui Xia
- National Enterprise Information Software Engineering Research Center, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Lei Qin
- National Enterprise Information Software Engineering Research Center, School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Zhen Ning
- School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
| | - Xingyu Zhang
- Starzl Transplant Institute, University of Pittsburgh Medical Center, Pittsburgh, PA, United States of America
- * E-mail:
| |
Collapse
|
36
|
An edge-driven multi-agent optimization model for infectious disease detection. APPL INTELL 2022; 52:14362-14373. [PMID: 35280108 PMCID: PMC8898659 DOI: 10.1007/s10489-021-03145-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/24/2021] [Indexed: 11/25/2022]
Abstract
This research work introduces a new intelligent framework for infectious disease detection by exploring various emerging and intelligent paradigms. We propose new deep learning architectures such as entity embedding networks, long-short term memory, and convolution neural networks, for accurately learning heterogeneous medical data in identifying disease infection. The multi-agent system is also consolidated for increasing the autonomy behaviours of the proposed framework, where each agent can easily share the derived learning outputs with the other agents in the system. Furthermore, evolutionary computation algorithms, such as memetic algorithms, and bee swarm optimization controlled the exploration of the hyper-optimization parameter space of the proposed framework. Intensive experimentation has been established on medical data. Strong results obtained confirm the superiority of our framework against the solutions that are state of the art, in both detection rate, and runtime performance, where the detection rate reaches 98% for handling real use cases.
Collapse
|
37
|
Integrating Models and Fusing Data in a Deep Ensemble Learning Method for Predicting Epidemic Diseases Outbreak ☆. BIG DATA RESEARCH 2022. [PMCID: PMC8577221 DOI: 10.1016/j.bdr.2021.100286] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Due to the continuous and growing spread of the novel corona virus (COVID-19) worldwide, it is urgent, especially in the data science era, to develop accurate data driven decision-aided methods to predict and early detect the outbreak of this epidemic disease and then to support healthcare decision makers. In this context, the main goal of this paper is to build an accurate and generic data driven method that can predict daily COVID-19 positive cases and therefore helps stakeholders to make and review their epidemic response plans. This method is based on the integration of three deep learning models: Long Short Term Memory (LSTM), Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN) and takes advantage of their complementarity. The proposed method is validated on two experimental scenarios where the first one aims to validate the method on China and Tunisia case studies and the second one is based on data fusion and transfer learning process where China data and models will be reused to predict Tunisia COVID-19 outbreak. Experiment results indicate that, compared with individual learners, the stacked-DNN meta-learner, whose inputs are results of LSTM, DNN and CNN learners, achieved the best results in terms of accuracy as well as RMSE and it required the lowest time for training as well as prediction for the two scenarios. The main outcomes of this paper are i) to adopt deep learning models combined to stacking ensemble learning to accurately forecast COVID-19 positive cases and ii) to merge data and to adopt transfer learning for the prediction of confirmed cases by reusing China data, learners and meat-learners to make prediction of the epidemic trend for other countries, with less facilities of collecting data, when preventive and control measures are similar.
Collapse
|
38
|
Phoobane P, Masinde M, Mabhaudhi T. Predicting Infectious Diseases: A Bibliometric Review on Africa. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19031893. [PMID: 35162917 PMCID: PMC8835071 DOI: 10.3390/ijerph19031893] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Revised: 01/28/2022] [Accepted: 01/30/2022] [Indexed: 12/18/2022]
Abstract
Africa has a long history of novel and re-emerging infectious disease outbreaks. This reality has attracted the attention of researchers interested in the general research theme of predicting infectious diseases. However, a knowledge mapping analysis of literature to reveal the research trends, gaps, and hotspots in predicting Africa’s infectious diseases using bibliometric tools has not been conducted. A bibliometric analysis of 247 published papers on predicting infectious diseases in Africa, published in the Web of Science core collection databases, is presented in this study. The results indicate that the severe outbreaks of infectious diseases in Africa have increased scientific publications during the past decade. The results also reveal that African researchers are highly underrepresented in these publications and that the United States of America (USA) is the most productive and collaborative country. The relevant hotspots in this research field include malaria, models, classification, associations, COVID-19, and cost-effectiveness. Furthermore, weather-based prediction using meteorological factors is an emerging theme, and very few studies have used the fourth industrial revolution (4IR) technologies. Therefore, there is a need to explore 4IR predicting tools such as machine learning and consider integrated approaches that are pivotal to developing robust prediction systems for infectious diseases, especially in Africa. This review paper provides a useful resource for researchers, practitioners, and research funding agencies interested in the research theme—the prediction of infectious diseases in Africa—by capturing the current research hotspots and trends.
Collapse
Affiliation(s)
- Paulina Phoobane
- Department of Information Technology, Central University of Technology, Free State, Private Bag X200539, Bloemfontein 9300, South Africa; (M.M.); (T.M.)
- Correspondence:
| | - Muthoni Masinde
- Department of Information Technology, Central University of Technology, Free State, Private Bag X200539, Bloemfontein 9300, South Africa; (M.M.); (T.M.)
| | - Tafadzwanashe Mabhaudhi
- Department of Information Technology, Central University of Technology, Free State, Private Bag X200539, Bloemfontein 9300, South Africa; (M.M.); (T.M.)
- Centre for Transformative Agricultural and Food Systems, School of Agricultural, Earth and Environmental Sciences, University of KwaZulu-Natal, Private Bag X01, Pietermaritzburg 3201, South Africa
- International Water Management Institute (IWMI-GH), West Africa Office, PMB CT 112 Cantonments, Accra GA015, Ghana
| |
Collapse
|
39
|
John Cremin C, Dash S, Huang X. Big Data: Historic Advances and Emerging Trends in Biomedical Research. CURRENT RESEARCH IN BIOTECHNOLOGY 2022. [DOI: 10.1016/j.crbiot.2022.02.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
40
|
Basu S, Sen S. COVID 19 Pandemic, Socio-Economic Behaviour and Infection Characteristics: An Inter-Country Predictive Study Using Deep Learning. COMPUTATIONAL ECONOMICS 2022; 61:645-676. [PMID: 35095204 PMCID: PMC8789377 DOI: 10.1007/s10614-021-10223-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 11/04/2021] [Indexed: 06/14/2023]
Abstract
This work aims to develop a data driven multi-horizon incidence forecasting model considering the inter-country variability in static socio-economic factors. The specific objectives of this study are to predict the future country-wise COVID 19 incidences, to locate the influences of individual socio-economic factors on the predictions, to analyze the clusters of countries on the basis of influential explanatory variables and thus to search for intra-cluster and inter-cluster characteristics. To that respect this study has used the deep neural network based temporal fusion transformer for the predictions, Pearson correlation to understand the influence of socio-economic variables on incidence and hierarchical clustering for cluster-analysis. The findings conclude that the inter-country infection related predictions vary widely over spatio-temporal variability and different socio-economic variables have different influences over this inter-country variability. It is observed that greater the population size, stronger the global connectedness, larger the social cohesion, higher the population density and meaningful the gender based discrimination higher will be the future spread. On the other hand greater the development level, higher the nutritional status, greater the access to quality health services, greater the urban population and greater the material poverty lesser will be the future spread. Definite spatial pattern of influence of the explanatory variables emerged from cluster analysis. To minimize the vulnerability towards unforeseen biological calamities modern and sustainable development policies are needed; affluence may not guarantee less infection. But these policies should vary between economies due to the variation in socio-economic status of the countries worldwide.
Collapse
Affiliation(s)
- Srinka Basu
- Department of Engineering and Technological Studies, University of Kalyani, Kalyani, West Bengal 741235 India
| | - Sugata Sen
- Department of Economics, Panskura Banamali College (Autonomous), Panskura, Purba Medinipur, West Bengal 721152 India
| |
Collapse
|
41
|
Clement JC, Ponnusamy V, Sriharipriya K, Nandakumar R. A Survey on Mathematical, Machine Learning and Deep Learning Models for COVID-19 Transmission and Diagnosis. IEEE Rev Biomed Eng 2022; 15:325-340. [PMID: 33769936 PMCID: PMC8905610 DOI: 10.1109/rbme.2021.3069213] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 01/05/2021] [Accepted: 03/22/2021] [Indexed: 11/10/2022]
Abstract
COVID-19 is a life threatening disease which has a enormous global impact. As the cause of the disease is a novel coronavirus whose gene information is unknown, drugs and vaccines are yet to be found. For the present situation, disease spread analysis and prediction with the help of mathematical and data driven model will be of great help to initiate prevention and control action, namely lockdown and qurantine. There are various mathematical and machine-learning models proposed for analyzing the spread and prediction. Each model has its own limitations and advantages for a particluar scenario. This article reviews the state-of-the art mathematical models for COVID-19, including compartment models, statistical models and machine learning models to provide more insight, so that an appropriate model can be well adopted for the disease spread analysis. Furthermore, accurate diagnose of COVID-19 is another essential process to identify the infected person and control further spreading. As the spreading is fast, there is a need for quick auotomated diagnosis mechanism to handle large population. Deep-learning and machine-learning based diagnostic mechanism will be more appropriate for this purpose. In this aspect, a comprehensive review on the deep learning models for the diagnosis of the disease is also provided in this article.
Collapse
Affiliation(s)
| | - VijayaKumar Ponnusamy
- Department of Electronics and Communication EngineeringSRM Institute of Science and TechnologyKattankulathur603203India
| | - K.C. Sriharipriya
- School of Electronics EngineeringVellore Institute of TechnologyVellore632014India
| | - R. Nandakumar
- Department of Electronics and Communication EngineeringK.S.R Institute for Engineering and TechnologyKalvi Nagar637215India
| |
Collapse
|
42
|
Song-men S. Intelligent Diagnosis Method for New Diseases Based on Fuzzy SVM Incremental Learning. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:7631271. [PMID: 35069792 PMCID: PMC8776429 DOI: 10.1155/2022/7631271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 11/24/2021] [Accepted: 12/13/2021] [Indexed: 11/27/2022]
Abstract
The diagnosis of new diseases is a challenging problem. In the early stage of the emergence of new diseases, there are few case samples; this may lead to the low accuracy of intelligent diagnosis. Because of the advantages of support vector machine (SVM) in dealing with small sample problems, it is selected for the intelligent diagnosis method. The standard SVM diagnosis model updating needs to retrain all samples. It costs huge storage and calculation costs and is difficult to adapt to the changing reality. In order to solve this problem, this paper proposes a new disease diagnosis method based on Fuzzy SVM incremental learning. According to SVM theory, the support vector set and boundary sample set related to the SVM diagnosis model are extracted. Only these sample sets are considered in incremental learning to ensure the accuracy and reduce the cost of calculation and storage. To reduce the impact of noise points caused by the reduction of training samples, FSVM is used to update the diagnosis model, and the generalization is improved. The simulation results on the banana dataset show that the proposed method can improve the classification accuracy from 86.4% to 90.4%. Finally, the method is applied in COVID-19's diagnostic. The diagnostic accuracy reaches 98.2% as the traditional SVM only gets 84%. With the increase of the number of case samples, the model is updated. When the training samples increase to 400, the number of samples participating in training is only 77; the amount of calculation of the updated model is small.
Collapse
Affiliation(s)
- Shi Song-men
- China Pharmaceutical University, Nanjing 211198, China
| |
Collapse
|
43
|
John CC, Ponnusamy V, Krishnan Chandrasekaran S, R N. A Survey on Mathematical, Machine Learning and Deep Learning Models for COVID-19 Transmission and Diagnosis. IEEE Rev Biomed Eng 2022. [PMID: 33769936 DOI: 10.1109/rbme.2021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
COVID-19 is a life threatening disease which has a enormous global impact. As the cause of the disease is a novel coronavirus whose gene information is unknown, drugs and vaccines are yet to be found. For the present situation, disease spread analysis and prediction with the help of mathematical and data driven model will be of great help to initiate prevention and control action, namely lockdown and qurantine. There are various mathematical and machine-learning models proposed for analyzing the spread and prediction. Each model has its own limitations and advantages for a particluar scenario. This article reviews the state-of-the art mathematical models for COVID-19, including compartment models, statistical models and machine learning models to provide more insight, so that an appropriate model can be well adopted for the disease spread analysis. Furthermore, accurate diagnose of COVID-19 is another essential process to identify the infected person and control further spreading. As the spreading is fast, there is a need for quick auotomated diagnosis mechanism to handle large population. Deep-learning and machine-learning based diagnostic mechanism will be more appropriate for this purpose. In this aspect, a comprehensive review on the deep learning models for the diagnosis of the disease is also provided in this article.
Collapse
|
44
|
Application of big data in COVID-19 epidemic. DATA SCIENCE FOR COVID-19 2022. [PMCID: PMC8988924 DOI: 10.1016/b978-0-323-90769-9.00023-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
45
|
AIM and Evolutionary Theory. Artif Intell Med 2022. [DOI: 10.1007/978-3-030-64573-1_41] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
46
|
Zhang T, Rabhi F, Behnaz A, Chen X, Paik HY, Yao L, MacIntyre CR. Use of automated machine learning for an outbreak risk prediction tool. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.101121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
|
47
|
Elgazzar H, Spurlock K, Bogart T. Evolutionary clustering and community detection algorithms for social media health surveillance. MACHINE LEARNING WITH APPLICATIONS 2021; 6:100084. [PMID: 34939040 PMCID: PMC8470901 DOI: 10.1016/j.mlwa.2021.100084] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 06/18/2021] [Accepted: 06/21/2021] [Indexed: 11/28/2022] Open
Abstract
The prominent rise of social networks within the past decade have become a gold mine for data mining operations seeking to model the real world through these virtual worlds. One of the most important applications that has been proposed is utilizing information generated from social networks as a supplemental health surveillance system to monitor disease epidemics. At the time this research was conducted in 2020, the COVID-19 virus had evolved into a global pandemic, forcing many countries to implement preventative measures to halt its expanse. Health surveillance has been a powerful tool in placing further preventative measures, however it is not a perfect system, and slowly collected, misidentified information can prove detrimental to these efforts. This research proposes a new potential surveillance avenue through unsupervised machine learning using dynamic, evolutionary variants of clustering algorithms DBSCAN and the Louvain method to allow for community detection in temporal networks. This technique is paired with geographical data collected directly from the social media Twitter, to create an effective and accurate health surveillance system that grows as time passes. The experimental results show that the proposed system is promising and has the potential to be an advancement on current machine learning health surveillance techniques.
Collapse
Affiliation(s)
- Heba Elgazzar
- School of Engineering and Computer Science, Morehead State University, Morehead, KY 40351, USA
| | - Kyle Spurlock
- School of Engineering and Computer Science, Morehead State University, Morehead, KY 40351, USA
| | - Tanner Bogart
- School of Engineering and Computer Science, Morehead State University, Morehead, KY 40351, USA
| |
Collapse
|
48
|
Singh G, Soman B. Spatiotemporal epidemiology and forecasting of dengue in the state of Punjab, India: Study protocol. Spat Spatiotemporal Epidemiol 2021; 39:100444. [PMID: 34774263 DOI: 10.1016/j.sste.2021.100444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 07/02/2021] [Accepted: 07/21/2021] [Indexed: 11/30/2022]
Abstract
Dengue burden in India is a major public health problem. The present study has been designed to understand mechanisms by which routine data generate evidence. Secondary data analysis of routine datasets to understand spatiotemporal epidemiology and forecast dengue will be conducted. Data science approach will be adopted to generate a reproducible framework in the R environment. The lab-confirmed dengue reported by the state health authorities from 01 January 2015 to 31 December 2019 will be included. Multiple climatic variables from satellite imagery, climatic models, vegetation and built-up indices, and sociodemographic variables will be explored as risk factors. Exploratory data analysis followed by statistical analysis and machine learning will be performed. Data analysis will include geospatial information analysis, time series analysis, and spatiotemporal analysis. The study will provide value addition to the existing disease surveillance mechanisms by developing a framework for incorporating multiple routine data sources available in the country.
Collapse
Affiliation(s)
- Gurpreet Singh
- Achutha Menon Centre for Health Science Studies, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum, India
| | - Biju Soman
- Achutha Menon Centre for Health Science Studies, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum, India..
| |
Collapse
|
49
|
Prediction on transmission trajectory of COVID-19 based on particle swarm algorithm. Pattern Recognit Lett 2021; 152:70-78. [PMID: 34538991 PMCID: PMC8440343 DOI: 10.1016/j.patrec.2021.09.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 08/11/2021] [Accepted: 09/08/2021] [Indexed: 12/12/2022]
Abstract
This study aimed to predict the transmission trajectory of the 2019 Corona Virus Disease (COVID-19). The particle swarm optimization (PSO) algorithm was combined with the traditional susceptible exposed infected recovered (SEIR) infectious disease prediction model to propose a SEIR-PSO prediction model on the COVID-19. In addition, the domestic epidemic data from February 25, 2020 to March 20, 2020 in China were selected as the training set for analysis. The results showed that when the conversion rate, recovery rate, and mortality rate of the SEIR-PSO model were 1/5, 1/15, and 1/13, its predictive effect on the number of people diagnosed with COVID-19 was the closest to the real data; and the SEIR-PSO model showed a mean-square errors (MSE) value of 1304.35 and mean absolute error (MAE) value of 1069.18, showing the best prediction effect compared with the susceptible infectious susceptible (SIS) model and the SEIR model. In contrary to the standard particle swarm optimization (SPSO) and linear weighted particle swarm optimization (LPSO), which were two classical improved PSO algorithms, the reliability and diversity of the SEIR-PSO model were higher. In summary, the SEIR-PSO model showed excellent performance in predicting the time series of COVID-19 epidemic data, and showed reliable application value for the prevention and control of COVID-19 epidemic.
Collapse
|
50
|
Rathinam F, Khatua S, Siddiqui Z, Malik M, Duggal P, Watson S, Vollenweider X. Using big data for evaluating development outcomes: A systematic map. CAMPBELL SYSTEMATIC REVIEWS 2021; 17:e1149. [PMID: 37051451 PMCID: PMC8354555 DOI: 10.1002/cl2.1149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
BACKGROUND Policy makers need access to reliable data to monitor and evaluate the progress of development outcomes and targets such as sustainable development outcomes (SDGs). However, significant data and evidence gaps remain. Lack of resources, limited capacity within governments and logistical difficulties in collecting data are some of the reasons for the data gaps. Big data-that is digitally generated, passively produced and automatically collected-offers a great potential for answering some of the data needs. Satellite and sensors, mobile phone call detail records, online transactions and search data, and social media are some of the examples of big data. Integrating big data with the traditional household surveys and administrative data can complement data availability, quality, granularity, accuracy and frequency, and help measure development outcomes temporally and spatially in a number of new ways.The study maps different sources of big data onto development outcomes (based on SDGs) to identify current evidence base, use and the gaps. The map provides a visual overview of existing and ongoing studies. This study also discusses the risks, biases and ethical challenges in using big data for measuring and evaluating development outcomes. The study is a valuable resource for evaluators, researchers, funders, policymakers and practitioners in their effort to contributing to evidence informed policy making and in achieving the SDGs. OBJECTIVES Identify and appraise rigorous impact evaluations (IEs), systematic reviews and the studies that have innovatively used big data to measure any development outcomes with special reference to difficult contexts. SEARCH METHODS A number of general and specialised data bases and reporsitories of organisations were searched using keywords related to big data by an information specialist. SELECTION CRITERIA The studies were selected on basis of whether they used big data sources to measure or evaluate development outcomes. DATA COLLECTION AND ANALYSIS Data collection was conducted using a data extraction tool and all extracted data was entered into excel and then analysed using Stata. The data analysis involved looking at trends and descriptive statistics only. MAIN RESULTS The search yielded over 17,000 records, which we then screened down to 437 studies which became the foundation of our systematic map. We found that overall, there is a sizable and rapidly growing number of measurement studies using big data but a much smaller number of IEs. We also see that the bulk of the big data sources are machine-generated (mostly satellites) represented in the light blue. We find that satellite data was used in over 70% of the measurement studies and in over 80% of the IEs. AUTHORS' CONCLUSIONS This map gives us a sense that there is a lot of work being done to develop appropriate measures using big data which could subsequently be used in IEs. Information on costs, ethics, transparency is lacking in the studies and more work is needed in this area to understand the efficacies related to the use of big data. There are a number of outcomes which are not being studied using big data, either due to the lack to applicability such as education or due to lack of awareness about the new methods and data sources. The map points to a number of gaps as well as opportunities where future researchers can conduct research.
Collapse
|