1
|
Ahkola H, Kotamäki N, Siivola E, Tiira J, Imoscopi S, Riva M, Tezel U, Juntunen J. Uncertainty in Environmental Micropollutant Modeling. ENVIRONMENTAL MANAGEMENT 2024; 74:380-398. [PMID: 38816505 PMCID: PMC11227446 DOI: 10.1007/s00267-024-01989-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 05/11/2024] [Indexed: 06/01/2024]
Abstract
Water pollution policies have been enacted across the globe to minimize the environmental risks posed by micropollutants (MPs). For regulative institutions to be able to ensure the realization of environmental objectives, they need information on the environmental fate of MPs. Furthermore, there is an urgent need to further improve environmental decision-making, which heavily relies on scientific data. Use of mathematical and computational modeling in environmental permit processes for water construction activities has increased. Uncertainty of input data considers several steps from sampling and analysis to physico-chemical characteristics of MP. Machine learning (ML) methods are an emerging technique in this field. ML techniques might become more crucial for MP modeling as the amount of data is constantly increasing and the emerging new ML approaches and applications are developed. It seems that both modeling strategies, traditional and ML, use quite similar methods to obtain uncertainties. Process based models cannot consider all known and relevant processes, making the comprehensive estimation of uncertainty challenging. Problems in a comprehensive uncertainty analysis within ML approach are even greater. For both approaches generic and common method seems to be more useful in a practice than those emerging from ab initio. The implementation of the modeling results, including uncertainty and the precautionary principle, should be researched more deeply to achieve a reliable estimation of the effect of an action on the chemical and ecological status of an environment without underestimating or overestimating the risk. The prevailing uncertainties need to be identified and acknowledged and if possible, reduced. This paper provides an overview of different aspects that concern the topic of uncertainty in MP modeling.
Collapse
Affiliation(s)
- Heidi Ahkola
- Finnish Environment Institute (Syke), Latokartanonkaari 11, 00790, Helsinki, Finland.
| | - Niina Kotamäki
- Finnish Environment Institute (Syke), Latokartanonkaari 11, 00790, Helsinki, Finland
| | - Eero Siivola
- Finnish Environment Institute (Syke), Latokartanonkaari 11, 00790, Helsinki, Finland
| | - Jussi Tiira
- Finnish Environment Institute (Syke), Latokartanonkaari 11, 00790, Helsinki, Finland
| | - Stefano Imoscopi
- IDSIA, Università della Svizzera italiana (USI), Via Buffi 13, 6900, Lugano, Switzerland
| | - Matteo Riva
- Independent Researcher. Work Carried Out While Employed at IDSIA, USI, Lugano, Switzerland
| | - Ulas Tezel
- Institute of Environmental Sciences, Boğaziçi University, Hisar Campus, Bebek, Istanbul, 34342, Turkey
| | - Janne Juntunen
- Finnish Environment Institute (Syke), Latokartanonkaari 11, 00790, Helsinki, Finland
| |
Collapse
|
2
|
Chowdhury AH, Rad D, Rahman MS. Predicting anxiety, depression, and insomnia among Bangladeshi university students using tree-based machine learning models. Health Sci Rep 2024; 7:e2037. [PMID: 38650723 PMCID: PMC11033350 DOI: 10.1002/hsr2.2037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 02/21/2024] [Accepted: 03/26/2024] [Indexed: 04/25/2024] Open
Abstract
Background and Aims Mental health problem is a rising public health concern. People of all ages, specially Bangladeshi university students, are more affected by this burden. Thus, the objective of the study was to use tree-based machine learning (ML) models to identify major risk factors and predict anxiety, depression, and insomnia in university students. Methods A social media-based cross-sectional survey was employed for data collection. We used Generalized Anxiety Disorder (GAD-7), Patient Health Questionnaire (PHQ-9) and Insomnia Severity Index (ISI-7) scale for measuring students' anxiety, depression and insomnia problems. The tree-based supervised decision tree (DT), random forest (RF) and robust eXtreme Gradient Boosting (XGBoost) ML algorithms were used to build the prediction models and their predictive performance was evaluated using confusion matrix and receiver operating characteristic (ROC) curves. Results Of the 1250 students surveyed, 64.7% were male and 35.3% were female. The students' ages ranged from 18 to 26 years old, with an average age of 22.24 years (SD = 1.30). Majority of the students (72.6%) were from rural areas and social media addicted (56.6%). Almost 83.3% of the students had moderate to severe anxiety, 84.7% had moderate to severe depression and 76.5% had moderate to severe insomnia problems. Students' social media addiction, age, academic performance, smoking status, monthly family income and morningness-eveningness are the main risk factors of anxiety, depression and insomnia. The highest predictive performance was observed from the XGBoost model for anxiety, depression and insomnia. Conclusion The study findings offer valuable insights for stakeholders, families and policymakers enabling a more profound comprehension of the pressing mental health disorders. This understanding can guide the formulation of improved policy strategies, initiatives for mental health promotion, and the development of effective counseling services within university campus. Additionally, our proposed model might play a critical role in diagnosing and predicting mental health problems among Bangladeshi university students and similar settings.
Collapse
Affiliation(s)
| | - Dana Rad
- Center of Research Development and Innovation in PsychologyAurel Vlaicu University of AradAradRomania
| | | |
Collapse
|
3
|
Shen L, Sun MH, Ma WT, Hu QW, Zhao CX, Yang ZR, Jiang CH, Shao ZJ, Liu K. Synergistic driving effects of risk factors on human brucellosis in Datong City, China: A dynamic perspective from spatial heterogeneity. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 894:164948. [PMID: 37336414 DOI: 10.1016/j.scitotenv.2023.164948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 06/14/2023] [Accepted: 06/15/2023] [Indexed: 06/21/2023]
Abstract
Brucellosis is a highly contagious zoonotic and systemic infectious disease caused by Brucella, which seriously affects public health and socioeconomic development worldwide. Particularly, in China accumulating eco-environmental changes and agricultural intensification have increased the expansion of human brucellosis (HB) infection. As a traditional animal husbandry area adjacent to Inner Mongolia, Datong City in northwestern China is characterized by a high HB incidence, demonstrating obvious variations in the risk pattern of HB infection in recent years. In this study, we built Bayesian spatiotemporal models to detect the transfer of high-risk clusters of HB occurrence in Datong from 2005 to 2020. Geographically and Temporally Weighted Regression and GeoDetector were employed to investigate the synergistic driving effects of multiple potential risk factors. Results confirmed an evident dynamic expansion of HB from the east to the west and south in Datong. The distribution of HB showed a negative correlation with urbanization level, economic development, population density, temperature, precipitation, and wind speed, while a positive correlation with the normalized difference vegetation index, and grassland/cropland cover areas. Especially, the local animal husbandry and related industries imposed a large influence on the spatiotemporal distribution of HB. This work strengthens the understanding of how HB spatial heterogeneity is driven by environmental factors, through which helpful insights can be provided for decision-makers to formulate and implement disease control strategies and policies for preventing the further spread of HB.
Collapse
Affiliation(s)
- Li Shen
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
| | - Ming-Hao Sun
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
| | - Wen-Tao Ma
- Department of Infectious Disease Control and Prevention, Datong Center for Disease Prevention and Control, Datong, China
| | - Qing-Wu Hu
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China
| | - Chen-Xi Zhao
- Department of Epidemiology, Ministry of Education Key Lab of Hazard Assessment and Control in Special Operational Environment, School of Public Health, Air Force Medical University, Xi'an, China
| | - Zu-Rong Yang
- Department of Epidemiology, Ministry of Education Key Lab of Hazard Assessment and Control in Special Operational Environment, School of Public Health, Air Force Medical University, Xi'an, China
| | - Cheng-Hao Jiang
- School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China.
| | - Zhong-Jun Shao
- Department of Epidemiology, Ministry of Education Key Lab of Hazard Assessment and Control in Special Operational Environment, School of Public Health, Air Force Medical University, Xi'an, China.
| | - Kun Liu
- Department of Epidemiology, Ministry of Education Key Lab of Hazard Assessment and Control in Special Operational Environment, School of Public Health, Air Force Medical University, Xi'an, China.
| |
Collapse
|
4
|
Zhao D. Research of Combined ES-BP Model in Predicting Syphilis Incidence 1982-2020 in Mainland China. IRANIAN JOURNAL OF PUBLIC HEALTH 2023; 52:2063-2072. [PMID: 37899935 PMCID: PMC10612558 DOI: 10.18502/ijph.v52i10.13844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/18/2022] [Indexed: 10/31/2023]
Abstract
Background Syphilis remains a major public health concern in China. We aimed to construct an optimum model to forecast syphilis epidemic trends and provide effective precautionary measures for prevention and control. Methods Data on the incidence of syphilis between 1982 and 2020 were obtained from the China Health Statistics Yearbook. An exponential smoothing model (ES model) and a BP neural network model were constructed, and on this basis, the ES-BP combination model was created. The prediction performance was assessed to compare the MAE (Mean Absolute Error), MSE (Mean Squared Error), MAPE (Mean Absolute Percentage Error), and RMSE (Root Mean Square Error). Results The optimum ES model was Brown's linear trend model, which had the lowest MAE and MAPE values, and its residual was a white noise sequence (P=0.359). The optimum BP neural network model had three layers with the number of nodes in the input, hidden, and output layers set to 5, 11, and 1, and the mean values of MAE, MSE, and RMSE by five-fold cross-validation were 1.519, 6.894, and 1.969, respectively. The ES-BP combination model had three layers, with model nodes 1, 4, and 1. The lowest mean values of MAE, MSE, and RMSE obtained by five-fold cross-validation were 1.265, 5.739, and 2.105, respectively. Conclusion The ES, BP neural network, and ES-BP combination models can be used to predict syphilis incidence, but the prediction performance of the ES-BP combination model is better than that of a basic ES model and a basic BP neural network model.
Collapse
Affiliation(s)
- Daren Zhao
- Department of Medical Administration, Sichuan Provincial Orthopedics Hospital, Chengdu, Sichuan, P.R. China
| |
Collapse
|
5
|
Wang LJ, Zhai PQ, Xue LL, Shi CY, Zhang Q, Zhang H. Machine learning-based identification of symptomatic carotid atherosclerotic plaques with dual-energy computed tomography angiography. J Stroke Cerebrovasc Dis 2023; 32:107209. [PMID: 37290153 DOI: 10.1016/j.jstrokecerebrovasdis.2023.107209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 05/30/2023] [Accepted: 06/02/2023] [Indexed: 06/10/2023] Open
Abstract
OBJECTIVE This study aimed to develop and validate a machine learning model incorporating both dual-energy computed tomography (DECT) angiography quantitative parameters and clinically relevant risk factors for the identification of symptomatic carotid plaques to prevent acute cerebrovascular events. METHODS The data of 180 patients with carotid atherosclerosis plaques were analysed from January 2017 to December 2021; 110 patients (64.03±9.58 years old, 20 women, 90 men) were allocated to the symptomatic group, and 70 patients (64.70±9.89 years old, 50 women, 20 men) were allocated to the asymptomatic group. Overall, five machine learning models using the XGBoost algorithm, based on different CT and clinical features, were developed in the training cohort. The performances of all five models were assessed in the testing cohort using receiver operating characteristic curves, accuracy, recall rate, and F1 score. RESULTS The shapley additive explanation (SHAP) value ranking showed fat fraction (FF) as the highest among all CT and clinical features and normalised iodine density (NID) as the 10th. The model based on the top 10 features from the SHAP measurement showed optimal performance (area under the curve [AUC] .885, accuracy .833, recall rate .933, F1 score .861), compared with the other four models based on conventional CT features (AUC .588, accuracy .593, recall rate .767, F1 score .676), DECT features (AUC .685, accuracy .648, recall rate .667, F1 score .678), conventional CT and DECT features (AUC .819, accuracy .740, recall rate .867, F1 score .788), and all CT and clinical features (AUC .878, accuracy .833, recall rate .867, F1 score .852). CONCLUSION FF and NID can serve as useful imaging markers of symptomatic carotid plaques. This tree-based machine learning model incorporating both DECT and clinical features could potentially comprise a non-invasive method for identification of symptomatic carotid plaques to guide clinical treatment strategies.
Collapse
Affiliation(s)
- Ling-Jie Wang
- Department of Radiology, First Hospital of Shanxi Medical University, Taiyuan, Shanxi Province 030001, PR China.
| | - Pei-Qing Zhai
- Department of Radiology, First Hospital of Shanxi Medical University, Taiyuan, Shanxi Province 030001, PR China.
| | - Li-Li Xue
- Department of Radiology, First Hospital of Shanxi Medical University, Taiyuan, Shanxi Province 030001, PR China.
| | - Cai-Yun Shi
- Department of Radiology, First Hospital of Shanxi Medical University, Taiyuan, Shanxi Province 030001, PR China.
| | - Qian Zhang
- Department of Radiology, First Hospital of Shanxi Medical University, Taiyuan, Shanxi Province 030001, PR China.
| | - Hua Zhang
- Department of Radiology, First Hospital of Shanxi Medical University, Taiyuan, Shanxi Province 030001, PR China.
| |
Collapse
|
6
|
Xie Y, Shi D, Wang X, Guan Y, Wu W, Wang Y. Prevalence trend and burden of neglected parasitic diseases in China from 1990 to 2019: findings from global burden of disease study. Front Public Health 2023; 11:1077723. [PMID: 37293619 PMCID: PMC10244527 DOI: 10.3389/fpubh.2023.1077723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 05/08/2023] [Indexed: 06/10/2023] Open
Abstract
Objective This study sought to investigate the parasitic diseases of neglected tropical diseases defined by the World Health Organization based on the Global Burden of Disease Study (GBD) database. Importantly, we analyzed the prevalence and burden of these diseases in China from 1990 to 2019 to provide valuable information to formulate more effective measures for their management and prevention. Methods Data on the prevalence and burden of neglected parasitic diseases in China from 1990 to 2019 were extracted from the global health data exchange (GHDx) database, including the absolute number of prevalence, age-standardized prevalence rate, disability-adjusted life year (DALY) and age-standardized DALY rate. Descriptive analysis was used to analyze the prevalence and burden changes, sex and age distribution of various parasitic diseases from 1990 to 2019. A time series model [Auto-Regressive Integrated Moving Average (ARIMA)] was used to predict the DALYs of neglected parasitic diseases in China from 2020 to 2030. Results In 2019, the number of neglected parasitic diseases in China was 152518062, the age-standardized prevalence was 11614.1 (95% uncertainty interval (UI) 8758.5-15244.5), the DALYs were 955722, and the age-standardized DALY rate was 54.9 (95% UI 26.0-101.8). Among these, the age-standardized prevalence of soil-derived helminthiasis was the highest (9370.2/100,000), followed by food-borne trematodiases (1502.3/100,000) and schistosomiasis (707.1/100,000). The highest age-standardized DALY rate was for food-borne trematodiases (36.0/100,000), followed by cysticercosis (7.9/100,000) and soil-derived helminthiasis (5.6/100,000). Higher prevalence and disease burden were observed in men and the upper age group. From 1990 to 2019, the number of neglected parasitic diseases in China decreased by 30.4%, resulting in a decline in DALYs of 27.3%. The age-standardized DALY rates of most diseases were decreased, especially for soil-derived helminthiasis, schistosomiasis and food-borne trematodiases. The ARIMA prediction model showed that the disease burden of echinococcosis and cysticercosis exhibited an increasing trend, highlighting the need for further prevention and control. Conclusion Although the prevalence and disease burden of neglected parasitic diseases in China have decreased, many issues remain to be addressed. More efforts should be undertaken to improve the prevention and control strategies for different parasitic diseases. The government should prioritize multisectoral integrated control and surveillance measures to prioritize the prevention and control of diseases with a high burden of disease. In addition, the older adult population and men need to pay more attention.
Collapse
Affiliation(s)
| | | | | | | | | | - Ying Wang
- National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention (Chinese Center for Tropical Diseases Research), NHC Key Laboratory of Parasite and Vector Biology, WHO Collaborating Centre for Tropical Diseases, National Center for International Research on Tropical Diseases, Shanghai, China
| |
Collapse
|
7
|
Gong W, Sun P, Zhai C, Yuan J, Chen Y, Chen Q, Zhao Y. Accessibility of the three-year comprehensive prevention and control of brucellosis in Ningxia: a mathematical modeling study. BMC Infect Dis 2023; 23:292. [PMID: 37147629 PMCID: PMC10161990 DOI: 10.1186/s12879-023-08270-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Accepted: 04/20/2023] [Indexed: 05/07/2023] Open
Abstract
BACKGROUND Brucellosis is a chronic zoonotic disease, and Ningxia is one of the high prevalence regions in China. To mitigate the spread of brucellosis, the government of Ningxia has implemented a comprehensive prevention and control plan (2022-2024). It is meaningful to quantitatively evaluate the accessibility of this strategy. METHODS Based on the transmission characteristics of brucellosis in Ningxia, we propose a dynamical model of sheep-human-environment, which coupling with the stage structure of sheep and indirect environmental transmission. We first calculate the basic reproduction number [Formula: see text] and use the model to fit the data of human brucellosis. Then, three widely applied control strategies of brucellosis in Ningxia, that is, slaughtering of sicked sheep, health education to high risk practitioners, and immunization of adult sheep, are evaluated. RESULTS The basic reproduction number is calculated as [Formula: see text], indicating that human brucellosis will persist. The model has a good alignment with the human brucellosis data. The quantitative accessibility evaluation results show that current brucellosis control strategy may not reach the goal on time. "Ningxia Brucellosis Prevention and Control Special Three-Year Action Implementation Plan (2022-2024)" will be achieved in 2024 when increasing slaughtering rate [Formula: see text] by 30[Formula: see text], increasing health education to reduce [Formula: see text] to 50[Formula: see text], and an increase of immunization rate of adult sheep [Formula: see text] by 40[Formula: see text]. CONCLUSION The results demonstrate that the comprehensive control measures are the most effective for brucellosis control, and it is necessary to further strengthen the multi-sectoral joint mechanism and adopt integrated measures to prevention and control brucellosis. These results can provide a reliable quantitative basis for further optimizing the prevention and control strategy of brucellosis in Ningxia.
Collapse
Affiliation(s)
- Wei Gong
- School of Science, Ningxia Medical University, 750001, Yinchuan, China
| | - Peng Sun
- Science and Technology Center, Ningxia Medical University, 750001, Yinchuan, China
| | - Changsheng Zhai
- School of Mathematics and Computer Science, Ningxia Normal University, 756000, Guyuan, China
| | - Jing Yuan
- School of Science, Ningxia Medical University, 750001, Yinchuan, China
| | - Yaogeng Chen
- School of Science, Ningxia Medical University, 750001, Yinchuan, China
| | - Qun Chen
- School of Science, Ningxia Medical University, 750001, Yinchuan, China
| | - Yu Zhao
- School of Public Health and Management, Ningxia Medical University, 750001, Yinchuan, China.
- Key Laboratory of Environmental Factors and Chronic Disease Control, 750001, Yinchuan, China.
| |
Collapse
|
8
|
Shiratori Y, Hutfless S, Rateb G, Fukuda K. The burden of gastrointestinal diseases in Japan, 1990–2019, and projections for 2035. JGH OPEN 2023; 7:221-227. [PMID: 36968565 PMCID: PMC10037033 DOI: 10.1002/jgh3.12883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/11/2023] [Accepted: 02/16/2023] [Indexed: 03/09/2023]
Abstract
Background and Aim Disease burden estimation allows clinicians and policymakers to plan for future healthcare needs. Although advances have been made in gastroenterology, as Japan has an aging population, disease burden assessment is important. We aimed to report gastrointestinal disease burden in Japan since 1990 and project changes through to 2035. Methods This descriptive study examined the crude and age-standardized rates of prevalence, mortality, and disability-adjusted life years (DALYs) of 22 gastrointestinal diseases between 1990 and 2019. We used data from the Global Burden of Disease study 2019. We calculated the expected disease burden of gastrointestinal diseases by 2035 using an autoregressive integrated moving average. Results Since 1990, cancer has accounted for most gastrointestinal disease-related causes of mortality and DALYs in Japan (77.1% and 71.2% in 1990, 79.2% and 73.7% in 2019, respectively). Although cancer-associated age-standardized mortality rates and DALYs have shown a decreasing trend, the crude rates have increased, suggesting that an aging society has a significant impact on the disease burden in Japan. Therefore, the overall gastrointestinal disease burden is expected to increase by 2035. Noncancerous chronic diseases with a high burden included cirrhosis, biliary disease, ileus, gastroesophageal reflux disorder, hernia, inflammatory bowel disease, enteric infections, and vascular intestinal disorders. In cirrhosis, the DALYs for hepatitis C decreased and the prevalence of non-alcoholic steatohepatitis increased. Conclusion In the super-aging Japanese society, the burden of gastrointestinal diseases is expected to increase in the coming years. Colorectal, gastric, pancreatic, and liver cancers are the focus of early detection and treatment.
Collapse
Affiliation(s)
- Yasutoshi Shiratori
- Department of Gastroenterology St. Luke's International Hospital Tokyo Japan
- Department of Gastroenterology Sherbrooke University Hospital Quebec Canada
| | - Susan Hutfless
- Departments of Epidemiology and Gastroenterology Johns Hopkins University Baltimore Maryland USA
| | - George Rateb
- Department of Gastroenterology Sherbrooke University Hospital Quebec Canada
| | - Katsuyuki Fukuda
- Department of Gastroenterology St. Luke's International Hospital Tokyo Japan
| |
Collapse
|
9
|
Noorunnahar M, Chowdhury AH, Mila FA. A tree based eXtreme Gradient Boosting (XGBoost) machine learning model to forecast the annual rice production in Bangladesh. PLoS One 2023; 18:e0283452. [PMID: 36972270 PMCID: PMC10042373 DOI: 10.1371/journal.pone.0283452] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 03/01/2023] [Indexed: 03/29/2023] Open
Abstract
In this study, we attempt to anticipate annual rice production in Bangladesh (1961-2020) using both the Autoregressive Integrated Moving Average (ARIMA) and the eXtreme Gradient Boosting (XGBoost) methods and compare their respective performances. On the basis of the lowest Corrected Akaike Information Criteria (AICc) values, a significant ARIMA (0, 1, 1) model with drift was chosen based on the findings. The drift parameter value shows that the production of rice positively trends upward. Thus, the ARIMA (0, 1, 1) model with drift was found to be significant. On the other hand, the XGBoost model for time series data was developed by changing the tunning parameters frequently with the greatest result. The four prominent error measures, such as mean absolute error (MAE), mean percentage error (MPE), root mean square error (RMSE), and mean absolute percentage error (MAPE), were used to assess the predictive performance of each model. We found that the error measures of the XGBoost model in the test set were comparatively lower than those of the ARIMA model. Comparatively, the MAPE value of the test set of the XGBoost model (5.38%) was lower than that of the ARIMA model (7.23%), indicating that XGBoost performs better than ARIMA at predicting the annual rice production in Bangladesh. Hence, the XGBoost model performs better than the ARIMA model in predicting the annual rice production in Bangladesh. Therefore, based on the better performance, the study forecasted the annual rice production for the next 10 years using the XGBoost model. According to our predictions, the annual rice production in Bangladesh will vary from 57,850,318 tons in 2021 to 82,256,944 tons in 2030. The forecast indicated that the amount of rice produced annually in Bangladesh will increase in the years to come.
Collapse
Affiliation(s)
- Mst Noorunnahar
- Department of Statistics, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur, Bangladesh
| | | | - Farhana Arefeen Mila
- Department of Agribusiness, Bangabandhu Sheikh Mujibur Rahman Agricultural University, Gazipur, Bangladesh
| |
Collapse
|
10
|
Zhao D, Zhang H. The research on TBATS and ELM models for prediction of human brucellosis cases in mainland China: a time series study. BMC Infect Dis 2022; 22:934. [PMID: 36510150 PMCID: PMC9746081 DOI: 10.1186/s12879-022-07919-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 12/05/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Human brucellosis is a serious public health concern in China. The objective of this study is to develop a suitable model for forecasting human brucellosis cases in mainland China. METHODS Data on monthly human brucellosis cases from January 2012 to December 2021 in 31 provinces and municipalities in mainland China were obtained from the National Health Commission of the People's Republic of China website. The TBATS and ELM models were constructed. The MAE, MSE, MAPE, and RMSE were calculated to evaluate the prediction performance of the two models. RESULTS The optimal TBATS model was TBATS (1, {0,0}, -, {< 12,4 >}) and the lowest AIC value was 1854.703. In the optimal TBATS model, {0,0} represents the ARIMA (0,0) model, {< 12,4 >} are the parameters of the seasonal periods and the corresponding number of Fourier terms, respectively, and the parameters of the Box-Cox transformation ω are 1. The optimal ELM model hidden layer number was 33 and the R-squared value was 0.89. The ELM model provided lower values of MAE, MSE, MAPE, and RMSE for both the fitting and forecasting performance. CONCLUSIONS The results suggest that the forecasting performance of ELM model outperforms the TBATS model in predicting human brucellosis between January 2012 and December 2021 in mainland China. Forecasts of the ELM model can help provide early warnings and more effective prevention and control measures for human brucellosis in mainland China.
Collapse
Affiliation(s)
- Daren Zhao
- Department of Medical Administration, Sichuan Provincial Orthopedics Hospital, Chengdu, Sichuan China
| | - Huiwu Zhang
- Department of Medical Administration, Sichuan Provincial Orthopedics Hospital, Chengdu, Sichuan China
| |
Collapse
|
11
|
Cui Y, Wang Q, Shi X, Ye Q, Lei M, Wang B. Development of a web-based calculator to predict three-month mortality among patients with bone metastases from cancer of unknown primary: An internally and externally validated study using machine-learning techniques. Front Oncol 2022; 12:1095059. [PMID: 36568149 PMCID: PMC9768185 DOI: 10.3389/fonc.2022.1095059] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 11/25/2022] [Indexed: 12/12/2022] Open
Abstract
Background Individualized therapeutic strategies can be carried out under the guidance of expected lifespan, hence survival prediction is important. Nonetheless, reliable survival estimation in individuals with bone metastases from cancer of unknown primary (CUP) is still scarce. The objective of the study is to construct a model as well as a web-based calculator to predict three-month mortality among bone metastasis patients with CUP using machine learning-based techniques. Methods This study enrolled 1010 patients from a large oncological database, the Surveillance, Epidemiology, and End Results (SEER) database, in the United States between 2010 and 2018. The entire patient population was classified into two cohorts at random: a training cohort (n=600, 60%) and a validation cohort (410, 40%). Patients from the validation cohort were used to validate models after they had been developed using the four machine learning approaches of random forest, gradient boosting machine, decision tree, and eXGBoosting machine on patients from the training cohort. In addition, 101 patients from two large teaching hospital were served as an external validation cohort. To evaluate each model's ability to predict the outcome, prediction measures such as area under the receiver operating characteristic (AUROC) curves, accuracy, and Youden index were generated. The study's risk stratification was done using the best cut-off value. The Streamlit software was used to establish a web-based calculator. Results The three-month mortality was 72.38% (731/1010) in the entire cohort. The multivariate analysis revealed that older age (P=0.031), lung metastasis (P=0.012), and liver metastasis (P=0.008) were risk contributors for three-month mortality, while radiation (P=0.002) and chemotherapy (P<0.001) were protective factors. The random forest model showed the highest area under curve (AUC) value (0.796, 95% CI: 0.746-0.847), the second-highest precision (0.876) and accuracy (0.778), and the highest Youden index (1.486), in comparison to the other three machine learning approaches. The AUC value was 0.748 (95% CI: 0.653-0.843) and the accuracy was 0.745, according to the external validation cohort. Based on the random forest model, a web calculator was established: https://starxueshu-codeok-main-8jv2ws.streamlitapp.com/. When compared to patients in the low-risk groups, patients in the high-risk groups had a 1.99 times higher chance of dying within three months in the internal validation cohort and a 2.37 times higher chance in the external validation cohort (Both P<0.001). Conclusions The random forest model has promising performance with favorable discrimination and calibration. This study suggests a web-based calculator based on the random forest model to estimate the three-month mortality among bone metastases from CUP, and it may be a helpful tool to direct clinical decision-making, inform patients about their prognosis, and facilitate therapeutic communication between patients and physicians.
Collapse
Affiliation(s)
- Yunpeng Cui
- Department of Orthopedic Surgery, Peking University First Hospital, Beijing, China
| | - Qiwei Wang
- Department of Orthopedic Surgery, Peking University First Hospital, Beijing, China
| | - Xuedong Shi
- Department of Orthopedic Surgery, Peking University First Hospital, Beijing, China,*Correspondence: Xuedong Shi, ; Mingxing Lei, ; Bailin Wang,
| | - Qianwen Ye
- Department of Oncology, Hainan Hospital of PLA General Hospital, Sanya, China
| | - Mingxing Lei
- Department of Orthopedic Surgery, Hainan Hospital of PLA General Hospital, Sanya, China,Chinese PLA Medical School, Beijing, China,*Correspondence: Xuedong Shi, ; Mingxing Lei, ; Bailin Wang,
| | - Bailin Wang
- Department of Thoracic Surgery, Hainan Hospital of PLA General Hospital, Sanya, China,*Correspondence: Xuedong Shi, ; Mingxing Lei, ; Bailin Wang,
| |
Collapse
|
12
|
Lou HR, Wang X, Gao Y, Zeng Q. Comparison of ARIMA model, DNN model and LSTM model in predicting disease burden of occupational pneumoconiosis in Tianjin, China. BMC Public Health 2022; 22:2167. [PMID: 36434563 PMCID: PMC9694549 DOI: 10.1186/s12889-022-14642-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Accepted: 11/16/2022] [Indexed: 11/27/2022] Open
Abstract
BACKGROUND This study aims to explore appropriate model for predicting the disease burden of pneumoconiosis in Tianjin by comparing the prediction effects of Autoregressive Integrated Moving Average (ARIMA) model, Deep Neural Networks (DNN) model and multivariate Long Short-Term Memory Neural Network (LSTM) models. METHODS Disability adjusted life year (DALY) was used to evaluate the disease burden of occupational pneumoconiosis. ARIMA model, DNN model and multivariate LSTM model were used to establish prediction model. Three performance evaluation metrics including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) were used to compare the prediction effects of the three models. RESULTS From 1990 to 2021, there were 10,694 cases of pneumoconiosis patients in Tianjin, resulting in a total of 112,725.52 person-years of DALY. During this period, the annual DALY showed a fluctuating trend, but it had a strong correlation with the number of pneumoconiosis patients, the average age of onset, the average age of receiving dust and the gross industrial product, and had a significant nonlinear relationship with them. The comparison of prediction results showed that the performance of multivariate LSTM model and DNN model is much better than that of traditional ARIMA model. Compared with the DNN model, the multivariate LSTM model performed better in the training set, showing lower RMES (42.30 vs. 380.96), MAE (29.53 vs. 231.20) and MAPE (1.63% vs. 2.93%), but performed less stable than the DNN on the test set, showing slightly higher RMSE (1309.14 vs. 656.44), MAE (886.98 vs. 594.47) and MAPE (36.86% vs. 22.43%). CONCLUSION The machine learning techniques of DNN and LSTM are an innovative method to accurately and efficiently predict the burden of pneumoconiosis with the simplest data. It has great application prospects in the monitoring and early warning system of occupational disease burden.
Collapse
Affiliation(s)
- He-Ren Lou
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China ,grid.265021.20000 0000 9792 1228School of Public Health, Tianjin Medical University, Tianjin, 300070 China
| | - Xin Wang
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China
| | - Ya Gao
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China
| | - Qiang Zeng
- grid.464467.3Tianjin Center for Disease Control and Prevention, Tianjin, 300011 China
| |
Collapse
|
13
|
Kim K, Lee MK, Shin HK, Lee H, Kim B, Kang S. Development and application of survey-based artificial intelligence for clinical decision support in managing infectious diseases: A pilot study on a hospital in central Vietnam. Front Public Health 2022; 10:1023098. [PMID: 36438286 PMCID: PMC9683382 DOI: 10.3389/fpubh.2022.1023098] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 10/17/2022] [Indexed: 11/11/2022] Open
Abstract
Introduction In this study, we developed a simplified artificial intelligence to support the clinical decision-making of medical personnel in a resource-limited setting. Methods We selected seven infectious disease categories that impose a heavy disease burden in the central Vietnam region: mosquito-borne disease, acute gastroenteritis, respiratory tract infection, pulmonary tuberculosis, sepsis, primary nervous system infection, and viral hepatitis. We developed a set of questionnaires to collect information on the current symptoms and history of patients suspected to have infectious diseases. We used data collected from 1,129 patients to develop and test a diagnostic model. We used XGBoost, LightGBM, and CatBoost algorithms to create artificial intelligence for clinical decision support. We used a 4-fold cross-validation method to validate the artificial intelligence model. After 4-fold cross-validation, we tested artificial intelligence models on a separate test dataset and estimated diagnostic accuracy for each model. Results We recruited 1,129 patients for final analyses. Artificial intelligence developed by the CatBoost algorithm showed the best performance, with 87.61% accuracy and an F1-score of 87.71. The F1-score of the CatBoost model by disease entity ranged from 0.80 to 0.97. Diagnostic accuracy was the lowest for sepsis and the highest for central nervous system infection. Conclusion Simplified artificial intelligence could be helpful in clinical decision support in settings with limited resources.
Collapse
Affiliation(s)
- Kwanghyun Kim
- Department of Preventive Medicine, Yonsei University College of Medicine, Seoul, South Korea,Department of Public Health, Graduate School, Yonsei University, Seoul, South Korea,*Correspondence: Kwanghyun Kim
| | - Myung-ken Lee
- Graduate School of Public Health, Kosin University College of Medicine, Busan, South Korea
| | - Hyun Kyung Shin
- Acryl, Seoul, South Korea,FineHealthcare, Seoul, South Korea
| | | | | | - Sunjoo Kang
- Graduate School of Public Health, Yonsei University, Seoul, South Korea,Sunjoo Kang
| |
Collapse
|
14
|
Prediction of global omicron pandemic using ARIMA, MLR, and Prophet models. Sci Rep 2022; 12:18138. [PMID: 36307471 PMCID: PMC9614203 DOI: 10.1038/s41598-022-23154-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 10/25/2022] [Indexed: 12/30/2022] Open
Abstract
Globally, since the outbreak of the Omicron variant in November 2021, the number of confirmed cases of COVID-19 has continued to increase, posing a tremendous challenge to the prevention and control of this infectious disease in many countries. The global daily confirmed cases of COVID-19 between November 1, 2021, and February 17, 2022, were used as a database for modeling, and the ARIMA, MLR, and Prophet models were developed and compared. The prediction performance was evaluated using mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean square error (RMSE). The study showed that ARIMA (7, 1, 0) was the optimum model, and the MAE, MAPE, and RMSE values were lower than those of the MLR and Prophet models in terms of fitting performance and forecasting performance. The ARIMA model had superior prediction performance compared to the MLR and Prophet models. In real-world research, an appropriate prediction model should be selected based on the characteristics of the data and the sample size, which is essential for obtaining more accurate predictions of infectious disease incidence.
Collapse
|
15
|
Hai Y, Leng G. A more than four-fold sex-specific difference of autism spectrum disorders and the possible contribution of pesticide usage in China 1990-2030. Front Public Health 2022; 10:945172. [PMID: 36187693 PMCID: PMC9525129 DOI: 10.3389/fpubh.2022.945172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 08/24/2022] [Indexed: 01/21/2023] Open
Abstract
Autism spectrum disorders (ASDs) are prevalent in children and adolescents and disproportionately affect males, and the main contributing factors underlying male vulnerability remain widely unknown. Pesticide use is widely reported to be associated with ASD risk, and the cases of pesticide poisoning incidence in rural areas are remarkably higher than those in the urban areas while the prevalence of ASDs in rural areas was higher than that in urban areas and the rate of male pesticide poisoning was significantly higher than female. Thus, pesticide usage may be an important contributing factor for causing sex-specific differences of ASD incidence. ASD burden was analyzed by using the data of ASD number, ASD rate (ASD cases per 100,000 persons) and disability-adjusted life years (DALYs) from 1990 to 2019. The changes from 1990 to 2030 were predicted using autoregressive integrated moving average (ARIMA) in time series forecasting based on the small values of Akaike information criterion and Bayesian information criterion. Finally, the relationship between ASD rate and pesticide usage risk index (PURI) was analyzed via Pearson's correlation coefficient. ASD number, ASD rate and DALYs will be reduced by 45.5% ± 8.2% (t = 9.100 and p = 0.0119), 56.6% ± 10.2% (t = 9.111 and p = 0.0118), and 44.9% ± 7.0% (t = 20.90 and p = 0.0023) from 1990 to 2030 in China. PURI has a strong relationship with ASD rate (rho = 0.953 to 0.988 and p < 0.0001). Pesticide poisoning incidence in males is up to 2-fold higher than that in females. ASD number and DALYs in males are 4-fold higher than those in females. Furthermore, there is growing evidence supporting that males are more susceptible than females to pesticides with sex differences in neurotoxicogenetics. Therefore, pesticide poisoning may be a contributing factor for causing the sex differences of ASD. Much work still needs to be done to confirm that.
Collapse
Affiliation(s)
- Yang Hai
- International Education College, Harbin Medical University, Harbin, China,*Correspondence: Yang Hai
| | - Guodong Leng
- College of Business Administration, Shenyang Pharmaceutical University, Shenyang, China
| |
Collapse
|
16
|
Rahman MS, Chowdhury AH. A data-driven eXtreme gradient boosting machine learning model to predict COVID-19 transmission with meteorological drivers. PLoS One 2022; 17:e0273319. [PMID: 36099253 PMCID: PMC9469970 DOI: 10.1371/journal.pone.0273319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Accepted: 08/06/2022] [Indexed: 11/22/2022] Open
Abstract
COVID-19 pandemic has become a global major public health concern. Examining the meteorological risk factors and accurately predicting the incidence of the COVID-19 pandemic is an extremely important challenge. Therefore, in this study, we analyzed the relationship between meteorological factors and COVID-19 transmission in SAARC countries. We also compared the predictive accuracy of Autoregressive Integrated Moving Average (ARIMAX) and eXtreme Gradient Boosting (XGBoost) methods for precise modelling of COVID-19 incidence. We compiled a daily dataset including confirmed COVID-19 case counts, minimum and maximum temperature (°C), relative humidity (%), surface pressure (kPa), precipitation (mm/day) and maximum wind speed (m/s) from the onset of the disease to January 29, 2022, in each country. The data were divided into training and test sets. The training data were used to fit ARIMAX model for examining significant meteorological risk factors. All significant factors were then used as covariates in ARIMAX and XGBoost models to predict the COVID-19 confirmed cases. We found that maximum temperature had a positive impact on the COVID-19 transmission in Afghanistan (β = 11.91, 95% CI: 4.77, 19.05) and India (β = 0.18, 95% CI: 0.01, 0.35). Surface pressure had a positive influence in Pakistan (β = 25.77, 95% CI: 7.85, 43.69) and Sri Lanka (β = 411.63, 95% CI: 49.04, 774.23). We also found that the XGBoost model can help improve prediction of COVID-19 cases in SAARC countries over the ARIMAX model. The study findings will help the scientific communities and policymakers to establish a more accurate early warning system to control the spread of the pandemic.
Collapse
Affiliation(s)
- Md. Siddikur Rahman
- Department of Statistics, Begum Rokeya University, Rangpur, Rangpur, Bangladesh
- * E-mail:
| | | |
Collapse
|
17
|
Dengue Incidence Trends and Its Burden in Major Endemic Regions from 1990 to 2019. Trop Med Infect Dis 2022; 7:tropicalmed7080180. [PMID: 36006272 PMCID: PMC9416661 DOI: 10.3390/tropicalmed7080180] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 07/31/2022] [Accepted: 08/09/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Dengue has become one of the major vector-borne diseases, which has been an important public health concern. We aimed to estimate the disease burden of dengue in major endemic regions from 1990 to 2019, and explore the impact pattern of the socioeconomic factors on the burden of dengue based on the global burden of diseases, injuries, and risk factors study 2019 (GBD 2019). METHODS Using the analytical strategies and data from the GBD 2019, we described the incidence and disability-adjusted life years (DALYs) of dengue in major endemic regions from 1990 to 2019. Furthermore, we estimated the correlation between dengue burden and socioeconomic factors, and then established an autoregressive integrated moving average (ARIMA) model to predict the epidemic trends of dengue in endemic regions. All estimates were proposed as numbers and age-standardized rates (ASR) per 100,000 population, with uncertainty intervals (UIs). The ASRs of dengue incidence were compared geographically and five regions were stratified by a sociodemographic index (SDI). RESULTS A significant rise was observed on a global scale between 1990 and 2019, with the overall age-standardized rate (ASR) increasing from 557.15 (95% UI 243.32-1212.53) per 100,000 in 1990 to 740.4 (95% UI 478.2-1323.1) per 100,000 in 2019. In 2019, the Oceania region had the highest age-standardized incidence rates per 100,000 population (3173.48 (95% UI 762.33-6161.18)), followed by the South Asia region (1740.79 (95% UI 660.93-4287.12)), and then the Southeast Asia region (1153.57 (95% UI 1049.49-1281.59)). In Oceania, South Asia, and Southeast Asia, increase trends were found in the burden of dengue fever measured by ASRs of DALY which were consistent with ASRs of dengue incidence at the national level. Most of the countries with the heaviest burden of dengue fever occurred in areas with low and medium SDI regions. However, the burden in high-middle and high-SDI countries is relatively low, especially the Solomon Islands and Tonga in Oceania, the Maldives in South Asia and Indonesia in Southeast Asia. The age distribution results of the incidence rate and disease burden of dengue fever of major endemic regions showed that the higher risk and disease burden are mainly concentrated in people under 14 or over 70 years old. The prediction by ARIMA showed that the risk of dengue fever in South and Southeast Asia is on the rise, and further prevention and control is warranted. CONCLUSIONS In view of the rapid population growth and urbanization in many dengue-endemic countries, our research results are of great significance for presenting the future trend in dengue fever. It is recommended to policy makers that specific attention needs to be paid to the negative impact of urbanization on dengue incidence and allocate more resources to the low-SDI areas and people under 14 or over 70 years old to reduce the burden of dengue fever.
Collapse
|
18
|
Prediction Model of Postoperative Severe Hypocalcemia in Patients with Secondary Hyperparathyroidism Based on Logistic Regression and XGBoost Algorithm. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:8752826. [PMID: 35924110 PMCID: PMC9343187 DOI: 10.1155/2022/8752826] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 06/14/2022] [Accepted: 07/07/2022] [Indexed: 11/18/2022]
Abstract
Objective A predictive model was established based on logistic regression and XGBoost algorithm to investigate the factors related to postoperative hypocalcemia in patients with secondary hyperparathyroidism (SHPT). Methods A total of 60 SHPT patients who underwent parathyroidectomy (PTX) in our hospital were retrospectively enrolled. All patients were randomly divided into a training set (n = 42) and a test set (n = 18). The clinical data of the patients were analyzed, including gender, age, dialysis time, body mass, and several preoperative biochemical indicators. The multivariate logistic regression and XGBoost algorithm models were used to analyze the independent risk factors for severe postoperative hypocalcemia (SH). The forecasting efficiency of the two prediction models is analyzed. Results Multivariate logistic regression analysis showed that body mass (OR = 1.203, P = 0.032), age (OR = 1.214, P = 0.035), preoperative PTH (OR = 1.026, P = 0.043), preoperative Ca (OR = 1.062, P = 0.025), and preoperative ALP (OR = 1.031, P = 0.027) were positively correlated with postoperative SH. The top three important features of XGBoost algorithm prediction model were preoperative Ca, preoperative PTH, and preoperative ALP. The area under the curve of the logistic regression and XGBoost algorithm model in the test set was 0.734 (95% CI: 0.595~0.872) and 0.827 (95% CI: 0.722~0.932), respectively. Conclusion The predictive models based on the logistic regression and XGBoost algorithm model can predict the occurrence of postoperative SH.
Collapse
|
19
|
Sebt MV, Jafari S, Khavaninzadeh M, Shavandi A. Diagnosis of brucellosis disease using data mining: A case study on patients of a hospital in Tehran. J Microbiol Methods 2022; 199:106530. [PMID: 35777597 DOI: 10.1016/j.mimet.2022.106530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Revised: 06/21/2022] [Accepted: 06/23/2022] [Indexed: 10/17/2022]
Abstract
BACKGROUND Brucellosis is a common zoonotic infection of humans from livestock. This bacterial infection is acquired from infected animals and their products. The pathogen of this disease is a genus of bacilli called Brucella, and no effective vaccine has been discovered yet for the prevention of human brucellosis. OBJECTIVES The present study is mainly conducted to diagnose brucellosis accurately and timely, using Data Mining techniques. Based on the knowledge discovered with Data Mining and opinions of specialist physicians, this study aims to propose instructions for diagnosing brucellosis. MATERIALS AND METHODS The dataset used in this study contains 340 samples and is extracted from the files of patients at Tehran Imam Khomeini Hospital from the years 2010-2020. Attributes of this dataset have been determined based on domain expert opinions, namely specialist physicians. After initial analysis and data pre-processing, various Data Mining techniques have been employed to diagnose brucellosis, including neural networks, Bayesian networks, and decision trees. RESULTS According to the recorded data, 270 people (approximately 79% of samples) had brucellosis. Some clinical symptoms were more prominent among infected patients, including fever, arthritis, tremor, decreased appetite, and nightly perspiration. Among all employed Data Mining techniques in this study, the decision tree with C5.0 pruning algorithm possessed the highest accuracy in diagnosing patients with brucellosis (approximately 99% accuracy). Based on the obtained final model, the most important factors for diagnosing brucellosis are the Wright test, Coombs Wright test, blood culture test, and living place. DISCUSSION AND CONCLUSION According to the results of this study, brucellosis can be diagnosed with a high accuracy using Data Mining techniques. Furthermore, the most significant factors for diagnosing brucellosis disease can be identified by Data Mining. Among all investigated techniques in this study, the decision tree with C5.0 pruning algorithm has the most accuracy in diagnosing brucellosis. Given the decision tree created by the C5.0 algorithm and the opinions of specialist physicians, some instructions are proposed based on a decision-making framework to classify referents into patient and non-patient groups. These instructions can accelerate the diagnosis, reduce therapeutic costs, and decrease treatment period.
Collapse
Affiliation(s)
- Mohammad Vahid Sebt
- Department of Industrial Engineering, Faculty of Engineering, Kharazmi University, Tehran, Iran.
| | - Sirous Jafari
- Department of Infectious Diseases, Imam Khomeini Hospital Complex, Tehran University of Medical Sciences, Tehran, Iran.
| | - Milad Khavaninzadeh
- Department of Industrial Engineering, Faculty of Engineering, Kharazmi University, Tehran, Iran.
| | - Ali Shavandi
- Department of Industrial Engineering, Faculty of Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
20
|
A Study on Customized Prediction of Daily Illness Risk Using Medical and Meteorological Data. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12126060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This study selected the most common illnesses in children and older adults and aimed to provide a customized degree of daily risk for each illness based on patient data for specific regions and illnesses. Sample medical data of one million people provided by the National Health Insurance Corporation and information regarding the meteorological environment and atmosphere from the Korea Meteorological Administration and a public data portal using application programing interface were collected. Learning and predictions were carried out with machine learning. Models with high R2 were selected and tuned to determine the optimal hyperparameter for predicting the degree of daily risk of an illness. Illnesses with an R2 value greater than 0.65 were considered significant. For children, these consisted of acute bronchitis, the common cold, rhinitis and tonsillitis, and middle ear inflammation. For older adults, they consisted of high blood pressure and heart disease, the common cold, esophageal inflammation and gastritis, acute bronchitis, eczema and dermatitis, and chronic bronchitis. This study provides the degree of daily risk for the most common illnesses in each age group. Furthermore, the results of this study are expected to raise awareness of illnesses that occur in certain climates and to help prevent them.
Collapse
|
21
|
Kamana E, Zhao J, Bai D. Predicting the impact of climate change on the re-emergence of malaria cases in China using LSTMSeq2Seq deep learning model: a modelling and prediction analysis study. BMJ Open 2022; 12:e053922. [PMID: 35361642 PMCID: PMC8971767 DOI: 10.1136/bmjopen-2021-053922] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVES Malaria is a vector-borne disease that remains a serious public health problem due to its climatic sensitivity. Accurate prediction of malaria re-emergence is very important in taking corresponding effective measures. This study aims to investigate the impact of climatic factors on the re-emergence of malaria in mainland China. DESIGN A modelling study. SETTING AND PARTICIPANTS Monthly malaria cases for four Plasmodium species (P. falciparum, P. malariae, P. vivax and other Plasmodium) and monthly climate data were collected for 31 provinces; malaria cases from 2004 to 2016 were obtained from the Chinese centre for disease control and prevention and climate parameters from China meteorological data service centre. We conducted analyses at the aggregate level, and there was no involvement of confidential information. PRIMARY AND SECONDARY OUTCOME MEASURES The long short-term memory sequence-to-sequence (LSTMSeq2Seq) deep neural network model was used to predict the re-emergence of malaria cases from 2004 to 2016, based on the influence of climatic factors. We trained and tested the extreme gradient boosting (XGBoost), gated recurrent unit, LSTM, LSTMSeq2Seq models using monthly malaria cases and corresponding meteorological data in 31 provinces of China. Then we compared the predictive performance of models using root mean squared error (RMSE) and mean absolute error evaluation measures. RESULTS The proposed LSTMSeq2Seq model reduced the mean RMSE of the predictions by 19.05% to 33.93%, 18.4% to 33.59%, 17.6% to 26.67% and 13.28% to 21.34%, for P. falciparum, P. vivax, P. malariae, and other plasmodia, respectively, as compared with other candidate models. The LSTMSeq2Seq model achieved an average prediction accuracy of 87.3%. CONCLUSIONS The LSTMSeq2Seq model significantly improved the prediction of malaria re-emergence based on the influence of climatic factors. Therefore, the LSTMSeq2Seq model can be effectively applied in the malaria re-emergence prediction.
Collapse
Affiliation(s)
- Eric Kamana
- Complexity Science Institute, School of Automation, Qingdao University, Qingdao, China
| | - Jijun Zhao
- Complexity Science Institute, School of Automation, Qingdao University, Qingdao, China
| | - Di Bai
- Complexity Science Institute, School of Automation, Qingdao University, Qingdao, China
| |
Collapse
|
22
|
Zhao D, Zhang H, Cao Q, Wang Z, He S, Zhou M, Zhang R. The research of ARIMA, GM(1,1), and LSTM models for prediction of TB cases in China. PLoS One 2022; 17:e0262734. [PMID: 35196309 PMCID: PMC8865644 DOI: 10.1371/journal.pone.0262734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/04/2022] [Indexed: 11/25/2022] Open
Abstract
Background and objective Tuberculosis (Tuberculosis, TB) is a public health problem in China, which not only endangers the population’s health but also affects economic and social development. It requires an accurate prediction analysis to help to make policymakers with early warning and provide effective precautionary measures. In this study, ARIMA, GM(1,1), and LSTM models were constructed and compared, respectively. The results showed that the LSTM was the optimal model, which can be achieved satisfactory performance for TB cases predictions in mainland China. Methods The data of tuberculosis cases in mainland China were extracted from the National Health Commission of the People’s Republic of China website. According to the TB data characteristics and the sample requirements, we created the ARIMA, GM(1,1), and LSTM models, which can make predictions for the prevalence trend of TB. The mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE) were applied to evaluate the effects of model fitting predicting accuracy. Results There were 3,021,995 tuberculosis cases in mainland China from January 2018 to December 2020. And the overall TB cases in mainland China take on a downtrend trend. We established ARIMA, GM(1,1), and LSTM models, respectively. The optimal ARIMA model is the ARIMA (0,1,0) × (0,1,0)12. The equation for GM(1,1) model was X(k+1) = -10057053.55e(-0.01k) + 10153178.55 the Mean square deviation ratio C value was 0.49, and the Small probability of error P was 0.94. LSTM model consists of an input layer, a hidden layer and an output layer, the parameters of epochs, learning rating are 60, 0.01, respectively. The MAE, RMSE, and MAPE values of LSTM model were smaller than that of GM(1,1) and ARIMA models. Conclusions Our findings showed that the LSTM model was the optimal model, which has a higher accuracy performance than that of ARIMA and GM (1,1) models. Its prediction results can act as a predictive tool for TB prevention measures in mainland China.
Collapse
Affiliation(s)
- Daren Zhao
- Department of Medical Administration, Sichuan Provincial Orthopedics Hospital, Chengdu, Sichuan, P.R. China
| | - Huiwu Zhang
- Department of Medical Administration, Sichuan Provincial Orthopedics Hospital, Chengdu, Sichuan, P.R. China
| | - Qing Cao
- Department of Medical Administration, Sichuan Academy of Medical Sciences & Sichuan Provincial People’s Hospital, Chengdu, Sichuan, P.R. China
| | - Zhiyi Wang
- Department of Medical Administration, Sichuan Cancer Hospital & Institute, Chengdu, Sichuan, P.R. China
| | - Sizhang He
- Department of Information and Statistics, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| | - Minghua Zhou
- Department of Medical Administration, Luzhou People’s Hospital, Luzhou, Sichuan, P.R. China
| | - Ruihua Zhang
- School of Management, Chengdu University of Traditional Chinese Medicine, Chengdu, Sichuan, P.R. China
- * E-mail:
| |
Collapse
|
23
|
A Comparison of Infectious Disease Forecasting Methods across Locations, Diseases, and Time. Pathogens 2022; 11:pathogens11020185. [PMID: 35215129 PMCID: PMC8875569 DOI: 10.3390/pathogens11020185] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Revised: 01/23/2022] [Accepted: 01/27/2022] [Indexed: 02/04/2023] Open
Abstract
Accurate infectious disease forecasting can inform efforts to prevent outbreaks and mitigate adverse impacts. This study compares the performance of statistical, machine learning (ML), and deep learning (DL) approaches in forecasting infectious disease incidences across different countries and time intervals. We forecasted three diverse diseases: campylobacteriosis, typhoid, and Q-fever, using a wide variety of features (n = 46) from public datasets, e.g., landscape, climate, and socioeconomic factors. We compared autoregressive statistical models to two tree-based ML models (extreme gradient boosted trees [XGB] and random forest [RF]) and two DL models (multi-layer perceptron and encoder–decoder model). The disease models were trained on data from seven different countries at the region-level between 2009–2017. Forecasting performance of all models was assessed using mean absolute error, root mean square error, and Poisson deviance across Australia, Israel, and the United States for the months of January through August of 2018. The overall model results were compared across diseases as well as various data splits, including country, regions with highest and lowest cases, and the forecasted months out (i.e., nowcasting, short-term, and long-term forecasting). Overall, the XGB models performed the best for all diseases and, in general, tree-based ML models performed the best when looking at data splits. There were a few instances where the statistical or DL models had minutely smaller error metrics for specific subsets of typhoid, which is a disease with very low case counts. Feature importance per disease was measured by using four tree-based ML models (i.e., XGB and RF with and without region name as a feature). The most important feature groups included previous case counts, region name, population counts and density, mortality causes of neonatal to under 5 years of age, sanitation factors, and elevation. This study demonstrates the power of ML approaches to incorporate a wide range of factors to forecast various diseases, regardless of location, more accurately than traditional statistical approaches.
Collapse
|
24
|
Meng D, Xu J, Zhao J. Analysis and prediction of hand, foot and mouth disease incidence in China using Random Forest and XGBoost. PLoS One 2021; 16:e0261629. [PMID: 34936688 PMCID: PMC8694472 DOI: 10.1371/journal.pone.0261629] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 12/06/2021] [Indexed: 12/13/2022] Open
Abstract
Hand, foot and mouth disease (HFMD) is an increasingly serious public health problem, and it has caused an outbreak in China every year since 2008. Predicting the incidence of HFMD and analyzing its influential factors are of great significance to its prevention. Now, machine learning has shown advantages in infectious disease models, but there are few studies on HFMD incidence based on machine learning that cover all the provinces in mainland China. In this study, we proposed two different machine learning algorithms, Random Forest and eXtreme Gradient Boosting (XGBoost), to perform our analysis and prediction. We first used Random Forest to examine the association between HFMD incidence and potential influential factors for 31 provinces in mainland China. Next, we established Random Forest and XGBoost prediction models using meteorological and social factors as the predictors. Finally, we applied our prediction models in four different regions of mainland China and evaluated the performance of them. Our results show that: 1) Meteorological factors and social factors jointly affect the incidence of HFMD in mainland China. Average temperature and population density are the two most significant influential factors; 2) Population flux has different delayed effect in affecting HFMD incidence in different regions. From a national perspective, the model using population flux data delayed for one month has better prediction performance; 3) The prediction capability of XGBoost model was better than that of Random Forest model from the overall perspective. XGBoost model is more suitable for predicting the incidence of HFMD in mainland China.
Collapse
Affiliation(s)
- Delin Meng
- Complexity Science Institute, Qingdao University, Qingdao, Shandong, China
| | - Jun Xu
- State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China
| | - Jijun Zhao
- Complexity Science Institute, Qingdao University, Qingdao, Shandong, China
- * E-mail:
| |
Collapse
|
25
|
Kim D, Kim SB, Jeon S, Kim S, Lee KH, Lee HS, Han SH. No Change of Pneumocystis jirovecii Pneumonia after the COVID-19 Pandemic: Multicenter Time-Series Analyses. J Fungi (Basel) 2021; 7:jof7110990. [PMID: 34829277 PMCID: PMC8624436 DOI: 10.3390/jof7110990] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/08/2021] [Accepted: 11/17/2021] [Indexed: 11/30/2022] Open
Abstract
Consolidated infection control measures imposed by the government and hospitals during COVID-19 pandemic resulted in a sharp decline of respiratory viruses. Based on the issue of whether Pneumocystis jirovecii could be transmitted by airborne and acquired from the environment, we assessed changes in P. jirovecii pneumonia (PCP) cases in a hospital setting before and after COVID-19. We retrospectively collected data of PCP-confirmed inpatients aged ≥18 years (N = 2922) in four university-affiliated hospitals between January 2015 and June 2021. The index and intervention dates were defined as the first time of P. jirovecii diagnosis and January 2020, respectively. We predicted PCP cases for post-COVID-19 and obtained the difference (residuals) between forecasted and observed cases using the autoregressive integrated moving average (ARIMA) and the Bayesian structural time-series (BSTS) models. Overall, the average of observed PCP cases per month in each year were 36.1 and 47.3 for pre- and post-COVID-19, respectively. The estimate for residuals in the ARIMA model was not significantly different in the total PCP-confirmed inpatients (7.4%, p = 0.765). The forecasted PCP cases by the BSTS model were not significantly different from the observed cases in the post-COVID-19 (−0.6%, 95% credible interval; −9.6~9.1%, p = 0.450). The unprecedented strict non-pharmacological interventions did not affect PCP cases.
Collapse
Affiliation(s)
- Dayeong Kim
- Department of Internal Medicine, Division of Infectious Disease, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea; (D.K.); (S.K.); (K.H.L.)
| | - Sun Bean Kim
- Department of Internal Medicine, Division of Infectious Diseases, Korea University College of Medicine, 73, Goryeodae-ro, Seongbuk-gu, Seoul 02841, Korea;
| | - Soyoung Jeon
- Biostatistics Collaboration Unit, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea;
| | - Subin Kim
- Department of Internal Medicine, Division of Infectious Disease, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea; (D.K.); (S.K.); (K.H.L.)
| | - Kyoung Hwa Lee
- Department of Internal Medicine, Division of Infectious Disease, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea; (D.K.); (S.K.); (K.H.L.)
| | - Hye Sun Lee
- Biostatistics Collaboration Unit, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea;
- Correspondence: (H.S.L.); (S.H.H.)
| | - Sang Hoon Han
- Department of Internal Medicine, Division of Infectious Disease, Yonsei University College of Medicine, 211 Eonju-ro, Gangnam-gu, Seoul 06273, Korea; (D.K.); (S.K.); (K.H.L.)
- Correspondence: (H.S.L.); (S.H.H.)
| |
Collapse
|
26
|
An CH, Nie SM, Sun YX, Fan SP, Luo BY, Li Z, Liu ZG, Chang WH. Seroprevalence trend of human brucellosis and MLVA genotyping characteristics of Brucella melitensis in Shaanxi Province, China, during 2008-2020. Transbound Emerg Dis 2021; 69:e423-e434. [PMID: 34510783 DOI: 10.1111/tbed.14320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 08/25/2021] [Accepted: 09/08/2021] [Indexed: 11/29/2022]
Abstract
In this study, a total of 179,907 blood samples from populations with suspected Brucella spp. infections were collected between 2008 and 2020 and analyzed by the Rose Bengal plate test (RBPT) and serum agglutination test (SAT). Moreover, conventional biotyping, B. abortus-melitensis-ovis-suis polymerase chain reaction (AMOS-PCR), and multiple-locus variable-number tandem repeat analysis (MLVA) was applied to characterize the isolated strains. A total of 8103 (4.50%) samples were positive in RBPT, while 7705 (4.28%, 95% confidence interval (CI) 4.19-4.37) samples were positive in SAT. There was a significant difference in seroprevalence for human brucellosis over time, in different areas and different cities (districts) (χ2 = 2 = 32.23, 1984.14, and 3749.51, p < .05). The highest seropositivity (8.22% (4, 965/60393; 95% CI 8.00-8.44) was observed in Yulin City, which borders Inner Mongolia, Ningxia, and Gansu Province, China, regions that have a high incidence of human brucellosis. Moreover, 174 Brucella strains were obtained, including nine with B. melitensis bv. 1, 145 with B. melitensis bv. 3, and 20 with B. melitensis variants. After random selection, 132 B. melitensis were further genotyped using MLVA-16. The 132 strains were sorted into 100 MLVA-16 genotypes (GTs) (GT 1-100), 81 of which were single GTs represented by singular independent strains. The remaining 19 shared GTs involved 51 strains, and each GT included two to seven isolates from the Shaan northern and Guanzhong areas. These data indicated that although sporadic cases were a dominant epidemic characteristic of human brucellosis in this province, more than 38.6% (51/132) outbreaks were also found in the Shaan northern area and Guanzhong areas. The 47 shared MLVA-16 GTs were observed in strains (n = 71) from this study and strains (n = 337) from 19 other provinces of China. These data suggest that strains from the northern provinces are a potential source of human brucellosis cases in Shaanxi Province. It is urgent to strengthen the surveillance and control of the trade and transfer of infected sheep among regions.
Collapse
Affiliation(s)
- Cui-Hong An
- Department of Plague and Brucellosis, Shaanxi Center for Disease Control and Prevention, Xi'an, China.,Department of Microbiology and Immunology, School of Medicine, Xi'an Jiaotong University, Xi'an, China
| | - Shou-Min Nie
- Department of Plague and Brucellosis, Shaanxi Center for Disease Control and Prevention, Xi'an, China
| | - Yang-Xin Sun
- Department of Plague and Brucellosis, Shaanxi Center for Disease Control and Prevention, Xi'an, China
| | - Suo-Ping Fan
- Department of Plague and Brucellosis, Shaanxi Center for Disease Control and Prevention, Xi'an, China
| | - Bo-Yan Luo
- Department of Plague and Brucellosis, Shaanxi Center for Disease Control and Prevention, Xi'an, China
| | - Zhenjun Li
- State Key Laboratory of Infectious Disease Prevention and Control, Chinese Center for Disease Control and Prevention, National Institute for Communicable Diseases Control and Prevention, Beijing, China
| | - Zhi-Guo Liu
- State Key Laboratory of Infectious Disease Prevention and Control, Chinese Center for Disease Control and Prevention, National Institute for Communicable Diseases Control and Prevention, Beijing, China
| | - Wen-Hui Chang
- Department of Plague and Brucellosis, Shaanxi Center for Disease Control and Prevention, Xi'an, China
| |
Collapse
|
27
|
Lv CX, An SY, Qiao BJ, Wu W. Time series analysis of hemorrhagic fever with renal syndrome in mainland China by using an XGBoost forecasting model. BMC Infect Dis 2021; 21:839. [PMID: 34412581 PMCID: PMC8377883 DOI: 10.1186/s12879-021-06503-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 07/30/2021] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Hemorrhagic fever with renal syndrome (HFRS) is still attracting public attention because of its outbreak in various cities in China. Predicting future outbreaks or epidemics disease based on past incidence data can help health departments take targeted measures to prevent diseases in advance. In this study, we propose a multistep prediction strategy based on extreme gradient boosting (XGBoost) for HFRS as an extension of the one-step prediction model. Moreover, the fitting and prediction accuracy of the XGBoost model will be compared with the autoregressive integrated moving average (ARIMA) model by different evaluation indicators. METHODS We collected HFRS incidence data from 2004 to 2018 of mainland China. The data from 2004 to 2017 were divided into training sets to establish the seasonal ARIMA model and XGBoost model, while the 2018 data were used to test the prediction performance. In the multistep XGBoost forecasting model, one-hot encoding was used to handle seasonal features. Furthermore, a series of evaluation indices were performed to evaluate the accuracy of the multistep forecast XGBoost model. RESULTS There were 200,237 HFRS cases in China from 2004 to 2018. A long-term downward trend and bimodal seasonality were identified in the original time series. According to the minimum corrected akaike information criterion (CAIC) value, the optimal ARIMA (3, 1, 0) × (1, 1, 0)12 model is selected. The index ME, RMSE, MAE, MPE, MAPE, and MASE indices of the XGBoost model were higher than those of the ARIMA model in the fitting part, whereas the RMSE of the XGBoost model was lower. The prediction performance evaluation indicators (MAE, MPE, MAPE, RMSE and MASE) of the one-step prediction and multistep prediction XGBoost model were all notably lower than those of the ARIMA model. CONCLUSIONS The multistep XGBoost prediction model showed a much better prediction accuracy and model stability than the multistep ARIMA prediction model. The XGBoost model performed better in predicting complicated and nonlinear data like HFRS. Additionally, Multistep prediction models are more practical than one-step prediction models in forecasting infectious diseases.
Collapse
Affiliation(s)
- Cai-Xia Lv
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning China
| | - Shu-Yi An
- Liaoning Provincial Center for Disease Control and Prevention, Shenyang, Liaoning China
| | - Bao-Jun Qiao
- Liaoning Provincial Center for Disease Control and Prevention, Shenyang, Liaoning China
| | - Wei Wu
- Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning China
| |
Collapse
|
28
|
Wang Y, Chen H, Sun T, Li A, Wang S, Zhang J, Li S, Zhang Z, Zhu D, Wang X, Cao F. Risk predicting for acute coronary syndrome based on machine learning model with kinetic plaque features from serial coronary computed tomography angiography. Eur Heart J Cardiovasc Imaging 2021; 23:800-810. [PMID: 34151931 DOI: 10.1093/ehjci/jeab101] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 05/04/2021] [Indexed: 12/21/2022] Open
Abstract
AIMS More patients with suspected coronary artery disease underwent coronary computed tomography angiography (CCTA) as gatekeeper. However, the prospective relation of plaque features to acute coronary syndrome (ACS) events has not been previously explored. METHODS AND RESULTS One hundred and one out of 452 patients with documented ACS event and received more than once CCTA during the past 12 years were recruited. Other 101 patients without ACS event were matched as case control. Baseline, follow-up, and changes of anatomical, compositional, and haemodynamic parameters [e.g. luminal stenosis, plaque volume, necrotic core, calcification, and CCTA-derived fractional flow reserve (CT-FFR)] were analysed by independent CCTA measurement core laboratories. Baseline anatomical, compositional, and haemodynamic parameters of lesions showed no significant difference between the two cohorts (P > 0.05). While the culprit lesions exhibited significant increase of luminal stenosis (10.18 ± 2.26% vs. 3.62 ± 1.41%, P = 0.018), remodelling index (0.15 ± 0.14 vs. 0.09 ± 0.01, P < 0.01), and necrotic core (4.79 ± 1.84% vs. 0.43 ± 1.09%, P = 0.019) while decrease of CT-FFR (-0.05 ± 0.005 vs. -0.01 ± 0.003, P < 0.01) and calcium ratio (-4.28 ± 2.48% vs. 4.48 ± 1.46%, P = 0.004) between follow-up CCTA and baseline scans in comparison to that of non-culprit lesion. The XGBoost model comprising the top five important plaque features revealed higher predictive ability (area under the curve 0.918, 95% confidence interval 0.861-0.968). CONCLUSIONS Dynamic changes of plaque features are highly relative with subsequent ACS events. The machine learning model of integrating these lesion characteristics (e.g. CT-FFR, necrotic core, remodelling index, plaque volume, and calcium) can improve the ability for predicting risks of ACS events.
Collapse
Affiliation(s)
- Yabin Wang
- Department of Geriatric Cardiology & National Clinical Research Center for Geriatric Diseases, Second Medical Center of Chinese PLA General Hospital, 28# Fuxing road, Haidian district, Beijing 100853, China
| | - Haiwei Chen
- Department of Geriatrics, Forth Medical Center of Chinese PLA General Hospital, Beijing 100853, China
| | - Ting Sun
- Department of Geriatric Cardiology & National Clinical Research Center for Geriatric Diseases, Second Medical Center of Chinese PLA General Hospital, 28# Fuxing road, Haidian district, Beijing 100853, China
| | - Ang Li
- Department of Geriatric Cardiology & National Clinical Research Center for Geriatric Diseases, Second Medical Center of Chinese PLA General Hospital, 28# Fuxing road, Haidian district, Beijing 100853, China
| | - Shengshu Wang
- Institute of Geriatrics, Second Medical Center of Chinese PLA General Hospital, Beijing 100853, China
| | - Jibin Zhang
- Department of Geriatric Cardiology & National Clinical Research Center for Geriatric Diseases, Second Medical Center of Chinese PLA General Hospital, 28# Fuxing road, Haidian district, Beijing 100853, China
| | - Sulei Li
- Department of Geriatric Cardiology & National Clinical Research Center for Geriatric Diseases, Second Medical Center of Chinese PLA General Hospital, 28# Fuxing road, Haidian district, Beijing 100853, China
| | - Zheng Zhang
- Department of Cardiology, PLA Rocket Force Characteristic Medical Center, Beijing 100088, China
| | - Di Zhu
- Department of Endocrinology, Air Force Medical Center, Beijing 100142, China
| | - Xinjiang Wang
- Department of Radiology, Second Medical Center of Chinese PLA General Hospital, Beijing, 100853, China
| | - Feng Cao
- Department of Geriatric Cardiology & National Clinical Research Center for Geriatric Diseases, Second Medical Center of Chinese PLA General Hospital, 28# Fuxing road, Haidian district, Beijing 100853, China
| |
Collapse
|