1
|
Geng X, Ma Y, Cai W, Zha Y, Zhang T, Zhang H, Yang C, Yin F, Shui T. Evaluation of models for multi-step forecasting of hand, foot and mouth disease using multi-input multi-output: A case study of Chengdu, China. PLoS Negl Trop Dis 2023; 17:e0011587. [PMID: 37683009 PMCID: PMC10511093 DOI: 10.1371/journal.pntd.0011587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 09/20/2023] [Accepted: 08/11/2023] [Indexed: 09/10/2023] Open
Abstract
BACKGROUND Hand, foot and mouth disease (HFMD) is a public health concern that threatens the health of children. Accurately forecasting of HFMD cases multiple days ahead and early detection of peaks in the number of cases followed by timely response are essential for HFMD prevention and control. However, many studies mainly predict future one-day incidence, which reduces the flexibility of prevention and control. METHODS We collected the daily number of HFMD cases among children aged 0-14 years in Chengdu from 2011 to 2017, as well as meteorological and air pollutant data for the same period. The LSTM, Seq2Seq, Seq2Seq-Luong and Seq2Seq-Shih models were used to perform multi-step prediction of HFMD through multi-input multi-output. We evaluated the models in terms of overall prediction performance, the time delay and intensity of detection peaks. RESULTS From 2011 to 2017, HFMD in Chengdu showed seasonal trends that were consistent with temperature, air pressure, rainfall, relative humidity, and PM10. The Seq2Seq-Shih model achieved the best performance, with RMSE, sMAPE and PCC values of 13.943~22.192, 17.880~27.937, and 0.887~0.705 for the 2-day to 15-day predictions, respectively. Meanwhile, the Seq2Seq-Shih model is able to detect peaks in the next 15 days with a smaller time delay. CONCLUSIONS The deep learning Seq2Seq-Shih model achieves the best performance in overall and peak prediction, and is applicable to HFMD multi-step prediction based on environmental factors.
Collapse
Affiliation(s)
- Xiaoran Geng
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Yue Ma
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Wennian Cai
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Yuanyi Zha
- Kunming Medical University, Kunming, China
| | - Tao Zhang
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Huadong Zhang
- Chongqing Center for Disease Control and Prevention, Chongqing, China
| | - Changhong Yang
- Sichuan Center for Disease Control and Prevention, Chengdu, China
| | - Fei Yin
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Tiejun Shui
- Yunnan Center for Disease Control and Prevention, Kunming, China
| |
Collapse
|
2
|
Canbek G. BenchMetrics Prob: benchmarking of probabilistic error/loss performance evaluation instruments for binary classification problems. INT J MACH LEARN CYB 2023:1-31. [PMID: 37360884 PMCID: PMC10113998 DOI: 10.1007/s13042-023-01826-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 03/21/2023] [Indexed: 06/28/2023]
Abstract
Probabilistic error/loss performance evaluation instruments that are originally used for regression and time series forecasting are also applied in some binary-class or multi-class classifiers, such as artificial neural networks. This study aims to systematically assess probabilistic instruments for binary classification performance evaluation using a proposed two-stage benchmarking method called BenchMetrics Prob. The method employs five criteria and fourteen simulation cases based on hypothetical classifiers on synthetic datasets. The goal is to reveal specific weaknesses of performance instruments and to identify the most robust instrument in binary classification problems. The BenchMetrics Prob method was tested on 31 instrument/instrument variants, and the results have identified four instruments as the most robust in a binary classification context: Sum Squared Error (SSE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE, as the variant of MSE), and Mean Absolute Error (MAE). As SSE has lower interpretability due to its [0, ∞) range, MAE in [0, 1] is the most convenient and robust probabilistic metric for generic purposes. In classification problems where large errors are more important than small errors, RMSE may be a better choice. Additionally, the results showed that instrument variants with summarization functions other than mean (e.g., median and geometric mean), LogLoss, and the error instruments with relative/percentage/symmetric-percentage subtypes for regression, such as Mean Absolute Percentage Error (MAPE), Symmetric MAPE (sMAPE), and Mean Relative Absolute Error (MRAE), were less robust and should be avoided. These findings suggest that researchers should employ robust probabilistic metrics when measuring and reporting performance in binary classification problems.
Collapse
|
3
|
How heterogeneous is the dengue transmission profile in Brazil? A study in six Brazilian states. PLoS Negl Trop Dis 2022; 16:e0010746. [PMID: 36095004 PMCID: PMC9499305 DOI: 10.1371/journal.pntd.0010746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 09/22/2022] [Accepted: 08/17/2022] [Indexed: 11/19/2022] Open
Abstract
Dengue is a vector-borne disease present in most tropical countries, infecting an average of 50 to 100 million people per year. Socioeconomic, demographic, and environmental factors directly influence the transmission cycle of the dengue virus (DENV). In Brazil, these factors vary between regions producing different profiles of dengue transmission and challenging the epidemiological surveillance of the disease. In this article, we aimed at classifying the profiles of dengue transmission in 1,823 Brazilian municipalities, covering different climates, from 2010 to 2019. Time series data of dengue cases were obtained from six states: Ceará and Maranhão in the semiarid Northeast, Minas Gerais in the countryside, Espírito Santo and Rio de Janeiro in the tropical Atlantic coast, and Paraná in the subtropical region. To describe the time series, we proposed a set of epi-features of the magnitude and duration of the dengue epidemic cycles, totaling 13 indicators. Using these epi-features as inputs, a multivariate cluster algorithm was employed to classify the municipalities according to their dengue transmission profile. Municipalities were classified into four distinct dengue transmission profiles: persistent transmission (7.8%), epidemic (21.3%), episodic/epidemic (43.2%), and episodic transmission (27.6%). Different profiles were associated with the municipality’s population size and climate. Municipalities with higher incidence and larger populations tended to be classified as persistent transmission, suggesting the existence of critical community size. This association, however, varies depending on the state, indicating the importance of other factors. The proposed classification is useful for developing more specific and precise surveillance protocols for regions with different dengue transmission profiles, as well as more precise public policies for dengue prevention. Dengue is one of the fastest-growing vector-borne diseases in the world. Currently, vaccines are experimental and are not very effective, so prevention depends on the control of the mosquito Aedes aegypti. Health promotion campaigns aimed at encouraging people to reduce mosquito breeding sites have limited effect. In addition, the heterogeneity of the territories that have dengue becomes a major challenge for the epidemiological surveillance of the disease. Brazil has a territory of continental size, and single standardized surveillance is not very effective for monitoring this arbovirus. Classifying types of dengue dynamics based on features of the epidemiological cycle in each location has the potential to increase the precision of surveillance and control strategies. In our study, we were able to classify areas according to different dengue transmission profiles, ranging from episodic to persistent transmission. These results can provide tools to guide actions aimed at achieving the World Health Organization’s goals of eliminating neglected tropical diseases in countries that have the virus.
Collapse
|
4
|
Petropoulos F, Makridakis S, Stylianou N. COVID-19: Forecasting confirmed cases and deaths with a simple time series model. INTERNATIONAL JOURNAL OF FORECASTING 2022; 38:439-452. [PMID: 33311822 PMCID: PMC7717777 DOI: 10.1016/j.ijforecast.2020.11.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Forecasting the outcome of outbreaks as early and as accurately as possible is crucial for decision-making and policy implementations. A significant challenge faced by forecasters is that not all outbreaks and epidemics turn into pandemics, making the prediction of their severity difficult. At the same time, the decisions made to enforce lockdowns and other mitigating interventions versus their socioeconomic consequences are not only hard to make, but also highly uncertain. The majority of modeling approaches to outbreaks, epidemics, and pandemics take an epidemiological approach that considers biological and disease processes. In this paper, we accept the limitations of forecasting to predict the long-term trajectory of an outbreak, and instead, we propose a statistical, time series approach to modelling and predicting the short-term behavior of COVID-19. Our model assumes a multiplicative trend, aiming to capture the continuation of the two variables we predict (global confirmed cases and deaths) as well as their uncertainty. We present the timeline of producing and evaluating 10-day-ahead forecasts over a period of four months. Our simple model offers competitive forecast accuracy and estimates of uncertainty that are useful and practically relevant.
Collapse
Affiliation(s)
| | - Spyros Makridakis
- Institute for the Future (IFF), University of Nicosia, Nicosia, Cyprus
| | - Neophytos Stylianou
- International Institute for Compassionate Care, Cyprus
- School of Management, University of Bath, UK
| |
Collapse
|
5
|
Jamshidi B, Rezaei M, Kakavandi M, Jamshidi Zargaran S. Modeling the Number of Confirmed Cases and Deaths from the COVID-19 Pandemic in the UK and Forecasting from April 15 to May 30, 2020. Disaster Med Public Health Prep 2022; 16:187-193. [PMID: 32878680 PMCID: PMC7588725 DOI: 10.1017/dmp.2020.312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 06/04/2020] [Accepted: 08/23/2020] [Indexed: 11/23/2022]
Abstract
OBJECTIVE The UK is one of the epicenters of coronavirus disease (COVID-19) in the world. As of April 14, there have been 93 873 confirmed patients of COVID-19 in the UK and 12 107 deaths with confirmed infection. On April 14, it was reported that COVID-19 was the cause of more than half of the deaths in London. METHODS The present paper addresses the modeling and forecasting of the outbreak of COVID-19 in the UK. This modeling must be accomplished through a 2-part time series model to study the number of confirmed cases and deaths. The period we aimed at a forecast was 46 days from April 15 to May 30, 2020. All the computations and simulations were conducted on Matlab R2015b, and the average curves and confidence intervals were calculated based on 100 simulations of the fitted models. RESULTS According to the obtained model, we expect that the cumulative number of confirmed cases will reach 282 000 with an 80% confidence interval (242 000 to 316 500) on May 30, from 93 873 on April 14. In addition, it is expected that, over this period, the number of daily new confirmed cases will fall to the interval 1330 to 6450 with the probability of 0.80 by the point estimation around 3100. Regarding death, our model establishes that the real case fatality rate of the pandemic in the UK approaches 11% (80% confidence interval: 8%-15%). Accordingly, we forecast that the total death in the UK will rise to 35 000 (28 000-50 000 with the probability of 80%). CONCLUSIONS The drawback of this study is the shortage of observations. Also, to conduct a more exact study, it is possible to take the number of the tests into account as an explanatory variable besides time.
Collapse
Affiliation(s)
- Babak Jamshidi
- Social Development and Health Promotion Research Center, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Mansour Rezaei
- Social Development and Health Promotion Research Center, Kermanshah University of Medical Sciences, Kermanshah, Iran
| | - Mohsen Kakavandi
- Mechanical Engineering, Poznan University of Technology, Poznan, Poland
| | | |
Collapse
|
6
|
Rosenkrantz DJ, Vullikanti A, Ravi SS, Stearns RE, Levin S, Poor HV, Marathe MV. Fundamental limitations on efficiently forecasting certain epidemic measures in network models. Proc Natl Acad Sci U S A 2022; 119:e2109228119. [PMID: 35046025 PMCID: PMC8794801 DOI: 10.1073/pnas.2109228119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 11/05/2021] [Indexed: 11/18/2022] Open
Abstract
The ongoing COVID-19 pandemic underscores the importance of developing reliable forecasts that would allow decision makers to devise appropriate response strategies. Despite much recent research on the topic, epidemic forecasting remains poorly understood. Researchers have attributed the difficulty of forecasting contagion dynamics to a multitude of factors, including complex behavioral responses, uncertainty in data, the stochastic nature of the underlying process, and the high sensitivity of the disease parameters to changes in the environment. We offer a rigorous explanation of the difficulty of short-term forecasting on networked populations using ideas from computational complexity. Specifically, we show that several forecasting problems (e.g., the probability that at least a given number of people will get infected at a given time and the probability that the number of infections will reach a peak at a given time) are computationally intractable. For instance, efficient solvability of such problems would imply that the number of satisfying assignments of an arbitrary Boolean formula in conjunctive normal form can be computed efficiently, violating a widely believed hypothesis in computational complexity. This intractability result holds even under the ideal situation, where all the disease parameters are known and are assumed to be insensitive to changes in the environment. From a computational complexity viewpoint, our results, which show that contagion dynamics become unpredictable for both macroscopic and individual properties, bring out some fundamental difficulties of predicting disease parameters. On the positive side, we develop efficient algorithms or approximation algorithms for restricted versions of forecasting problems.
Collapse
Affiliation(s)
- Daniel J Rosenkrantz
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA 22904
- Department of Computer Science, University at Albany-State University of New York, Albany, NY 12222
| | - Anil Vullikanti
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA 22904
- Department of Computer Science, University of Virginia, Charlottesville, VA 22904
| | - S S Ravi
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA 22904
- Department of Computer Science, University at Albany-State University of New York, Albany, NY 12222
| | - Richard E Stearns
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA 22904
- Department of Computer Science, University at Albany-State University of New York, Albany, NY 12222
| | - Simon Levin
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ 08544
- Princeton Environmental Institute, Princeton University, Princeton, NJ 08544
| | - H Vincent Poor
- Department of Electrical and Computer Engineering, Princeton University, Princeton, NJ 08544
| | - Madhav V Marathe
- Biocomplexity Institute and Initiative, University of Virginia, Charlottesville, VA 22904;
- Department of Computer Science, University of Virginia, Charlottesville, VA 22904
| |
Collapse
|
7
|
Cawley C, Bergey F, Mehl A, Finckh A, Gilsdorf A. Novel Methods in the Surveillance of Influenza-Like Illness in Germany Using Data From a Symptom Assessment App (Ada): Observational Case Study. JMIR Public Health Surveill 2021; 7:e26523. [PMID: 34734836 PMCID: PMC8722671 DOI: 10.2196/26523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 08/04/2021] [Accepted: 08/16/2021] [Indexed: 11/13/2022] Open
Abstract
Background Participatory epidemiology is an emerging field harnessing consumer data entries of symptoms. The free app Ada allows users to enter the symptoms they are experiencing and applies a probabilistic reasoning model to provide a list of possible causes for these symptoms. Objective The objective of our study is to explore the potential contribution of Ada data to syndromic surveillance by comparing symptoms of influenza-like illness (ILI) entered by Ada users in Germany with data from a national population-based reporting system called GrippeWeb. Methods We extracted data for all assessments performed by Ada users in Germany over 3 seasons (2017/18, 2018/19, and 2019/20) and identified those with ILI (report of fever with cough or sore throat). The weekly proportion of assessments in which ILI was reported was calculated (overall and stratified by age group), standardized for the German population, and compared with trends in ILI rates reported by GrippeWeb using time series graphs, scatterplots, and Pearson correlation coefficient. Results In total, 2.1 million Ada assessments (for any symptoms) were included. Within seasons and across age groups, the Ada data broadly replicated trends in estimated weekly ILI rates when compared with GrippeWeb data (Pearson correlation—2017-18: r=0.86, 95% CI 0.76-0.92; P<.001; 2018-19: r=0.90, 95% CI 0.84-0.94; P<.001; 2019-20: r=0.64, 95% CI 0.44-0.78; P<.001). However, there were differences in the exact timing and nature of the epidemic curves between years. Conclusions With careful interpretation, Ada data could contribute to identifying broad ILI trends in countries without existing population-based monitoring systems or to the syndromic surveillance of symptoms not covered by existing systems.
Collapse
|
8
|
Yang W, Zhang D, Peng L, Zhuge C, Hong L. Rational evaluation of various epidemic models based on the COVID-19 data of China. Epidemics 2021; 37:100501. [PMID: 34601321 PMCID: PMC8464399 DOI: 10.1016/j.epidem.2021.100501] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 09/11/2021] [Accepted: 09/18/2021] [Indexed: 11/30/2022] Open
Abstract
In this paper, based on the Akaike information criterion, root mean square error and robustness coefficient, a rational evaluation of various epidemic models/methods, including seven empirical functions, four statistical inference methods and five dynamical models, on their forecasting abilities is carried out. With respect to the outbreak data of COVID-19 epidemics in China, we find that before the inflection point, all models fail to make a reliable prediction. The Logistic function consistently underestimates the final epidemic size, while the Gompertz’s function makes an overestimation in all cases. Towards statistical inference methods, the methods of sequential Bayesian and time-dependent reproduction number are more accurate at the late stage of an epidemic. And the transition-like behavior of exponential growth method from underestimation to overestimation with respect to the inflection point might be useful for constructing a more reliable forecast. Compared to ODE-based SIR, SEIR and SEIR-AHQ models, the SEIR-QD and SEIR-PO models generally show a better performance on studying the COVID-19 epidemics, whose success we believe could be attributed to a proper trade-off between model complexity and fitting accuracy. Our findings not only are crucial for the forecast of COVID-19 epidemics, but also may apply to other infectious diseases.
Collapse
Affiliation(s)
- Wuyue Yang
- Yau Mathematical Sciences Center, Tsinghua University, Beijing, 100084, PR China
| | - Dongyan Zhang
- Beijing Institute for Scientific and Engineering Computing, Faculty of Science, Beijing University of Technology, Beijing 100124, PR China
| | - Liangrong Peng
- College of Mathematics and Data Science, Minjiang University, Fuzhou, 350108, PR China
| | - Changjing Zhuge
- Beijing Institute for Scientific and Engineering Computing, Faculty of Science, Beijing University of Technology, Beijing 100124, PR China.
| | - Liu Hong
- School of Mathematics, Sun Yat-sen University, Guangzhou, 510275, PR China.
| |
Collapse
|
9
|
Recchia G, Freeman ALJ, Spiegelhalter D. How well did experts and laypeople forecast the size of the COVID-19 pandemic? PLoS One 2021; 16:e0250935. [PMID: 33951092 PMCID: PMC8099086 DOI: 10.1371/journal.pone.0250935] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 04/17/2021] [Indexed: 01/12/2023] Open
Abstract
Throughout the COVID-19 pandemic, social and traditional media have disseminated predictions from experts and nonexperts about its expected magnitude. How accurate were the predictions of 'experts'-individuals holding occupations or roles in subject-relevant fields, such as epidemiologists and statisticians-compared with those of the public? We conducted a survey in April 2020 of 140 UK experts and 2,086 UK laypersons; all were asked to make four quantitative predictions about the impact of COVID-19 by 31 Dec 2020. In addition to soliciting point estimates, we asked participants for lower and higher bounds of a range that they felt had a 75% chance of containing the true answer. Experts exhibited greater accuracy and calibration than laypersons, even when restricting the comparison to a subset of laypersons who scored in the top quartile on a numeracy test. Even so, experts substantially underestimated the ultimate extent of the pandemic, and the mean number of predictions for which the expert intervals contained the actual outcome was only 1.8 (out of 4), suggesting that experts should consider broadening the range of scenarios they consider plausible. Predictions of the public were even more inaccurate and poorly calibrated, suggesting that an important role remains for expert predictions as long as experts acknowledge their uncertainty.
Collapse
Affiliation(s)
- Gabriel Recchia
- Department of Pure Mathematics and Mathematical Statistics, Winton Centre for Risk and Evidence Communication, University of Cambridge, Cambridge, United Kingdom
| | - Alexandra L. J. Freeman
- Department of Pure Mathematics and Mathematical Statistics, Winton Centre for Risk and Evidence Communication, University of Cambridge, Cambridge, United Kingdom
| | - David Spiegelhalter
- Department of Pure Mathematics and Mathematical Statistics, Winton Centre for Risk and Evidence Communication, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
10
|
Li Q, Bedi T, Lehmann CU, Xiao G, Xie Y. Evaluating short-term forecasting of COVID-19 cases among different epidemiological models under a Bayesian framework. Gigascience 2021; 10:giab009. [PMID: 33604654 PMCID: PMC7928884 DOI: 10.1093/gigascience/giab009] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 01/11/2021] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Forecasting of COVID-19 cases daily and weekly has been one of the challenges posed to governments and the health sector globally. To facilitate informed public health decisions, the concerned parties rely on short-term daily projections generated via predictive modeling. We calibrate stochastic variants of growth models and the standard susceptible-infectious-removed model into 1 Bayesian framework to evaluate and compare their short-term forecasts. RESULTS We implement rolling-origin cross-validation to compare the short-term forecasting performance of the stochastic epidemiological models and an autoregressive moving average model across 20 countries that had the most confirmed COVID-19 cases as of August 22, 2020. CONCLUSION None of the models proved to be a gold standard across all regions, while all outperformed the autoregressive moving average model in terms of the accuracy of forecast and interpretability.
Collapse
Affiliation(s)
- Qiwei Li
- Department of Mathematical Sciences, The University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX 75080, USA
| | - Tejasv Bedi
- Department of Mathematical Sciences, The University of Texas at Dallas, 800 W Campbell Rd, Richardson, TX 75080, USA
| | - Christoph U Lehmann
- Department of Pediatrics, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Lyda Hill Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Guanghua Xiao
- Lyda Hill Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Yang Xie
- Lyda Hill Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
- Department of Population and Data Sciences, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
11
|
Kamar A, Maalouf N, Hitti E, El Eid G, Isma'eel H, Elhajj IH. Challenge of forecasting demand of medical resources and supplies during a pandemic: A comparative evaluation of three surge calculators for COVID-19. Epidemiol Infect 2021; 149:e51. [PMID: 33531094 PMCID: PMC7925989 DOI: 10.1017/s095026882100025x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2020] [Revised: 12/28/2020] [Accepted: 01/29/2021] [Indexed: 12/15/2022] Open
Abstract
Ever since the World Health Organization (WHO) declared the new coronavirus disease 2019 (COVID-19) as a pandemic, there has been a public health debate concerning medical resources and supplies including hospital beds, intensive care units (ICU), ventilators and protective personal equipment (PPE). Forecasting COVID-19 dissemination has played a key role in informing healthcare professionals and governments on how to manage overburdened healthcare systems. However, forecasting during the pandemic remained challenging and sometimes highly controversial. Here, we highlight this challenge by performing a comparative evaluation for the estimations obtained from three COVID-19 surge calculators under different social distancing approaches, taking Lebanon as a case study. Despite discrepancies in estimations, the three surge calculators used herein agree that there will be a relative shortage in the capacity of medical resources and a significant surge in PPE demand if the social distancing policy is removed. Our results underscore the importance of implementing containment interventions including social distancing in alleviating the demand for medical care during the COVID-19 pandemic in the absence of any medication or vaccine. The paper also highlights the value of employing several models in surge planning.
Collapse
Affiliation(s)
- A. Kamar
- Vascular Medicine Program, American University of Beirut, Beirut, Lebanon
| | - N. Maalouf
- Maroun Semaan Faculty of Engineering and Architecture, Department of Electrical and Computer Engineering, American University of Beirut, Beirut, Lebanon
| | - E. Hitti
- Department of Emergency Medicine, American University of Beirut Medical Center, Beirut, Lebanon
| | - G. El Eid
- Department of Emergency Medicine, American University of Beirut Medical Center, Beirut, Lebanon
| | - H. Isma'eel
- Vascular Medicine Program, American University of Beirut, Beirut, Lebanon
- Department of Internal Medicine, American University of Beirut Medical Center, Beirut, Lebanon
| | - I. H. Elhajj
- Vascular Medicine Program, American University of Beirut, Beirut, Lebanon
- Maroun Semaan Faculty of Engineering and Architecture, Department of Electrical and Computer Engineering, American University of Beirut, Beirut, Lebanon
| |
Collapse
|
12
|
Mathematical Models for COVID-19 Pandemic: A Comparative Analysis. J Indian Inst Sci 2020; 100:793-807. [PMID: 33144763 PMCID: PMC7596173 DOI: 10.1007/s41745-020-00200-6] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 09/14/2020] [Indexed: 12/17/2022]
Abstract
COVID-19 pandemic represents an unprecedented global health crisis in the last 100 years. Its economic, social and health impact continues to grow and is likely to end up as one of the worst global disasters since the 1918 pandemic and the World Wars. Mathematical models have played an important role in the ongoing crisis; they have been used to inform public policies and have been instrumental in many of the social distancing measures that were instituted worldwide. In this article, we review some of the important mathematical models used to support the ongoing planning and response efforts. These models differ in their use, their mathematical form and their scope.
Collapse
|
13
|
Adiga A, Dubhashi D, Lewis B, Marathe M, Venkatramanan S, Vullikanti A. Models for COVID-19 Pandemic: A Comparative Analysis. ARXIV 2020:arXiv:2009.10014v1. [PMID: 32995366 PMCID: PMC7523122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Abstract
COVID-19 pandemic represents an unprecedented global health crisis in the last 100 years. Its economic, social and health impact continues to grow and is likely to end up as one of the worst global disasters since the 1918 pandemic and the World Wars. Mathematical models have played an important role in the ongoing crisis; they have been used to inform public policies and have been instrumental in many of the social distancing measures that were instituted worldwide. In this article we review some of the important mathematical models used to support the ongoing planning and response efforts. These models differ in their use, their mathematical form and their scope.
Collapse
Affiliation(s)
- Aniruddha Adiga
- BIOCOMPLEXITY INSTITUTE AND INITITIATIVE, UNIVERSITY OF VIRGINIA
| | - Devdatt Dubhashi
- DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING, CHALMERS UNIVERSITY
| | - Bryan Lewis
- BIOCOMPLEXITY INSTITUTE AND INITITIATIVE, UNIVERSITY OF VIRGINIA
| | - Madhav Marathe
- BIOCOMPLEXITY INSTITUTE AND INITITIATIVE, UNIVERSITY OF VIRGINIA
- DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF VIRGINIA
| | | | - Anil Vullikanti
- BIOCOMPLEXITY INSTITUTE AND INITITIATIVE, UNIVERSITY OF VIRGINIA
- DEPARTMENT OF COMPUTER SCIENCE, UNIVERSITY OF VIRGINIA
| |
Collapse
|
14
|
Abstract
Background: It is extremely useful to construct mathematical models to forecast and control real phenomena. One of the common applied statistical models to represent the data involving with time is the time series modeling. A novel time series model to represent the propagation of an epidemic infection in a population is presented. The model deals with addressing the cumulative number of confirmed cases. Methods: Our model is the generalization of statistical exponential growth models and can describe different stages of the outbreak of a communicable disease. Applying the mentioned procedure leads to models CVJR1 (3.2, 1.44, 3, 13) for modeling the sequence of COVID-19 from January 13 to March 5. All computations and 200 simulations were done in MatLab 8.6. Results: For comparing candidates through fitting the dataset for six pairs of (l^ and a^), we used the minimum criterion square of residuals. We present the average and 90% upper and lower bounds of the predictions made by our models for three periods. Applying the mentioned procedure led to having models with parameters (3.2, 1.44, 3, 13) for modeling the course of COVID-19 from January 13 to March 5. Conclusions: The presented model can cover the epidemic behaviors related to social networks. Our model can be adjusted to worldwide modeling for modeling a phenomenon spreading in different populations simultaneously.
Collapse
|
15
|
Schneider PP, van Gool CJAW, Spreeuwenberg P, Hooiveld M, Donker GA, Barnett DJ, Paget J. Using web search queries to monitor influenza-like illness: an exploratory retrospective analysis, Netherlands, 2017/18 influenza season. Euro Surveill 2020; 25:1900221. [PMID: 32489174 PMCID: PMC7268271 DOI: 10.2807/1560-7917.es.2020.25.21.1900221] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
BackgroundDespite the early development of Google Flu Trends in 2009, standards for digital epidemiology methods have not been established and research from European countries is scarce.AimIn this article, we study the use of web search queries to monitor influenza-like illness (ILI) rates in the Netherlands in real time.MethodsIn this retrospective analysis, we simulated the weekly use of a prediction model for estimating the then-current ILI incidence across the 2017/18 influenza season solely based on Google search query data. We used weekly ILI data as reported to The European Surveillance System (TESSY) each week, and we removed the then-last 4 weeks from our dataset. We then fitted a prediction model based on the then-most-recent search query data from Google Trends to fill the 4-week gap ('Nowcasting'). Lasso regression, in combination with cross-validation, was applied to select predictors and to fit the 52 models, one for each week of the season.ResultsThe models provided accurate predictions with a mean and maximum absolute error of 1.40 (95% confidence interval: 1.09-1.75) and 6.36 per 10,000 population. The onset, peak and end of the epidemic were predicted with an error of 1, 3 and 2 weeks, respectively. The number of search terms retained as predictors ranged from three to five, with one keyword, 'griep' ('flu'), having the most weight in all models.DiscussionThis study demonstrates the feasibility of accurate, real-time ILI incidence predictions in the Netherlands using Google search query data.
Collapse
Affiliation(s)
- Paul P Schneider
- School of Health and Related Research (ScHARR), University of Sheffield, Sheffield, United Kingdom,Nivel (Netherlands Institute for Health Service Research), Utrecht, Netherlands
| | - Christel JAW van Gool
- School CAPHRI, Care and Public Health Research Institute, Maastricht University, Maastricht, Netherlands
| | - Peter Spreeuwenberg
- Nivel (Netherlands Institute for Health Service Research), Utrecht, Netherlands
| | - Mariëtte Hooiveld
- Nivel (Netherlands Institute for Health Service Research), Utrecht, Netherlands
| | - Gé A Donker
- Nivel (Netherlands Institute for Health Service Research), Utrecht, Netherlands
| | - David J Barnett
- Faculty of Health, Medicine and Life Sciences, Maastricht University, Maastricht, Netherlands
| | - John Paget
- Nivel (Netherlands Institute for Health Service Research), Utrecht, Netherlands
| |
Collapse
|
16
|
Morgan O. How decision makers can use quantitative approaches to guide outbreak responses. Philos Trans R Soc Lond B Biol Sci 2020; 374:20180365. [PMID: 31104605 PMCID: PMC6558558 DOI: 10.1098/rstb.2018.0365] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Decision makers are responsible for directing staffing, logistics, selecting public health interventions, communicating to professionals and the public, planning future response needs, and establishing strategic and tactical priorities along with their funding requirements. Decision makers need to rapidly synthesize data from different experts across multiple disciplines, bridge data gaps and translate epidemiological analysis into an operational set of decisions for disease control. Analytic approaches can be defined for specific response phases: investigation, scale-up and control. These approaches include: improved applications of quantitative methods to generate insightful epidemiological descriptions of outbreaks; robust investigations of causal agents and risk factors; tools to assess response needs; identifying and monitoring optimal interventions or combinations of interventions; and forecasting for response planning. Data science and quantitative approaches can improve decision-making in outbreak response. To realize these benefits, we need to develop a structured approach that will improve the quality and timeliness of data collected during outbreaks, establish analytic teams within the response structure and define a research agenda for data analytics in outbreak response. This article is part of the theme issue ‘Modelling infectious disease outbreaks in humans, animals and plants: epidemic forecasting and control’. This theme issue is linked with the earlier issue ‘Modelling infectious disease outbreaks in humans, animals and plants: approaches and important themes’.
Collapse
Affiliation(s)
- Oliver Morgan
- Department of Health Emergency Information and Risk Assessment, Health Emergencies Programme, World Health Organization , Geneva , Switzerland
| |
Collapse
|
17
|
Kowal DR. Integer-valued functional data analysis for measles forecasting. Biometrics 2019; 75:1321-1333. [PMID: 31254384 DOI: 10.1111/biom.13110] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2018] [Accepted: 05/22/2019] [Indexed: 11/29/2022]
Abstract
Measles presents a unique and imminent challenge for epidemiologists and public health officials: the disease is highly contagious, yet vaccination rates are declining precipitously in many localities. Consequently, the risk of a measles outbreak continues to rise. To improve preparedness, we study historical measles data both prevaccine and postvaccine, and design new methodology to forecast measles counts with uncertainty quantification. We propose to model the disease counts as an integer-valued functional time series: measles counts are a function of time-of-year and time-ordered by year. The counts are modeled using a negative-binomial distribution conditional on a real-valued latent process, which accounts for the overdispersion observed in the data. The latent process is decomposed using an unknown basis expansion, which is learned from the data, with dynamic basis coefficients. The resulting framework provides enhanced capability to model complex seasonality, which varies dynamically from year-to-year, and offers improved multimonth-ahead point forecasts and substantially tighter forecast intervals (with correct coverage) compared to existing forecasting models. Importantly, the fully Bayesian approach provides well-calibrated and precise uncertainty quantification for epi-relevant features, such as the future value and time of the peak measles count in a given year. An R package is available online.
Collapse
Affiliation(s)
- Daniel R Kowal
- Department of Statistics, Rice University, Houston, Texas
| |
Collapse
|
18
|
Rangarajan P, Mody SK, Marathe M. Forecasting dengue and influenza incidences using a sparse representation of Google trends, electronic health records, and time series data. PLoS Comput Biol 2019; 15:e1007518. [PMID: 31751346 PMCID: PMC6894887 DOI: 10.1371/journal.pcbi.1007518] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 12/05/2019] [Accepted: 10/29/2019] [Indexed: 12/20/2022] Open
Abstract
Dengue and influenza-like illness (ILI) are two of the leading causes of viral infection in the world and it is estimated that more than half the world’s population is at risk for developing these infections. It is therefore important to develop accurate methods for forecasting dengue and ILI incidences. Since data from multiple sources (such as dengue and ILI case counts, electronic health records and frequency of multiple internet search terms from Google Trends) can improve forecasts, standard time series analysis methods are inadequate to estimate all the parameter values from the limited amount of data available if we use multiple sources. In this paper, we use a computationally efficient implementation of the known variable selection method that we call the Autoregressive Likelihood Ratio (ARLR) method. This method combines sparse representation of time series data, electronic health records data (for ILI) and Google Trends data to forecast dengue and ILI incidences. This sparse representation method uses an algorithm that maximizes an appropriate likelihood ratio at every step. Using numerical experiments, we demonstrate that our method recovers the underlying sparse model much more accurately than the lasso method. We apply our method to dengue case count data from five countries/states: Brazil, Mexico, Singapore, Taiwan, and Thailand and to ILI case count data from the United States. Numerical experiments show that our method outperforms existing time series forecasting methods in forecasting the dengue and ILI case counts. In particular, our method gives a 18 percent forecast error reduction over a leading method that also uses data from multiple sources. It also performs better than other methods in predicting the peak value of the case count and the peak time. Dengue and influenza-like illness (ILI) are leading causes of viral infection in the world and hence it is important to develop accurate methods for forecasting their incidence. We use Autoregressive Likelihood Ratio method, which is a computationally efficient implementation of the variable selection method, in order to obtain a sparse (non-lasso) representation of time series, Google Trends and electronic health records (for ILI) data. This method is used to forecast dengue incidence in five countries/states and ILI incidence in USA. We show that this method outperforms existing time series methods in forecasting these diseases. The method is general and can also be used to forecast other diseases.
Collapse
Affiliation(s)
- Prashant Rangarajan
- Departments of Computer Science and Mathematics, Birla Institute of Technology and Science, Pilani, India
| | - Sandeep K. Mody
- Department of Mathematics, Indian Institute of Science, Bangalore, India
| | - Madhav Marathe
- Department of Computer Science, Network, Simulation Science and Advanced Computing Division, Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, United States of America
- * E-mail:
| |
Collapse
|
19
|
Manliura Datilo P, Ismail Z, Dare J. A Review of Epidemic Forecasting Using Artificial Neural Networks. INTERNATIONAL JOURNAL OF EPIDEMIOLOGIC RESEARCH 2019. [DOI: 10.15171/ijer.2019.24] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Background and aims: Since accurate forecasts help inform decisions for preventive health-care intervention and epidemic control, this goal can only be achieved by making use of appropriate techniques and methodologies. As much as forecast precision is important, methods and model selection procedures are critical to forecast precision. This study aimed at providing an overview of the selection of the right artificial neural network (ANN) methodology for the epidemic forecasts. It is necessary for forecasters to apply the right tools for the epidemic forecasts with high precision. Methods: It involved sampling and survey of epidemic forecasts based on ANN. A comparison of performance using ANN forecast and other methods was reviewed. Hybrids of a neural network with other classical methods or meta-heuristics that improved performance of epidemic forecasts were analysed. Results: Implementing hybrid ANN using data transformation techniques based on improved algorithms, combining forecast models, and using technological platforms enhance the learning and generalization of ANN in forecasting epidemics. Conclusion: The selection of forecasting tool is critical to the precision of epidemic forecast; hence, a working guide for the choice of appropriate tools will help reduce inconsistency and imprecision in forecasting epidemic size in populations. ANN hybrids that combined other algorithms and models, data transformation and technology should be used for an epidemic forecast.
Collapse
Affiliation(s)
- Philemon Manliura Datilo
- Department of Mathematical Sciences, Universiti Teknologi Malaysia, Johor, Malaysia
- Department of Information Technology, Modibbo Adama University of Technology, Yola School of Management and Information Technology, Adamawa State, Nigeria
| | - Zuhaimy Ismail
- Department of Mathematical Sciences, Universiti Teknologi Malaysia, Johor, Malaysia
| | - Jayeola Dare
- Adekunle Ajasin University, Department of Mathematical Sciences, Faculty of Science, Ondo State, Nigeria
| |
Collapse
|
20
|
Thorve S, Wilson ML, Lewis BL, Swarup S, Vullikanti AKS, Marathe MV. EpiViewer: an epidemiological application for exploring time series data. BMC Bioinformatics 2018; 19:449. [PMID: 30466409 PMCID: PMC6251172 DOI: 10.1186/s12859-018-2439-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 10/15/2018] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND Visualization plays an important role in epidemic time series analysis and forecasting. Viewing time series data plotted on a graph can help researchers identify anomalies and unexpected trends that could be overlooked if the data were reviewed in tabular form; these details can influence a researcher's recommended course of action or choice of simulation models. However, there are challenges in reviewing data sets from multiple data sources - data can be aggregated in different ways (e.g., incidence vs. cumulative), measure different criteria (e.g., infection counts, hospitalizations, and deaths), or represent different geographical scales (e.g., nation, HHS Regions, or states), which can make a direct comparison between time series difficult. In the face of an emerging epidemic, the ability to visualize time series from various sources and organizations and to reconcile these datasets based on different criteria could be key in developing accurate forecasts and identifying effective interventions. Many tools have been developed for visualizing temporal data; however, none yet supports all the functionality needed for easy collaborative visualization and analysis of epidemic data. RESULTS In this paper, we present EpiViewer, a time series exploration dashboard where users can upload epidemiological time series data from a variety of sources and compare, organize, and track how data evolves as an epidemic progresses. EpiViewer provides an easy-to-use web interface for visualizing temporal datasets either as line charts or bar charts. The application provides enhanced features for visual analysis, such as hierarchical categorization, zooming, and filtering, to enable detailed inspection and comparison of multiple time series on a single canvas. Finally, EpiViewer provides several built-in statistical Epi-features to help users interpret the epidemiological curves. CONCLUSION EpiViewer is a single page web application that provides a framework for exploring, comparing, and organizing temporal datasets. It offers a variety of features for convenient filtering and analysis of epicurves based on meta-attribute tagging. EpiViewer also provides a platform for sharing data between groups for better comparison and analysis. Our user study demonstrated that EpiViewer is easy to use and fills a particular niche in the toolspace for visualization and exploration of epidemiological data.
Collapse
Affiliation(s)
- Swapna Thorve
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, USA
- Network Dynamics and Simulation Science Laboratory, Biocomplexity Institute of Virginia Tech, Blacksburg, Virginia, USA
| | - Mandy L. Wilson
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, USA
| | - Bryan L. Lewis
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, USA
| | - Samarth Swarup
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, USA
| | - Anil Kumar S. Vullikanti
- Department of Computer Science, University of Virginia, Charlottesville, Virginia, USA
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, USA
| | - Madhav V. Marathe
- Department of Computer Science, University of Virginia, Charlottesville, Virginia, USA
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, USA
| |
Collapse
|
21
|
Chakraborty P, Lewis B, Eubank S, Brownstein JS, Marathe M, Ramakrishnan N. What to know before forecasting the flu. PLoS Comput Biol 2018; 14:e1005964. [PMID: 30312305 PMCID: PMC6193572 DOI: 10.1371/journal.pcbi.1005964] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Prithwish Chakraborty
- Discovery Analytics Center, Virginia Tech, Blacksburg, Virginia, United States of America
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Bryan Lewis
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, United States of America
| | - Stephen Eubank
- Network Dynamics and Simulation Science Laboratory, Biocomplexity Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - John S. Brownstein
- Children's Hospital Informatics Program, Boston Children’s Hospital, Massachusetts, United States of America
- Department of Pediatrics, Harvard Medical School, Massachusetts, United States of America
| | - Madhav Marathe
- Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, United States of America
- Department of Computer Science, University of Virginia, Charlottesville, Virginia, United States of America
| | - Naren Ramakrishnan
- Discovery Analytics Center, Virginia Tech, Blacksburg, Virginia, United States of America
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
- * E-mail:
| |
Collapse
|