1
|
Singh V, Khan SA, Yadav SK, Akhter Y. Modeling Global Monkeypox Infection Spread Data: A Comparative Study of Time Series Regression and Machine Learning Models. Curr Microbiol 2023; 81:15. [PMID: 38006416 DOI: 10.1007/s00284-023-03531-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Accepted: 10/19/2023] [Indexed: 11/27/2023]
Abstract
The global impact of COVID-19 has heightened concerns about emerging viral infections, among which monkeypox (MPOX) has become a significant public health threat. To address this, our study employs a comprehensive approach using three statistical techniques: Distribution fitting, ARIMA modeling, and Random Forest machine learning to analyze and predict the spread of MPOX in the top ten countries with high infection rates. We aim to provide a detailed understanding of the disease dynamics and model theoretical distributions using country-specific datasets to accurately assess and forecast the disease's transmission. The data from the considered countries are fitted into ARIMA models to determine the best time series regression model. Additionally, we employ the random forest machine learning approach to predict the future behavior of the disease. Evaluating the Root Mean Square Errors (RMSE) for both models, we find that the random forest outperforms ARIMA in six countries, while ARIMA performs better in the remaining four countries. Based on these findings, robust policy-making should consider the best fitted model for each country to effectively manage and respond to the ongoing public health threat posed by monkeypox. The integration of multiple modeling techniques enhances our understanding of the disease dynamics and aids in devising more informed strategies for containment and control.
Collapse
Affiliation(s)
- Vishwajeet Singh
- Directorate of Online Education, Manipal Academy of Higher Education (MAHE), Manipal, Karnataka, 576104, India
| | - Saif Ali Khan
- Department of Statistics, Babasaheb Bhimrao Ambedkar University, Vidya Vihar, Raebareli Road, Lucknow, Uttar Pradesh, 226025, India
| | - Subhash Kumar Yadav
- Department of Statistics, Babasaheb Bhimrao Ambedkar University, Vidya Vihar, Raebareli Road, Lucknow, Uttar Pradesh, 226025, India.
| | - Yusuf Akhter
- Department of Biotechnology, Babasaheb Bhimrao Ambedkar University, Vidya Vihar, Raebareli Road, Lucknow, Uttar Pradesh, 226025, India.
| |
Collapse
|
2
|
Gantenberg JR, McConeghy KW, Howe CJ, Steingrimsson J, van Aalst R, Chit A, Zullo AR. Predicting Seasonal Influenza Hospitalizations Using an Ensemble Super Learner: A Simulation Study. Am J Epidemiol 2023; 192:1688-1700. [PMID: 37147861 PMCID: PMC10558190 DOI: 10.1093/aje/kwad113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 08/17/2022] [Accepted: 04/27/2023] [Indexed: 05/07/2023] Open
Abstract
Accurate forecasts can inform response to outbreaks. Most efforts in influenza forecasting have focused on predicting influenza-like activity, with fewer on influenza-related hospitalizations. We conducted a simulation study to evaluate a super learner's predictions of 3 seasonal measures of influenza hospitalizations in the United States: peak hospitalization rate, peak hospitalization week, and cumulative hospitalization rate. We trained an ensemble machine learning algorithm on 15,000 simulated hospitalization curves and generated weekly predictions. We compared the performance of the ensemble (weighted combination of predictions from multiple prediction algorithms), the best-performing individual prediction algorithm, and a naive prediction (median of a simulated outcome distribution). Ensemble predictions performed similarly to the naive predictions early in the season but consistently improved as the season progressed for all prediction targets. The best-performing prediction algorithm in each week typically had similar predictive accuracy compared with the ensemble, but the specific prediction algorithm selected varied by week. An ensemble super learner improved predictions of influenza-related hospitalizations, relative to a naive prediction. Future work should examine the super learner's performance using additional empirical data on influenza-related predictors (e.g., influenza-like illness). The algorithm should also be tailored to produce prospective probabilistic forecasts of selected prediction targets.
Collapse
Affiliation(s)
- Jason R Gantenberg
- Correspondence to Dr. Jason R. Gantenberg, Department of Health Services, Policy and Practice, Brown University School of Public Health, Providence, RI 02912 (e-mail: )
| | | | | | | | | | | | | |
Collapse
|
3
|
Mason L, Berrington de Gonzalez A, Garcia-Closas M, Chanock SJ, Hicks B, Almeida JS. Interpretable, non-mechanistic forecasting using empirical dynamic modeling and interactive visualization. PLoS One 2023; 18:e0277149. [PMID: 37011060 PMCID: PMC10069763 DOI: 10.1371/journal.pone.0277149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 03/03/2023] [Indexed: 04/05/2023] Open
Abstract
Forecasting methods are notoriously difficult to interpret, particularly when the relationship between the data and the resulting forecasts is not obvious. Interpretability is an important property of a forecasting method because it allows the user to complement the forecasts with their own knowledge, a process which leads to more applicable results. In general, mechanistic methods are more interpretable than non-mechanistic methods, but they require explicit knowledge of the underlying dynamics. In this paper, we introduce EpiForecast, a tool which performs interpretable, non-mechanistic forecasts using interactive visualization and a simple, data-focused forecasting technique based on empirical dynamic modelling. EpiForecast's primary feature is a four-plot interactive dashboard which displays a variety of information to help the user understand how the forecasts are generated. In addition to point forecasts, the tool produces distributional forecasts using a kernel density estimation method-these are visualized using color gradients to produce a quick, intuitive visual summary of the estimated future. To ensure the work is FAIR and privacy is ensured, we have released the tool as an entirely in-browser web-application.
Collapse
Affiliation(s)
- Lee Mason
- Queen's University Belfast, Belfast, United Kingdom
| | | | | | - Stephen J Chanock
- National Institutes of Health, Rockville, Maryland, United States of America
| | | | - Jonas S Almeida
- National Institutes of Health, Rockville, Maryland, United States of America
| |
Collapse
|
4
|
Qureshi M, Khan S, Bantan RAR, Daniyal M, Elgarhy M, Marzo RR, Lin Y. Modeling and Forecasting Monkeypox Cases Using Stochastic Models. J Clin Med 2022; 11:6555. [PMID: 36362783 PMCID: PMC9659136 DOI: 10.3390/jcm11216555] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 10/24/2022] [Accepted: 10/27/2022] [Indexed: 08/25/2023] Open
Abstract
BACKGROUND Monkeypox virus is gaining attention due to its severity and spread among people. This study sheds light on the modeling and forecasting of new monkeypox cases. Knowledge about the future situation of the virus using a more accurate time series and stochastic models is required for future actions and plans to cope with the challenge. METHODS We conduct a side-by-side comparison of the machine learning approach with the traditional time series model. The multilayer perceptron model (MLP), a machine learning technique, and the Box-Jenkins methodology, also known as the ARIMA model, are used for classical modeling. Both methods are applied to the Monkeypox cumulative data set and compared using different model selection criteria such as root mean square error, mean square error, mean absolute error, and mean absolute percentage error. RESULTS With a root mean square error of 150.78, the monkeypox series follows the ARIMA (7,1,7) model among the other potential models. Comparatively, we use the multilayer perceptron (MLP) model, which employs the sigmoid activation function and has a different number of hidden neurons in a single hidden layer. The root mean square error of the MLP model, which uses a single input and ten hidden neurons, is 54.40, significantly lower than that of the ARIMA model. The actual confirmed cases versus estimated or fitted plots also demonstrate that the multilayer perceptron model has a better fit for the monkeypox data than the ARIMA model. CONCLUSIONS AND RECOMMENDATION When it comes to predicting monkeypox, the machine learning method outperforms the traditional time series. A better match can be achieved in future studies by applying the extreme learning machine model (ELM), support vector machine (SVM), and some other methods with various activation functions. It is thus concluded that the selected data provide a real picture of the virus. If the situations remain the same, governments and other stockholders should ensure the follow-up of Standard Operating Procedures (SOPs) among the masses, as the trends will continue rising in the upcoming 10 days. However, governments should take some serious interventions to cope with the virus. LIMITATION In the ARIMA models selected for forecasting, we did not incorporate the effect of covariates such as the effect of net migration of monkeypox virus patients, government interventions, etc.
Collapse
Affiliation(s)
- Moiz Qureshi
- Department of Statistics, Shaheed Benazir Bhutto University, Nawabshah 67450, Pakistan
| | - Shahid Khan
- Department of Mathematics, National University of Modern Languages, Islamabad 44000, Pakistan
| | - Rashad A. R. Bantan
- Department of Marine Geology, Faculty of Marine Science, King AbdulAziz University, Jeddah 21551, Saudi Arabia
| | - Muhammad Daniyal
- Department of Statistics, The Islamia University of Bahawalpur, Bahawalpur 63100, Pakistan
| | - Mohammed Elgarhy
- The Higher Institute of Commercial Sciences, Al Mahalla Al Kubra 31951, Egypt
| | - Roy Rillera Marzo
- Department of Community Medicine, International Medical School, Management and Science University, Shah Alam 40100, Selangor, Malaysia
- Global Public Health, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Jalan Lagoon Selatan, Subang Jaya 47500, Selangor, Malaysia
| | - Yulan Lin
- Department of Epidemiology and Health Statistics, School of Public Health, Fujian Medical University, Fuzhou 350122, China
| |
Collapse
|
5
|
Roster K, Connaughton C, Rodrigues FA. Forecasting new diseases in low-data settings using transfer learning. CHAOS, SOLITONS, AND FRACTALS 2022; 161:112306. [PMID: 35765601 PMCID: PMC9222348 DOI: 10.1016/j.chaos.2022.112306] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 05/11/2022] [Accepted: 06/03/2022] [Indexed: 06/15/2023]
Abstract
Recent infectious disease outbreaks, such as the COVID-19 pandemic and the Zika epidemic in Brazil, have demonstrated both the importance and difficulty of accurately forecasting novel infectious diseases. When new diseases first emerge, we have little knowledge of the transmission process, the level and duration of immunity to reinfection, or other parameters required to build realistic epidemiological models. Time series forecasts and machine learning, while less reliant on assumptions about the disease, require large amounts of data that are also not available in early stages of an outbreak. In this study, we examine how knowledge of related diseases can help make predictions of new diseases in data-scarce environments using transfer learning. We implement both an empirical and a synthetic approach. Using data from Brazil, we compare how well different machine learning models transfer knowledge between two different dataset pairs: case counts of (i) dengue and Zika, and (ii) influenza and COVID-19. In the synthetic analysis, we generate data with an SIR model using different transmission and recovery rates, and then compare the effectiveness of different transfer learning methods. We find that transfer learning offers the potential to improve predictions, even beyond a model based on data from the target disease, though the appropriate source disease must be chosen carefully. While imperfect, these models offer an additional input for decision makers for pandemic response.
Collapse
Affiliation(s)
- Kirstin Roster
- Institute of Mathematics and Computer Science, University of São Paulo, Avenida Trabalhador São Carlense 400, São Carlos 13566-590, São Paulo, Brazil
| | - Colm Connaughton
- Mathematics Institute, University of Warwick, Coventry CV4 7AL, United Kingdom
- London Mathematical Laboratory, 8 Margravine Gardens, W6 8RH London, United Kingdom
| | - Francisco A Rodrigues
- Institute of Mathematics and Computer Science, University of São Paulo, Avenida Trabalhador São Carlense 400, São Carlos 13566-590, São Paulo, Brazil
| |
Collapse
|
6
|
Zhang R, Wang Y, Lv Z, Pei S. Evaluating the impact of stay-at-home and quarantine measures on COVID-19 spread. BMC Infect Dis 2022; 22:648. [PMID: 35896977 PMCID: PMC9326419 DOI: 10.1186/s12879-022-07636-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 07/19/2022] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND During the early stage of the COVID-19 pandemic, many countries implemented non-pharmaceutical interventions (NPIs) to control the transmission of SARS-CoV-2, the causative pathogen of COVID-19. Among those NPIs, stay-at-home and quarantine measures were widely adopted and enforced. Understanding the effectiveness of stay-at-home and quarantine measures can inform decision-making and control planning during the ongoing COVID-19 pandemic and for future disease outbreaks. METHODS In this study, we use mathematical models to evaluate the impact of stay-at-home and quarantine measures on COVID-19 spread in four cities that experienced large-scale outbreaks in the spring of 2020: Wuhan, New York, Milan, and London. We develop a susceptible-exposed-infected-removed (SEIR)-type model with components of self-isolation and quarantine and couple this disease transmission model with a data assimilation method. By calibrating the model to case data, we estimate key epidemiological parameters before lockdown in each city. We further examine the impact of stay-at-home and quarantine rates on COVID-19 spread after lockdown using counterfactual model simulations. RESULTS Results indicate that self-isolation of susceptible population is necessary to contain the outbreak. At a given rate, self-isolation of susceptible population induced by stay-at-home orders is more effective than quarantine of SARS-CoV-2 contacts in reducing effective reproductive numbers [Formula: see text]. Variation in self-isolation and quarantine rates can also considerably affect the duration of outbreaks, attack rates and peak timing. We generate counterfactual simulations to estimate effectiveness of stay-at-home and quarantine measures. Without these two measures, the cumulative confirmed cases could be much higher than reported numbers within 40 days after lockdown in Wuhan, New York, Milan, and London. CONCLUSIONS Our findings underscore the essential role of stay-at-home orders and quarantine of SARS-CoV-2 contacts during the early phase of the pandemic.
Collapse
Affiliation(s)
- Renquan Zhang
- School of Mathematical Sciences, Dalian University of Technology, 116024 Dalian, China
| | - Yu Wang
- School of Mathematical Sciences, Dalian University of Technology, 116024 Dalian, China
| | - Zheng Lv
- School of Control Science and Engineering, Dalian University of Technology, 116024 Dalian, China
| | - Sen Pei
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, 10032 New York, USA
| |
Collapse
|
7
|
Does knowing the influenza epidemic threshold has been reached influence the performance of influenza case definitions? PLoS One 2022; 17:e0270740. [PMID: 35776716 PMCID: PMC9249166 DOI: 10.1371/journal.pone.0270740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Accepted: 06/16/2022] [Indexed: 12/02/2022] Open
Abstract
Background Disease surveillance using adequate case definitions is very important. The objective of the study was to compare the performance of influenza case definitions and influenza symptoms in the first two epidemic weeks with respect to other epidemic weeks. Methods We analysed cases of acute respiratory infection detected by the network of sentinel primary care physicians of Catalonia for 10 seasons. We calculated the diagnostic odds ratio (DOR) and 95% confidence intervals (CI) for the first two epidemic weeks and for other epidemic weeks. Results A total of 4,338 samples were collected in the epidemic weeks, of which 2,446 (56.4%) were positive for influenza. The most predictive case definition for laboratory-confirmed influenza was the WHO case definition for influenza-like illness (ILI) in the first two epidemic weeks (DOR 2.10; 95% CI 1.57–2.81) and in other epidemic weeks (DOR 2.31; 95% CI 1.96–2.72). The most predictive symptom was fever. After knowing that epidemic threshold had been reached, the DOR of the ILI WHO case definition in children aged <5 years and cough and fever in this group increased (190%, 170% and 213%, respectively). Conclusions During influenza epidemics, differences in the performance of the case definition and the discriminative ability of symptoms were found according to whether it was known that the epidemic threshold had been reached or not. This suggests that sentinel physicians are stricter in selecting samples to send to the laboratory from patients who present symptoms more specific to influenza after rather than before an influenza epidemic has been declared.
Collapse
|
8
|
Petropoulos F, Makridakis S, Stylianou N. COVID-19: Forecasting confirmed cases and deaths with a simple time series model. INTERNATIONAL JOURNAL OF FORECASTING 2022; 38:439-452. [PMID: 33311822 PMCID: PMC7717777 DOI: 10.1016/j.ijforecast.2020.11.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Forecasting the outcome of outbreaks as early and as accurately as possible is crucial for decision-making and policy implementations. A significant challenge faced by forecasters is that not all outbreaks and epidemics turn into pandemics, making the prediction of their severity difficult. At the same time, the decisions made to enforce lockdowns and other mitigating interventions versus their socioeconomic consequences are not only hard to make, but also highly uncertain. The majority of modeling approaches to outbreaks, epidemics, and pandemics take an epidemiological approach that considers biological and disease processes. In this paper, we accept the limitations of forecasting to predict the long-term trajectory of an outbreak, and instead, we propose a statistical, time series approach to modelling and predicting the short-term behavior of COVID-19. Our model assumes a multiplicative trend, aiming to capture the continuation of the two variables we predict (global confirmed cases and deaths) as well as their uncertainty. We present the timeline of producing and evaluating 10-day-ahead forecasts over a period of four months. Our simple model offers competitive forecast accuracy and estimates of uncertainty that are useful and practically relevant.
Collapse
Affiliation(s)
| | - Spyros Makridakis
- Institute for the Future (IFF), University of Nicosia, Nicosia, Cyprus
| | - Neophytos Stylianou
- International Institute for Compassionate Care, Cyprus
- School of Management, University of Bath, UK
| |
Collapse
|
9
|
Abstract
Influenza is a common respiratory infection that causes considerable morbidity and mortality worldwide each year. In recent years, along with the improvement in computational resources, there have been a number of important developments in the science of influenza surveillance and forecasting. Influenza surveillance systems have been improved by synthesizing multiple sources of information. Influenza forecasting has developed into an active field, with annual challenges in the United States that have stimulated improved methodologies. Work continues on the optimal approaches to assimilating surveillance data and information on relevant driving factors to improve estimates of the current situation (nowcasting) and to forecast future dynamics.
Collapse
Affiliation(s)
- Sheikh Taslim Ali
- World Health Organization Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China;
| | - Benjamin J Cowling
- World Health Organization Collaborating Centre for Infectious Disease Epidemiology and Control, School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong Special Administrative Region, China;
| |
Collapse
|
10
|
Sundar S, Schwab P, Tan JZH, Romero-Brufau S, Celi LA, Wangmo D, Penna ND. Forecasting the COVID-19 Pandemic: Lessons learned and future directions. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.11.06.21266007. [PMID: 34806093 PMCID: PMC8603143 DOI: 10.1101/2021.11.06.21266007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
I.The Coronavirus Disease 2019 (COVID-19) has demonstrated that accurate forecasts of infection and mortality rates are essential for informing healthcare resource allocation, designing countermeasures, implementing public health policies, and increasing public awareness. However, there exist a multitude of modeling methodologies, and their relative performances in accurately forecasting pandemic dynamics are not currently comprehensively understood. In this paper, we introduce the non-mechanistic MIT-LCP forecasting model, and assess and compare its performance to various mechanistic and non-mechanistic models that have been proposed for forecasting COVID-19 dynamics. We performed a comprehensive experimental evaluation which covered the time period of November 2020 to April 2021, in order to determine the relative performances of MIT-LCP and seven other forecasting models from the United States' Centers for Disease Control and Prevention (CDC) Forecast Hub. Our results show that there exist forecasting scenarios well-suited to both mechanistic and non-mechanistic models, with mechanistic models being particularly performant for forecasts that are further in the future when recent data may not be as informative, and non-mechanistic models being more effective with shorter prediction horizons when recent representative data is available. Improving our understanding of which forecasting approaches are more reliable, and in which forecasting scenarios, can assist effective pandemic preparation and management.
Collapse
|
11
|
Inter-provincial disparity of COVID-19 transmission and control in Nepal. Sci Rep 2021; 11:13363. [PMID: 34172764 PMCID: PMC8233407 DOI: 10.1038/s41598-021-92253-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 05/24/2021] [Indexed: 12/24/2022] Open
Abstract
Despite the global efforts to mitigate the ongoing COVID-19 pandemic, the disease transmission and the effective controls still remain uncertain as the outcome of the epidemic varies from place to place. In this regard, the province-wise data from Nepal provides a unique opportunity to study the effective control strategies. This is because (a) some provinces of Nepal share an open-border with India, resulting in a significantly high inflow of COVID-19 cases from India; (b) despite the inflow of a considerable number of cases, the local spread was quite controlled until mid-June of 2020, presumably due to control policies implemented; and (c) the relaxation of policies caused a rapid surge of the COVID-19 cases, providing a multi-phasic trend of disease dynamics. In this study, we used this unique data set to explore the inter-provincial disparities of the important indicators, such as epidemic trend, epidemic growth rate, and reproduction numbers. Furthermore, we extended our analysis to identify prevention and control policies that are effective in altering these indicators. Our analysis identified a noticeable inter-province variation in the epidemic trend (3 per day to 104 per day linear increase during third surge period), the median daily growth rate (1 to 4% per day exponential growth), the basic reproduction number (0.71 to 1.21), and the effective reproduction number (maximum values ranging from 1.20 to 2.86). Importantly, results from our modeling show that the type and number of control strategies that are effective in altering the indicators vary among provinces, underscoring the need for province-focused strategies along with the national-level strategy in order to ensure the control of a local spread.
Collapse
|
12
|
Abstract
Influenza forecasting in the United States (US) is complex and challenging due to spatial and temporal variability, nested geographic scales of interest, and heterogeneous surveillance participation. Here we present Dante, a multiscale influenza forecasting model that learns rather than prescribes spatial, temporal, and surveillance data structure and generates coherent forecasts across state, regional, and national scales. We retrospectively compare Dante's short-term and seasonal forecasts for previous flu seasons to the Dynamic Bayesian Model (DBM), a leading competitor. Dante outperformed DBM for nearly all spatial units, flu seasons, geographic scales, and forecasting targets. Dante's sharper and more accurate forecasts also suggest greater public health utility. Dante placed 1st in the Centers for Disease Control and Prevention's prospective 2018/19 FluSight challenge in both the national and regional competition and the state competition. The methodology underpinning Dante can be used in other seasonal disease forecasting contexts having nested geographic scales of interest.
Collapse
Affiliation(s)
- Dave Osthus
- Los Alamos National Laboratory, Statistical Sciences Group, Los Alamos, NM, USA.
| | - Kelly R Moran
- Los Alamos National Laboratory, Statistical Sciences Group, Los Alamos, NM, USA.,Department of Statistical Science, Duke University, Durham, NC, USA
| |
Collapse
|
13
|
A stacked ensemble method for forecasting influenza-like illness visit volumes at emergency departments. PLoS One 2021; 16:e0241725. [PMID: 33750974 PMCID: PMC7984626 DOI: 10.1371/journal.pone.0241725] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 02/27/2021] [Indexed: 11/19/2022] Open
Abstract
Accurate and reliable short-term forecasts of influenza-like illness (ILI) visit volumes at emergency departments can improve staffing and resource allocation decisions within hospitals. In this paper, we developed a stacked ensemble model that averages the predictions from various competing methodologies in the current frontier for ILI-related forecasts. We also constructed a back-of-the-envelope prediction interval for the stacked ensemble, which provides a conservative characterization of the uncertainty in the stacked ensemble predictions. We assessed the accuracy and reliability of our model with 1 to 4 weeks ahead forecast targets using real-time hospital-level data on weekly ILI visit volumes during the 2012-2018 flu seasons in the Alberta Children's Hospital, located in Calgary, Alberta, Canada. Our results suggest the forecasting performance of the stacked ensemble meets or exceeds the performance of the individual models over all forecast targets.
Collapse
|
14
|
Pei S, Teng X, Lewis P, Shaman J. Optimizing respiratory virus surveillance networks using uncertainty propagation. Nat Commun 2021; 12:222. [PMID: 33431854 PMCID: PMC7801666 DOI: 10.1038/s41467-020-20399-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 12/01/2020] [Indexed: 02/07/2023] Open
Abstract
Infectious disease prevention, control and forecasting rely on sentinel observations; however, many locations lack the capacity for routine surveillance. Here we show that, by using data from multiple sites collectively, accurate estimation and forecasting of respiratory diseases for locations without surveillance is feasible. We develop a framework to optimize surveillance sites that suppresses uncertainty propagation in a networked disease transmission model. Using influenza outbreaks from 35 US states, the optimized system generates better near-term predictions than alternate systems designed using population and human mobility. We also find that monitoring regional population centers serves as a reasonable proxy for the optimized network and could direct surveillance for diseases with limited records. The proxy method is validated using model simulations for 3,108 US counties and historical data for two other respiratory pathogens - human metapneumovirus and seasonal coronavirus - from 35 US states and can be used to guide systemic allocation of surveillance efforts.
Collapse
Affiliation(s)
- Sen Pei
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA.
| | - Xian Teng
- School of Computing and Information, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Paul Lewis
- Integrated Biosurveillance Section, Armed Forces Health Surveillance Branch, Silver Spring, MD, 20904, USA
| | - Jeffrey Shaman
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA.
| |
Collapse
|
15
|
Gibson GC, Moran KR, Reich NG, Osthus D. Improving probabilistic infectious disease forecasting through coherence. PLoS Comput Biol 2021; 17:e1007623. [PMID: 33406068 PMCID: PMC7837472 DOI: 10.1371/journal.pcbi.1007623] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 01/26/2021] [Accepted: 09/14/2020] [Indexed: 11/19/2022] Open
Abstract
With an estimated $10.4 billion in medical costs and 31.4 million outpatient visits each year, influenza poses a serious burden of disease in the United States. To provide insights and advance warning into the spread of influenza, the U.S. Centers for Disease Control and Prevention (CDC) runs a challenge for forecasting weighted influenza-like illness (wILI) at the national and regional level. Many models produce independent forecasts for each geographical unit, ignoring the constraint that the national wILI is a weighted sum of regional wILI, where the weights correspond to the population size of the region. We propose a novel algorithm that transforms a set of independent forecast distributions to obey this constraint, which we refer to as probabilistically coherent. Enforcing probabilistic coherence led to an increase in forecast skill for 79% of the models we tested over multiple flu seasons, highlighting the importance of respecting the forecasting system’s geographical hierarchy. Seasonal influenza causes a significant public health burden nationwide. Accurate influenza forecasting may help public health officials allocate resources and plan responses to emerging outbreaks. The U.S. Centers for Disease Control and Prevention (CDC) reports influenza data at multiple geographical units, including regionally and nationally, where the national data are by construction a weighted sum of the regional data. In an effort to improve influenza forecast accuracy across all models submitted to the CDC’s annual flu forecasting challenge, we examined the effect of imposing this geographical constraint on the set of independent forecasts, made publicly available by the CDC. We developed a novel method to transform forecast densities to obey the geographical constraint that respects the correlation structure between geographical units. This method showed consistent improvement across 79% of models and that held when stratified by targets and test seasons. Our method can be applied to other forecasting systems both within and outside an infectious disease context that have a geographical hierarchy.
Collapse
Affiliation(s)
- Graham Casey Gibson
- Statistical Sciences Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, Massachusetts, United States of America
- * E-mail:
| | - Kelly R. Moran
- Statistical Sciences Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
- Department of Statistical Science, Duke University, Durham, North Carolina, United States of America
| | - Nicholas G. Reich
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, Massachusetts, United States of America
| | - Dave Osthus
- Statistical Sciences Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| |
Collapse
|
16
|
Bomfim R, Pei S, Shaman J, Yamana T, Makse HA, Andrade JS, Lima Neto AS, Furtado V. Predicting dengue outbreaks at neighbourhood level using human mobility in urban areas. J R Soc Interface 2020; 17:20200691. [PMID: 33109025 DOI: 10.1098/rsif.2020.0691] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Dengue is a vector-borne disease transmitted by the Aedes genus mosquito. It causes financial burdens on public health systems and considerable morbidity and mortality. Tropical regions in the Americas and Asia are the areas most affected by the virus. Fortaleza is a city with approximately 2.6 million inhabitants in northeastern Brazil that, during the recent decades, has been suffering from endemic dengue transmission, interspersed with larger epidemics. The objective of this paper is to study the impact of human mobility in urban areas on the spread of the dengue virus, and to test whether human mobility data can be used to improve predictions of dengue virus transmission at the neighbourhood level. We present two distinct forecasting systems for dengue transmission in Fortaleza: the first using artificial neural network methods and the second developed using a mechanistic model of disease transmission. We then present enhanced versions of the two forecasting systems that incorporate bus transportation data cataloguing movement among 119 neighbourhoods in Fortaleza. Each forecasting system was used to perform retrospective forecasts for historical dengue outbreaks from 2007 to 2015. Results show that both artificial neural networks and mechanistic models can accurately forecast dengue cases, and that the inclusion of human mobility data substantially improves the performance of both forecasting systems. While the mechanistic models perform better in capturing seasons with large-scale outbreaks, the neural networks more accurately forecast outbreak peak timing, peak intensity and annual dengue time series. These results have two practical implications: they support the creation of public policies from the use of the models created here to combat the disease and help to understand the impact of urban mobility on the epidemic in large cities.
Collapse
Affiliation(s)
- Rafael Bomfim
- Programa de Pós Graduação em Informática Aplicada Universidade de Fortaleza, Fortaleza, Brazil
| | - Sen Pei
- Department of Environmental Health Sciences, Columbia University, New York, NY 10032, USA
| | - Jeffrey Shaman
- Department of Environmental Health Sciences, Columbia University, New York, NY 10032, USA
| | - Teresa Yamana
- Department of Environmental Health Sciences, Columbia University, New York, NY 10032, USA
| | - Hernán A Makse
- Levich Institute and Physics Department, City College of New York, New York, NY 10031, USA
| | - José S Andrade
- Departamento de Física, Universidade Federal do Ceará, Campus do Pici, 60451-970 Fortaleza, Ceará, Brazil
| | - Antonio S Lima Neto
- Secretaria Municipal de Saúde de Fortaleza (SMS-Fortaleza), Fortaleza, Ceará, Brazil.,Centro de Ciências da Saúde, Universidade de Fortaleza (UNIFOR), Fortaleza, Ceará, Brazil
| | - Vasco Furtado
- Programa de Pós Graduação em Informática Aplicada Universidade de Fortaleza, Fortaleza, Brazil
| |
Collapse
|
17
|
Pei S, Shaman J. Aggregating forecasts of multiple respiratory pathogens supports more accurate forecasting of influenza-like illness. PLoS Comput Biol 2020; 16:e1008301. [PMID: 33090997 PMCID: PMC7608986 DOI: 10.1371/journal.pcbi.1008301] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 11/03/2020] [Accepted: 09/02/2020] [Indexed: 12/12/2022] Open
Abstract
Influenza-like illness (ILI) is a commonly measured syndromic signal representative of a range of acute respiratory infections. Reliable forecasts of ILI can support better preparation for patient surges in healthcare systems. Although ILI is an amalgamation of multiple pathogens with variable seasonal phasing and attack rates, most existing process-based forecasting systems treat ILI as a single infectious agent. Here, using ILI records and virologic surveillance data, we show that ILI signal can be disaggregated into distinct viral components. We generate separate predictions for six contributing pathogens (influenza A/H1, A/H3, B, respiratory syncytial virus, and human parainfluenza virus types 1-2 and 3), and develop a method to forecast ILI by aggregating these predictions. The relative contribution of each pathogen to the total ILI signal is estimated using a Markov Chain Monte Carlo (MCMC) method upon forecast aggregation. We find highly variable overall contributions from influenza type A viruses across seasons, but relatively stable contributions for the other pathogens. Using historical data from 1997 to 2014 at US national and regional levels, the proposed forecasting system generates improved predictions of both seasonal and near-term targets relative to a baseline method that simulates ILI as a single pathogen. The hierarchical forecasting system can generate predictions for each viral component, as well as infer and predict their contributions to ILI, which may additionally help physicians determine the etiological causes of ILI in clinical settings.
Collapse
Affiliation(s)
- Sen Pei
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, United States of America
| | - Jeffrey Shaman
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, United States of America
| |
Collapse
|
18
|
Cheng HY, Wu YC, Lin MH, Liu YL, Tsai YY, Wu JH, Pan KH, Ke CJ, Chen CM, Liu DP, Lin IF, Chuang JH. Applying Machine Learning Models with An Ensemble Approach for Accurate Real-Time Influenza Forecasting in Taiwan: Development and Validation Study. J Med Internet Res 2020; 22:e15394. [PMID: 32755888 PMCID: PMC7439145 DOI: 10.2196/15394] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 12/21/2019] [Accepted: 06/13/2020] [Indexed: 12/14/2022] Open
Abstract
Background Changeful seasonal influenza activity in subtropical areas such as Taiwan causes problems in epidemic preparedness. The Taiwan Centers for Disease Control has maintained real-time national influenza surveillance systems since 2004. Except for timely monitoring, epidemic forecasting using the national influenza surveillance data can provide pivotal information for public health response. Objective We aimed to develop predictive models using machine learning to provide real-time influenza-like illness forecasts. Methods Using surveillance data of influenza-like illness visits from emergency departments (from the Real-Time Outbreak and Disease Surveillance System), outpatient departments (from the National Health Insurance database), and the records of patients with severe influenza with complications (from the National Notifiable Disease Surveillance System), we developed 4 machine learning models (autoregressive integrated moving average, random forest, support vector regression, and extreme gradient boosting) to produce weekly influenza-like illness predictions for a given week and 3 subsequent weeks. We established a framework of the machine learning models and used an ensemble approach called stacking to integrate these predictions. We trained the models using historical data from 2008-2014. We evaluated their predictive ability during 2015-2017 for each of the 4-week time periods using Pearson correlation, mean absolute percentage error (MAPE), and hit rate of trend prediction. A dashboard website was built to visualize the forecasts, and the results of real-world implementation of this forecasting framework in 2018 were evaluated using the same metrics. Results All models could accurately predict the timing and magnitudes of the seasonal peaks in the then-current week (nowcast) (ρ=0.802-0.965; MAPE: 5.2%-9.2%; hit rate: 0.577-0.756), 1-week (ρ=0.803-0.918; MAPE: 8.3%-11.8%; hit rate: 0.643-0.747), 2-week (ρ=0.783-0.867; MAPE: 10.1%-15.3%; hit rate: 0.669-0.734), and 3-week forecasts (ρ=0.676-0.801; MAPE: 12.0%-18.9%; hit rate: 0.643-0.786), especially the ensemble model. In real-world implementation in 2018, the forecasting performance was still accurate in nowcasts (ρ=0.875-0.969; MAPE: 5.3%-8.0%; hit rate: 0.582-0.782) and remained satisfactory in 3-week forecasts (ρ=0.721-0.908; MAPE: 7.6%-13.5%; hit rate: 0.596-0.904). Conclusions This machine learning and ensemble approach can make accurate, real-time influenza-like illness forecasts for a 4-week period, and thus, facilitate decision making.
Collapse
Affiliation(s)
| | | | - Min-Hau Lin
- Taiwan Centers for Disease Control, Taipei, Taiwan
| | - Yu-Lun Liu
- Taiwan Centers for Disease Control, Taipei, Taiwan
| | | | - Jo-Hua Wu
- Value Lab, Acer Inc., Taipei, Taiwan
| | | | - Chih-Jung Ke
- Taiwan Centers for Disease Control, Taipei, Taiwan
| | | | - Ding-Ping Liu
- Taiwan Centers for Disease Control, Taipei, Taiwan.,National Taipei University of Nursing and Health Sciences, Taipei, Taiwan
| | - I-Feng Lin
- Institute of Public Health, National Yang-Ming University, Taipei, Taiwan
| | - Jen-Hsiang Chuang
- Taiwan Centers for Disease Control, Taipei, Taiwan.,Institute of Public Health, National Yang-Ming University, Taipei, Taiwan
| |
Collapse
|
19
|
Hsiang S, Allen D, Annan-Phan S, Bell K, Bolliger I, Chong T, Druckenmiller H, Huang LY, Hultgren A, Krasovich E, Lau P, Lee J, Rolf E, Tseng J, Wu T. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 2020; 584:262-267. [PMID: 32512578 DOI: 10.1038/s41586–020–2404–8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Accepted: 05/26/2020] [Indexed: 05/28/2023]
Abstract
Governments around the world are responding to the coronavirus disease 2019 (COVID-19) pandemic1, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with unprecedented policies designed to slow the growth rate of infections. Many policies, such as closing schools and restricting populations to their homes, impose large and visible costs on society; however, their benefits cannot be directly observed and are currently understood only through process-based simulations2-4. Here we compile data on 1,700 local, regional and national non-pharmaceutical interventions that were deployed in the ongoing pandemic across localities in China, South Korea, Italy, Iran, France and the United States. We then apply reduced-form econometric methods, commonly used to measure the effect of policies on economic growth5,6, to empirically evaluate the effect that these anti-contagion policies have had on the growth rate of infections. In the absence of policy actions, we estimate that early infections of COVID-19 exhibit exponential growth rates of approximately 38% per day. We find that anti-contagion policies have significantly and substantially slowed this growth. Some policies have different effects on different populations, but we obtain consistent evidence that the policy packages that were deployed to reduce the rate of transmission achieved large, beneficial and measurable health outcomes. We estimate that across these 6 countries, interventions prevented or delayed on the order of 61 million confirmed cases, corresponding to averting approximately 495 million total infections. These findings may help to inform decisions regarding whether or when these policies should be deployed, intensified or lifted, and they can support policy-making in the more than 180 other countries in which COVID-19 has been reported7.
Collapse
Affiliation(s)
- Solomon Hsiang
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.
- National Bureau of Economic Research, Cambridge, MA, USA.
- Centre for Economic Policy Research, London, UK.
| | - Daniel Allen
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Sébastien Annan-Phan
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Kendon Bell
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | - Ian Bolliger
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Energy & Resources Group, UC Berkeley, Berkeley, CA, USA
| | - Trinetta Chong
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Hannah Druckenmiller
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Luna Yue Huang
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Andrew Hultgren
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Emma Krasovich
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Peiley Lau
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Jaecheol Lee
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Esther Rolf
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Electrical Engineering & Computer Science Department, UC Berkeley, Berkeley, CA, USA
| | - Jeanette Tseng
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Tiffany Wu
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| |
Collapse
|
20
|
Hsiang S, Allen D, Annan-Phan S, Bell K, Bolliger I, Chong T, Druckenmiller H, Huang LY, Hultgren A, Krasovich E, Lau P, Lee J, Rolf E, Tseng J, Wu T. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 2020. [PMID: 32512578 DOI: 10.1101/2020.03.22.20040642] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Governments around the world are responding to the coronavirus disease 2019 (COVID-19) pandemic1, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with unprecedented policies designed to slow the growth rate of infections. Many policies, such as closing schools and restricting populations to their homes, impose large and visible costs on society; however, their benefits cannot be directly observed and are currently understood only through process-based simulations2-4. Here we compile data on 1,700 local, regional and national non-pharmaceutical interventions that were deployed in the ongoing pandemic across localities in China, South Korea, Italy, Iran, France and the United States. We then apply reduced-form econometric methods, commonly used to measure the effect of policies on economic growth5,6, to empirically evaluate the effect that these anti-contagion policies have had on the growth rate of infections. In the absence of policy actions, we estimate that early infections of COVID-19 exhibit exponential growth rates of approximately 38% per day. We find that anti-contagion policies have significantly and substantially slowed this growth. Some policies have different effects on different populations, but we obtain consistent evidence that the policy packages that were deployed to reduce the rate of transmission achieved large, beneficial and measurable health outcomes. We estimate that across these 6 countries, interventions prevented or delayed on the order of 61 million confirmed cases, corresponding to averting approximately 495 million total infections. These findings may help to inform decisions regarding whether or when these policies should be deployed, intensified or lifted, and they can support policy-making in the more than 180 other countries in which COVID-19 has been reported7.
Collapse
Affiliation(s)
- Solomon Hsiang
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.
- National Bureau of Economic Research, Cambridge, MA, USA.
- Centre for Economic Policy Research, London, UK.
| | - Daniel Allen
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Sébastien Annan-Phan
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Kendon Bell
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | - Ian Bolliger
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Energy & Resources Group, UC Berkeley, Berkeley, CA, USA
| | - Trinetta Chong
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Hannah Druckenmiller
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Luna Yue Huang
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Andrew Hultgren
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Emma Krasovich
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Peiley Lau
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Jaecheol Lee
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Esther Rolf
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
- Electrical Engineering & Computer Science Department, UC Berkeley, Berkeley, CA, USA
| | - Jeanette Tseng
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Tiffany Wu
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| |
Collapse
|
21
|
Cravo Oliveira Hashiguchi T, Ait Ouakrim D, Padget M, Cassini A, Cecchini M. Resistance proportions for eight priority antibiotic-bacterium combinations in OECD, EU/EEA and G20 countries 2000 to 2030: a modelling study. ACTA ACUST UNITED AC 2020; 24. [PMID: 31115312 PMCID: PMC6530255 DOI: 10.2807/1560-7917.es.2019.24.20.1800445] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Background Antimicrobial resistance is widely considered an urgent global health issue due to associated mortality and disability, societal and healthcare costs. Aim To estimate the past, current and projected future proportion of infections resistant to treatment for eight priority antibiotic-bacterium combinations from 2000 to 2030 for 52 countries. Methods We collated data from a variety of sources including ResistanceMap and World Bank. Feature selection algorithms and multiple imputation were used to produce a complete historical dataset. Forecasts were derived from an ensemble of three models: exponential smoothing, linear regression and random forest. The latter two were informed by projections of antibiotic consumption, out-of-pocket medical spending, populations aged 64 years and older and under 15 years and real gross domestic product. We incorporated three types of uncertainty, producing 150 estimates for each country-antibiotic-bacterium-year. Results Average resistance proportions across antibiotic-bacterium combinations could grow moderately from 17% to 18% within the Organisation for Economic Co-operation and Development (OECD; growth in 64% of uncertainty sets), from 18% to 19% in the European Union/European Economic Area (EU/EEA; growth in 87% of uncertainty sets) and from 29% to 31% in Group of Twenty (G20) countries (growth in 62% of uncertainty sets) between 2015 and 2030. There is broad heterogeneity in levels and rates of change across countries and antibiotic-bacterium combinations from 2000 to 2030. Conclusion If current trends continue, resistance proportions are projected to marginally increase in the coming years. The estimates indicate there is significant heterogeneity in resistance proportions across countries and antibiotic-bacterium combinations.
Collapse
Affiliation(s)
| | - Driss Ait Ouakrim
- Organisation for Economic Co-operation and Development (OECD), Paris, France
| | - Michael Padget
- Organisation for Economic Co-operation and Development (OECD), Paris, France
| | - Alessandro Cassini
- European Centre for Disease Prevention and Control (ECDC), Stockholm, Sweden
| | - Michele Cecchini
- Organisation for Economic Co-operation and Development (OECD), Paris, France
| |
Collapse
|
22
|
Hsiang S, Allen D, Annan-Phan S, Bell K, Bolliger I, Chong T, Druckenmiller H, Huang LY, Hultgren A, Krasovich E, Lau P, Lee J, Rolf E, Tseng J, Wu T. The effect of large-scale anti-contagion policies on the COVID-19 pandemic. Nature 2020; 584:262-267. [PMID: 32512578 DOI: 10.1038/s41586-020-2404-8] [Citation(s) in RCA: 697] [Impact Index Per Article: 174.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Accepted: 05/26/2020] [Indexed: 11/09/2022]
Abstract
Governments around the world are responding to the coronavirus disease 2019 (COVID-19) pandemic1, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), with unprecedented policies designed to slow the growth rate of infections. Many policies, such as closing schools and restricting populations to their homes, impose large and visible costs on society; however, their benefits cannot be directly observed and are currently understood only through process-based simulations2-4. Here we compile data on 1,700 local, regional and national non-pharmaceutical interventions that were deployed in the ongoing pandemic across localities in China, South Korea, Italy, Iran, France and the United States. We then apply reduced-form econometric methods, commonly used to measure the effect of policies on economic growth5,6, to empirically evaluate the effect that these anti-contagion policies have had on the growth rate of infections. In the absence of policy actions, we estimate that early infections of COVID-19 exhibit exponential growth rates of approximately 38% per day. We find that anti-contagion policies have significantly and substantially slowed this growth. Some policies have different effects on different populations, but we obtain consistent evidence that the policy packages that were deployed to reduce the rate of transmission achieved large, beneficial and measurable health outcomes. We estimate that across these 6 countries, interventions prevented or delayed on the order of 61 million confirmed cases, corresponding to averting approximately 495 million total infections. These findings may help to inform decisions regarding whether or when these policies should be deployed, intensified or lifted, and they can support policy-making in the more than 180 other countries in which COVID-19 has been reported7.
Collapse
Affiliation(s)
- Solomon Hsiang
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA. .,National Bureau of Economic Research, Cambridge, MA, USA. .,Centre for Economic Policy Research, London, UK.
| | - Daniel Allen
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Sébastien Annan-Phan
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Kendon Bell
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | - Ian Bolliger
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Energy & Resources Group, UC Berkeley, Berkeley, CA, USA
| | - Trinetta Chong
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Hannah Druckenmiller
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Luna Yue Huang
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Andrew Hultgren
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Emma Krasovich
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Peiley Lau
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Jaecheol Lee
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Agricultural & Resource Economics, UC Berkeley, Berkeley, CA, USA
| | - Esther Rolf
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA.,Electrical Engineering & Computer Science Department, UC Berkeley, Berkeley, CA, USA
| | - Jeanette Tseng
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| | - Tiffany Wu
- Global Policy Laboratory, Goldman School of Public Policy, UC Berkeley, Berkeley, CA, USA
| |
Collapse
|
23
|
Pei S. Influencer identification in dynamical complex systems. JOURNAL OF COMPLEX NETWORKS 2020; 8:cnz029. [PMID: 32774857 PMCID: PMC7391989 DOI: 10.1093/comnet/cnz029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/06/2019] [Accepted: 07/13/2019] [Indexed: 06/11/2023]
Abstract
The integrity and functionality of many real-world complex systems hinge on a small set of pivotal nodes, or influencers. In different contexts, these influencers are defined as either structurally important nodes that maintain the connectivity of networks, or dynamically crucial units that can disproportionately impact certain dynamical processes. In practice, identification of the optimal set of influencers in a given system has profound implications in a variety of disciplines. In this review, we survey recent advances in the study of influencer identification developed from different perspectives, and present state-of-the-art solutions designed for different objectives. In particular, we first discuss the problem of finding the minimal number of nodes whose removal would breakdown the network (i.e. the optimal percolation or network dismantle problem), and then survey methods to locate the essential nodes that are capable of shaping global dynamics with either continuous (e.g. independent cascading models) or discontinuous phase transitions (e.g. threshold models). We conclude the review with a summary and an outlook.
Collapse
Affiliation(s)
- Sen Pei
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York, NY, USA
| |
Collapse
|
24
|
Darwish A, Rahhal Y, Jafar A. A comparative study on predicting influenza outbreaks using different feature spaces: application of influenza-like illness data from Early Warning Alert and Response System in Syria. BMC Res Notes 2020; 13:33. [PMID: 31948473 PMCID: PMC6964210 DOI: 10.1186/s13104-020-4889-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 01/03/2020] [Indexed: 11/10/2022] Open
Abstract
Objective An accurate forecasting of outbreaks of influenza-like illness (ILI) could support public health officials to suggest public health actions earlier. We investigated the performance of three different feature spaces in different models to forecast the weekly ILI rate in Syria using EWARS data from World Health Organization (WHO). Time series feature space was first used and we applied the seven models which are Naïve, Average, Seasonal naïve, drift, dynamic harmonic regression (Dhr), seasonal and trend decomposition using loess (STL) and TBATS. The Second feature space is like some state-of-the-art, which we named \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$53-weeks-before\_52-first-order-difference$$\end{document}53-weeks-before_52-first-order-difference feature space. The third one, we proposed and named \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$n-years-before\_m-weeks-around$$\end{document}n-years-before_m-weeks-around (YnWm) feature space. Machine learning (ML) and deep learning (DL) model were applied to the second and third feature spaces (generalized linear model (GLM), support vector regression (SVR), gradient boosting (GB), random forest (RF) and long short term memory (LSTM)). Results It was indicated that the LSTM model of four layers with \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$1-year-before\_4-weeks-around$$\end{document}1-year-before_4-weeks-around feature space gave more accurate results than other models and reached the lowest MAPE of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$3.52\%$$\end{document}3.52% and the lowest RMSE of 0.01662. I hope that this modelling methodology can be applied in other countries and therefore help prevent and control influenza worldwide.
Collapse
Affiliation(s)
- Ali Darwish
- Department of Informatics, Higher Institute for Applied Sciences and Technology, Damascus, Syria.
| | - Yasser Rahhal
- Department of Informatics, Higher Institute for Applied Sciences and Technology, Damascus, Syria
| | - Assef Jafar
- Department of Informatics, Higher Institute for Applied Sciences and Technology, Damascus, Syria
| |
Collapse
|
25
|
Rangarajan P, Mody SK, Marathe M. Forecasting dengue and influenza incidences using a sparse representation of Google trends, electronic health records, and time series data. PLoS Comput Biol 2019; 15:e1007518. [PMID: 31751346 PMCID: PMC6894887 DOI: 10.1371/journal.pcbi.1007518] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Revised: 12/05/2019] [Accepted: 10/29/2019] [Indexed: 12/20/2022] Open
Abstract
Dengue and influenza-like illness (ILI) are two of the leading causes of viral infection in the world and it is estimated that more than half the world’s population is at risk for developing these infections. It is therefore important to develop accurate methods for forecasting dengue and ILI incidences. Since data from multiple sources (such as dengue and ILI case counts, electronic health records and frequency of multiple internet search terms from Google Trends) can improve forecasts, standard time series analysis methods are inadequate to estimate all the parameter values from the limited amount of data available if we use multiple sources. In this paper, we use a computationally efficient implementation of the known variable selection method that we call the Autoregressive Likelihood Ratio (ARLR) method. This method combines sparse representation of time series data, electronic health records data (for ILI) and Google Trends data to forecast dengue and ILI incidences. This sparse representation method uses an algorithm that maximizes an appropriate likelihood ratio at every step. Using numerical experiments, we demonstrate that our method recovers the underlying sparse model much more accurately than the lasso method. We apply our method to dengue case count data from five countries/states: Brazil, Mexico, Singapore, Taiwan, and Thailand and to ILI case count data from the United States. Numerical experiments show that our method outperforms existing time series forecasting methods in forecasting the dengue and ILI case counts. In particular, our method gives a 18 percent forecast error reduction over a leading method that also uses data from multiple sources. It also performs better than other methods in predicting the peak value of the case count and the peak time. Dengue and influenza-like illness (ILI) are leading causes of viral infection in the world and hence it is important to develop accurate methods for forecasting their incidence. We use Autoregressive Likelihood Ratio method, which is a computationally efficient implementation of the variable selection method, in order to obtain a sparse (non-lasso) representation of time series, Google Trends and electronic health records (for ILI) data. This method is used to forecast dengue incidence in five countries/states and ILI incidence in USA. We show that this method outperforms existing time series methods in forecasting these diseases. The method is general and can also be used to forecast other diseases.
Collapse
Affiliation(s)
- Prashant Rangarajan
- Departments of Computer Science and Mathematics, Birla Institute of Technology and Science, Pilani, India
| | - Sandeep K. Mody
- Department of Mathematics, Indian Institute of Science, Bangalore, India
| | - Madhav Marathe
- Department of Computer Science, Network, Simulation Science and Advanced Computing Division, Biocomplexity Institute, University of Virginia, Charlottesville, Virginia, United States of America
- * E-mail:
| |
Collapse
|
26
|
Estimating influenza incidence using search query deceptiveness and generalized ridge regression. PLoS Comput Biol 2019; 15:e1007165. [PMID: 31574086 PMCID: PMC6771994 DOI: 10.1371/journal.pcbi.1007165] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2018] [Accepted: 05/31/2019] [Indexed: 11/22/2022] Open
Abstract
Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry into new approaches using internet activity traces, based on the theory that lay observations of health status lead to informative features in internet data. These approaches risk being deceived by activity traces having a coincidental, rather than informative, relationship to disease incidence; to our knowledge, this risk has not yet been quantitatively explored. We evaluated both simulated and real activity traces of varying deceptiveness for influenza incidence estimation using linear regression. We found that deceptiveness knowledge does reduce error in such estimates, that it may help automatically-selected features perform as well or better than features that require human curation, and that a semantic distance measure derived from the Wikipedia article category tree serves as a useful proxy for deceptiveness. This suggests that disease incidence estimation models should incorporate not only data about how internet features map to incidence but also additional data to estimate feature deceptiveness. By doing so, we may gain one more step along the path to accurate, reliable disease incidence estimation using internet data. This capability would improve public health by decreasing the cost and increasing the timeliness of such estimates. While often considered a minor infection, seasonal flu kills many thousands of people every year and sickens millions more. The more accurate and up-to-date public health officials’ view of what the seasonal outbreak is, the more effectively the outbreak can be addressed. Currently, this knowledge is based on collating information on patients who enter the health care system. This approach is accurate, but it’s also expensive and slow. Researchers hope that new approaches based on examining what people do and share on the internet may work more cheaply and quickly. Some internet activity, however, has a history of correspondence with disease activity, but this relationship is coincidental rather than informative. For example, some prior work has found a correspondence between zombie-related social media messages and the flu season, so one could plausibly build accurate flu estimates using such messages that are then fooled by the appearance of a new zombie movie. We tested flu estimation models that incorporate information about this risk of deception, finding that knowledge of deceptiveness does indeed produce more accurate estimates; we also identified a method to estimate deceptiveness. Our results suggest that estimation models used in practice should use information about both how inputs maps to disease activity and also what the potential of each input to be deceptive is. This may get us one step closer to accurate, reliable disease estimates based on internet data, which would improve public health by making those estimates faster and cheaper.
Collapse
|
27
|
On the multibin logarithmic score used in the FluSight competitions. Proc Natl Acad Sci U S A 2019; 116:20809-20810. [PMID: 31558612 DOI: 10.1073/pnas.1912147116] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
28
|
Kandula S, Shaman J. Reappraising the utility of Google Flu Trends. PLoS Comput Biol 2019; 15:e1007258. [PMID: 31374088 PMCID: PMC6693776 DOI: 10.1371/journal.pcbi.1007258] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 08/14/2019] [Accepted: 07/09/2019] [Indexed: 11/18/2022] Open
Abstract
Estimation of influenza-like illness (ILI) using search trends activity was intended to supplement traditional surveillance systems, and was a motivation behind the development of Google Flu Trends (GFT). However, several studies have previously reported large errors in GFT estimates of ILI in the US. Following recent release of time-stamped surveillance data, which better reflects real-time operational scenarios, we reanalyzed GFT errors. Using three data sources-GFT: an archive of weekly ILI estimates from Google Flu Trends; ILIf: fully-observed ILI rates from ILINet; and, ILIp: ILI rates available in real-time based on partial reporting-five influenza seasons were analyzed and mean square errors (MSE) of GFT and ILIp as estimates of ILIf were computed. To correct GFT errors, a random forest regression model was built with ILI and GFT rates from the previous three weeks as predictors. An overall reduction in error of 44% was observed and the errors of the corrected GFT are lower than those of ILIp. An 80% reduction in error during 2012/13, when GFT had large errors, shows that extreme failures of GFT could have been avoided. Using autoregressive integrated moving average (ARIMA) models, one- to four-week ahead forecasts were generated with two separate data streams: ILIp alone, and with both ILIp and corrected GFT. At all forecast targets and seasons, and for all but two regions, inclusion of GFT lowered MSE. Results from two alternative error measures, mean absolute error and mean absolute proportional error, were largely consistent with results from MSE. Taken together these findings provide an error profile of GFT in the US, establish strong evidence for the adoption of search trends based 'nowcasts' in influenza forecast systems, and encourage reevaluation of the utility of this data source in diverse domains.
Collapse
Affiliation(s)
- Sasikiran Kandula
- Department of Environmental Health Sciences, Columbia University, New York, New York, United States of America
| | - Jeffrey Shaman
- Department of Environmental Health Sciences, Columbia University, New York, New York, United States of America
| |
Collapse
|
29
|
Reich NG, Brooks LC, Fox SJ, Kandula S, McGowan CJ, Moore E, Osthus D, Ray EL, Tushar A, Yamana TK, Biggerstaff M, Johansson MA, Rosenfeld R, Shaman J. A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States. Proc Natl Acad Sci U S A 2019; 116:3146-3154. [PMID: 30647115 PMCID: PMC6386665 DOI: 10.1073/pnas.1812594116] [Citation(s) in RCA: 130] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Influenza infects an estimated 9-35 million individuals each year in the United States and is a contributing cause for between 12,000 and 56,000 deaths annually. Seasonal outbreaks of influenza are common in temperate regions of the world, with highest incidence typically occurring in colder and drier months of the year. Real-time forecasts of influenza transmission can inform public health response to outbreaks. We present the results of a multiinstitution collaborative effort to standardize the collection and evaluation of forecasting models for influenza in the United States for the 2010/2011 through 2016/2017 influenza seasons. For these seven seasons, we assembled weekly real-time forecasts of seven targets of public health interest from 22 different models. We compared forecast accuracy of each model relative to a historical baseline seasonal average. Across all regions of the United States, over half of the models showed consistently better performance than the historical baseline when forecasting incidence of influenza-like illness 1 wk, 2 wk, and 3 wk ahead of available data and when forecasting the timing and magnitude of the seasonal peak. In some regions, delays in data reporting were strongly and negatively associated with forecast accuracy. More timely reporting and an improved overall accessibility to novel and traditional data sources are needed to improve forecasting accuracy and its integration with real-time public health decision making.
Collapse
Affiliation(s)
- Nicholas G Reich
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, MA 01003;
| | - Logan C Brooks
- Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 15213
| | - Spencer J Fox
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712
| | - Sasikiran Kandula
- Department of Environmental Health Sciences, Columbia University, New York, NY 10032
| | - Craig J McGowan
- Influenza Division, Centers for Disease Control and Prevention, Atlanta, GA 30333
| | - Evan Moore
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, MA 01003
| | - Dave Osthus
- Statistical Sciences Group, Los Alamos National Laboratory, Los Alamos, NM 87545
| | - Evan L Ray
- Department of Mathematics and Statistics, Mount Holyoke College, South Hadley, MA 01075
| | - Abhinav Tushar
- Department of Biostatistics and Epidemiology, University of Massachusetts-Amherst, Amherst, MA 01003
| | - Teresa K Yamana
- Department of Environmental Health Sciences, Columbia University, New York, NY 10032
| | - Matthew Biggerstaff
- Influenza Division, Centers for Disease Control and Prevention, Atlanta, GA 30333
| | - Michael A Johansson
- Division of Vector-Borne Diseases, Centers for Disease Control and Prevention, San Juan, PR 00920
| | - Roni Rosenfeld
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, PA 15213
| | - Jeffrey Shaman
- Department of Environmental Health Sciences, Columbia University, New York, NY 10032
| |
Collapse
|
30
|
Pei S, Cane MA, Shaman J. Predictability in process-based ensemble forecast of influenza. PLoS Comput Biol 2019; 15:e1006783. [PMID: 30817754 PMCID: PMC6394909 DOI: 10.1371/journal.pcbi.1006783] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2018] [Accepted: 01/12/2019] [Indexed: 11/18/2022] Open
Abstract
Process-based models have been used to simulate and forecast a number of nonlinear dynamical systems, including influenza and other infectious diseases. In this work, we evaluate the effects of model initial condition error and stochastic fluctuation on forecast accuracy in a compartmental model of influenza transmission. These two types of errors are found to have qualitatively similar growth patterns during model integration, indicating that dynamic error growth, regardless of source, is a dominant component of forecast inaccuracy. We therefore examine the nonlinear growth of model initial error and compute the fastest growing directions using singular vector analysis. Using this information, we generate perturbations in an ensemble forecast system of influenza to obtain more optimal ensemble spread. In retrospective forecasts of historical outbreaks for 95 US cities from 2003 to 2014, this approach improves short-term forecast of incidence over the next one to four weeks.
Collapse
Affiliation(s)
- Sen Pei
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, United States of America
| | - Mark A. Cane
- Lamont-Doherty Earth Observatory, Columbia University, New York, NY, United States of America
| | - Jeffrey Shaman
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, United States of America
| |
Collapse
|
31
|
Osthus D, Daughton AR, Priedhorsky R. Even a good influenza forecasting model can benefit from internet-based nowcasts, but those benefits are limited. PLoS Comput Biol 2019; 15:e1006599. [PMID: 30707689 PMCID: PMC6373968 DOI: 10.1371/journal.pcbi.1006599] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 02/13/2019] [Accepted: 10/30/2018] [Indexed: 11/19/2022] Open
Abstract
The ability to produce timely and accurate flu forecasts in the United States can significantly impact public health. Augmenting forecasts with internet data has shown promise for improving forecast accuracy and timeliness in controlled settings, but results in practice are less convincing, as models augmented with internet data have not consistently outperformed models without internet data. In this paper, we perform a controlled experiment, taking into account data backfill, to improve clarity on the benefits and limitations of augmenting an already good flu forecasting model with internet-based nowcasts. Our results show that a good flu forecasting model can benefit from the augmentation of internet-based nowcasts in practice for all considered public health-relevant forecasting targets. The degree of forecast improvement due to nowcasting, however, is uneven across forecasting targets, with short-term forecasting targets seeing the largest improvements and seasonal targets such as the peak timing and intensity seeing relatively marginal improvements. The uneven forecasting improvements across targets hold even when "perfect" nowcasts are used. These findings suggest that further improvements to flu forecasting, particularly seasonal targets, will need to derive from other, non-nowcasting approaches.
Collapse
Affiliation(s)
- Dave Osthus
- Los Alamos National Laboratory, Los Alamos, New Mexico, USA
| | - Ashlynn R. Daughton
- Los Alamos National Laboratory, Los Alamos, New Mexico, USA
- University of Colorado Boulder, Boulder, Colorado, USA
| | | |
Collapse
|
32
|
Moss R, Zarebski AE, Carlson SJ, McCaw JM. Accounting for Healthcare-Seeking Behaviours and Testing Practices in Real-Time Influenza Forecasts. Trop Med Infect Dis 2019; 4:E12. [PMID: 30641917 PMCID: PMC6473244 DOI: 10.3390/tropicalmed4010012] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Revised: 01/08/2019] [Accepted: 01/08/2019] [Indexed: 11/29/2022] Open
Abstract
For diseases such as influenza, where the majority of infected persons experience mild (if any) symptoms, surveillance systems are sensitive to changes in healthcare-seeking and clinical decision-making behaviours. This presents a challenge when trying to interpret surveillance data in near-real-time (e.g., to provide public health decision-support). Australia experienced a particularly large and severe influenza season in 2017, perhaps in part due to: (a) mild cases being more likely to seek healthcare; and (b) clinicians being more likely to collect specimens for reverse transcription polymerase chain reaction (RT-PCR) influenza tests. In this study, we used weekly Flutracking surveillance data to estimate the probability that a person with influenza-like illness (ILI) would seek healthcare and have a specimen collected. We then used this estimated probability to calibrate near-real-time seasonal influenza forecasts at each week of the 2017 season, to see whether predictive skill could be improved. While the number of self-reported influenza tests in the weekly surveys are typically very low, we were able to detect a substantial change in healthcare seeking behaviour and clinician testing behaviour prior to the high epidemic peak. Adjusting for these changes in behaviour in the forecasting framework improved predictive skill. Our analysis demonstrates a unique value of community-level surveillance systems, such as Flutracking, when interpreting traditional surveillance data. These methods are also applicable beyond the Australian context, as similar community-level surveillance systems operate in other countries.
Collapse
Affiliation(s)
- Robert Moss
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville 3052, Australia.
| | | | | | - James M McCaw
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Parkville 3052, Australia.
- School of Mathematics and Statistics, The University of Melbourne, Parkville 3052, Australia.
- Murdoch Children's Research Institute, The Royal Children's Hospital, Parkville 3052, Australia.
- Victorian Infectious Diseases Reference Laboratory Epidemiology Unit, Peter Doherty Institute for Infection and Immunity, The Royal Melbourne Hospital and The University of Melbourne, Melbourne 3000, Australia.
| |
Collapse
|
33
|
Pei S, Morone F, Liljeros F, Makse H, Shaman JL. Inference and control of the nosocomial transmission of methicillin-resistant Staphylococcus aureus. eLife 2018; 7:e40977. [PMID: 30560786 PMCID: PMC6298769 DOI: 10.7554/elife.40977] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 11/16/2018] [Indexed: 12/19/2022] Open
Abstract
Methicillin-resistant Staphylococcus aureus (MRSA) is a continued threat to human health in both community and healthcare settings. In hospitals, control efforts would benefit from accurate estimation of asymptomatic colonization and infection importation rates from the community. However, developing such estimates remains challenging due to limited observation of colonization and complicated transmission dynamics within hospitals and the community. Here, we develop an inference framework that can estimate these key quantities by combining statistical filtering techniques, an agent-based model, and real-world patient-to-patient contact networks, and use this framework to infer nosocomial transmission and infection importation over an outbreak spanning 6 years in 66 Swedish hospitals. In particular, we identify a small number of patients with disproportionately high risk of colonization. In retrospective control experiments, interventions targeted to these individuals yield a substantial improvement over heuristic strategies informed by number of contacts, length of stay and contact tracing.
Collapse
Affiliation(s)
- Sen Pei
- Department of Environmental Health Sciences, Mailman School of Public HealthColumbia UniversityNew YorkUnited States
| | - Flaviano Morone
- Levich Institute and Physics DepartmentCity College of New YorkNew YorkUnited States
| | | | - Hernán Makse
- Levich Institute and Physics DepartmentCity College of New YorkNew YorkUnited States
| | - Jeffrey L Shaman
- Department of Environmental Health Sciences, Mailman School of Public HealthColumbia UniversityNew YorkUnited States
| |
Collapse
|