1
|
Fiandrino S, Bizzotto A, Guzzetta G, Merler S, Baldo F, Valdano E, Urdiales AM, Bella A, Celino F, Zino L, Rizzo A, Li Y, Perra N, Gioannini C, Milano P, Paolotti D, Quaggiotto M, Rossi L, Vismara I, Vespignani A, Gozzi N. Collaborative forecasting of influenza-like illness in Italy: The Influcast experience. Epidemics 2025; 50:100819. [PMID: 39965358 DOI: 10.1016/j.epidem.2025.100819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 12/20/2024] [Accepted: 02/05/2025] [Indexed: 02/20/2025] Open
Abstract
Collaborative hubs that integrate multiple teams to generate ensemble projections and forecasts for shared targets are now regarded as state-of-the-art in epidemic predictive modeling. In this paper, we introduce Influcast, Italy's first epidemic forecasting hub for influenza-like illness. During the 2023/2024 winter season, Influcast provided 20 rounds of forecasts, involving five teams and eight models to predict influenza-like illness incidence up to four weeks in advance at the national and regional administrative level. The individual forecasts were synthesized into an ensemble and benchmarked against a baseline model. Across all models, the ensemble most frequently ranks among the top performers at the national level considering different metrics and forecasting rounds. Additionally, the ensemble outperforms the baseline and most individual models across all regions. Despite a decline in absolute performance over longer horizons, the ensemble model outperformed the baseline in all considered horizons. These findings show the importance of multimodel forecasting hubs in producing reliable short-term influenza-like illnesses forecasts that can inform public health preparedness and mitigation strategies.
Collapse
Affiliation(s)
- Stefania Fiandrino
- ISI Foundation, Turin, Italy; Department of Computer, Control, and Management Engineering Antonio Ruberti, Sapienza University of Rome, Rome, Italy
| | - Andrea Bizzotto
- Center for Health Emergencies, Bruno Kessler Foundation, Trento, Italy; Department of Mathematics, University of Trento, Trento, Italy
| | - Giorgio Guzzetta
- Center for Health Emergencies, Bruno Kessler Foundation, Trento, Italy
| | - Stefano Merler
- Center for Health Emergencies, Bruno Kessler Foundation, Trento, Italy
| | - Federico Baldo
- University of Bologna - Department of Computer Science and Engineering, Italy; Institut Pierre Louis d'Epidémiologie et de Santé Publique, INSERM & Sorbonne Université, site Hôpital St. Antoine, 27 rue Chaligny, 75012, Paris, France
| | - Eugenio Valdano
- Institut Pierre Louis d'Epidémiologie et de Santé Publique, INSERM & Sorbonne Université, site Hôpital St. Antoine, 27 rue Chaligny, 75012, Paris, France
| | | | | | - Francesco Celino
- Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - Lorenzo Zino
- Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - Alessandro Rizzo
- Department of Electronics and Telecommunications, Politecnico di Torino, Turin, Italy
| | - Yuhan Li
- School of Mathematical Sciences, Queen Mary University of London, UK
| | - Nicola Perra
- School of Mathematical Sciences, Queen Mary University of London, UK; The Alan Turing Institute, London, UK
| | | | | | | | - Marco Quaggiotto
- ISI Foundation, Turin, Italy; Department of Design, Politecnico di Milano, Italy
| | | | | | - Alessandro Vespignani
- ISI Foundation, Turin, Italy; Laboratory for the Modeling of Biological and Socio-technical Systems, Northeastern University, Boston, MA, USA
| | | |
Collapse
|
2
|
Shen X, Rumack A, Wilder B, Tibshirani RJ. Nowcasting reported covid-19 hospitalizations using de-identified, aggregated medical insurance claims data. PLoS Comput Biol 2025; 21:e1012717. [PMID: 39965031 PMCID: PMC11841917 DOI: 10.1371/journal.pcbi.1012717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 02/20/2025] [Accepted: 12/12/2024] [Indexed: 02/20/2025] Open
Abstract
We propose, implement, and evaluate a method for nowcasting the daily number of new COVID-19 hospitalizations, at the level of individual US states, based on de-identified, aggregated medical insurance claims data. Our analysis proceeds under a hypothetical scenario in which, during the Delta wave, states only report data on the first day of each month, and on this day, report COVID-19 hospitalization counts for each day in the previous month. In this hypothetical scenario (just as in reality), medical insurance claims data continues to be available daily. At the beginning of each month, we train a regression model, using all data available thus far, to predict hospitalization counts from medical insurance claims. We then use this model to nowcast the (unseen) values of COVID-19 hospitalization counts from medical insurance claims, at each day in the following month. Our analysis uses properly-versioned data, which would have been available in real-time at the time predictions are produced (instead of using data that would have only been available in hindsight). In spite of the difficulties inherent to real-time estimation (e.g., latency and backfill) and the complex dynamics behind COVID-19 hospitalizations themselves, we find altogether that medical insurance claims can be an accurate predictor of hospitalization reports, with mean absolute errors typically around 0.4 hospitalizations per 100,000 people, i.e., proportion of variance explained around 75%. Perhaps more importantly, we find that nowcasts made using medical insurance claims are able to qualitatively capture the dynamics (upswings and downswings) of hospitalization waves, which are key features that inform public health decision-making.
Collapse
Affiliation(s)
- Xueda Shen
- Department of Biostatistics, University of California, Berkeley, California, United States of America
| | - Aaron Rumack
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Bryan Wilder
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Ryan J Tibshirani
- Department of Statistics, University of California, Berkeley, California, United States of America
| |
Collapse
|
3
|
BLEICHRODT AM, OKANO JT, FUNG ICH, CHOWELL G, BLOWER S. The Future of HIV: Challenges in meeting the 2030 Ending the HIV Epidemic in the U.S. (EHE) reduction goal. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2025:2025.01.06.25320033. [PMID: 39830275 PMCID: PMC11741459 DOI: 10.1101/2025.01.06.25320033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
Objectives To predict the burden of HIV in the United States (US) nationally and by region, transmission type, and race/ethnicity through 2030. Methods Using publicly available data from the CDC NCHHSTP AtlasPlus dashboard, we generated 11-year prospective forecasts of incident HIV diagnoses nationally and by region (South, non-South), race/ethnicity (White, Hispanic/Latino, Black/African American), and transmission type (Injection-Drug Use, Male-to-Male Sexual Contact (MMSC), and Heterosexual Contact (HSC)). We employed weighted (W) and unweighted (UW) n-sub-epidemic ensemble models, calibrated using 12 years of historical data (2008-2019), and forecasted trends for 2020-2030. We compared results to identify persistent, concerning trends across models. Results We projected substantial decreases in incident HIV diagnoses nationally (W: 27.9%, UW: 21.9%), and in the South (W:18.0%, UW: 9.2%) and non-South (W: 21.2%, UW: 19.5%) from 2019 to 2030. However, concerning non-decreasing trends were observed nationally in key sub-populations during this period: Hispanic/Latino persons (W: 1.4%, UW: 2.6%), Hispanic/Latino MMSC (W: 9.0%, UW: 9.9%), people who inject drugs (PWID) (W: 25.6%, UW: 9.2%), and White PWID (W: 3.5%, UW: 44.9%). The rising trends among Hispanic/Latino MMSC and overall PWID were consistent across the South and non-South regions. Conclusions Although the forecasted national-level decrease in the number of incident HIV diagnoses is encouraging, the US is unlikely to achieve the Ending the HIV Epidemic in the U.S. goal of a 90% reduction in HIV incidence by 2030. Additionally, the observed increases among specific subpopulations highlight the importance of a targeted and equitable approach to effectively combat HIV in the US.
Collapse
Affiliation(s)
- Amanda M BLEICHRODT
- Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA
| | - Justin T OKANO
- Center for Biomedical Modeling, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Isaac CH FUNG
- Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA
| | - Gerardo CHOWELL
- Department of Population Health Sciences, School of Public Health, Georgia State University, Atlanta, GA, USA
| | - Sally BLOWER
- Center for Biomedical Modeling, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| |
Collapse
|
4
|
Ray EL, Wang Y, Wolfinger RD, Reich NG. Flusion: Integrating multiple data sources for accurate influenza predictions. Epidemics 2024; 50:100810. [PMID: 39818098 DOI: 10.1016/j.epidem.2024.100810] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Revised: 09/26/2024] [Accepted: 12/06/2024] [Indexed: 01/18/2025] Open
Abstract
Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admissions reported in the CDC's National Healthcare Safety Network (NHSN) surveillance system. Reporting of influenza hospital admissions through NHSN began within the last few years, and as such only a limited amount of historical data are available for this target signal. To produce forecasts in the presence of limited data for the target surveillance system, we augmented these data with two signals that have a longer historical record: 1) ILI+, which estimates the proportion of outpatient doctor visits where the patient has influenza; and 2) rates of laboratory-confirmed influenza hospitalizations at a selected set of healthcare facilities. Our model, Flusion, is an ensemble model that combines two machine learning models using gradient boosting for quantile regression based on different feature sets with a Bayesian autoregressive model. The gradient boosting models were trained on all three data signals, while the autoregressive model was trained on only data for the target surveillance signal, NHSN admissions; all three models were trained jointly on data for multiple locations. In each week of the influenza season, these models produced quantiles of a predictive distribution of influenza hospital admissions in each state for the current week and the following three weeks; the ensemble prediction was computed by averaging these quantile predictions. Flusion emerged as the top-performing model in the CDC's influenza prediction challenge for the 2023/24 season. In this article we investigate the factors contributing to Flusion's success, and we find that its strong performance was primarily driven by the use of a gradient boosting model that was trained jointly on data from multiple surveillance signals and multiple locations. These results indicate the value of sharing information across multiple locations and surveillance signals, especially when doing so adds to the pool of available training data.
Collapse
Affiliation(s)
- Evan L Ray
- Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, United States.
| | - Yijin Wang
- Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, United States
| | | | - Nicholas G Reich
- Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, United States
| |
Collapse
|
5
|
Marshall M, Parker F, Gardner LM. When are predictions useful? A new method for evaluating epidemic forecasts. BMC GLOBAL AND PUBLIC HEALTH 2024; 2:67. [PMID: 39681892 DOI: 10.1186/s44263-024-00098-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Accepted: 09/19/2024] [Indexed: 12/18/2024]
Abstract
BACKGROUND COVID-19 will not be the last pandemic of the twenty-first century. To better prepare for the next one, it is essential that we make honest appraisals of the utility of different responses to COVID. In this paper, we focus specifically on epidemiologic forecasting. Characterizing forecast efficacy over the history of the pandemic is challenging, especially given its significant spatial, temporal, and contextual variability. In this light, we introduce the Weighted Contextual Interval Score (WCIS), a new method for retrospective interval forecast evaluation. METHODS The central tenet of the WCIS is a direct incorporation of contextual utility into the evaluation. This necessitates a specific characterization of forecast efficacy depending on the use case for predictions, accomplished via defining a utility threshold parameter. This idea is generalized to probabilistic interval-form forecasts, which are the preferred prediction format for epidemiological modeling, as an extension of the existing Weighted Interval Score (WIS). RESULTS We apply the WCIS to two forecasting scenarios: facility-level hospitalizations for a single state, and state-level hospitalizations for the whole of the United States. We observe that an appropriately parameterized application of the WCIS captures both the relative quality and the overall frequency of useful forecasts. Since the WCIS represents the utility of predictions using contextual normalization, it is easily comparable across highly variable pandemic scenarios while remaining intuitively representative of the in-situ quality of individual forecasts. CONCLUSIONS The WCIS provides a pragmatic utility-based characterization of probabilistic predictions. This method is expressly intended to enable practitioners and policymakers who may not have expertise in forecasting but are nevertheless essential partners in epidemic response to use and provide insightful analysis of predictions. We note that the WCIS is intended specifically for retrospective forecast evaluation and should not be used as a minimized penalty in a competitive context as it lacks statistical propriety. Code and data used for our analysis are available at https://github.com/maximilian-marshall/wcis .
Collapse
Affiliation(s)
- Maximilian Marshall
- Department of Civil and Systems Engineering, Johns Hopkins University, Baltimore, MD, USA.
| | - Felix Parker
- Department of Civil and Systems Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Lauren M Gardner
- Department of Civil and Systems Engineering, Johns Hopkins University, Baltimore, MD, USA
| |
Collapse
|
6
|
Fox SJ, Kim M, Meyers LA, Reich NG, Ray EL. Optimizing Disease Outbreak Forecast Ensembles. Emerg Infect Dis 2024; 30:1967-1969. [PMID: 39174027 PMCID: PMC11347000 DOI: 10.3201/eid3009.240026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/24/2024] Open
Abstract
On the basis of historical influenza and COVID-19 forecasts, we found that more than 3 forecast models are needed to ensure robust ensemble accuracy. Additional models can improve ensemble performance, but with diminishing accuracy returns. This understanding will assist with the design of current and future collaborative infectious disease forecasting efforts.
Collapse
|
7
|
Shandross L, Howerton E, Contamin L, Hochheiser H, Krystalli A, Reich NG, Ray EL. hubEnsembles: Ensembling Methods in R. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.24.24309416. [PMID: 38978658 PMCID: PMC11230315 DOI: 10.1101/2024.06.24.24309416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Combining predictions from multiple models into an ensemble is a widely used practice across many fields with demonstrated performance benefits. The R package hubEnsembles provides a flexible framework for ensembling various types of predictions, including point estimates and probabilistic predictions. A range of common methods for generating ensembles are supported, including weighted averages, quantile averages, and linear pools. The hubEnsembles package fits within a broader framework of open-source software and data tools called the "hubverse", which facilitates the development and management of collaborative modelling exercises.
Collapse
|
8
|
Bosse NI, Abbott S, Bracher J, van Leeuwen E, Cori A, Funk S. Human judgement forecasting of COVID-19 in the UK. Wellcome Open Res 2024; 8:416. [PMID: 38618198 PMCID: PMC11009611 DOI: 10.12688/wellcomeopenres.19380.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/12/2024] [Indexed: 04/16/2024] Open
Abstract
Background In the past, two studies found ensembles of human judgement forecasts of COVID-19 to show predictive performance comparable to ensembles of computational models, at least when predicting case incidences. We present a follow-up to a study conducted in Germany and Poland and investigate a novel joint approach to combine human judgement and epidemiological modelling. Methods From May 24th to August 16th 2021, we elicited weekly one to four week ahead forecasts of cases and deaths from COVID-19 in the UK from a crowd of human forecasters. A median ensemble of all forecasts was submitted to the European Forecast Hub. Participants could use two distinct interfaces: in one, forecasters submitted a predictive distribution directly, in the other forecasters instead submitted a forecast of the effective reproduction number R t. This was then used to forecast cases and deaths using simulation methods from the EpiNow2 R package. Forecasts were scored using the weighted interval score on the original forecasts, as well as after applying the natural logarithm to both forecasts and observations. Results The ensemble of human forecasters overall performed comparably to the official European Forecast Hub ensemble on both cases and deaths, although results were sensitive to changes in details of the evaluation. R t forecasts performed comparably to direct forecasts on cases, but worse on deaths. Self-identified "experts" tended to be better calibrated than "non-experts" for cases, but not for deaths. Conclusions Human judgement forecasts and computational models can produce forecasts of similar quality for infectious disease such as COVID-19. The results of forecast evaluations can change depending on what metrics are chosen and judgement on what does or doesn't constitute a "good" forecast is dependent on the forecast consumer. Combinations of human and computational forecasts hold potential but present real-world challenges that need to be solved.
Collapse
Affiliation(s)
- Nikos I. Bosse
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
- NIHR Health Protection Research Unit in Modelling & Health Economics, London, UK
| | - Sam Abbott
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
| | - Johannes Bracher
- Computational Statistics Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Edwin van Leeuwen
- NIHR Health Protection Research Unit in Modelling & Health Economics, London, UK
- Modelling Economics Unit, UK Health Security Agency, London, UK
| | - Anne Cori
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, England, UK
| | - Sebastian Funk
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, WC1E 7HT, UK
- NIHR Health Protection Research Unit in Modelling & Health Economics, London, UK
| |
Collapse
|
9
|
Howerton E, Contamin L, Mullany LC, Qin M, Reich NG, Bents S, Borchering RK, Jung SM, Loo SL, Smith CP, Levander J, Kerr J, Espino J, van Panhuis WG, Hochheiser H, Galanti M, Yamana T, Pei S, Shaman J, Rainwater-Lovett K, Kinsey M, Tallaksen K, Wilson S, Shin L, Lemaitre JC, Kaminsky J, Hulse JD, Lee EC, McKee CD, Hill A, Karlen D, Chinazzi M, Davis JT, Mu K, Xiong X, Pastore Y Piontti A, Vespignani A, Rosenstrom ET, Ivy JS, Mayorga ME, Swann JL, España G, Cavany S, Moore S, Perkins A, Hladish T, Pillai A, Ben Toh K, Longini I, Chen S, Paul R, Janies D, Thill JC, Bouchnita A, Bi K, Lachmann M, Fox SJ, Meyers LA, Srivastava A, Porebski P, Venkatramanan S, Adiga A, Lewis B, Klahn B, Outten J, Hurt B, Chen J, Mortveit H, Wilson A, Marathe M, Hoops S, Bhattacharya P, Machi D, Cadwell BL, Healy JM, Slayton RB, Johansson MA, Biggerstaff M, Truelove S, Runge MC, Shea K, Viboud C, Lessler J. Evaluation of the US COVID-19 Scenario Modeling Hub for informing pandemic response under uncertainty. Nat Commun 2023; 14:7260. [PMID: 37985664 PMCID: PMC10661184 DOI: 10.1038/s41467-023-42680-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 10/17/2023] [Indexed: 11/22/2023] Open
Abstract
Our ability to forecast epidemics far into the future is constrained by the many complexities of disease systems. Realistic longer-term projections may, however, be possible under well-defined scenarios that specify the future state of critical epidemic drivers. Since December 2020, the U.S. COVID-19 Scenario Modeling Hub (SMH) has convened multiple modeling teams to make months ahead projections of SARS-CoV-2 burden, totaling nearly 1.8 million national and state-level projections. Here, we find SMH performance varied widely as a function of both scenario validity and model calibration. We show scenarios remained close to reality for 22 weeks on average before the arrival of unanticipated SARS-CoV-2 variants invalidated key assumptions. An ensemble of participating models that preserved variation between models (using the linear opinion pool method) was consistently more reliable than any single model in periods of valid scenario assumptions, while projection interval coverage was near target levels. SMH projections were used to guide pandemic response, illustrating the value of collaborative hubs for longer-term scenario projections.
Collapse
Affiliation(s)
- Emily Howerton
- The Pennsylvania State University, University Park, PA, USA.
| | | | - Luke C Mullany
- Johns Hopkins University Applied Physics Lab, Laurel, MD, USA
| | | | | | - Samantha Bents
- National Institutes of Health Fogarty International Center, Bethesda, MD, USA
| | - Rebecca K Borchering
- The Pennsylvania State University, University Park, PA, USA
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Sung-Mok Jung
- University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Sara L Loo
- Johns Hopkins University, Baltimore, MD, USA
| | | | | | | | - J Espino
- University of Pittsburgh, Pittsburgh, PA, USA
| | | | | | | | | | - Sen Pei
- Columbia University, New York, NY, USA
| | | | | | - Matt Kinsey
- Johns Hopkins University Applied Physics Lab, Laurel, MD, USA
| | - Kate Tallaksen
- Johns Hopkins University Applied Physics Lab, Laurel, MD, USA
| | - Shelby Wilson
- Johns Hopkins University Applied Physics Lab, Laurel, MD, USA
| | - Lauren Shin
- Johns Hopkins University Applied Physics Lab, Laurel, MD, USA
| | | | | | | | | | | | - Alison Hill
- Johns Hopkins University, Baltimore, MD, USA
| | - Dean Karlen
- University of Victoria, Victoria, BC, Canada
| | | | | | - Kunpeng Mu
- Northeastern University, Boston, MA, USA
| | | | | | | | | | - Julie S Ivy
- North Carolina State University, Raleigh, NC, USA
| | | | | | | | - Sean Cavany
- University of Notre Dame, Notre Dame, IN, USA
| | - Sean Moore
- University of Notre Dame, Notre Dame, IN, USA
| | | | | | | | | | | | - Shi Chen
- University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Rajib Paul
- University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Daniel Janies
- University of North Carolina at Charlotte, Charlotte, NC, USA
| | | | | | - Kaiming Bi
- University of Texas at Austin, Austin, TX, USA
| | | | | | | | | | | | | | | | - Bryan Lewis
- University of Virginia, Charlottesville, VA, USA
| | - Brian Klahn
- University of Virginia, Charlottesville, VA, USA
| | | | | | | | | | | | | | - Stefan Hoops
- University of Virginia, Charlottesville, VA, USA
| | | | - Dustin Machi
- University of Virginia, Charlottesville, VA, USA
| | - Betsy L Cadwell
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Jessica M Healy
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | | | | | | | - Michael C Runge
- U.S. Geological Survey Eastern Ecological Science Center, Laurel, MD, USA
| | - Katriona Shea
- The Pennsylvania State University, University Park, PA, USA
| | - Cécile Viboud
- National Institutes of Health Fogarty International Center, Bethesda, MD, USA.
| | - Justin Lessler
- University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
10
|
Wattanachit N, Ray EL, McAndrew TC, Reich NG. Comparison of combination methods to create calibrated ensemble forecasts for seasonal influenza in the U.S. Stat Med 2023; 42:4696-4712. [PMID: 37648218 PMCID: PMC10710272 DOI: 10.1002/sim.9884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 07/10/2023] [Accepted: 08/10/2023] [Indexed: 09/01/2023]
Abstract
The characteristics of influenza seasons vary substantially from year to year, posing challenges for public health preparation and response. Influenza forecasting is used to inform seasonal outbreak response, which can in turn potentially reduce the impact of an epidemic. The United States Centers for Disease Control and Prevention, in collaboration with external researchers, has run an annual prospective influenza forecasting exercise, known as the FluSight challenge. Uniting theoretical results from the forecasting literature with domain-specific forecasts from influenza outbreaks, we applied parametric forecast combination methods that simultaneously optimize model weights and calibrate the ensemble via a beta transformation and made adjustments to the methods to reduce their complexity. We used the beta-transformed linear pool, the finite beta mixture model, and their equal weight adaptations to produce ensemble forecasts retrospectively for the 2016/2017, 2017/2018, and 2018/2019 influenza seasons in the U.S. We compared their performance to methods that were used in the FluSight challenge to produce the FluSight Network ensemble, namely the equally weighted linear pool and the linear pool. Ensemble forecasts produced from methods with a beta transformation were shown to outperform those from the equally weighted linear pool and the linear pool for all week-ahead targets across in the test seasons based on average log scores. We observed improvements in overall accuracy despite the beta-transformed linear pool or beta mixture methods' modest under-prediction across all targets and seasons. Combination techniques that explicitly adjust for known calibration issues in linear pooling should be considered to improve probabilistic scores in outbreak settings.
Collapse
Affiliation(s)
- Nutcha Wattanachit
- School of Public Health and Health Sciences, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| | - Evan L Ray
- School of Public Health and Health Sciences, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| | | | - Nicholas G Reich
- School of Public Health and Health Sciences, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| |
Collapse
|
11
|
Panaggio MJ, Wilson SN, Ratcliff JD, Mullany LC, Freeman JD, Rainwater-Lovett K. On the Mark: Modeling and Forecasting for Public Health Impact. Health Secur 2023; 21:S79-S88. [PMID: 37756211 DOI: 10.1089/hs.2023.0033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2023] Open
Affiliation(s)
- Mark J Panaggio
- Mark J. Panaggio, PhD, is Applied Mathematicians/Data Scientists, Johns Hopkins University Applied Physics
| | - Shelby N Wilson
- Shelby N. Wilson, PhD, is Applied Mathematicians/Data Scientists, Johns Hopkins University Applied Physics
| | - Jeremy D Ratcliff
- Jeremy D. Ratcliff, PhD, is a Senior Scientist, Asymmetric Operations Sector, Johns Hopkins University Applied Physics
| | - Luke C Mullany
- Luke C. Mullany, PhD, MS, MHS, is a Senior Researcher, Research and Exploratory Development Department, Johns Hopkins University Applied Physics
| | - Jeffrey D Freeman
- Jeffrey D. Freeman, PhD, MPH, is Director and Special Assistant to the President, National Center for Disaster Medicine and Public Health, Uniformed Services University of the Health Sciences, Bethesda, MD
| | - Kaitlin Rainwater-Lovett
- Kaitlin Rainwater-Lovett, PhD, MPH, is Assistant Program Manager, Johns Hopkins University Applied Physics Laboratory, Laurel, MD
| |
Collapse
|
12
|
Bilinski AM, Salomon JA, Hatfield LA. Adaptive metrics for an evolving pandemic: A dynamic approach to area-level COVID-19 risk designations. Proc Natl Acad Sci U S A 2023; 120:e2302528120. [PMID: 37527346 PMCID: PMC10410764 DOI: 10.1073/pnas.2302528120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 04/27/2023] [Indexed: 08/03/2023] Open
Abstract
Throughout the COVID-19 pandemic, policymakers have proposed risk metrics, such as the CDC Community Levels, to guide local and state decision-making. However, risk metrics have not reliably predicted key outcomes and have often lacked transparency in terms of prioritization of false-positive versus false-negative signals. They have also struggled to maintain relevance over time due to slow and infrequent updates addressing new variants and shifts in vaccine- and infection-induced immunity. We make two contributions to address these weaknesses. We first present a framework to evaluate predictive accuracy based on policy targets related to severe disease and mortality, allowing for explicit preferences toward false-negative versus false-positive signals. This approach allows policymakers to optimize metrics for specific preferences and interventions. Second, we propose a method to update risk thresholds in real time. We show that this adaptive approach to designating areas as "high risk" improves performance over static metrics in predicting 3-wk-ahead mortality and intensive care usage at both state and county levels. We also demonstrate that with our approach, using only new hospital admissions to predict 3-wk-ahead mortality and intensive care usage has performed consistently as well as metrics that also include cases and inpatient bed usage. Our results highlight that a key challenge for COVID-19 risk prediction is the changing relationship between indicators and outcomes of policy interest. Adaptive metrics therefore have a unique advantage in a rapidly evolving pandemic context.
Collapse
Affiliation(s)
- Alyssa M. Bilinski
- Departments of Health Services, Policy and Practice & Biostatistics, Brown University, Providence, RI02912
| | | | - Laura A. Hatfield
- Department of Health Care Policy, Harvard Medical School, Boston, MA02115
| |
Collapse
|
13
|
Howerton E, Contamin L, Mullany LC, Qin M, Reich NG, Bents S, Borchering RK, Jung SM, Loo SL, Smith CP, Levander J, Kerr J, Espino J, van Panhuis WG, Hochheiser H, Galanti M, Yamana T, Pei S, Shaman J, Rainwater-Lovett K, Kinsey M, Tallaksen K, Wilson S, Shin L, Lemaitre JC, Kaminsky J, Hulse JD, Lee EC, McKee C, Hill A, Karlen D, Chinazzi M, Davis JT, Mu K, Xiong X, Piontti APY, Vespignani A, Rosenstrom ET, Ivy JS, Mayorga ME, Swann JL, España G, Cavany S, Moore S, Perkins A, Hladish T, Pillai A, Toh KB, Longini I, Chen S, Paul R, Janies D, Thill JC, Bouchnita A, Bi K, Lachmann M, Fox S, Meyers LA, Srivastava A, Porebski P, Venkatramanan S, Adiga A, Lewis B, Klahn B, Outten J, Hurt B, Chen J, Mortveit H, Wilson A, Marathe M, Hoops S, Bhattacharya P, Machi D, Cadwell BL, Healy JM, Slayton RB, Johansson MA, Biggerstaff M, Truelove S, Runge MC, Shea K, Viboud C, Lessler J. Informing pandemic response in the face of uncertainty. An evaluation of the U.S. COVID-19 Scenario Modeling Hub. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.28.23291998. [PMID: 37461674 PMCID: PMC10350156 DOI: 10.1101/2023.06.28.23291998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]
Abstract
Our ability to forecast epidemics more than a few weeks into the future is constrained by the complexity of disease systems, our limited ability to measure the current state of an epidemic, and uncertainties in how human action will affect transmission. Realistic longer-term projections (spanning more than a few weeks) may, however, be possible under defined scenarios that specify the future state of critical epidemic drivers, with the additional benefit that such scenarios can be used to anticipate the comparative effect of control measures. Since December 2020, the U.S. COVID-19 Scenario Modeling Hub (SMH) has convened multiple modeling teams to make 6-month ahead projections of the number of SARS-CoV-2 cases, hospitalizations and deaths. The SMH released nearly 1.8 million national and state-level projections between February 2021 and November 2022. SMH performance varied widely as a function of both scenario validity and model calibration. Scenario assumptions were periodically invalidated by the arrival of unanticipated SARS-CoV-2 variants, but SMH still provided projections on average 22 weeks before changes in assumptions (such as virus transmissibility) invalidated scenarios and their corresponding projections. During these periods, before emergence of a novel variant, a linear opinion pool ensemble of contributed models was consistently more reliable than any single model, and projection interval coverage was near target levels for the most plausible scenarios (e.g., 79% coverage for 95% projection interval). SMH projections were used operationally to guide planning and policy at different stages of the pandemic, illustrating the value of the hub approach for long-term scenario projections.
Collapse
Affiliation(s)
| | | | | | | | | | - Samantha Bents
- National Institutes of Health Fogarty International Center (NIH)
| | | | | | - Sara L Loo
- Johns Hopkins University Infectious Disease Dynamics (JHU-IDD)
| | - Claire P Smith
- Johns Hopkins University Infectious Disease Dynamics (JHU-IDD)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Shi Chen
- University of North Carolina at Charlotte (UNCC)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Sherratt K, Gruson H, Grah R, Johnson H, Niehus R, Prasse B, Sandmann F, Deuschel J, Wolffram D, Abbott S, Ullrich A, Gibson G, Ray EL, Reich NG, Sheldon D, Wang Y, Wattanachit N, Wang L, Trnka J, Obozinski G, Sun T, Thanou D, Pottier L, Krymova E, Meinke JH, Barbarossa MV, Leithauser N, Mohring J, Schneider J, Wlazlo J, Fuhrmann J, Lange B, Rodiah I, Baccam P, Gurung H, Stage S, Suchoski B, Budzinski J, Walraven R, Villanueva I, Tucek V, Smid M, Zajicek M, Perez Alvarez C, Reina B, Bosse NI, Meakin SR, Castro L, Fairchild G, Michaud I, Osthus D, Alaimo Di Loro P, Maruotti A, Eclerova V, Kraus A, Kraus D, Pribylova L, Dimitris B, Li ML, Saksham S, Dehning J, Mohr S, Priesemann V, Redlarski G, Bejar B, Ardenghi G, Parolini N, Ziarelli G, Bock W, Heyder S, Hotz T, Singh DE, Guzman-Merino M, Aznarte JL, Morina D, Alonso S, Alvarez E, Lopez D, Prats C, Burgard JP, Rodloff A, Zimmermann T, Kuhlmann A, Zibert J, Pennoni F, Divino F, Catala M, Lovison G, Giudici P, Tarantino B, Bartolucci F, Jona Lasinio G, Mingione M, Farcomeni A, Srivastava A, Montero-Manso P, Adiga A, Hurt B, Lewis B, Marathe M, Porebski P, Venkatramanan S, Bartczuk RP, Dreger F, Gambin A, Gogolewski K, Gruziel-Slomka M, Krupa B, Moszyński A, Niedzielewski K, Nowosielski J, Radwan M, Rakowski F, Semeniuk M, Szczurek E, Zielinski J, Kisielewski J, Pabjan B, Holger K, Kheifetz Y, Scholz M, Przemyslaw B, Bodych M, Filinski M, Idzikowski R, Krueger T, Ozanski T, Bracher J, Funk S. Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations. eLife 2023; 12:e81916. [PMID: 37083521 PMCID: PMC10238088 DOI: 10.7554/elife.81916] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/20/2023] [Indexed: 04/22/2023] Open
Abstract
Background Short-term forecasts of infectious disease burden can contribute to situational awareness and aid capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise the predictive performance of such forecasts if multiple models are combined into an ensemble. Here, we report on the performance of ensembles in predicting COVID-19 cases and deaths across Europe between 08 March 2021 and 07 March 2022. Methods We used open-source tools to develop a public European COVID-19 Forecast Hub. We invited groups globally to contribute weekly forecasts for COVID-19 cases and deaths reported by a standardised source for 32 countries over the next 1-4 weeks. Teams submitted forecasts from March 2021 using standardised quantiles of the predictive distribution. Each week we created an ensemble forecast, where each predictive quantile was calculated as the equally-weighted average (initially the mean and then from 26th July the median) of all individual models' predictive quantiles. We measured the performance of each model using the relative Weighted Interval Score (WIS), comparing models' forecast accuracy relative to all other models. We retrospectively explored alternative methods for ensemble forecasts, including weighted averages based on models' past predictive performance. Results Over 52 weeks, we collected forecasts from 48 unique models. We evaluated 29 models' forecast scores in comparison to the ensemble model. We found a weekly ensemble had a consistently strong performance across countries over time. Across all horizons and locations, the ensemble performed better on relative WIS than 83% of participating models' forecasts of incident cases (with a total N=886 predictions from 23 unique models), and 91% of participating models' forecasts of deaths (N=763 predictions from 20 models). Across a 1-4 week time horizon, ensemble performance declined with longer forecast periods when forecasting cases, but remained stable over 4 weeks for incident death forecasts. In every forecast across 32 countries, the ensemble outperformed most contributing models when forecasting either cases or deaths, frequently outperforming all of its individual component models. Among several choices of ensemble methods we found that the most influential and best choice was to use a median average of models instead of using the mean, regardless of methods of weighting component forecast models. Conclusions Our results support the use of combining forecasts from individual models into an ensemble in order to improve predictive performance across epidemiological targets and populations during infectious disease epidemics. Our findings further suggest that median ensemble methods yield better predictive performance more than ones based on means. Our findings also highlight that forecast consumers should place more weight on incident death forecasts than incident case forecasts at forecast horizons greater than 2 weeks. Funding AA, BH, BL, LWa, MMa, PP, SV funded by National Institutes of Health (NIH) Grant 1R01GM109718, NSF BIG DATA Grant IIS-1633028, NSF Grant No.: OAC-1916805, NSF Expeditions in Computing Grant CCF-1918656, CCF-1917819, NSF RAPID CNS-2028004, NSF RAPID OAC-2027541, US Centers for Disease Control and Prevention 75D30119C05935, a grant from Google, University of Virginia Strategic Investment Fund award number SIF160, Defense Threat Reduction Agency (DTRA) under Contract No. HDTRA1-19-D-0007, and respectively Virginia Dept of Health Grant VDH-21-501-0141, VDH-21-501-0143, VDH-21-501-0147, VDH-21-501-0145, VDH-21-501-0146, VDH-21-501-0142, VDH-21-501-0148. AF, AMa, GL funded by SMIGE - Modelli statistici inferenziali per governare l'epidemia, FISR 2020-Covid-19 I Fase, FISR2020IP-00156, Codice Progetto: PRJ-0695. AM, BK, FD, FR, JK, JN, JZ, KN, MG, MR, MS, RB funded by Ministry of Science and Higher Education of Poland with grant 28/WFSN/2021 to the University of Warsaw. BRe, CPe, JLAz funded by Ministerio de Sanidad/ISCIII. BT, PG funded by PERISCOPE European H2020 project, contract number 101016233. CP, DL, EA, MC, SA funded by European Commission - Directorate-General for Communications Networks, Content and Technology through the contract LC-01485746, and Ministerio de Ciencia, Innovacion y Universidades and FEDER, with the project PGC2018-095456-B-I00. DE., MGu funded by Spanish Ministry of Health / REACT-UE (FEDER). DO, GF, IMi, LC funded by Laboratory Directed Research and Development program of Los Alamos National Laboratory (LANL) under project number 20200700ER. DS, ELR, GG, NGR, NW, YW funded by National Institutes of General Medical Sciences (R35GM119582; the content is solely the responsibility of the authors and does not necessarily represent the official views of NIGMS or the National Institutes of Health). FB, FP funded by InPresa, Lombardy Region, Italy. HG, KS funded by European Centre for Disease Prevention and Control. IV funded by Agencia de Qualitat i Avaluacio Sanitaries de Catalunya (AQuAS) through contract 2021-021OE. JDe, SMo, VP funded by Netzwerk Universitatsmedizin (NUM) project egePan (01KX2021). JPB, SH, TH funded by Federal Ministry of Education and Research (BMBF; grant 05M18SIA). KH, MSc, YKh funded by Project SaxoCOV, funded by the German Free State of Saxony. Presentation of data, model results and simulations also funded by the NFDI4Health Task Force COVID-19 (https://www.nfdi4health.de/task-force-covid-19-2) within the framework of a DFG-project (LO-342/17-1). LP, VE funded by Mathematical and Statistical modelling project (MUNI/A/1615/2020), Online platform for real-time monitoring, analysis and management of epidemic situations (MUNI/11/02202001/2020); VE also supported by RECETOX research infrastructure (Ministry of Education, Youth and Sports of the Czech Republic: LM2018121), the CETOCOEN EXCELLENCE (CZ.02.1.01/0.0/0.0/17-043/0009632), RECETOX RI project (CZ.02.1.01/0.0/0.0/16-013/0001761). NIB funded by Health Protection Research Unit (grant code NIHR200908). SAb, SF funded by Wellcome Trust (210758/Z/18/Z).
Collapse
|
15
|
Bilinski AM, Salomon JA, Hatfield LA. Adaptive metrics for an evolving pandemic A dynamic approach to area-level COVID-19 risk designations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.15.23285969. [PMID: 36824769 PMCID: PMC9949193 DOI: 10.1101/2023.02.15.23285969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Throughout the COVID-19 pandemic, policymakers have proposed risk metrics, such as the CDC Community Levels, to guide local and state decision-making. However, risk metrics have not reliably predicted key outcomes and often lack transparency in terms of prioritization of false positive versus false negative signals. They have also struggled to maintain relevance over time due to slow and infrequent updates addressing new variants and shifts in vaccine- and infection-induced immunity. We make two contributions to address these weaknesses of risk metrics. We first present a framework to evaluate predictive accuracy based on policy targets related to severe disease and mortality, allowing for explicit preferences toward false negative versus false positive signals. This approach allows policymakers to optimize metrics for specific preferences and interventions. Second, we propose a novel method to update risk thresholds in real-time. We show that this adaptive approach to designating areas as "high risk" improves performance over static metrics in predicting 3-week-ahead mortality and intensive care usage at both state and county levels. We also demonstrate that with our approach, using only new hospital admissions to predict 3-week-ahead mortality and intensive care usage has performed consistently as well as metrics that also include cases and inpatient bed usage. Our results highlight that a key challenge for COVID-19 risk prediction is the changing relationship between indicators and outcomes of policy interest. Adaptive metrics therefore have a unique advantage in a rapidly evolving pandemic context. Significance Statement In the rapidly-evolving COVID-19 pandemic, public health risk metrics often become less relevant over time. Risk metrics are designed to predict future severe disease and mortality based on currently-available surveillance data, such as cases and hospitalizations. However, the relationship between cases, hospitalizations, and mortality has varied considerably over the course of the pandemic, in the context of new variants and shifts in vaccine- and infection-induced immunity. We propose an adaptive approach that regularly updates metrics based on the relationship between surveillance inputs and future outcomes of policy interest. Our method captures changing pandemic dynamics, requires only hospitalization input data, and outperforms static risk metrics in predicting high-risk states and counties.
Collapse
Affiliation(s)
- Alyssa M. Bilinski
- Departments of Health Services, Policy and Practice & Biostatistics, Brown University, 121 S. Main St., Providence, RI 02912 USA
| | - Joshua A. Salomon
- Department of Health Policy, Stanford University, Stanford, CA 94305 USA
| | - Laura A. Hatfield
- Department of Health Care Policy, Harvard Medical School, 180 Longwood Ave., Boston, MA 02115 USA
| |
Collapse
|
16
|
Stolerman LM, Clemente L, Poirier C, Parag KV, Majumder A, Masyn S, Resch B, Santillana M. Using digital traces to build prospective and real-time county-level early warning systems to anticipate COVID-19 outbreaks in the United States. SCIENCE ADVANCES 2023; 9:eabq0199. [PMID: 36652520 PMCID: PMC9848273 DOI: 10.1126/sciadv.abq0199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
Coronavirus disease 2019 (COVID-19) continues to affect the world, and the design of strategies to curb disease outbreaks requires close monitoring of their trajectories. We present machine learning methods that leverage internet-based digital traces to anticipate sharp increases in COVID-19 activity in U.S. counties. In a complementary direction to the efforts led by the Centers for Disease Control and Prevention (CDC), our models are designed to detect the time when an uptrend in COVID-19 activity will occur. Motivated by the need for finer spatial resolution epidemiological insights, we build upon previous efforts conceived at the state level. Our methods-tested in an out-of-sample manner, as events were unfolding, in 97 counties representative of multiple population sizes across the United States-frequently anticipated increases in COVID-19 activity 1 to 6 weeks before local outbreaks, defined when the effective reproduction number Rt becomes larger than 1 for a period of 2 weeks.
Collapse
Affiliation(s)
- Lucas M. Stolerman
- Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Department of Mathematics, Oklahoma State University, Stillwater, OK, USA
| | - Leonardo Clemente
- Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA
- Machine Intelligence Group for the Betterment of Health and the Environment, Network Science Institute, Northeastern University, Boston, MA, USA
| | - Canelle Poirier
- Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
| | - Kris V. Parag
- NIHR Health Protection Research Unit, Behavioural Science and Evaluation, University of Bristol, Bristol, UK
| | | | - Serge Masyn
- Global Public Health, Janssen R&D, Beerse, Belgium
| | - Bernd Resch
- Department of Geoinformatics - Z-GIS, University of Salzburg, Salzburg, Austria
- Center for Geographic Analysis, Harvard University, Cambridge, MA, USA
| | - Mauricio Santillana
- Department of Pediatrics, Harvard Medical School, Boston, MA, USA
- Machine Intelligence Group for the Betterment of Health and the Environment, Network Science Institute, Northeastern University, Boston, MA, USA
- Harvard University, T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
17
|
Howerton E, Runge MC, Bogich TL, Borchering RK, Inamine H, Lessler J, Mullany LC, Probert WJM, Smith CP, Truelove S, Viboud C, Shea K. Context-dependent representation of within- and between-model uncertainty: aggregating probabilistic predictions in infectious disease epidemiology. J R Soc Interface 2023; 20:20220659. [PMID: 36695018 PMCID: PMC9874266 DOI: 10.1098/rsif.2022.0659] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 01/03/2023] [Indexed: 01/26/2023] Open
Abstract
Probabilistic predictions support public health planning and decision making, especially in infectious disease emergencies. Aggregating outputs from multiple models yields more robust predictions of outcomes and associated uncertainty. While the selection of an aggregation method can be guided by retrospective performance evaluations, this is not always possible. For example, if predictions are conditional on assumptions about how the future will unfold (e.g. possible interventions), these assumptions may never materialize, precluding any direct comparison between predictions and observations. Here, we summarize literature on aggregating probabilistic predictions, illustrate various methods for infectious disease predictions via simulation, and present a strategy for choosing an aggregation method when empirical validation cannot be used. We focus on the linear opinion pool (LOP) and Vincent average, common methods that make different assumptions about between-prediction uncertainty. We contend that assumptions of the aggregation method should align with a hypothesis about how uncertainty is expressed within and between predictions from different sources. The LOP assumes that between-prediction uncertainty is meaningful and should be retained, while the Vincent average assumes that between-prediction uncertainty is akin to sampling error and should not be preserved. We provide an R package for implementation. Given the rising importance of multi-model infectious disease hubs, our work provides useful guidance on aggregation and a deeper understanding of the benefits and risks of different approaches.
Collapse
Affiliation(s)
- Emily Howerton
- Department of Biology and Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA, USA
| | - Michael C. Runge
- Eastern Ecological Science Center at the Patuxent Research Refuge, U.S. Geological Survey, Laurel, MD, USA
| | - Tiffany L. Bogich
- Department of Biology and Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA, USA
| | - Rebecca K. Borchering
- Department of Biology and Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA, USA
| | - Hidetoshi Inamine
- Department of Biology and Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA, USA
| | - Justin Lessler
- Department of Epidemiology and Carolina Population Center, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Luke C. Mullany
- Applied Physics Laboratory, Johns Hopkins University, Baltimore, MD, USA
| | - William J. M. Probert
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
| | - Claire P. Smith
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Shaun Truelove
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
- Department of International Health, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD, USA
| | - Cécile Viboud
- Fogarty International Center, National Institutes of Health, Bethesda, MD, USA
| | - Katriona Shea
- Department of Biology and Center for Infectious Disease Dynamics, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
18
|
Bracher J, Wolffram D, Deuschel J, Görgen K, Ketterer JL, Ullrich A, Abbott S, Barbarossa MV, Bertsimas D, Bhatia S, Bodych M, Bosse NI, Burgard JP, Castro L, Fairchild G, Fiedler J, Fuhrmann J, Funk S, Gambin A, Gogolewski K, Heyder S, Hotz T, Kheifetz Y, Kirsten H, Krueger T, Krymova E, Leithäuser N, Li ML, Meinke JH, Miasojedow B, Michaud IJ, Mohring J, Nouvellet P, Nowosielski JM, Ozanski T, Radwan M, Rakowski F, Scholz M, Soni S, Srivastava A, Gneiting T, Schienle M. National and subnational short-term forecasting of COVID-19 in Germany and Poland during early 2021. COMMUNICATIONS MEDICINE 2022; 2:136. [PMID: 36352249 PMCID: PMC9622804 DOI: 10.1038/s43856-022-00191-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 09/22/2022] [Indexed: 11/07/2022] Open
Abstract
BACKGROUND During the COVID-19 pandemic there has been a strong interest in forecasts of the short-term development of epidemiological indicators to inform decision makers. In this study we evaluate probabilistic real-time predictions of confirmed cases and deaths from COVID-19 in Germany and Poland for the period from January through April 2021. METHODS We evaluate probabilistic real-time predictions of confirmed cases and deaths from COVID-19 in Germany and Poland. These were issued by 15 different forecasting models, run by independent research teams. Moreover, we study the performance of combined ensemble forecasts. Evaluation of probabilistic forecasts is based on proper scoring rules, along with interval coverage proportions to assess calibration. The presented work is part of a pre-registered evaluation study. RESULTS We find that many, though not all, models outperform a simple baseline model up to four weeks ahead for the considered targets. Ensemble methods show very good relative performance. The addressed time period is characterized by rather stable non-pharmaceutical interventions in both countries, making short-term predictions more straightforward than in previous periods. However, major trend changes in reported cases, like the rebound in cases due to the rise of the B.1.1.7 (Alpha) variant in March 2021, prove challenging to predict. CONCLUSIONS Multi-model approaches can help to improve the performance of epidemiological forecasts. However, while death numbers can be predicted with some success based on current case and hospitalization data, predictability of case numbers remains low beyond quite short time horizons. Additional data sources including sequencing and mobility data, which were not extensively used in the present study, may help to improve performance.
Collapse
Affiliation(s)
- Johannes Bracher
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany.
- Computational Statistics Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany.
| | - Daniel Wolffram
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
- Computational Statistics Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
- HIDSS4Health - Helmholtz Information and Data Science School for Health, Karlsruhe/Heidelberg, Germany
| | - Jannik Deuschel
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Konstantin Görgen
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Jakob L Ketterer
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | | | - Sam Abbott
- London School of Hygiene and Tropical Medicine, London, UK
| | | | - Dimitris Bertsimas
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sangeeta Bhatia
- MRC Centre for Global Infectious Disease Analysis, Abdul Latif Jameel Institute for Disease and Emergency Analytics (J-IDEA), Imperial College London, London, UK
| | - Marcin Bodych
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | - Nikos I Bosse
- London School of Hygiene and Tropical Medicine, London, UK
| | - Jan Pablo Burgard
- Economic and Social Statistics Department, University of Trier, Trier, Germany
| | - Lauren Castro
- Information Systems and Modeling, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Geoffrey Fairchild
- Information Systems and Modeling, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Jochen Fiedler
- Fraunhofer Institute for Industrial Mathematics (ITWM), Kaiserslautern, Germany
| | - Jan Fuhrmann
- Institute for Applied Mathematics, University of Heidelberg, Heidelberg, Germany
| | - Sebastian Funk
- London School of Hygiene and Tropical Medicine, London, UK
| | - Anna Gambin
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Krzysztof Gogolewski
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Stefan Heyder
- Institute of Mathematics, Technische Universität Ilmenau, Ilmenau, Germany
| | - Thomas Hotz
- Institute of Mathematics, Technische Universität Ilmenau, Ilmenau, Germany
| | - Yuri Kheifetz
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| | - Holger Kirsten
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| | - Tyll Krueger
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | - Ekaterina Krymova
- Swiss Data Science Center, ETH Zürich and EPF Lausanne, Zürich, Switzerland
| | - Neele Leithäuser
- Fraunhofer Institute for Industrial Mathematics (ITWM), Kaiserslautern, Germany
| | - Michael L Li
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jan H Meinke
- Jülich Supercomputing Centre, Forschungszentrum Jülich, Jülich, Germany
| | - Błażej Miasojedow
- Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
| | - Isaac J Michaud
- Statistical Sciences Group, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Jan Mohring
- Fraunhofer Institute for Industrial Mathematics (ITWM), Kaiserslautern, Germany
| | | | - Jedrzej M Nowosielski
- Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
| | - Tomasz Ozanski
- Wroclaw University of Science and Technology, Wroclaw, Poland
| | - Maciej Radwan
- Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
| | - Franciszek Rakowski
- Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, Warsaw, Poland
| | - Markus Scholz
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Leipzig, Germany
| | - Saksham Soni
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ajitesh Srivastava
- Ming Hsieh Department of Computer and Electrical Engineering, University of Southern California, Los Angeles, CA, USA
| | - Tilmann Gneiting
- Computational Statistics Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
- Institute for Stochastics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
| | - Melanie Schienle
- Chair of Statistical Methods and Econometrics, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany.
- Computational Statistics Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany.
| |
Collapse
|
19
|
Cramer EY, Huang Y, Wang Y, Ray EL, Cornell M, Bracher J, Brennen A, Rivadeneira AJC, Gerding A, House K, Jayawardena D, Kanji AH, Khandelwal A, Le K, Mody V, Mody V, Niemi J, Stark A, Shah A, Wattanchit N, Zorn MW, Reich NG. The United States COVID-19 Forecast Hub dataset. Sci Data 2022. [PMID: 35915104 DOI: 10.1101/2021.11.04.21265886v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023] Open
Abstract
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
Collapse
Affiliation(s)
- Estee Y Cramer
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Yuxin Huang
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Yijin Wang
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Evan L Ray
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Matthew Cornell
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Johannes Bracher
- Chair of Econometrics and Statistics, Karlsruhe Institute of Technology, Karlsruhe, 76185, Germany
- Computational Statistics Group, Heidelberg Institute for Theoretical Studies, Heidelberg, 69118, Germany
| | | | | | - Aaron Gerding
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Katie House
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Dasuni Jayawardena
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Abdul Hannan Kanji
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Ayush Khandelwal
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Khoa Le
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Vidhi Mody
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Vrushti Mody
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Jarad Niemi
- Department of Statistics, Iowa State University, Ames, IA, 50011, USA
| | - Ariane Stark
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Apurv Shah
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Nutcha Wattanchit
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Martha W Zorn
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA
| | - Nicholas G Reich
- Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA, 01003, USA.
| |
Collapse
|
20
|
Cramer EY, Huang Y, Wang Y, Ray EL, Cornell M, Bracher J, Brennen A, Rivadeneira AJC, Gerding A, House K, Jayawardena D, Kanji AH, Khandelwal A, Le K, Mody V, Mody V, Niemi J, Stark A, Shah A, Wattanchit N, Zorn MW, Reich NG. The United States COVID-19 Forecast Hub dataset. Sci Data 2022; 9:462. [PMID: 35915104 PMCID: PMC9342845 DOI: 10.1038/s41597-022-01517-w] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 06/29/2022] [Indexed: 02/02/2023] Open
Abstract
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
Collapse
Affiliation(s)
- Estee Y. Cramer
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Yuxin Huang
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Yijin Wang
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Evan L. Ray
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Matthew Cornell
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Johannes Bracher
- grid.7892.40000 0001 0075 5874Chair of Econometrics and Statistics, Karlsruhe Institute of Technology, Karlsruhe, 76185 Germany ,grid.424699.40000 0001 2275 2842Computational Statistics Group, Heidelberg Institute for Theoretical Studies, Heidelberg, 69118 Germany
| | | | - Alvaro J. Castro Rivadeneira
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Aaron Gerding
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Katie House
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Dasuni Jayawardena
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Abdul Hannan Kanji
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Ayush Khandelwal
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Khoa Le
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Vidhi Mody
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Vrushti Mody
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Jarad Niemi
- grid.34421.300000 0004 1936 7312Department of Statistics, Iowa State University, Ames, IA 50011 USA
| | - Ariane Stark
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Apurv Shah
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Nutcha Wattanchit
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Martha W. Zorn
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | - Nicholas G. Reich
- grid.266683.f0000 0001 2166 5835Department of Biostatistics and Epidemiology, University of Massachusetts Amherst, Amherst, MA 01003 USA
| | | |
Collapse
|