1
|
Alifa M, Castruccio S, Bolster D, Bravo MA, Crippa P. Uncertainty Reduction and Environmental Justice in Air Pollution Epidemiology: The Importance of Minority Representation. Geohealth 2023; 7:e2023GH000854. [PMID: 37780098 PMCID: PMC10538591 DOI: 10.1029/2023gh000854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 08/29/2023] [Accepted: 08/31/2023] [Indexed: 10/03/2023]
Abstract
Ambient air pollution is an increasing threat to society, with rising numbers of adverse outcomes and exposure inequalities worldwide. Reducing uncertainty in health outcomes models and exposure disparity studies is therefore essential to develop policies effective in protecting the most affected places and populations. This study uses the concept of information entropy to study tradeoffs in mortality uncertainty reduction from increasing input data of air pollution versus health outcomes. We study a case scenario for short-term mortality from particulate matter (PM2.5) in North Carolina for 2001-2016, employing a case-crossover design with inputs from an individual-level mortality data set and high-resolution gridded data sets of PM2.5 and weather covariates. We find a significant association between mortality and PM2.5, and the information tradeoffs indicate that a 10% increase in mortality information reduces model uncertainty three times more than increased resolution of the air pollution model from 12 to 1 km. We also find that Non-Hispanic Black (NHB) residents tend to live in relatively more polluted census tracts, and that the mean PM2.5 for NHB cases in the mortality model is significantly higher than that of Non-Hispanic White cases. The distinct distribution of PM2.5 for NHB cases results in a relatively higher information value, and therefore faster uncertainty reduction, for new NHB cases introduced into the mortality model. This newfound influence of exposure disparities in the rate of uncertainty reduction highlights the importance of minority representation in environmental research as a quantitative advantage to produce more confident estimates of the true effects of environmental pollution.
Collapse
Affiliation(s)
- Mariana Alifa
- Department of Civil and Environmental Engineering and Earth SciencesUniversity of Notre DameNotre DameINUSA
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and StatisticsUniversity of Notre DameNotre DameINUSA
| | - Diogo Bolster
- Department of Civil and Environmental Engineering and Earth SciencesUniversity of Notre DameNotre DameINUSA
| | - Mercedes A. Bravo
- Global Health InstituteDuke UniversityDurhamNCUSA
- Children's Environmental Health InitiativeUniversity of Notre DameSouth BendINUSA
| | - Paola Crippa
- Department of Civil and Environmental Engineering and Earth SciencesUniversity of Notre DameNotre DameINUSA
| |
Collapse
|
2
|
Wikle CK, Datta A, Hari BV, Boone EL, Sahoo I, Kavila I, Castruccio S, Simmons SJ, Burr WS, Chang W. An illustration of model agnostic explainability methods applied to environmental data. Environmetrics 2023; 34:e2772. [PMID: 37200542 PMCID: PMC10187774 DOI: 10.1002/env.2772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 09/20/2022] [Indexed: 05/20/2023]
Abstract
Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub-discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In this article, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: "feature shuffling", "interpretable local surrogates", and "occlusion analysis". We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long-lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean.
Collapse
Affiliation(s)
| | - Abhirup Datta
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland, USA
| | | | - Edward L. Boone
- Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Indranil Sahoo
- Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, Virginia, USA
| | - Indulekha Kavila
- School of Pure and Applied Physics, Mahatma Gandhi University, Athirampuzha, Kerala, India
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, Indiana, USA
| | - Susan J. Simmons
- Institute for Advanced Analytics, North Carolina State University, Raleigh, North Carolina, USA
| | - Wesley S. Burr
- Department of Mathematics, Trent University, Peterborough, Ontario, Canada
| | - Won Chang
- Department of Mathematical Sciences, University of Cincinnati, Cincinnati, Ohio, USA
| |
Collapse
|
3
|
Alifa M, Castruccio S, Bolster D, Bravo M, Crippa P. Information entropy tradeoffs for efficient uncertainty reduction in estimates of air pollution mortality. Environ Res 2022; 212:113587. [PMID: 35654155 DOI: 10.1016/j.envres.2022.113587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 05/18/2022] [Accepted: 05/28/2022] [Indexed: 06/15/2023]
Abstract
Implementing effective policy to protect human health from the adverse effects of air pollution, such as premature mortality, requires reducing the uncertainty in health outcomes models. Here we present a novel method to reduce mortality uncertainty by increasing the amount of input data of air pollution and health outcomes, and then quantifying tradeoffs associated with the different data gained. We first present a study of long-term mortality from fine particulate matter (PM2.5) based on simulated data, followed by a real-world application of short-term PM2.5-related mortality in an urban area. We employ information yield curves to identify which variables more effectively reduce mortality uncertainty when increasing information. Our methodology can be used to explore how specific pollution scenarios will impact mortality and thus improve decision-making. The proposed framework is general and can be applied to any real case-scenario where knowledge in pollution, demographics, or health outcomes can be augmented through data acquisition or model improvements to generate more robust risk assessments.
Collapse
Affiliation(s)
- Mariana Alifa
- Department of Civil and Environmental Engineering and Earth Sciences, University of Notre Dame, Notre Dame, IN, USA
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, USA
| | - Diogo Bolster
- Department of Civil and Environmental Engineering and Earth Sciences, University of Notre Dame, Notre Dame, IN, USA
| | - Mercedes Bravo
- Global Health Institute, Duke University, Durham, NC, USA; Children's Environmental Health Initiative, University of Notre Dame, South Bend, IN, USA
| | - Paola Crippa
- Department of Civil and Environmental Engineering and Earth Sciences, University of Notre Dame, Notre Dame, IN, USA.
| |
Collapse
|
4
|
Huang H, Castruccio S, Genton MG. Forecasting high‐frequency spatio‐temporal wind power with dimensionally reduced echo state networks. J R Stat Soc Ser C Appl Stat 2022. [DOI: 10.1111/rssc.12540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Huang Huang
- Statistics ProgramKing Abdullah University of Science and Technology ThuwalSaudi Arabia
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and StatisticsUniversity of Notre Dame Notre DameIndianaUSA
| | - Marc G. Genton
- Statistics ProgramKing Abdullah University of Science and Technology ThuwalSaudi Arabia
| |
Collapse
|
5
|
Zhang J, Crippa P, Genton MG, Castruccio S. Assessing the reliability of wind power operations under a changing climate with a non-Gaussian bias correction. Ann Appl Stat 2021. [DOI: 10.1214/21-aoas1460] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Jiachen Zhang
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame
| | - Paola Crippa
- Department of Civil and Environmental Engineering and Geosciences, University of Notre Dame
| | - Marc G. Genton
- Statistics Program, King Abdullah University of Science and Technology
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame
| |
Collapse
|
6
|
Hu W, Fuglstad G, Castruccio S. A Stochastic Locally Diffusive Model with Neural Network‐Based Deformations for Global Sea Surface Temperature. Stat (Int Stat Inst) 2021. [DOI: 10.1002/sta4.431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Wenjing Hu
- Department of Applied and Computational Mathematics and Statistics University of Notre Dame Indiana USA
| | - Geir‐Arne Fuglstad
- Department of Mathematical Sciences Norwegian University of Science and Technology Trondheim Norway
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics University of Notre Dame Indiana USA
| |
Collapse
|
7
|
Aquino B, Castruccio S, Gupta V, Howard S. Spatial modeling of mid-infrared spectral data with thermal compensation using integrated nested Laplace approximation. Appl Opt 2021; 60:8609-8615. [PMID: 34612963 DOI: 10.1364/ao.435918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Accepted: 08/25/2021] [Indexed: 06/13/2023]
Abstract
The problem of analyzing substances using low-cost sensors with a low signal-to-noise ratio (SNR) remains challenging. Using accurate models for the spectral data is paramount for the success of any classification task. We demonstrate that the thermal compensation of sample heating and spatial variability analysis yield lower modeling errors than non-spatial modeling. Then, we obtain the inference of the spectral data probability density functions using the integrated nested Laplace approximation (INLA) on a Bayesian hierarchical model. To achieve this goal, we use the fast and user-friendly R-INLA package in R for the computation. This approach allows affordable and real-time substance identification with fewer SNR sensor measurements, thereby potentially increasing throughput and lowering costs.
Collapse
|
8
|
Edwards M, Castruccio S, Hammerling D. Marginally parameterized spatio-temporal models and stepwise maximum likelihood estimation. Comput Stat Data Anal 2020. [DOI: 10.1016/j.csda.2020.107018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
9
|
Affiliation(s)
- Amanda Lenzi
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN
| | - Håvard Rue
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Marc G. Genton
- Statistics Program, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
10
|
Giani P, Castruccio S, Anav A, Howard D, Hu W, Crippa P. Short-term and long-term health impacts of air pollution reductions from COVID-19 lockdowns in China and Europe: a modelling study. Lancet Planet Health 2020; 4:e474-e482. [PMID: 32976757 PMCID: PMC7508534 DOI: 10.1016/s2542-5196(20)30224-2] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 08/25/2020] [Accepted: 08/26/2020] [Indexed: 05/19/2023]
Abstract
BACKGROUND Exposure to poor air quality leads to increased premature mortality from cardiovascular and respiratory diseases. Among the far-reaching implications of the ongoing COVID-19 pandemic, a substantial improvement in air quality was observed worldwide after the lockdowns imposed by many countries. We aimed to assess the implications of different lockdown measures on air pollution levels in Europe and China, as well as the short-term and long-term health impact. METHODS For this modelling study, observations of fine particulate matter (PM2·5) concentrations from more than 2500 stations in Europe and China during 2016-20 were integrated with chemical transport model simulations to reconstruct PM2·5 fields at high spatiotemporal resolution. The health benefits, expressed as short-term and long-term avoided mortality from PM2·5 exposure associated with the interventions imposed to control the COVID-19 pandemic, were quantified on the basis of the latest epidemiological studies. To explore the long-term variability in air quality and associated premature mortality, we built different scenarios of economic recovery (immediate or gradual resumption of activities, a second outbreak in autumn, and permanent lockdown for the whole of 2020). FINDINGS The lockdown interventions led to a reduction in population-weighted PM2·5 of 14·5 μg m-3 across China (-29·7%) and 2·2 μg m-3 across Europe (-17·1%), with unprecedented reductions of 40 μg m-3 in bimonthly mean PM2·5 in the areas most affected by COVID-19 in China. In the short term, an estimated 24 200 (95% CI 22 380-26 010) premature deaths were averted throughout China between Feb 1 and March 31, and an estimated 2190 (1960-2420) deaths were averted in Europe between Feb 21 and May 17. We also estimated a positive number of long-term avoided premature fatalities due to reduced PM2·5 concentrations, ranging from 76 400 (95% CI 62 600-86 900) to 287 000 (233 700-328 300) for China, and from 13 600 (11 900-15 300) to 29 500 (25 800-33 300) for Europe, depending on the future scenarios of economic recovery adopted. INTERPRETATION These results indicate that lockdown interventions led to substantial reductions in PM2·5 concentrations in China and Europe. We estimated that tens of thousands of premature deaths from air pollution were avoided, although with significant differences observed in Europe and China. Our findings suggest that considerable improvements in air quality are achievable in both China and Europe when stringent emission control policies are adopted. FUNDING None.
Collapse
Affiliation(s)
- Paolo Giani
- Department of Civil and Environmental Engineering and Earth Sciences, University of Notre Dame, Notre Dame, IN, USA
| | - Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, USA
| | - Alessandro Anav
- Climate Modeling Laboratory, Italian National Agency for New Technologies, Energy and Sustainable Economic Development (ENEA), Centro Ricerche Casaccia, Rome, Italy
| | - Don Howard
- Department of Philosophy, University of Notre Dame, Notre Dame, IN, USA
| | - Wenjing Hu
- Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN, USA
| | - Paola Crippa
- Department of Civil and Environmental Engineering and Earth Sciences, University of Notre Dame, Notre Dame, IN, USA.
| |
Collapse
|
11
|
|
12
|
|
13
|
Jeong J, Yan Y, Castruccio S, Genton MG. A Stochastic Generator of Global Monthly Wind Energy with Tukey g-and-h Autoregressive Processes. Stat Sin 2019. [DOI: 10.5705/ss.202017.0474] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
14
|
|
15
|
|
16
|
Castruccio S, Ombao H, Genton MG. A scalable multi-resolution spatio-temporal model for brain activation and connectivity in fMRI data. Biometrics 2018; 74:823-833. [DOI: 10.1111/biom.12844] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2018] [Revised: 10/01/2018] [Accepted: 11/01/2017] [Indexed: 11/27/2022]
Affiliation(s)
- Stefano Castruccio
- Department of Applied and Computational Mathematics and Statistics; University of Notre Dame; 153 Hurley Hall, Notre Dame Indiana 46556 U.S.A
| | - Hernando Ombao
- Statistics Program; King Abdullah University of Science and Technology (KAUST); Thuwal 23955-6900 Saudi Arabia
| | - Marc G. Genton
- Statistics Program; King Abdullah University of Science and Technology (KAUST); Thuwal 23955-6900 Saudi Arabia
| |
Collapse
|
17
|
|
18
|
Affiliation(s)
- Stefano Castruccio
- School of Mathematics & Statistics, Newcastle University, Newcastle Upon Tyne, NE1 7RU United Kingdom
| | - Marc G. Genton
- CEMSE Division, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Saudi Arabia
| |
Collapse
|
19
|
Castruccio S, Guinness J. An evolutionary spectrum approach to incorporate large-scale geographical descriptors on global processes. J R Stat Soc Ser C Appl Stat 2016. [DOI: 10.1111/rssc.12167] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
20
|
Affiliation(s)
- Marc G. Genton
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Stefano Castruccio
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Paola Crippa
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Subhajit Dutta
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Raphaël Huser
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Ying Sun
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Sabrina Vettori
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| |
Collapse
|
21
|
Affiliation(s)
- Stefano Castruccio
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| | - Marc G. Genton
- CEMSE Division; King Abdullah University of Science and Technology; Thuwal 23955-6900 Saudi Arabia
| |
Collapse
|
22
|
|