1
|
Libardi ADLC, Masselot P, Schneider R, Nightingale E, Milojevic A, Vanoli J, Mistry MN, Gasparrini A. High resolution mapping of nitrogen dioxide and particulate matter in Great Britain (2003-2021) with multi-stage data reconstruction and ensemble machine learning methods. ATMOSPHERIC POLLUTION RESEARCH 2024; 15:102284. [PMID: 39175565 PMCID: PMC7616380 DOI: 10.1016/j.apr.2024.102284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2024]
Abstract
In this contribution, we applied a multi-stage machine learning (ML) framework to map daily values of nitrogen dioxide (NO2) and particulate matter (PM10 and PM2.5) at a 1 km2 resolution over Great Britain for the period 2003-2021. The process combined ground monitoring observations, satellite-derived products, climate reanalyses and chemical transport model datasets, and traffic and land-use data. Each feature was harmonized to 1 km resolution and extracted at monitoring sites. Models used single and ensemble-based algorithms featuring random forests (RF), extreme gradient boosting (XGB), light gradient boosting machine (LGBM), as well as lasso and ridge regression. The various stages focused on augmenting PM2.5 using co-occurring PM10 values, gap-filling aerosol optical depth and columnar NO2 data obtained from satellite instruments, and finally the training of an ensemble model and the prediction of daily values across the whole geographical domain (2003-2021). Results show a good ensemble model performance, calculated through a ten-fold monitor-based cross-validation procedure, with an average R2 of 0.690 (range 0.611-0.792) for NO2, 0.704 (0.609-0.786) for PM10, and 0.802 (0.746-0.888) for PM2.5. Reconstructed pollution levels decreased markedly within the study period, with a stronger reduction in the latter eight years. The pollutants exhibited different spatial patterns, while NO2 rose in close proximity to high-traffic areas, PM demonstrated variation at a larger scale. The resulting 1 km2 spatially resolved daily datasets allow for linkage with health data across Great Britain over nearly two decades, thus contributing to extensive, extended, and detailed research on the long-and short-term health effects of air pollution.
Collapse
Affiliation(s)
- Arturo de la Cruz Libardi
- Environment & Health Modelling (EHM) Lab, Department of Public Health Environments and Society, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, WC1H 9SH, London, United Kingdom
| | - Pierre Masselot
- Environment & Health Modelling (EHM) Lab, Department of Public Health Environments and Society, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, WC1H 9SH, London, United Kingdom
| | - Rochelle Schneider
- Φ-lab (Phi-lab), European Space Agency (ESA), Frascati, Italy
- Forecast Department, European Centre for Medium-Range Weather Forecast (ECMWF), Reading, United Kingdom
| | - Emily Nightingale
- Department of Infectious Disease Epidemiology and Dynamics, London School of Hygiene & Tropical Medicine, Keppel Street, WC1E 7HT, London, United Kingdom
| | - Ai Milojevic
- Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, WC1H 9SH, London, United Kingdom
- Centre on Climate Change & Planetary Health, London School of Hygiene & Tropical Medicine, Keppel Street, WC1E 7HT, London, United Kingdom
| | - Jacopo Vanoli
- Environment & Health Modelling (EHM) Lab, Department of Public Health Environments and Society, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, WC1H 9SH, London, United Kingdom
- School of Tropical Medicine and Global Health, Nagasaki University, Nagasaki, Japan
| | - Malcolm N. Mistry
- Environment & Health Modelling (EHM) Lab, Department of Public Health Environments and Society, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, WC1H 9SH, London, United Kingdom
- Department of Economics, Ca’ Foscari University of Venice, Italy
| | - Antonio Gasparrini
- Environment & Health Modelling (EHM) Lab, Department of Public Health Environments and Society, London School of Hygiene & Tropical Medicine, 15-17 Tavistock Place, WC1H 9SH, London, United Kingdom
| |
Collapse
|
2
|
Sun W, Lu K, Li R. Global estimates of ambient NO 2 concentrations and long-term health effects during 2000-2019. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 359:124562. [PMID: 39019310 DOI: 10.1016/j.envpol.2024.124562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 07/05/2024] [Accepted: 07/14/2024] [Indexed: 07/19/2024]
Abstract
High concentrations of ambient NO2 causes serious air pollution and could also pose great threats to human health. However, the long-term trends (20-year) and potential health effects of ambient NO2 exposure globally still shows high uncertainties. In this work, the field measurements, satellite dataset, GEOS-Chem output, and multiple geographical covariates were incorporated into the multi-stage model to investigate the global evolutions of ambient NO2 during 2000-2019. The results indicated that the cross-validation (CV) R2 values of ambient NO2 based on multi-stage model displayed satisfied performance (R2 = 0.78), which was superior to the individual model. Besides, the out-of-bag R2 was 0.75, which suggested the multi-stage model showed the better transferability. At the spatial scale, the NO2 concentrations followed the order of China (16.9 ± 9.0 μg/m3) > India (15.5 ± 5.6 μg/m3) > United States (10.7 ± 5.6 μg/m3) > Europe (7.7 ± 4.5 μg/m3), which was in consistent with the anthropogenic NOx emission. At the temporal scale, the ambient NO2 levels in China experienced persistent increases (0.29 μg/m3/year) during 2000-2013, whereas they showed slight decreases (-0.23 μg/m3/year) during 2013-2019. The ambient NO2 levels in the United States experienced continuous decreases during 2000-2019 (-0.20 μg/m3/year), while both of India and Europe remained relatively stable. Long-term NO2 exposure inevitably increased premature mortalities. The global premature all-cause mortalities associated with the excessive NO2 exposure increased from 288,169 (95% CI: 43,650, 527,971) to 461,301 (95% CI: 69,973, 843,996) in the past 20 years. This study would provide sufficient policy support for future ambient NO2 mitigation.
Collapse
Affiliation(s)
- Wenwen Sun
- Department of Research, Shanghai University of Medicine & Health Sciences Affiliated Zhoupu Hospital, Shanghai, 201318, PR China; Department of Atmospheric and Oceanic Sciences, Fudan University, Shanghai, 200032, PR China
| | - Kuangyi Lu
- College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, 201318, PR China
| | - Rui Li
- Key Laboratory of Geographic Information Science of the Ministry of Education, School of Geographic Sciences, East China Normal University, Shanghai, 200241, PR China; Institute of Eco-Chongming (IEC), 20 Cuiniao Road, Chenjia Town, Chongming District, Shanghai, 202162, PR China.
| |
Collapse
|
3
|
Clark LP, Zilber D, Schmitt C, Fargo DC, Reif DM, Motsinger-Reif AA, Messier KP. A review of geospatial exposure models and approaches for health data integration. JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2024:10.1038/s41370-024-00712-8. [PMID: 39251872 DOI: 10.1038/s41370-024-00712-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 08/01/2024] [Accepted: 08/05/2024] [Indexed: 09/11/2024]
Abstract
BACKGROUND Geospatial methods are common in environmental exposure assessments and increasingly integrated with health data to generate comprehensive models of environmental impacts on public health. OBJECTIVE Our objective is to review geospatial exposure models and approaches for health data integration in environmental health applications. METHODS We conduct a literature review and synthesis. RESULTS First, we discuss key concepts and terminology for geospatial exposure data and models. Second, we provide an overview of workflows in geospatial exposure model development and health data integration. Third, we review modeling approaches, including proximity-based, statistical, and mechanistic approaches, across diverse exposure types, such as air quality, water quality, climate, and socioeconomic factors. For each model type, we provide descriptions, general equations, and example applications for environmental exposure assessment. Fourth, we discuss the approaches used to integrate geospatial exposure data and health data, such as methods to link data sources with disparate spatial and temporal scales. Fifth, we describe the landscape of open-source tools supporting these workflows.
Collapse
Affiliation(s)
- Lara P Clark
- National Institute of Environmental Health Sciences, Office of the Scientific Director, Office of Data Science, Durham, NC, USA
| | - Daniel Zilber
- National Institute of Environmental Health Sciences, Division of Translational Toxicology, Predictive Toxicology Branch, Durham, NC, USA
| | - Charles Schmitt
- National Institute of Environmental Health Sciences, Office of the Scientific Director, Office of Data Science, Durham, NC, USA
| | - David C Fargo
- National Institute of Environmental Health Sciences, Office of the Director, Office of Environmental Science Cyberinfrastructure, Durham, NC, USA
| | - David M Reif
- National Institute of Environmental Health Sciences, Division of Translational Toxicology, Predictive Toxicology Branch, Durham, NC, USA
| | - Alison A Motsinger-Reif
- National Institute of Environmental Health Sciences, Division of Intramural Research, Biostatistics and Computational Biology Branch, Durham, NC, USA
| | - Kyle P Messier
- National Institute of Environmental Health Sciences, Division of Translational Toxicology, Predictive Toxicology Branch, Durham, NC, USA.
- National Institute of Environmental Health Sciences, Division of Intramural Research, Biostatistics and Computational Biology Branch, Durham, NC, USA.
| |
Collapse
|
4
|
Venkatraman Jagatha J, Schneider C, Sauter T. Parsimonious Random-Forest-Based Land-Use Regression Model Using Particulate Matter Sensors in Berlin, Germany. SENSORS (BASEL, SWITZERLAND) 2024; 24:4193. [PMID: 39000970 PMCID: PMC11244214 DOI: 10.3390/s24134193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 06/07/2024] [Accepted: 06/21/2024] [Indexed: 07/16/2024]
Abstract
Machine learning (ML) methods are widely used in particulate matter prediction modelling, especially through use of air quality sensor data. Despite their advantages, these methods' black-box nature obscures the understanding of how a prediction has been made. Major issues with these types of models include the data quality and computational intensity. In this study, we employed feature selection methods using recursive feature elimination and global sensitivity analysis for a random-forest (RF)-based land-use regression model developed for the city of Berlin, Germany. Land-use-based predictors, including local climate zones, leaf area index, daily traffic volume, population density, building types, building heights, and street types were used to create a baseline RF model. Five additional models, three using recursive feature elimination method and two using a Sobol-based global sensitivity analysis (GSA), were implemented, and their performance was compared against that of the baseline RF model. The predictors that had a large effect on the prediction as determined using both the methods are discussed. Through feature elimination, the number of predictors were reduced from 220 in the baseline model to eight in the parsimonious models without sacrificing model performance. The model metrics were compared, which showed that the parsimonious_GSA-based model performs better than does the baseline model and reduces the mean absolute error (MAE) from 8.69 µg/m3 to 3.6 µg/m3 and the root mean squared error (RMSE) from 9.86 µg/m3 to 4.23 µg/m3 when applying the trained model to reference station data. The better performance of the GSA_parsimonious model is made possible by the curtailment of the uncertainties propagated through the model via the reduction of multicollinear and redundant predictors. The parsimonious model validated against reference stations was able to predict the PM2.5 concentrations with an MAE of less than 5 µg/m3 for 10 out of 12 locations. The GSA_parsimonious performed best in all model metrics and improved the R2 from 3% in the baseline model to 17%. However, the predictions exhibited a degree of uncertainty, making it unreliable for regional scale modelling. The GSA_parsimonious model can nevertheless be adapted to local scales to highlight the land-use parameters that are indicative of PM2.5 concentrations in Berlin. Overall, population density, leaf area index, and traffic volume are the major predictors of PM2.5, while building type and local climate zones are the less significant predictors. Feature selection based on sensitivity analysis has a large impact on the model performance. Optimising models through sensitivity analysis can enhance the interpretability of the model dynamics and potentially reduce computational costs and time when modelling is performed for larger areas.
Collapse
Affiliation(s)
| | - Christoph Schneider
- Geography Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
| | - Tobias Sauter
- Geography Department, Humboldt-Universität zu Berlin, Unter den Linden 6, 10099 Berlin, Germany
| |
Collapse
|
5
|
Parker JD, Mirel LB, Lee P, Mintz R, Tungate A, Vaidyanathan A. Evaluating data quality for blended data using a data quality framework. STATISTICAL JOURNAL OF THE IAOS 2024; 40:125-136. [PMID: 38800620 PMCID: PMC11117461 DOI: 10.3233/sji-230125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
In 2020 the U.S. Federal Committee on Statistical Methodology (FCSM) released "A Framework for Data Quality", organized by 11 dimensions of data quality grouped among three domains of quality (utility, objectivity, integrity). This paper addresses the use of the FCSM Framework for data quality assessments of blended data. The FCSM Framework applies to all types of data, however best practices for implementation have not been documented. We applied the FCSM Framework for three health-research related case studies. For each case study, assessments of data quality dimensions were performed to identify threats to quality, possible mitigations of those threats, and trade-offs among them. From these assessments the authors concluded: 1) data quality assessments are more complex in practice than anticipated and expert guidance and documentation are important; 2) each dimension may not be equally important for different data uses; 3) data quality assessments can be subjective and having a quantitative tool could help explain the results, however, quantitative assessments may be closely tied to the intended use of the dataset; 4) there are common trade-offs and mitigations for some threats to quality among dimensions. This paper is one of the first to apply the FCSM Framework to specific use-cases and illustrates a process for similar data uses.
Collapse
Affiliation(s)
- Jennifer D. Parker
- National Center for Health Statistics, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services
| | - Lisa B. Mirel
- National Center for Science and Engineering Statistics, National Science Foundation
| | - Phillip Lee
- Administration for Children and Families, U.S. Department of Health and Human Services
| | | | - Andrew Tungate
- Centers for Medicare and Medicaid Services, U.S. Department of Health and Human Services
| | - Ambarish Vaidyanathan
- National Center for Environmental Health, Centers for Disease Control and Prevention, U.S. Department of Health and Human Services
| |
Collapse
|
6
|
Gholami H, Mohammadifar A, Behrooz RD, Kaskaoutis DG, Li Y, Song Y. Intrinsic and extrinsic techniques for quantification uncertainty of an interpretable GRU deep learning model used to predict atmospheric total suspended particulates (TSP) in Zabol, Iran during the dusty period of 120-days wind. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2024; 342:123082. [PMID: 38061429 DOI: 10.1016/j.envpol.2023.123082] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/11/2023] [Accepted: 11/30/2023] [Indexed: 12/17/2023]
Abstract
Total suspended particulates (TSP), as a key pollutant, is a serious threat for air quality, climate, ecosystems and human health. Therefore, measurements, prediction and forecasting of TSP concentrations are necessary to mitigate their negative effects. This study applies the gated recurrent unit (GRU) deep learning model to predict TSP concentrations in Zabol, Iran, during the dust period of the 120-day wind (3 June - 4 October 2014). Three uncertainty quantification (UQ) techniques consisting of the blackbox metamodel, heteroscedastic regression and infinitesimal jackknife were applied to quantify the uncertainty associated with GRU model. Permutation feature importance measure (PFIM), based on the game theory, was employed for the interpretability of the predictive model's outputs. A total of 80 TSP samples were collected and were randomly divided as training (70%) and validation (30%) datasets, while eight variables were used in the TSP prediction model. Our findings showed that GRU performed very well for TSP prediction (with r and Nash Sutcliffe coefficient (NSC) values above 0.99 for both datasets, and RMSE of 57 μg m-3 and 73 μg m-3 for training and validation datasets, respectively). Among the three UQ techniques, the infinitesimal jackknife was the most accurate one, while all the observed and predicted TSP values fell within the continence limitation estimated by the model. PFIM plots showed that wind speed and air humidity were the most and least important variables, respectively, impacting the predictive model's outputs. This is the first attempt of using an interpretable DL model for TSP prediction modelling, recommending that future research should involve aspects of uncertainty and interpretability of the predictive models. Overall, UQ and interpretability techniques have a key role in reducing the impact of uncertainties during optimization and decision making, resulting in better understanding of sophisticated mechanisms related to the predictive model.
Collapse
Affiliation(s)
- Hamid Gholami
- Department of Natural Resources Engineering, University of Hormozgan, Bandar-Abbas, Hormozgan, Iran.
| | - Aliakbar Mohammadifar
- Department of Natural Resources Engineering, University of Hormozgan, Bandar-Abbas, Hormozgan, Iran
| | - Reza Dahmardeh Behrooz
- Department of Environmental Science, Faculty of Natural Resources, University of Zabol, P.O. Box 98615-538, Zabol, Iran
| | - Dimitris G Kaskaoutis
- Department of Chemical Engineering, University of Western Macedonia, Kozani, 50100, Greece
| | - Yue Li
- State Key Laboratory of Loess and Quaternary Geology, Institute of Earth Environment, Chinese Academy of Sciences, Xi'an, 710061, China; Laoshan Laboratory, Qingdao, 266061, China
| | - Yougui Song
- State Key Laboratory of Loess and Quaternary Geology, Institute of Earth Environment, Chinese Academy of Sciences, Xi'an, 710061, China; Laoshan Laboratory, Qingdao, 266061, China.
| |
Collapse
|
7
|
Lyu T, Tang Y, Cao H, Gao Y, Zhou X, Zhang W, Zhang R, Jiang Y. Estimating the geographical patterns and health risks associated with PM 2.5-bound heavy metals to guide PM 2.5 control targets in China based on machine-learning algorithms. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 337:122558. [PMID: 37714401 DOI: 10.1016/j.envpol.2023.122558] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Revised: 09/02/2023] [Accepted: 09/12/2023] [Indexed: 09/17/2023]
Abstract
PM2.5 is the main component of haze, and PM2.5-bound heavy metals (PBHMs) can induce various toxic effects via inhalation. However, comprehensive macroanalyses on large scales are still lacking. In this study, we compiled a substantial dataset consisting of the concentrations of eight PBHMs, including As, Cd, Cr, Cu, Mn, Ni, Pb and Zn, across different cities in China. To improve prediction accuracy, we enhanced the traditional land-use regression (LUR) model by incorporating emission source-related variables and employing the best-fitted machine-learning algorithm, which was applied to predict PBHM concentrations, analyze geographical patterns and assess the health risks associated with metals under different PM2.5 control targets. Our model exhibited excellent performance in predicting the concentrations of PBHMs, with predicted values closely matching measured values. Noncarcinogenic risks exist in 99.4% of the estimated regions, and the carcinogenic risks in all studied regions of the country are within an acceptable range (1 × 10-5-1 × 10-6). In densely populated areas such as Henan, Shandong, and Sichuan, it is imperative to control the concentration of PBHMs to reduce the number of patients with cancer. Controlling PM2.5 effectively decreases both carcinogenic and noncarcinogenic health risks associated with PBHMs, but still exceed acceptable risk level, suggesting that other important emission sources should be given attention.
Collapse
Affiliation(s)
- Tong Lyu
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Yilin Tang
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Hongbin Cao
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China.
| | - Yue Gao
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Xu Zhou
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Wei Zhang
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Ruidi Zhang
- Beijing Area Major Laboratory of Protection and Utilization of Traditional Chinese Medicine, Beijing Normal University, Beijing, 100875, China; Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
| | - Yanxue Jiang
- College of Environment and Ecology, Chongqing University, Chongqing, 400045, China
| |
Collapse
|
8
|
Wongnakae P, Chitchum P, Sripramong R, Phosri A. Application of satellite remote sensing data and random forest approach to estimate ground-level PM 2.5 concentration in Northern region of Thailand. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:88905-88917. [PMID: 37442931 DOI: 10.1007/s11356-023-28698-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 07/05/2023] [Indexed: 07/15/2023]
Abstract
Numerous epidemiological studies have shown that particulate matter with aerodynamic diameter up to 2.5 μm (PM2.5) is associated with many health consequences, where PM2.5 concentration obtained from the monitoring station was normally applied as the exposure level, so that the concentration of PM2.5 in unmonitored areas has not been captured. The satellite-derived aerosol optical depth (AOD) product is then used to spatially predict ground truth of PM2.5 concentration that covers the locations with no air quality monitoring station, but this method has seldom been developed in Thailand. This study aimed at estimating ground-level PM2.5 concentration at 3 km × 3 km spatial resolution over Northern region of Thailand in 2021 using the random forest model integrating the Moderate Resolution Imaging Spectroradiometer (MODIS) AOD products from Terra and Aqua satellites, meteorological factors, and land use data. A random forest model contained 100 decision trees was utilized to train the model, and 10-fold cross-validation approach was implemented to validate the model performance. The good consistency between actual (observed) and predicted concentrations of PM2.5 in Northern region of Thailand was observed, where a coefficient of determination (R2) and root mean square error (RMSE) of the model fitting were 0.803 and 14.30 μg/m3, respectively, and those of 10-fold cross-validation approach were 0.796 and 14.64 μg/m3, respectively. The three most important predictors for estimating the ground-level concentrations of PM2.5 in this study were normalized difference vegetation index (NDVI), relative humidity, and number of fire hotspot, respectively. Findings from this study revealed that integrating the MODIS AOD, meteorological variables, and land use data into the random forest model precisely and accurately estimated ground-level PM2.5 concentration over Northern region of Thailand that can be further used to investigate the effects of PM2.5 exposure on health consequences, even in unmonitored locations, in epidemiological studies.
Collapse
Affiliation(s)
- Pimchanok Wongnakae
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand
| | - Pakkapong Chitchum
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand
| | - Rungduen Sripramong
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand
| | - Arthit Phosri
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand.
- Center of Excellence on Environmental Health and Toxicology (EHT), OPS, Ministry of Higher Education, Research, Science and Innovation, Bangkok, Thailand.
| |
Collapse
|
9
|
Tao H, Jawad AH, Shather AH, Al-Khafaji Z, Rashid TA, Ali M, Al-Ansari N, Marhoon HA, Shahid S, Yaseen ZM. Machine learning algorithms for high-resolution prediction of spatiotemporal distribution of air pollution from meteorological and soil parameters. ENVIRONMENT INTERNATIONAL 2023; 175:107931. [PMID: 37119651 DOI: 10.1016/j.envint.2023.107931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 03/18/2023] [Accepted: 04/11/2023] [Indexed: 05/22/2023]
Abstract
This study uses machine learning (ML) models for a high-resolution prediction (0.1°×0.1°) of air fine particular matter (PM2.5) concentration, the most harmful to human health, from meteorological and soil data. Iraq was considered the study area to implement the method. Different lags and the changing patterns of four European Reanalysis (ERA5) meteorological variables, rainfall, mean temperature, wind speed and relative humidity, and one soil parameter, the soil moisture, were used to select the suitable set of predictors using a non-greedy algorithm known as simulated annealing (SA). The selected predictors were used to simulate the temporal and spatial variability of air PM2.5 concentration over Iraq during the early summer (May-July), the most polluted months, using three advanced ML models, extremely randomized trees (ERT), stochastic gradient descent backpropagation (SGD-BP) and long short-term memory (LSTM) integrated with Bayesian optimizer. The spatial distribution of the annual average PM2.5 revealed the population of the whole of Iraq is exposed to a pollution level above the standard limit. The changes in temperature and soil moisture and the mean wind speed and humidity of the month before the early summer can predict the temporal and spatial variability of PM2.5 over Iraq during May-July. Results revealed the higher performance of LSTM with normalized root-mean-square error and Kling-Gupta efficiency of 13.4% and 0.89, compared to 16.02% and 0.81 for SDG-BP and 17.9% and 0.74 for ERT. The LSTM could also reconstruct the observed spatial distribution of PM2.5 with MapCurve and Cramer's V values of 0.95 and 0.91, compared to 0.9 and 0.86 for SGD-BP and 0.83 and 0.76 for ERT. The study provided a methodology for forecasting spatial variability of PM2.5 concentration at high resolution during the peak pollution months from freely available data, which can be replicated in other regions for generating high-resolution PM2.5 forecasting maps.
Collapse
Affiliation(s)
- Hai Tao
- School of Computer and Information, Qiannan Normal University for Nationalities, Duyun, Guizhou 558000, China; State Key Laboratory of Public Big Data, Guizhou University, Guizhou, Guiyang 550025, China; Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - Ali H Jawad
- Faculty of Applied Sciences, UniversitiTeknologi MARA, 40450 Shah Alam, Selangor, Malaysia.
| | - A H Shather
- Dep of Computer Technology Engineering, Engineering Technical College, University of Alkitab, Iraq.
| | - Zainab Al-Khafaji
- Department of Building and Construction Technologies Engineering, AL-Mustaqbal University College, Hillah 51001, Iraq.
| | - Tarik A Rashid
- Computer Science and Engineering Department, University of Kurdistan Hewler, Erbil, KR, Iraq.
| | - Mumtaz Ali
- UniSQ College, University of Southern Queensland, QLD 4350, Australia.
| | - Nadhir Al-Ansari
- Dept. of Civil, Environmental and Natural Resources Engineering, Lulea Univ. of Technology, Lulea T3334, Sweden.
| | - Haydar Abdulameer Marhoon
- Information and Communication Technology Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, Iraq; College of Computer Sciences and Information Technology, University of Kerbala, Karbala, Iraq.
| | - Shamsuddin Shahid
- Department of Hydraulics and Hydrology, School of Civil Engineering, Faculty of Engineering, Universiti Teknologi Malaysia (UTM), 81310 Skudia, Johor, Malaysia.
| | - Zaher Mundher Yaseen
- Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia; Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia.
| |
Collapse
|
10
|
Zhai S, Zhang Y, Huang J, Li X, Wang W, Zhang T, Yin F, Ma Y. Exploring the detailed spatiotemporal characteristics of PM 2.5: Generating a full-coverage and hourly PM 2.5 dataset in the Sichuan Basin, China. CHEMOSPHERE 2023; 310:136786. [PMID: 36257387 DOI: 10.1016/j.chemosphere.2022.136786] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]
Abstract
Fine particulate matter (PM2.5) has received worldwide attention due to its threat to public health. In the Sichuan Basin (SCB), PM2.5 is causing heavy health burdens due to its high concentrations and population density. Compared with other heavily polluted areas, less effort has been made to generate a full-coverage PM2.5 dataset of the SCB, in which the detailed PM2.5 spatiotemporal characteristics remain unclear. Considering commonly existing spatiotemporal autocorrelations, the top-of-atmosphere reflectance (TOAR) with a high coverage rate and other auxiliary data were employed to build commonly used random forest (RF) models to generate accurate hourly PM2.5 concentration predictions with a 0.05° × 0.05° spatial resolution in the SCB in 2016. Specifically, with historical concentrations predicted from a spatial RF (S-RF) and observed at stations, an alternative spatiotemporal RF (AST-RF) and spatiotemporal RF (ST-RF) were built in grids with stations (type 1). The predictions from the AST-RF in grids without stations (type 2) and observations in type 1 formed the PM2.5 dataset. The LOOCV R2, RMSE and MAE were 0.94/0.94, 8.71/8.62 μg∕m3 and 5.58/5.57 μg∕m3 in the AST-RF/ST-RF, respectively. Using the produced dataset, spatiotemporal analysis was conducted for a detailed understanding of the spatiotemporal characteristics of PM2.5 in the SCB. The PM2.5 concentrations gradually increased from the edge to the center of the SCB in spatial distribution. Two high-concentration areas centered on Chengdu and Zigong were observed throughout the year, while another high-concentration area centered on Dazhou was only observed in winter. The diurnal variation had double peaks and double valleys in the SCB. The concentrations were high at night and low in daytime, which suggests that characterizing the relationship between PM2.5 and adverse health outcomes by daily means might be inaccurate with most human activities conducted in daytime.
Collapse
Affiliation(s)
- Siwei Zhai
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Yi Zhang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Jingfei Huang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Xuelin Li
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Wei Wang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Tao Zhang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Fei Yin
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Yue Ma
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China.
| |
Collapse
|
11
|
Random Forest Estimation and Trend Analysis of PM2.5 Concentration over the Huaihai Economic Zone, China (2000–2020). SUSTAINABILITY 2022. [DOI: 10.3390/su14148520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Consisting of ten cities in four Chinese provinces, the Huaihai Economic Zone has suffered serious air pollution over the last two decades, particularly of fine particulate matter (PM2.5). In this study, we used multi-source data, namely MAIAC AOD (at a 1 km spatial resolution), meteorological, topographic, date, and location (latitude and longitude) data, to construct a regression model using random forest to estimate the daily PM2.5 concentration over the Huaihai Economic Zone from 2000 to 2020. It was found that the variable expressing time (date) had the greatest characteristic importance when estimating PM2.5. By averaging the modeled daily PM2.5 concentration, we produced a yearly PM2.5 concentration dataset, at a 1 km resolution, for the study area from 2000 to 2020. On comparing modeled daily PM2.5 with observational data, the coefficient of determination (R2) of the modeling was 0.85, the root means square error (RMSE) was 14.63 μg/m3, and the mean absolute error (MAE) was 10.03 μg/m3. The quality assessment of the synthesized yearly PM2.5 concentration dataset shows that R2 = 0.77, RMSE = 6.92 μg/m3, and MAE = 5.42 μg/m3. Despite different trends from 2000–2010 and from 2010–2020, the trend of PM2.5 concentration over the Huaihai Economic Zone during the 21 years was, overall, decreasing. The area of the significantly decreasing trend was small and mainly concentrated in the lake areas of the Zone. It is concluded that PM2.5 can be well-estimated from the MAIAC AOD dataset, when incorporating spatiotemporal variability using random forest, and that the resultant PM2.5 concentration data provide a basis for environmental monitoring over large geographic areas.
Collapse
|
12
|
Prediction of Air Pollutant Concentrations via RANDOM Forest Regressor Coupled with Uncertainty Analysis—A Case Study in Ningxia. ATMOSPHERE 2022. [DOI: 10.3390/atmos13060960] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Air pollution has not received much attention until recent years when people started to understand its dreadful impacts on human health. According to air pollution and the meteorological monitoring data from 1 January 2016 to 31 December 2017 in Ningxia, we analyzed the impact of ground surface temperature, air temperature, relative humidity and the power of wind on air pollutant concentrations. Meanwhile, we analyze the relationships between air pollutant concentrations and meteorological variables by using the mathematical model of decision tree regressor (DTR), feedforward artificial neural network with back-propagation algorithm (FFANN-BP) and random forest regressor (RFR) according to air-monitoring station data. For all pollutants, the RFR increases R2 of FFANN-BP and DTR by up to 0.53 and 0.42 respectively, reduces root mean square error (RMSE) by up to 68.7 and 41.2, and MAE by up to 25.2 and 17. The empirical results show that the proposed RFR displays the best forecasting performance and could provide local authorities with reliable and precise predictions of air pollutant concentrations. The RFR effectively establishes the relationships between the influential factors and air pollutant concentrations, and well suppresses the overfitting problem and improves the accuracy of prediction. Besides, the limitation of machine learning for single site prediction is also overcame.
Collapse
|
13
|
Iskandaryan D, Ramos F, Trilles S. Bidirectional convolutional LSTM for the prediction of nitrogen dioxide in the city of Madrid. PLoS One 2022; 17:e0269295. [PMID: 35648766 PMCID: PMC9159618 DOI: 10.1371/journal.pone.0269295] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Accepted: 05/18/2022] [Indexed: 12/03/2022] Open
Abstract
Nitrogen dioxide is one of the pollutants with the most significant health effects. Advanced information on its concentration in the air can help to monitor and control further consequences more effectively, while also making it easier to apply preventive and mitigating measures. Machine learning technologies with available methods and capabilities, combined with the geospatial dimension, can perform predictive analyses with higher accuracy and, as a result, can serve as a supportive tool for productive management. One of the most advanced machine learning algorithms, Bidirectional convolutional LSTM, is being used in ongoing work to predict the concentration of nitrogen dioxide. The model has been validated to perform more accurate spatiotemporal analysis based on the integration of temporal and geospatial factors. The analysis was carried out according to two scenarios developed on the basis of selected features using data from the city of Madrid for the periods January-June 2019 and January-June 2020. Evaluation of the model's performance was conducted using the Root Mean Square Error and the Mean Absolute Error which emphasises the superiority of the proposed model over the reference models. In addition, the significance of a feature selection technique providing improved accuracy was underlined. In terms of execution time, due to the complexity of the Bidirectional convolutional LSTM architecture, convergence and generalisation of the data took longer, resulting in the superiority of the reference models.
Collapse
Affiliation(s)
- Ditsuhi Iskandaryan
- Institute of New Imaging Technologies (INIT), Universitat Jaume I, Castelló de la Plana, Castellón, Spain
| | - Francisco Ramos
- Institute of New Imaging Technologies (INIT), Universitat Jaume I, Castelló de la Plana, Castellón, Spain
| | - Sergio Trilles
- Institute of New Imaging Technologies (INIT), Universitat Jaume I, Castelló de la Plana, Castellón, Spain
| |
Collapse
|
14
|
A New Coupling Method for PM2.5 Concentration Estimation by the Satellite-Based Semiempirical Model and Numerical Model. REMOTE SENSING 2022. [DOI: 10.3390/rs14102360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Aerosol optical and chemical properties play a major role in the retrieval of PM2.5 concentrations based on aerosol optical depth (AOD) data from satellites in the conventional semiempirical model (SEM). However, limited observation information hinders the high-resolution estimation of PM2.5. Therefore, a new method for evaluating near-surface PM2.5 at high spatial resolution is developed by coupling the SEM and the chemical transport model (CTM)-based numerical (CSEN) model. The numerical model can provide large-scale information for aerosol properties with high spatial resolution at a large scale based on emissions and meteorology, though it can still be biased in simulating absolute PM2.5 concentrations. Therefore, the two crucial aerosol characteristic parameters, including the coefficient integrated humidity effect (γ′) and the comprehensive reference value of aerosol properties (K) in SEM, have been redefined using the WRF-Chem numerical model. Improved model performance was observed for these results compared with the original SEM results. The monthly averaged correlation coefficients (R) by CSEN were 0.92, 0.82, 0.84, and 0.83 in January, April, July, and October, respectively, whereas those of the SEM were 0.80, 0.77, 0.72, and 0.72, respectively. All the statistical metrics of the model validation showed significant improvements in all seasons. The reduced biases of estimated PM2.5 by CSEN indicated the effect of hygroscopic growth and aerosol properties affected by the meteorology on the relationship between AOD and estimated PM2.5 concentrations, especially in winter and summer. The better performance of the CSEN model provides insight for air quality monitoring at different scales, which supplies important information for air pollution control policies and health impact analysis.
Collapse
|
15
|
Dimakopoulou K, Samoli E, Analitis A, Schwartz J, Beevers S, Kitwiroon N, Beddows A, Barratt B, Rodopoulou S, Zafeiratou S, Gulliver J, Katsouyanni K. Development and Evaluation of Spatio-Temporal Air Pollution Exposure Models and Their Combinations in the Greater London Area, UK. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19095401. [PMID: 35564796 PMCID: PMC9103954 DOI: 10.3390/ijerph19095401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 04/26/2022] [Accepted: 04/27/2022] [Indexed: 11/21/2022]
Abstract
Land use regression (LUR) and dispersion/chemical transport models (D/CTMs) are frequently applied to predict exposure to air pollution concentrations at a fine scale for use in epidemiological studies. Moreover, the use of satellite aerosol optical depth data has been a key predictor especially for particulate matter pollution and when studying large populations. Within the STEAM project we present a hybrid spatio-temporal modeling framework by (a) incorporating predictions from dispersion modeling of nitrogen dioxide (NO2), ozone (O3) and particulate matter with an aerodynamic diameter equal or less than 10 μm (PM10) and less than 2.5 μm (PM2.5) into a spatio-temporal LUR model; and (b) combining the predictions LUR and dispersion modeling and additionally, only for PM2.5, from an ensemble machine learning approach using a generalized additive model (GAM). We used air pollution measurements from 2009 to 2013 from 62 fixed monitoring sites for O3, 115 for particles and up to 130 for NO2, obtained from the dense network in the Greater London Area, UK. We assessed all models following a 10-fold cross validation (10-fold CV) procedure. The hybrid models performed better compared to separate LUR models. Incorporation of the dispersion estimates in the LUR models as a predictor, improved the LUR model fit: CV-R2 increased to 0.76 from 0.71 for NO2, to 0.79 from 0.57 for PM10, to 0.81 to 0.66 for PM2.5 and to 0.75 from 0.62 for O3. The CV-R2 obtained from the hybrid GAM framework was also increased compared to separate LUR models (CV-R2 = 0.80 for NO2, 0.76 for PM10, 0.79 for PM2.5 and 0.75 for O3). Our study supports the combined use of different air pollution exposure assessment methods in a single modeling framework to improve the accuracy of spatio-temporal predictions for subsequent use in epidemiological studies.
Collapse
Affiliation(s)
- Konstantina Dimakopoulou
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece; (K.D.); (E.S.); (A.A.); (S.R.); (S.Z.)
| | - Evangelia Samoli
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece; (K.D.); (E.S.); (A.A.); (S.R.); (S.Z.)
| | - Antonis Analitis
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece; (K.D.); (E.S.); (A.A.); (S.R.); (S.Z.)
| | - Joel Schwartz
- Department of Environmental Health, Harvard TH Chan School of Public Health, Boston, MA 02115, USA;
- Department of Epidemiology, Harvard TH Chan School of Public Health, Boston, MA 02115, USA
| | - Sean Beevers
- MRC Centre for Environment and Health, Imperial College London, London SE1 9NH, UK; (S.B.); (N.K.); (A.B.); (B.B.)
| | - Nutthida Kitwiroon
- MRC Centre for Environment and Health, Imperial College London, London SE1 9NH, UK; (S.B.); (N.K.); (A.B.); (B.B.)
| | - Andrew Beddows
- MRC Centre for Environment and Health, Imperial College London, London SE1 9NH, UK; (S.B.); (N.K.); (A.B.); (B.B.)
| | - Benjamin Barratt
- MRC Centre for Environment and Health, Imperial College London, London SE1 9NH, UK; (S.B.); (N.K.); (A.B.); (B.B.)
- National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Health Impact of Environmental Hazards, Imperial College London, London SW7 2AZ, UK
| | - Sophia Rodopoulou
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece; (K.D.); (E.S.); (A.A.); (S.R.); (S.Z.)
| | - Sofia Zafeiratou
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece; (K.D.); (E.S.); (A.A.); (S.R.); (S.Z.)
| | - John Gulliver
- Centre for Environmental Health and Sustainability, School of Geography, Geology and the Environment, University of Leicester, University Road, Leicester LE1 7RH, UK;
| | - Klea Katsouyanni
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 115 27 Athens, Greece; (K.D.); (E.S.); (A.A.); (S.R.); (S.Z.)
- MRC Centre for Environment and Health, Imperial College London, London SE1 9NH, UK; (S.B.); (N.K.); (A.B.); (B.B.)
- Correspondence:
| |
Collapse
|
16
|
Palanichamy N, Haw SC, S S, Murugan R, Govindasamy K. Machine learning methods to predict particulate matter PM 2.5. F1000Res 2022; 11:406. [PMID: 36531254 PMCID: PMC9723408 DOI: 10.12688/f1000research.73166.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/09/2022] [Indexed: 01/13/2023] Open
Abstract
Introduction Pollution of air in urban cities across the world has been steadily increasing in recent years. An increasing trend in particulate matter, PM 2.5, is a threat because it can lead to uncontrollable consequences like worsening of asthma and cardiovascular disease. The metric used to measure air quality is the air pollutant index (API). In Malaysia, machine learning (ML) techniques for PM 2.5 have received less attention as the concentration is on predicting other air pollutants. To fill the research gap, this study focuses on correctly predicting PM 2.5 concentrations in the smart cities of Malaysia by comparing supervised ML techniques, which helps to mitigate its adverse effects. Methods In this paper, ML models for forecasting PM 2.5 concentrations were investigated on Malaysian air quality data sets from 2017 to 2018. The dataset was preprocessed by data cleaning and a normalization process. Next, it was reduced into an informative dataset with location and time factors in the feature extraction process. The dataset was fed into three supervised ML classifiers, which include random forest (RF), artificial neural network (ANN) and long short-term memory (LSTM). Finally, their output was evaluated using the confusion matrix and compared to identify the best model for the accurate prediction of PM 2.5. Results Overall, the experimental result shows an accuracy of 97.7% was obtained by the RF model in comparison with the accuracy of ANN (61.14%) and LSTM (61.77%) in predicting PM 2.5. Discussion RF performed well when compared with ANN and LSTM for the given data with minimum features. RF was able to reach good accuracy as the model learns from the random samples by using decision tree with the maximum vote on the predictions.
Collapse
Affiliation(s)
- Naveen Palanichamy
- Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia,
| | - Su-Cheng Haw
- Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia
| | - Subramanian S
- Department of Electrical Engineering, Annamalai University, India, Chidambaram, Tamil Nadu, 608002, India
| | - Rishanti Murugan
- Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia
| | - Kuhaneswaran Govindasamy
- Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia
| |
Collapse
|
17
|
Estimation and Analysis of PM 2.5 Concentrations with NPP-VIIRS Nighttime Light Images: A Case Study in the Chang-Zhu-Tan Urban Agglomeration of China. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19074306. [PMID: 35409987 PMCID: PMC8998965 DOI: 10.3390/ijerph19074306] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 03/30/2022] [Accepted: 03/31/2022] [Indexed: 02/04/2023]
Abstract
Rapid economic and social development has caused serious atmospheric environmental problems. The temporal and spatial distribution characteristics of PM2.5 concentrations have become an important research topic for sustainable social development monitoring. Based on NPP-VIIRS nighttime light images, meteorological data, and SRTM DEM data, this article builds a PM2.5 concentration estimation model for the Chang-Zhu-Tan urban agglomeration. First, the partial least squares method is used to calculate the nighttime light radiance, meteorological elements (temperature, relative humidity, and wind speed), and topographic elements (elevation, slope, and topographic undulation) for correlation analysis. Second, we construct seasonal and annual PM2.5 concentration estimation models, including multiple linear regression, support random forest, vector regression, Gaussian process regression, etc., with different factor sets. Finally, the accuracy of the PM2.5 concentration estimation model that results in the Chang-Zhu-Tan urban agglomeration is analyzed, and the spatial distribution of the PM2.5 concentration is inverted. The results show that the PM2.5 concentration correlation of meteorological elements is the strongest, and the topographic elements are the weakest. In terms of seasonal estimation, the spring estimation results of multiple linear regression and machine learning estimation models are the worst, the winter estimation results of multiple linear regression estimation models are the best, and the annual estimation results of machine learning estimation models are the best. At the same time, the study found that there is a significant difference in the temporal and spatial distribution of PM2.5 concentrations. The methods in this article overcome the high cost and spatial resolution limitations of traditional large-scale PM2.5 concentration monitoring, to a certain extent, and can provide a reference for the study of PM2.5 concentration estimation and prediction based on satellite remote sensing technology.
Collapse
|
18
|
Gardner-Frolick R, Boyd D, Giang A. Selecting Data Analytic and Modeling Methods to Support Air Pollution and Environmental Justice Investigations: A Critical Review and Guidance Framework. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:2843-2860. [PMID: 35133145 DOI: 10.1021/acs.est.1c01739] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Given the serious adverse health effects associated with many pollutants, and the inequitable distribution of these effects between socioeconomic groups, air pollution is often a focus of environmental justice (EJ) research. However, EJ analyses that aim to illuminate whether and how air pollution hazards are inequitably distributed may present a unique set of requirements for estimating pollutant concentrations compared to other air quality applications. Here, we perform a scoping review of the range of data analytic and modeling methods applied in past studies of air pollution and environmental injustice and develop a guidance framework for selecting between them given the purpose of analysis, users, and resources available. We include proxy, monitor-based, statistical, and process-based methods. Upon critically synthesizing the literature, we identify four main dimensions to inform method selection: accuracy, interpretability, spatiotemporal features of the method, and usability of the method. We illustrate the guidance framework with case studies from the literature. Future research in this area includes an exploration of increasing data availability, advanced statistical methods, and the importance of science-based policy.
Collapse
Affiliation(s)
- Rivkah Gardner-Frolick
- Department of Mechanical Engineering, University of British Columbia, Vancouver V6T 1Z4, Canada
| | - David Boyd
- Institute for Resources, Environment and Sustainability, University of British Columbia, Vancouver V6T 1Z4, Canada
| | - Amanda Giang
- Department of Mechanical Engineering, University of British Columbia, Vancouver V6T 1Z4, Canada
- Institute for Resources, Environment and Sustainability, University of British Columbia, Vancouver V6T 1Z4, Canada
| |
Collapse
|
19
|
Retrieval of Fine-Grained PM2.5 Spatiotemporal Resolution Based on Multiple Machine Learning Models. REMOTE SENSING 2022. [DOI: 10.3390/rs14030599] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Due to the country’s rapid economic growth, the problem of air pollution in China is becoming increasingly serious. In order to achieve a win-win situation for the environment and urban development, the government has issued many policies to strengthen environmental protection. PM2.5 is the primary particulate matter in air pollution, so an accurate estimation of PM2.5 distribution is of great significance. Although previous studies have attempted to retrieve PM2.5 using geostatistical or aerosol remote sensing retrieval methods, the current rough resolution and accuracy remain as limitations of such methods. This paper proposes a fine-grained spatiotemporal PM2.5 retrieval method that comprehensively considers various datasets, such as Landsat 8 satellite images, ground monitoring station data, and socio-economic data, to explore the applicability of different machine learning algorithms in PM2.5 retrieval. Six typical algorithms were used to train the multi-dimensional elements in a series of experiments. The characteristics of retrieval accuracy in different scenarios were clarified mainly according to the validation index, R2. The random forest algorithm was shown to have the best numerical and PM2.5-based air-quality-category accuracy, with a cross-validated R2 of 0.86 and a category retrieval accuracy of 0.83, while both maintained excellent retrieval accuracy and achieved a high spatiotemporal resolution. Based on this retrieval model, we evaluated the PM2.5 distribution characteristics and hourly variation in the sample area, as well as the functions of different input variables in the model. The PM2.5 retrieval method proposed in this paper provides a new model for fine-grained PM2.5 concentration estimation to determine the distribution laws of air pollutants and thereby specify more effective measures to realize the high-quality development of the city.
Collapse
|
20
|
Bekkar A, Hssina B, Douzi S, Douzi K. Air-pollution prediction in smart city, deep learning approach. JOURNAL OF BIG DATA 2021; 8:161. [PMID: 34956819 PMCID: PMC8693596 DOI: 10.1186/s40537-021-00548-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 12/10/2021] [Indexed: 06/14/2023]
Abstract
Over the past few decades, due to human activities, industrialization, and urbanization, air pollution has become a life-threatening factor in many countries around the world. Among air pollutants, Particulate Matter with a diameter of less than 2.5 μ m ( P M 2.5 ) is a serious health problem. It causes various illnesses such as respiratory tract and cardiovascular diseases. Hence, it is necessary to accurately predict the P M 2.5 concentrations in order to prevent the citizens from the dangerous impact of air pollution beforehand. The variation of P M 2.5 depends on a variety of factors, such as meteorology and the concentration of other pollutants in urban areas. In this paper, we implemented a deep learning solution to predict the hourly forecast of P M 2.5 concentration in Beijing, China, based on CNN-LSTM, with a spatial-temporal feature by combining historical data of pollutants, meteorological data, and P M 2.5 concentration in the adjacent stations. We examined the difference in performances among Deep learning algorithms such as LSTM, Bi-LSTM, GRU, Bi-GRU, CNN, and a hybrid CNN-LSTM model. Experimental results indicate that our method "hybrid CNN-LSTM multivariate" enables more accurate predictions than all the listed traditional models and performs better in predictive performance.
Collapse
Affiliation(s)
| | - Badr Hssina
- FSTM, University Hassan II, Casablanca, Morocco
| | | | | |
Collapse
|
21
|
Analyzing the Contribution of Human Mobility to Changes in Air Pollutants: Insights from the COVID-19 Lockdown in Wuhan. ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION 2021. [DOI: 10.3390/ijgi10120836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
During the COVID-19 lockdown in Wuhan, transportation, industrial production and other human activities declined significantly, as did the NO2 concentration. In order to assess the relative contributions of different factors to reductions in air pollutants, we implemented sensitivity experiments by Random Forest (RF) models, with the comparison of the contributions of meteorological conditions, human mobility, and emissions from industry and households between different periods. In addition, we conducted scenario analyses to suggest an appropriate limit for control of human mobility. Different mechanisms for air pollutants were shown in the pre-pandemic, pre-lockdown, lockdown, and post-pandemic periods. Wind speed and the Within-city Migration index, representing intra-city mobility intensity, were excluded from stepwise multiple linear models in the pre-lockdown and lockdown periods. The results of sensitivity experiments show that, in the COVID-19 lockdown period, 73.3% of the reduction can be attributed to decreased human mobility. In the post-pandemic period, meteorological conditions control about 42.2% of the decrease, and emissions from industry and households control 40.0%, while human mobility only contributes 17.8%. The results of the scenario analysis suggest that the priority of restriction should be given to human mobility within the city than other kinds of human mobility. The reduction in the NO2 concentration tends to be smaller when human mobility within the city decreases by more than 70%. A limit of less than 40% on the control of the human mobility can achieve a better effect, especially in cities with severe traffic pollution.
Collapse
|
22
|
Prediction of Air Pollutant Concentration Based on One-Dimensional Multi-Scale CNN-LSTM Considering Spatial-Temporal Characteristics: A Case Study of Xi’an, China. ATMOSPHERE 2021. [DOI: 10.3390/atmos12121626] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Air pollution has become a serious problem threatening human health. Effective prediction models can help reduce the adverse effects of air pollutants. Accurate predictions of air pollutant concentration can provide a scientific basis for air pollution prevention and control. However, the previous air pollution-related prediction models mainly processed air quality prediction, or the prediction of a single or two air pollutants. Meanwhile, the temporal and spatial characteristics and multiple factors of pollutants were not fully considered. Herein, we establish a deep learning model for an atmospheric pollutant memory network (LSTM) by both applying the one-dimensional multi-scale convolution kernel (ODMSCNN) and a long-short-term memory network (LSTM) on the basis of temporal and spatial characteristics. The temporal and spatial characteristics combine the respective advantages of CNN and LSTM networks. First, ODMSCNN is utilized to extract the temporal and spatial characteristics of air pollutant-related data to form a feature vector, and then the feature vector is input into the LSTM network to predict the concentration of air pollutants. The data set comes from the daily concentration data and hourly concentration data of six atmospheric pollutants (PM2.5, PM10, NO2, CO, O3, SO2) and 17 types of meteorological data in Xi’an. Daily concentration data prediction, hourly concentration data prediction, group data prediction and multi-factor prediction were used to verify the effectiveness of the model. In general, the air pollutant concentration prediction model based on ODMSCNN-LSTM shows a better prediction effect compared with multi-layer perceptron (MLP), CNN, and LSTM models.
Collapse
|
23
|
Li L, Blomberg AJ, Lawrence J, Réquia WJ, Wei Y, Liu M, Peralta AA, Koutrakis P. A spatiotemporal ensemble model to predict gross beta particulate radioactivity across the contiguous United States. ENVIRONMENT INTERNATIONAL 2021; 156:106643. [PMID: 34020300 PMCID: PMC9384849 DOI: 10.1016/j.envint.2021.106643] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Revised: 04/23/2021] [Accepted: 05/10/2021] [Indexed: 06/12/2023]
Abstract
Particulate radioactivity, a characteristic of particulate matter, is primarily determined by the abundance of radionuclides that are bound to airborne particulates. Exposure to high levels of particulate radioactivity has been associated with negative health outcomes. However, there are currently no spatially and temporally resolved particulate radioactivity data for exposure assessment purposes. We estimated the monthly distributions of gross beta particulate radioactivity across the contiguous United States from 2001 to 2017 with a spatial resolution of 32 km, via a multi-stage ensemble-based model. Particulate radioactivity was measured at 129 RadNet monitors across the contiguous U.S. In stage one, we built 264 base learning models using six methods, then selected nine base models that provide different predictions. In stage two, we used a non-negative geographically and temporally weighted regression method to aggregate the selected base learner predictions based on their local performance. The results of block cross-validation analysis suggested that the non-negative geographically and temporally weighted regression ensemble learning model outperformed all base learning model with the smallest rooted mean square error (0.094 mBq/m3). Our model provided an accurate estimation of particulate radioactivity, thus can be used in future health studies.
Collapse
Affiliation(s)
- Longxiang Li
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA.
| | - Annelise J Blomberg
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA; Division of Occupational and Environmental Medicine, Lund University, Lund, Sweden
| | - Joy Lawrence
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA
| | - Weeberb J Réquia
- School of Public Policy and Government, Fundação Getúlio Vargas Brasília, Distrito Federal, Brazil
| | - Yaguang Wei
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA
| | - Man Liu
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA
| | - Adjani A Peralta
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA
| | - Petros Koutrakis
- Department of Environmental Health, Harvard T. H. Chan School of Public Health, Boston, MA 02114, USA
| |
Collapse
|
24
|
PM2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM. SUSTAINABILITY 2021. [DOI: 10.3390/su132112071] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
With the rapid development of China’s industrialization, air pollution is becoming more and more serious. Predicting air quality is essential for identifying further preventive measures to avoid negative impacts. The existing prediction of atmospheric pollutant concentration ignores the problem of feature redundancy and spatio-temporal characteristics; the accuracy of the model is not high, the mobility of it is not strong. Therefore, firstly, extreme gradient lifting (XGBoost) is applied to extract features from PM2.5, then one-dimensional multi-scale convolution kernel (MSCNN) is used to extract local temporal and spatial feature relations from air quality data, and linear splicing and fusion is carried out to obtain the spatio-temporal feature relationship of multi-features. Finally, XGBoost and MSCNN combine the advantages of LSTM in dealing with time series. Genetic algorithm (GA) is applied to optimize the parameter set of long-term and short-term memory network (LSTM) network. The spatio-temporal relationship of multi-features is input into LSTM network, and then the long-term feature dependence of multi-feature selection is output to predict PM2.5 concentration. A XGBoost-MSCGL of PM2.5 concentration prediction model based on spatio-temporal feature selection is established. The data set comes from the hourly concentration data of six kinds of atmospheric pollutants and meteorological data in Fen-Wei Plain in 2020. To verify the effectiveness of the model, the XGBoost-MSCGL model is compared with the benchmark models such as multilayer perceptron (MLP), CNN, LSTM, XGBoost, CNN-LSTM with before and after using XGBoost feature selection. According to the forecast results of 12 cities, compared with the single model, the root mean square error (RMSE) decreased by about 39.07%, the average MAE decreased by about 42.18%, the average MAE decreased by about 49.33%, but R2 increased by 23.7%. Compared with the model after feature selection, the root mean square error (RMSE) decreased by an average of about 15%. On average, the MAPE decreased by 16%, the MAE decreased by 21%, and R2 increased by 2.6%. The experimental results show that the XGBoost-MSCGL prediction model offer a more comprehensive understanding, runs deeper levels, guarantees a higher prediction accuracy, and ensures a better generalization ability in the prediction of PM2.5 concentration.
Collapse
|
25
|
Umar IK, Nourani V, Gökçekuş H. A novel multi-model data-driven ensemble approach for the prediction of particulate matter concentration. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:49663-49677. [PMID: 33939094 DOI: 10.1007/s11356-021-14133-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 04/22/2021] [Indexed: 06/12/2023]
Abstract
Accuracy in the prediction of the particulate matter (PM2.5 and PM10) concentration in the atmosphere is essential for both its monitoring and control. In this study, a novel neuro fuzzy ensemble (NF-E) model was proposed for prediction of hourly PM2.5 and PM10 concentration. The NF-E involves careful selection of relevant input parameters for base modelling and using an adaptive neuro fuzzy inference system (ANFIS) model as a nonlinear kernel for obtaining ensemble output. The four base models used include ANFIS, artificial neural network (ANN), support vector regression (SVR) and multilinear regression (MLR). The dominant input parameters for developing the base models were selected using two nonlinear approaches (mutual information and single-input single-output ANN-based sensitivity analysis) and a conventional Pearson correlation coefficient. The NF-E model was found to predict both PM2.5 and PM10 with higher generalization ability and least error. The NF-E model outperformed all the single base models and other linear ensemble techniques with a Nash-Sutcliffe efficiency (NSE) of 0.9594 and 0.9865, mean absolute error (MAE) of 1.63 μg/m3 and 1.66 μg/m3 and BIAS of 0.0760 and 0.0340 in the testing stage for PM2.5 and PM10, respectively. The NF-E could improve the efficiency of other models by 4-22% for PM2.5 and 3-20% for PM10 depending on the model.
Collapse
Affiliation(s)
- Ibrahim Khalil Umar
- Faculty of Civil and Environmental Engineering, Near East University, Near East Boulevard, Via Mersin, 99138, Nicosia, North Cyprus, Turkey.
| | - Vahid Nourani
- Center of Excellence in Hydroinformatics, Faculty of Civil Engineering, University of Tabriz, Tabriz, Iran
- Faculty of Civil and Environmental Engineering, Near East University, Near East Boulevard, Via Mersin, 99138, Nicosia, North Cyprus, Turkey
| | - Hüseyin Gökçekuş
- Faculty of Civil and Environmental Engineering, Near East University, Near East Boulevard, Via Mersin, 99138, Nicosia, North Cyprus, Turkey
| |
Collapse
|
26
|
Evaluation of Machine Learning Models for Estimating PM2.5 Concentrations across Malaysia. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11167326] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Southeast Asia (SEA) is a hotspot region for atmospheric pollution and haze conditions, due to extensive forest, agricultural and peat fires. This study aims to estimate the PM2.5 concentrations across Malaysia using machine-learning (ML) models like Random Forest (RF) and Support Vector Regression (SVR), based on satellite AOD (aerosol optical depth) observations, ground measured air pollutants (NO2, SO2, CO, O3) and meteorological parameters (air temperature, relative humidity, wind speed and direction). The estimated PM2.5 concentrations for a two-year period (2018–2019) are evaluated against measurements performed at 65 air-quality monitoring stations located at urban, industrial, suburban and rural sites. PM2.5 concentrations varied widely between the stations, with higher values (mean of 24.2 ± 21.6 µg m−3) at urban/industrial stations and lower (mean of 21.3 ± 18.4 µg m−3) at suburban/rural sites. Furthermore, pronounced seasonal variability in PM2.5 is recorded across Malaysia, with highest concentrations during the dry season (June–September). Seven models were developed for PM2.5 predictions, i.e., separately for urban/industrial and suburban/rural sites, for the four dominant seasons (dry, wet and two inter-monsoon), and an overall model, which displayed accuracies in the order of R2 = 0.46–0.76. The validation analysis reveals that the RF model (R2 = 0.53–0.76) exhibits slightly better performance than SVR, except for the overall model. This is the first study conducted in Malaysia for PM2.5 estimations at a national scale combining satellite aerosol retrievals with ground-based pollutants, meteorological factors and ML techniques. The satisfactory prediction of PM2.5 concentrations across Malaysia allows a continuous monitoring of the pollution levels at remote areas with absence of measurement networks.
Collapse
|
27
|
Madokoro H, Kiguchi O, Nagayoshi T, Chiba T, Inoue M, Chiyonobu S, Nix S, Woo H, Sato K. Development of Drone-Mounted Multiple Sensing System with Advanced Mobility for In Situ Atmospheric Measurement: A Case Study Focusing on PM 2.5 Local Distribution. SENSORS 2021; 21:s21144881. [PMID: 34300619 PMCID: PMC8309946 DOI: 10.3390/s21144881] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 07/08/2021] [Accepted: 07/14/2021] [Indexed: 11/16/2022]
Abstract
This study was conducted using a drone with advanced mobility to develop a unified sensor and communication system as a new platform for in situ atmospheric measurements. As a major cause of air pollution, particulate matter (PM) has been attracting attention globally. We developed a small, lightweight, simple, and cost-effective multi-sensor system for multiple measurements of atmospheric phenomena and related environmental information. For in situ local area measurements, we used a long-range wireless communication module with real-time monitoring and visualizing software applications. Moreover, we developed four prototype brackets with optimal assignment of sensors, devices, and a camera for mounting on a drone as a unified system platform. Results of calibration experiments, when compared to data from two upper-grade PM2.5 sensors, demonstrated that our sensor system followed the overall tendencies and changes. We obtained original datasets after conducting flight measurement experiments at three sites with differing surrounding environments. The experimentally obtained prediction results matched regional PM2.5 trends obtained using long short-term memory (LSTM) networks trained using the respective datasets.
Collapse
Affiliation(s)
- Hirokazu Madokoro
- Faculty of Software and Information Science, Iwate Prefectural University, Takizawa 020-0693, Japan
- Faculty of Systems Science and Technology, Akita Prefectural University, Yurihonjo 015-0055, Japan; (S.N.); (K.S.)
- Correspondence: ; Tel.: +81-019-694-2500
| | - Osamu Kiguchi
- Faculty of Bioresource Sciences, Akita Prefectural University, Akita 010-0195, Japan; (O.K.); (T.N.); (M.I.)
| | - Takeshi Nagayoshi
- Faculty of Bioresource Sciences, Akita Prefectural University, Akita 010-0195, Japan; (O.K.); (T.N.); (M.I.)
| | - Takashi Chiba
- College of Agriculture, Food and Environment Sciences, Rakuno Gakuen University, Ebetsu 069-0851, Japan;
| | - Makoto Inoue
- Faculty of Bioresource Sciences, Akita Prefectural University, Akita 010-0195, Japan; (O.K.); (T.N.); (M.I.)
| | - Shun Chiyonobu
- Graduate School of International Resource Sciences, Akita University, Akita 010-8502, Japan;
| | - Stephanie Nix
- Faculty of Systems Science and Technology, Akita Prefectural University, Yurihonjo 015-0055, Japan; (S.N.); (K.S.)
| | - Hanwool Woo
- Institute of Engineering Innovation, Graduate School of Engineering, The University of Tokyo, Tokyo 113-8656, Japan;
| | - Kazuhito Sato
- Faculty of Systems Science and Technology, Akita Prefectural University, Yurihonjo 015-0055, Japan; (S.N.); (K.S.)
| |
Collapse
|
28
|
Malings C, Knowland KE, Keller CA, Cohn SE. Sub-City Scale Hourly Air Quality Forecasting by Combining Models, Satellite Observations, and Ground Measurements. EARTH AND SPACE SCIENCE (HOBOKEN, N.J.) 2021; 8:e2021EA001743. [PMID: 34435082 PMCID: PMC8365697 DOI: 10.1029/2021ea001743] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 05/03/2021] [Accepted: 05/27/2021] [Indexed: 05/19/2023]
Abstract
While multiple information sources exist concerning surface-level air pollution, no individual source simultaneously provides large-scale spatial coverage, fine spatial and temporal resolution, and high accuracy. It is, therefore, necessary to integrate multiple data sources, using the strengths of each source to compensate for the weaknesses of others. In this study, we propose a method incorporating outputs of NASA's GEOS Composition Forecasting model system with satellite information from the TROPOMI instrument and ground measurement data on surface concentrations. Although we use ground monitoring data from the Environmental Protection Agency network in the continental United States, the model and satellite data sources used have the potential to allow for global application. This method is demonstrated using surface measurements of nitrogen dioxide as a test case in regions surrounding five major US cities. The proposed method is assessed through cross-validation against withheld ground monitoring sites. In these assessments, the proposed method demonstrates major improvements over two baseline approaches which use ground-based measurements only. Results also indicate the potential for near-term updating of forecasts based on recent ground measurements.
Collapse
Affiliation(s)
- C. Malings
- Goddard Space Flight CenterNASA Postdoctoral Program FellowGreenbeltMDUSA
- Goddard Space Flight CenterGlobal Modeling and Assimilation OfficeGreenbeltMDUSA
- Universities Space Research AssociationColumbiaMDUSA
| | - K. E. Knowland
- Goddard Space Flight CenterGlobal Modeling and Assimilation OfficeGreenbeltMDUSA
- Universities Space Research AssociationColumbiaMDUSA
| | - C. A. Keller
- Goddard Space Flight CenterGlobal Modeling and Assimilation OfficeGreenbeltMDUSA
- Universities Space Research AssociationColumbiaMDUSA
| | - S. E. Cohn
- Goddard Space Flight CenterGlobal Modeling and Assimilation OfficeGreenbeltMDUSA
| |
Collapse
|
29
|
Ashworth M, Analitis A, Whitney D, Samoli E, Zafeiratou S, Atkinson R, Dimakopoulou K, Beavers S, Schwartz J, Katsouyanni K. Spatio-temporal associations of air pollutant concentrations, GP respiratory consultations and respiratory inhaler prescriptions: a 5-year study of primary care in the borough of Lambeth, South London. Environ Health 2021; 20:54. [PMID: 33962646 PMCID: PMC8105918 DOI: 10.1186/s12940-021-00730-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Accepted: 04/14/2021] [Indexed: 06/03/2023]
Abstract
BACKGROUND Although the associations of outdoor air pollution exposure with mortality and hospital admissions are well established, few previous studies have reported on primary care clinical and prescribing data. We assessed the associations of short and long-term pollutant exposures with General Practitioner respiratory consultations and inhaler prescriptions. METHODS Daily primary care data, for 2009-2013, were obtained from Lambeth DataNet (LDN), an anonymised dataset containing coded data from all patients (1.2 million) registered at general practices in Lambeth, an inner-city south London borough. Counts of respiratory consultations and inhaler prescriptions by day and Lower Super Output Area (LSOA) of residence were constructed. We developed models for predicting daily PM2.5, PM10, NO2 and O3 per LSOA. We used spatio-temporal mixed effects zero inflated negative binomial models to investigate the simultaneous short- and long-term effects of exposure to pollutants on the number of events. RESULTS The mean concentrations of NO2, PM10, PM2.5 and O3 over the study period were 50.7, 21.2, 15.6, and 49.9 μg/m3 respectively, with all pollutants except NO2 having much larger temporal rather than spatial variability. Following short-term exposure increases to PM10, NO2 and PM2.5 the number of consultations and inhaler prescriptions were found to increase, especially for PM10 exposure in children which was associated with increases in daily respiratory consultations of 3.4% and inhaler prescriptions of 0.8%, per PM10 interquartile range (IQR) increase. Associations further increased after adjustment for weekly average exposures, rising to 6.1 and 1.2%, respectively, for weekly average PM10 exposure. In contrast, a short-term increase in O3 exposure was associated with decreased number of respiratory consultations. No association was found between long-term exposures to PM10, PM2.5 and NO2 and number of respiratory consultations. Long-term exposure to NO2 was associated with an increase (8%) in preventer inhaler prescriptions only. CONCLUSIONS We found increases in the daily number of GP respiratory consultations and inhaler prescriptions following short-term increases in exposure to NO2, PM10 and PM2.5. These associations are more pronounced in children and persist for at least a week. The association with long term exposure to NO2 and preventer inhaler prescriptions indicates likely increased chronic respiratory morbidity.
Collapse
Affiliation(s)
- Mark Ashworth
- School of Population Health and Environmental Sciences, King’s College London, Guy’s Campus, Addison House, London, SE1 1UL UK
| | - Antonis Analitis
- Medical School, National and Kapodistrian University of Athens, Athens, Greece
| | - David Whitney
- School of Population Health and Environmental Sciences, King’s College London, Guy’s Campus, Addison House, London, SE1 1UL UK
| | - Evangelia Samoli
- Medical School, National and Kapodistrian University of Athens, Athens, Greece
| | - Sofia Zafeiratou
- Medical School, National and Kapodistrian University of Athens, Athens, Greece
| | - Richard Atkinson
- Population Health Research Institute, St George’s, University of London, Cranmer Terrace, London, SW170RE UK
| | | | - Sean Beavers
- School of Population Health and Environmental Sciences, King’s College London, Guy’s Campus, Addison House, London, SE1 1UL UK
- Environmental Research Group, MRC Centre for Environment and Health, Imperial College, London, UK
| | - Joel Schwartz
- Departments of Environmental Health and Epidemiology, Harvard TH Chan School of Public Health, 665 Huntington Avenue, Building 1, Room 1301, Boston, MA 02115 USA
| | - Klea Katsouyanni
- Medical School, National and Kapodistrian University of Athens, Athens, Greece
- Environmental Research Group, MRC Centre for Environment and Health, Imperial College, London, UK
| | | |
Collapse
|
30
|
McIsaac MA, Sanders E, Kuester T, Aronson KJ, Kyba CCM. The impact of image resolution on power, bias, and confounding: A simulation study of ambient light at night exposure. Environ Epidemiol 2021; 5:e145. [PMID: 33870017 PMCID: PMC8043729 DOI: 10.1097/ee9.0000000000000145] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Accepted: 02/22/2021] [Indexed: 11/26/2022] Open
Abstract
Studies of the impact of environmental pollutants on health outcomes can be compromised by mismeasured exposures or unmeasured confounding with other environmental exposures. Both problems can be exacerbated by measuring exposure from data sources with low spatial resolution. Artificial light at night, for example, is often estimated from low-resolution satellite images, which may result in substantial measurement error and increased correlation with air or noise pollution. METHODS Light at night exposure was considered in simulated epidemiologic studies in Vancouver, British Columbia. First, we assessed statistical power and bias for hypothetical studies that replaced true light exposure with estimates from sources with low resolution. Next, health status was simulated based on pollutants other than light exposure, and we assessed the frequency with which studies might incorrectly attribute negative health impacts to light exposure as a result of unmeasured confounding by the other environmental exposures. RESULTS When light was simulated to be the causal agent, studies relying on low-resolution data suffered from lower statistical power and biased estimates. Additionally, correlations between light and other pollutants increased as the spatial resolution of the light exposure map decreased, so studies estimating light exposure from images with lower spatial resolution were more prone to confounding. CONCLUSIONS Studies estimating exposure to pollutants from data with lower spatial resolution are prone to increased bias, increased confounding, and reduced power. Studies examining effects of light at night should avoid using exposure estimates based on low-resolution maps, and should consider potential confounding with other environmental pollutants.
Collapse
Affiliation(s)
- Michael A. McIsaac
- School of Mathematical and Computational Sciences, University of Prince Edward Island, Charlottetown, PEI, Canada
- Department of Public Health Sciences, Queen’s University, Kingston, ON, Canada
| | - Eric Sanders
- Department of Statistics, University of British Columbia, Vancouver, BC, Canada
| | - Theres Kuester
- GFZ German Research Centre for Geosciences, Potsdam, Germany
| | - Kristan J. Aronson
- Department of Public Health Sciences, Queen’s University, Kingston, ON, Canada
- Division of Cancer Care and Epidemiology, Cancer Research Institute, Queen’s University, Kingston, ON, Canada
| | | |
Collapse
|
31
|
A Novel Recursive Model Based on a Convolutional Long Short-Term Memory Neural Network for Air Pollution Prediction. REMOTE SENSING 2021. [DOI: 10.3390/rs13071284] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Deep learning provides a promising approach for air pollution prediction. The existing deep learning-based predicted models generally consider either the temporal correlations of air quality monitoring stations or the nonlinear relationship between the PM2.5 (particulate matter with an aerodynamic diameter of less than 2.5 μm) concentrations and explanatory variables. Spatial correlation has not been effectively incorporated into prediction models, therefore exhibiting poor performance in PM2.5 prediction tasks. Additionally, determining the manner by which to expand longer-term prediction tasks is still challenging. In this paper, to allow for spatiotemporal correlations, a spatiotemporal convolutional recursive long short-term memory (CR-LSTM) neural network model is proposed for predicting the PM2.5 concentrations in long-term prediction tasks by combining a convolutional long short-term memory (ConvLSTM) neural network and a recursive strategy. Herein, the ConvLSTM network was used to capture the complex spatiotemporal correlations and to predict the future PM2.5 concentrations; the recursive strategy was used for expanding the long-term prediction tasks. The CR-LSTM model was used to realize the prediction of the future 24 h of PM2.5 concentrations for 12 air quality monitoring stations in Beijing by configuring both the appropriate time lag derived from the temporal correlations and the spatial neighborhood, including the hourly historical PM2.5 concentrations, the daily mean meteorological data, and the annual nighttime light and normalized difference vegetation index (NDVI). The results showed that the proposed CR-LSTM model achieved better performance (coefficient of determination (R2) = 0.74; root mean square error (RMSE) = 18.96 μg/m3) than other common models, such as multiple linear regression (MLR), support vector regression (SVR), the conventional LSTM model, the LSTM extended (LSTME) model, and the temporal sliding LSTM extended (TS-LSTME) model. The proposed CR-LSTM model, implementing a combination of geographical rules, recursive strategy, and deep learning, shows improved performance in longer-term prediction tasks.
Collapse
|
32
|
Abstract
Air pollution and its consequences are negatively impacting on the world population and the environment, which converts the monitoring and forecasting air quality techniques as essential tools to combat this problem. To predict air quality with maximum accuracy, along with the implemented models and the quantity of the data, it is crucial also to consider the dataset types. This study selected a set of research works in the field of air quality prediction and is concentrated on the exploration of the datasets utilised in them. The most significant findings of this research work are: (1) meteorological datasets were used in 94.6% of the papers leaving behind the rest of the datasets with a big difference, which is complemented with others, such as temporal data, spatial data, and so on; (2) the usage of various datasets combinations has been commenced since 2009; and (3) the utilisation of open data have been started since 2012, 32.3% of the studies used open data, and 63.4% of the studies did not provide the data.
Collapse
|
33
|
Schneider R, Vicedo-Cabrera AM, Sera F, Masselot P, Stafoggia M, de Hoogh K, Kloog I, Reis S, Vieno M, Gasparrini A. A Satellite-Based Spatio-Temporal Machine Learning Model to Reconstruct Daily PM 2.5 Concentrations across Great Britain. REMOTE SENSING 2021; 12:3803. [PMID: 33408882 PMCID: PMC7116547 DOI: 10.3390/rs12223803] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Epidemiological studies on the health effects of air pollution usually rely on measurements from fixed ground monitors, which provide limited spatio-temporal coverage. Data from satellites, reanalysis, and chemical transport models offer additional information used to reconstruct pollution concentrations at high spatio-temporal resolutions. This study aims to develop a multi-stage satellite-based machine learning model to estimate daily fine particulate matter (PM2.5) levels across Great Britain between 2008–2018. This high-resolution model consists of random forest (RF) algorithms applied in four stages. Stage-1 augments monitor-PM2.5 series using co-located PM10 measures. Stage-2 imputes missing satellite aerosol optical depth observations using atmospheric reanalysis models. Stage-3 integrates the output from previous stages with spatial and spatio-temporal variables to build a prediction model for PM2.5. Stage-4 applies Stage-3 models to estimate daily PM2.5 concentrations over a 1 km grid. The RF architecture performed well in all stages, with results from Stage-3 showing an average cross-validated R2 of 0.767 and minimal bias. The model performed better over the temporal scale when compared to the spatial component, but both presented good accuracy with an R2 of 0.795 and 0.658, respectively. These findings indicate that direct satellite observations must be integrated with other satellite-based products and geospatial variables to derive reliable estimates of air pollution exposure. The high spatio-temporal resolution and the relatively high precision allow these estimates (approximately 950 million points) to be used in epidemiological analyses to assess health risks associated with both short- and long-term exposure to PM2.5.
Collapse
Affiliation(s)
- Rochelle Schneider
- Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
- The Centre on Climate Change and Planetary Health, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
- European Centre for Medium-Range Weather Forecast (ECMWF), Shinfield Rd, Reading RG2 9AX, UK
- Correspondence:
| | - Ana M. Vicedo-Cabrera
- Institute of Social and Preventive Medicine, University of Bern, 3012 Bern, Switzerland
- Oeschger Center for Climate Change Research, University of Bern, 3012 Bern, Switzerland
| | - Francesco Sera
- Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
| | - Pierre Masselot
- Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
| | - Massimo Stafoggia
- Department of Epidemiology, Lazio Regional Health Service, 00147 Rome, Italy
| | - Kees de Hoogh
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051 Basel, Switzerland
- University of Basel, Petersplatz 1, 4051 Basel, Switzerland
| | - Itai Kloog
- Department of Geography and Environmental Development, Ben-Gurion University of the Negev, Beer Sheva P.O.B. 653, Israel
| | - Stefan Reis
- UK Centre for Ecology & Hydrology, Bush Estate, Penicuik, Edinburgh, Midlothian EH26 0QB, UK
- Medical School, University of Exeter, Knowledge Spa, Truro TR1 3HD, UK
| | - Massimo Vieno
- UK Centre for Ecology & Hydrology, Bush Estate, Penicuik, Edinburgh, Midlothian EH26 0QB, UK
| | - Antonio Gasparrini
- Department of Public Health, Environments and Society, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
- The Centre on Climate Change and Planetary Health, London School of Hygiene & Tropical Medicine, London WC1H 9SH, UK
- Centre for Statistical Methodology, London School of Hygiene & Tropical Medicine, London WC1E 7HT, UK
| |
Collapse
|
34
|
A Satellite-Based High-Resolution (1-km) Ambient PM2.5 Database for India over Two Decades (2000–2019): Applications for Air Quality Management. REMOTE SENSING 2020. [DOI: 10.3390/rs12233872] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Fine particulate matter (PM2.5) is a major criteria pollutant affecting the environment, health and climate. In India where ground-based measurements of PM2.5 is scarce, it is important to have a long-term database at a high spatial resolution for an efficient air quality management plan. Here we develop and present a high-resolution (1-km) ambient PM2.5 database spanning two decades (2000–2019) for India. We convert aerosol optical depth from Moderate Resolution Imaging Spectroradiometer (MODIS) retrieved by Multiangle Implementation of Atmospheric Correction (MAIAC) algorithm to surface PM2.5 using a dynamic scaling factor from Modern-Era Retrospective analysis for Research and Applications Version 2 (MERRA-2) data. The satellite-derived daily (24-h average) and annual PM2.5 show a R2 of 0.8 and 0.97 and root mean square error of 25.7 and 7.2 μg/m3, respectively against surface measurements from the Central Pollution Control Board India network. Population-weighted 20-year averaged PM2.5 over India is 57.3 μg/m3 (5–95 percentile ranges: 16.8–86.9) with a larger increase observed in the present decade (2010–2019) than in the previous decade (2000 to 2009). Poor air quality across the urban–rural transact suggests that this is a regional scale problem, a fact that is often neglected. The database is freely disseminated through a web portal ‘satellite-based application for air quality monitoring and management at a national scale’ (SAANS) for air quality management, epidemiological research and mass awareness.
Collapse
|