1
|
Wang Z, Wu X, Wu Y. A spatiotemporal XGBoost model for PM 2.5 concentration prediction and its application in Shanghai. Heliyon 2023; 9:e22569. [PMID: 38058450 PMCID: PMC10696222 DOI: 10.1016/j.heliyon.2023.e22569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2022] [Revised: 11/13/2023] [Accepted: 11/15/2023] [Indexed: 12/08/2023] Open
Abstract
This paper innovatively constructed an analytical and forecasting framework to predict PM2.5 concentration levels for 16 municipal districts in Shanghai. By means of XGBoost parameters adjustment, empirical mode decomposition, and model fusion, improvements are made on XGBoost prediction accuracy and stability so that prediction deviation at extreme points can be avoided. The main findings of this paper can be summarized as follows: 1) Compared with the original model, the goodness of fit of the modified XGBoost model on the test set increased by 17 %, and the root mean square error decreased by 28 %; 2) The variation of PM2.5 concentration in Shanghai has a significant seasonal (cyclical) effect, and its fluctuation period is 3 months (a quarter). In winter, the frequency of extreme value points is significantly higher than that in other seasons; 3) In terms of spatial distribution, the PM2.5 concentration in the central city of Shanghai is higher than that in the rural areas, and the PM2.5 concentration gradually decreases from center city to the surrounding areas. The innovation and contribution of this paper can be summarized as follows: 1) EEMD algorithm verified by SSA was used to decompose the original model without reconstructing all subsequences and get the best weighing among each part of the hybrid model by using variable weight assignment; 2) The city was cut into pieces according to administrative districts in avoid of the duplicate analysis when utilizing advised Kriging interpolation; 3) IDW method was applied to verified Kriging interpolation to increase the accuracy; 4) The latitude and longitude were innovatively converted into the arc length of the corresponding spherical surface; 5) Hierarchical analysis method was used to obtain the order of importance among the PM2.5 monitoring stations, which could improve the accuracy and achieve dimension reduction.
Collapse
Affiliation(s)
- Zidong Wang
- School of Economics and Management, Shanghai Maritime University, Shanghai 201306, China
| | - Xianhua Wu
- School of Economics and Management, Shanghai Maritime University, Shanghai 201306, China
| | - You Wu
- School of Economics and Management, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
2
|
Choi H, Park S, Kang Y, Im J, Song S. Retrieval of hourly PM 2.5 using top-of-atmosphere reflectance from geostationary ocean color imagers I and II. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 323:121169. [PMID: 36773685 DOI: 10.1016/j.envpol.2023.121169] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/11/2023] [Accepted: 01/28/2023] [Indexed: 06/18/2023]
Abstract
To produce real-time ground-level information on particulate matter with a diameter equal to or less than 2.5 μm (PM2.5), many studies have explored the applicability of satellite data, particularly aerosol optical depth (AOD). However, many of the techniques used are computationally demanding; to overcome these challenges, machine learning(ML)-based research has been on the rise. Here, we used ML techniques to directly estimate ground-level PM2.5 concentrations over South Korea using top-of-atmosphere (TOA) reflectance from the Geostationary Ocean Color Imager I (GOCI-I) and its next generation GOCI-II with improved spatial, spectral, and temporal resolutions. Three ML techniques were used to estimate ground-level PM2.5 concentrations: random forest, light gradient boosting machine (LGBM), and artificial neural network. Three schemes were examined based on the input feature composition of the GOCI spectral bands: scheme 1 using all GOCI-I bands, scheme 2 using only GOCI-II bands that overlap with GOCI-I bands, and scheme 3 using all GOCI-II bands. The results showed that LGBM performed better than the other ML models. GOCI-II-based schemes 2 and 3 (determination of coefficient (R2) = 0.85 and 0.85 and root-mean-square-error (RMSE) = 7.69 and 7.82 μg/m3, respectively) performed slightly better than GOCI-I-based scheme 1 (R2 = 0.83 and RMSE = 8.49 μg/m3). In particular, TOA reflectance at a new channel (380 nm) of GOCI-II was identified as the most contributing variable, given its high sensitivity to aerosols. The long-term estimation of PM2.5 concentrations using the proposed models was examined for ground stations located in two major cities. GOCI-II-based models produced a more detailed spatial distribution of PM2.5 concentrations owing to their higher spatial resolution (i.e., 250 m). The use of TOA reflectance data, instead of AOD and other aerosol products commonly used in previous studies, reduced the missing rate of the estimated ground-level PM2.5 concentrations by up to 50%. Our results indicate that the proposed approach using TOA reflectance data from geostationary satellite sensors has great potential for estimating ground-level PM2.5 concentrations for operational purposes.
Collapse
Affiliation(s)
- Hyunyoung Choi
- Department of Urban Environment Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| | - Seonyoung Park
- Department of Applied Artificial Intelligence, Seoul National University of Science and Technology, Seoul, 01811, Republic of Korea
| | - Yoojin Kang
- Department of Urban Environment Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| | - Jungho Im
- Department of Urban Environment Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea; Research & Management Center for Particulate Matters at the Southeast Region of Korea, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, South Korea.
| | - Sanghyeon Song
- Department of Urban Environment Engineering, Ulsan National Institute of Science and Technology, Ulsan, 44919, Republic of Korea
| |
Collapse
|
3
|
Yu Z, Ma J, Qu Y, Pan L, Wan S. PM 2.5 extended-range forecast based on MJO and S2S using LightGBM. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 880:163358. [PMID: 37030354 DOI: 10.1016/j.scitotenv.2023.163358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 03/11/2023] [Accepted: 04/03/2023] [Indexed: 04/15/2023]
Abstract
We developed an extended-range fine particulate matter (PM2.5) prediction model in Shanghai using the light gradient-boosting machine (LightGBM) algorithm based on PM2.5 historical data, meteorological observational data, Subseasonal-to-Seasonal Prediction Project (S2S) forecasts and Madden-Julian Oscillation (MJO) monitoring data. The analysis and prediction results demonstrated that the MJO improved the predictive skill of the extended-range PM2.5 forecast. The MJO indexes, namely, real-time multivariate MJO series 1 (RMM1) and real-time multivariate MJO series 2 (RMM2), ranked the first, and seventh, respectively, in terms of the predictive contribution of all meteorological predictors. When the MJO was not introduced, the correlation coefficients for the forecasts on lead times of 11-40 days ranged from 0.27 to 0.55, and the root mean square errors (RMSEs) ranged from 23.4 to 31.8 μg/m3. After the MJO was introduced, the correlation coefficients for the 11-40 day forecast ranged from 0.31 to 0.56, among which those for the 16-40 day forecast substantially improved, and the RMSEs ranged from 23.2 to 28.7 μg/m3. When comparing the prediction scores, such as percent correct (PC), critical success index (CSI), and equitable threat score (ETS), the forecast model was more accurate when it introduced the MJO. A novel aspect of this study is to investigate the effects of the MJO mechanism on the meteorological conditions of air pollution in eastern China through advanced regression analysis. The MJO indexes RMM1 and RMM2 considerably impacted the geopotential height field of 28°-40° at 300-250 hPa 45 days in advance. When RMM1 increased and RMM2 decreased 45 days in advance, the 500 hPa geopotential height field weakened accordingly, and the bottom of the 500 hPa trough moved southward; thus cold air was more easily transported southward and the upstream air pollutants were transported to eastern China. With a weak ground pressure field and dry air at low altitudes, the westerly wind component increased, which led to the easier formation of a weather configuration favorable for the accumulation and transport of air pollution, thus resulting in an increase in PM2.5 concentration in the region. These findings can guide forecasters regarding the utility of MJO and S2S for subseasonal air pollution outlooks.
Collapse
Affiliation(s)
- Zhongqi Yu
- Shanghai Typhoon Institute, Shanghai Meteorological Service, Shanghai 200030, China; Shanghai Key Laboratory of Meteorology and Health, Shanghai Meteorological Service, Shanghai 200030, China
| | - Jinghui Ma
- Shanghai Typhoon Institute, Shanghai Meteorological Service, Shanghai 200030, China; Department of Atmospheric and Oceanic Sciences & Institute of Atmospheric Sciences, Fudan University, Shanghai 200438, China; Shanghai Key Laboratory of Meteorology and Health, Shanghai Meteorological Service, Shanghai 200030, China.
| | - Yuanhao Qu
- Shanghai Typhoon Institute, Shanghai Meteorological Service, Shanghai 200030, China; Shanghai Key Laboratory of Meteorology and Health, Shanghai Meteorological Service, Shanghai 200030, China
| | - Liang Pan
- Shanghai Typhoon Institute, Shanghai Meteorological Service, Shanghai 200030, China
| | - Shiquan Wan
- Yangzhou Meteorological Office, Yangzhou, China
| |
Collapse
|
4
|
Zhai S, Zhang Y, Huang J, Li X, Wang W, Zhang T, Yin F, Ma Y. Exploring the detailed spatiotemporal characteristics of PM 2.5: Generating a full-coverage and hourly PM 2.5 dataset in the Sichuan Basin, China. CHEMOSPHERE 2023; 310:136786. [PMID: 36257387 DOI: 10.1016/j.chemosphere.2022.136786] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/04/2022] [Indexed: 06/16/2023]
Abstract
Fine particulate matter (PM2.5) has received worldwide attention due to its threat to public health. In the Sichuan Basin (SCB), PM2.5 is causing heavy health burdens due to its high concentrations and population density. Compared with other heavily polluted areas, less effort has been made to generate a full-coverage PM2.5 dataset of the SCB, in which the detailed PM2.5 spatiotemporal characteristics remain unclear. Considering commonly existing spatiotemporal autocorrelations, the top-of-atmosphere reflectance (TOAR) with a high coverage rate and other auxiliary data were employed to build commonly used random forest (RF) models to generate accurate hourly PM2.5 concentration predictions with a 0.05° × 0.05° spatial resolution in the SCB in 2016. Specifically, with historical concentrations predicted from a spatial RF (S-RF) and observed at stations, an alternative spatiotemporal RF (AST-RF) and spatiotemporal RF (ST-RF) were built in grids with stations (type 1). The predictions from the AST-RF in grids without stations (type 2) and observations in type 1 formed the PM2.5 dataset. The LOOCV R2, RMSE and MAE were 0.94/0.94, 8.71/8.62 μg∕m3 and 5.58/5.57 μg∕m3 in the AST-RF/ST-RF, respectively. Using the produced dataset, spatiotemporal analysis was conducted for a detailed understanding of the spatiotemporal characteristics of PM2.5 in the SCB. The PM2.5 concentrations gradually increased from the edge to the center of the SCB in spatial distribution. Two high-concentration areas centered on Chengdu and Zigong were observed throughout the year, while another high-concentration area centered on Dazhou was only observed in winter. The diurnal variation had double peaks and double valleys in the SCB. The concentrations were high at night and low in daytime, which suggests that characterizing the relationship between PM2.5 and adverse health outcomes by daily means might be inaccurate with most human activities conducted in daytime.
Collapse
Affiliation(s)
- Siwei Zhai
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Yi Zhang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Jingfei Huang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Xuelin Li
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Wei Wang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Tao Zhang
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Fei Yin
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China
| | - Yue Ma
- Institute of Systems Epidemiology, West China School of Public Health and West China Fourth Hospital, Sichuan University, China.
| |
Collapse
|
5
|
Shi Y, Lau AKH, Ng E, Ho HC, Bilal M. A Multiscale Land Use Regression Approach for Estimating Intraurban Spatial Variability of PM 2.5 Concentration by Integrating Multisource Datasets. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 19:321. [PMID: 35010580 PMCID: PMC8751171 DOI: 10.3390/ijerph19010321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 12/24/2021] [Accepted: 12/28/2021] [Indexed: 06/14/2023]
Abstract
Poor air quality has been a major urban environmental issue in large high-density cities all over the world, and particularly in Asia, where the multiscale complex of pollution dispersal creates a high-level spatial variability of exposure level. Investigating such multiscale complexity and fine-scale spatial variability is challenging. In this study, we aim to tackle the challenge by focusing on PM2.5 (particulate matter with an aerodynamic diameter less than 2.5 µm,) which is one of the most concerning air pollutants. We use the widely adopted land use regression (LUR) modeling technique as the fundamental method to integrate air quality data, satellite data, meteorological data, and spatial data from multiple sources. Unlike most LUR and Aerosol Optical Depth (AOD)-PM2.5 studies, the modeling process was conducted independently at city and neighborhood scales. Correspondingly, predictor variables at the two scales were treated separately. At the city scale, the model developed in the present study obtains better prediction performance in the AOD-PM2.5 relationship when compared with previous studies (R2¯ from 0.72 to 0.80). At the neighborhood scale, point-based building morphological indices and road network centrality metrics were found to be fit-for-purpose indicators of PM2.5 spatial estimation. The resultant PM2.5 map was produced by combining the models from the two scales, which offers a geospatial estimation of small-scale intraurban variability.
Collapse
Affiliation(s)
- Yuan Shi
- Institute of Future Cities (IOFC), The Chinese University of Hong Kong, Hong Kong, China
| | - Alexis Kai-Hon Lau
- Division of Environment and Sustainability, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China;
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
- Institute for the Environment, The Hong Kong University of Science & Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Edward Ng
- Institute of Future Cities (IOFC), The Chinese University of Hong Kong, Hong Kong, China
- School of Architecture, The Chinese University of Hong Kong, Hong Kong, China;
- Institute of Environment, Energy and Sustainability (IEES), The Chinese University of Hong Kong, Hong Kong, China
| | - Hung-Chak Ho
- Department of Urban Planning and Design, The University of Hong Kong, Hong Kong, China;
| | - Muhammad Bilal
- Lab of Environmental Remote Sensing (LERS), School of Marine Sciences, Nanjing University of Information Science and Technology, Nanjing 210044, China;
| |
Collapse
|
6
|
A Novel Recursive Model Based on a Convolutional Long Short-Term Memory Neural Network for Air Pollution Prediction. REMOTE SENSING 2021. [DOI: 10.3390/rs13071284] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Deep learning provides a promising approach for air pollution prediction. The existing deep learning-based predicted models generally consider either the temporal correlations of air quality monitoring stations or the nonlinear relationship between the PM2.5 (particulate matter with an aerodynamic diameter of less than 2.5 μm) concentrations and explanatory variables. Spatial correlation has not been effectively incorporated into prediction models, therefore exhibiting poor performance in PM2.5 prediction tasks. Additionally, determining the manner by which to expand longer-term prediction tasks is still challenging. In this paper, to allow for spatiotemporal correlations, a spatiotemporal convolutional recursive long short-term memory (CR-LSTM) neural network model is proposed for predicting the PM2.5 concentrations in long-term prediction tasks by combining a convolutional long short-term memory (ConvLSTM) neural network and a recursive strategy. Herein, the ConvLSTM network was used to capture the complex spatiotemporal correlations and to predict the future PM2.5 concentrations; the recursive strategy was used for expanding the long-term prediction tasks. The CR-LSTM model was used to realize the prediction of the future 24 h of PM2.5 concentrations for 12 air quality monitoring stations in Beijing by configuring both the appropriate time lag derived from the temporal correlations and the spatial neighborhood, including the hourly historical PM2.5 concentrations, the daily mean meteorological data, and the annual nighttime light and normalized difference vegetation index (NDVI). The results showed that the proposed CR-LSTM model achieved better performance (coefficient of determination (R2) = 0.74; root mean square error (RMSE) = 18.96 μg/m3) than other common models, such as multiple linear regression (MLR), support vector regression (SVR), the conventional LSTM model, the LSTM extended (LSTME) model, and the temporal sliding LSTM extended (TS-LSTME) model. The proposed CR-LSTM model, implementing a combination of geographical rules, recursive strategy, and deep learning, shows improved performance in longer-term prediction tasks.
Collapse
|