1
|
Rodríguez Núñez M, Tavera Busso I, Carreras HA. Quantifying the contribution of environmental variables to cyclists' exposure to PM 2.5 using machine learning techniques. Heliyon 2024; 10:e24724. [PMID: 38298733 PMCID: PMC10828810 DOI: 10.1016/j.heliyon.2024.e24724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 12/17/2023] [Accepted: 01/12/2024] [Indexed: 02/02/2024] Open
Abstract
Cyclists are particularly vulnerable to travel-related exposure to air pollution. Understanding the factors that increase exposure is crucial for promoting healthier urban environments. Machine learning models have successfully predicted air pollutant concentrations, but they tend to be less interpretable than classical statistical ones, such as linear models. This study aimed to develop a predictive model to assess cyclists' exposure to fine particulate matter (PM2.5) in urban environments. The model was generated using geo-temporally referenced data and machine learning techniques. We explored several models and found that the gradient boosting machine learning model best fitted the PM2.5 predictions, with a minimum root mean square error value of 5.62 μg m-3. The variables with greatest influence on cyclist exposure were the temporal ones (month, day of the week, and time of the day), followed by meteorological variables, such as temperature, relative humidity, wind speed, wind direction, and atmospheric pressure. Additionally, we considered relevant attributes, which are partially linked to spatial characteristics. These attributes encompass street typology, vegetation density, and the flow of vehicles on a particular street, which quantifies the number of vehicles passing a given point per minute. Mean PM2.5 concentration was lower in bicycle paths away from vehicular traffic than in bike lanes along streets. These outcomes underscore the need to thoughtfully design public transportation routes, including bus routes, concerning the network of bicycle pathways. Such strategic planning attempts to improve the air quality in urban landscapes.
Collapse
Affiliation(s)
- Martín Rodríguez Núñez
- Instituto Multidisciplinario de Biología Vegetal (IMBIV), Consejo Nacional de Investigaciones Científicas y Técnicas, Argentina
- Departamento de Química, Físicas y Naturales, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Iván Tavera Busso
- Instituto Multidisciplinario de Biología Vegetal (IMBIV), Consejo Nacional de Investigaciones Científicas y Técnicas, Argentina
- Departamento de Química, Físicas y Naturales, Universidad Nacional de Córdoba, Córdoba, Argentina
| | - Hebe Alejandra Carreras
- Instituto Multidisciplinario de Biología Vegetal (IMBIV), Consejo Nacional de Investigaciones Científicas y Técnicas, Argentina
- Departamento de Química, Físicas y Naturales, Universidad Nacional de Córdoba, Córdoba, Argentina
| |
Collapse
|
2
|
Ma X, Zou B, Deng J, Gao J, Longley I, Xiao S, Guo B, Wu Y, Xu T, Xu X, Yang X, Wang X, Tan Z, Wang Y, Morawska L, Salmond J. A comprehensive review of the development of land use regression approaches for modeling spatiotemporal variations of ambient air pollution: A perspective from 2011 to 2023. ENVIRONMENT INTERNATIONAL 2024; 183:108430. [PMID: 38219544 DOI: 10.1016/j.envint.2024.108430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 11/26/2023] [Accepted: 01/04/2024] [Indexed: 01/16/2024]
Abstract
Land use regression (LUR) models are widely used in epidemiological and environmental studies to estimate humans' exposure to air pollution within urban areas. However, the early models, developed using linear regressions and data from fixed monitoring stations and passive sampling, were primarily designed to model traditional and criteria air pollutants and had limitations in capturing high-resolution spatiotemporal variations of air pollution. Over the past decade, there has been a notable development of multi-source observations from low-cost monitors, mobile monitoring, and satellites, in conjunction with the integration of advanced statistical methods and spatially and temporally dynamic predictors, which have facilitated significant expansion and advancement of LUR approaches. This paper reviews and synthesizes the recent advances in LUR approaches from the perspectives of the changes in air quality data acquisition, novel predictor variables, advances in model-developing approaches, improvements in validation methods, model transferability, and modeling software as reported in 155 LUR studies published between 2011 and 2023. We demonstrate that these developments have enabled LUR models to be developed for larger study areas and encompass a wider range of criteria and unregulated air pollutants. LUR models in the conventional spatial structure have been complemented by more complex spatiotemporal structures. Compared with linear models, advanced statistical methods yield better predictions when handling data with complex relationships and interactions. Finally, this study explores new developments, identifies potential pathways for further breakthroughs in LUR methodologies, and proposes future research directions. In this context, LUR approaches have the potential to make a significant contribution to future efforts to model the patterns of long- and short-term exposure of urban populations to air pollution.
Collapse
Affiliation(s)
- Xuying Ma
- College of Geomatics, Xi'an University of Science and Technology, Xi'an 710054, China; College of Safety Science and Engineering, Xi'an University of Science and Technology, Xi'an 710054, China; International Laboratory for Air Quality and Health, Queensland University of Technology, Brisbane, Queensland 4000, Australia.
| | - Bin Zou
- School of Geosciences and Info-Physics, Central South University, Changsha, Hunan 410083, China.
| | - Jun Deng
- College of Safety Science and Engineering, Xi'an University of Science and Technology, Xi'an 710054, China; Shaanxi Key Laboratory of Prevention and Control of Coal Fire, Xi'an University of Science and Technology, Xi'an 710054, China
| | - Jay Gao
- School of Environment, Faculty of Science, University of Auckland, Auckland 1010, New Zealand
| | - Ian Longley
- National Institute of Water and Atmospheric Research, Auckland 1010, New Zealand
| | - Shun Xiao
- School of Geography and Tourism, Shaanxi Normal University, Xi'an 710119, China
| | - Bin Guo
- College of Geomatics, Xi'an University of Science and Technology, Xi'an 710054, China
| | - Yarui Wu
- College of Geomatics, Xi'an University of Science and Technology, Xi'an 710054, China
| | - Tingting Xu
- School of Software Engineering, Chongqing University of Post and Telecommunications, Chongqing 400065, China
| | - Xin Xu
- Xi'an Institute for Innovative Earth Environment Research, Xi'an 710061, China
| | - Xiaosha Yang
- Shandong Nova Fitness Co., Ltd., Baoji, Shaanxi 722404, China
| | - Xiaoqi Wang
- College of Geomatics, Xi'an University of Science and Technology, Xi'an 710054, China
| | - Zelei Tan
- College of Geomatics, Xi'an University of Science and Technology, Xi'an 710054, China
| | - Yifan Wang
- College of Geomatics, Xi'an University of Science and Technology, Xi'an 710054, China
| | - Lidia Morawska
- International Laboratory for Air Quality and Health, Queensland University of Technology, Brisbane, Queensland 4000, Australia.
| | - Jennifer Salmond
- School of Environment, Faculty of Science, University of Auckland, Auckland 1010, New Zealand
| |
Collapse
|
3
|
Wongnakae P, Chitchum P, Sripramong R, Phosri A. Application of satellite remote sensing data and random forest approach to estimate ground-level PM 2.5 concentration in Northern region of Thailand. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:88905-88917. [PMID: 37442931 DOI: 10.1007/s11356-023-28698-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 07/05/2023] [Indexed: 07/15/2023]
Abstract
Numerous epidemiological studies have shown that particulate matter with aerodynamic diameter up to 2.5 μm (PM2.5) is associated with many health consequences, where PM2.5 concentration obtained from the monitoring station was normally applied as the exposure level, so that the concentration of PM2.5 in unmonitored areas has not been captured. The satellite-derived aerosol optical depth (AOD) product is then used to spatially predict ground truth of PM2.5 concentration that covers the locations with no air quality monitoring station, but this method has seldom been developed in Thailand. This study aimed at estimating ground-level PM2.5 concentration at 3 km × 3 km spatial resolution over Northern region of Thailand in 2021 using the random forest model integrating the Moderate Resolution Imaging Spectroradiometer (MODIS) AOD products from Terra and Aqua satellites, meteorological factors, and land use data. A random forest model contained 100 decision trees was utilized to train the model, and 10-fold cross-validation approach was implemented to validate the model performance. The good consistency between actual (observed) and predicted concentrations of PM2.5 in Northern region of Thailand was observed, where a coefficient of determination (R2) and root mean square error (RMSE) of the model fitting were 0.803 and 14.30 μg/m3, respectively, and those of 10-fold cross-validation approach were 0.796 and 14.64 μg/m3, respectively. The three most important predictors for estimating the ground-level concentrations of PM2.5 in this study were normalized difference vegetation index (NDVI), relative humidity, and number of fire hotspot, respectively. Findings from this study revealed that integrating the MODIS AOD, meteorological variables, and land use data into the random forest model precisely and accurately estimated ground-level PM2.5 concentration over Northern region of Thailand that can be further used to investigate the effects of PM2.5 exposure on health consequences, even in unmonitored locations, in epidemiological studies.
Collapse
Affiliation(s)
- Pimchanok Wongnakae
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand
| | - Pakkapong Chitchum
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand
| | - Rungduen Sripramong
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand
| | - Arthit Phosri
- Department of Environmental Health Sciences, Faculty of Public Health, Mahidol University, 4th Floor, 2nd Building, Rajvithi Road, Bangkok, 10400, Thailand.
- Center of Excellence on Environmental Health and Toxicology (EHT), OPS, Ministry of Higher Education, Research, Science and Innovation, Bangkok, Thailand.
| |
Collapse
|
4
|
Islam ARMT, Al Awadh M, Mallick J, Pal SC, Chakraborty R, Fattah MA, Ghose B, Kakoli MKA, Islam MA, Naqvi HR, Bilal M, Elbeltagi A. Estimating ground-level PM 2.5 using subset regression model and machine learning algorithms in Asian megacity, Dhaka, Bangladesh. AIR QUALITY, ATMOSPHERE, & HEALTH 2023; 16:1117-1139. [PMID: 37303964 PMCID: PMC9961308 DOI: 10.1007/s11869-023-01329-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 02/16/2023] [Indexed: 06/13/2023]
Abstract
Fine particulate matter (PM2.5) has become a prominent pollutant due to rapid economic development, urbanization, industrialization, and transport activities, which has serious adverse effects on human health and the environment. Many studies have employed traditional statistical models and remote-sensing technologies to estimate PM2.5 concentrations. However, statistical models have shown inconsistency in PM2.5 concentration predictions, while machine learning algorithms have excellent predictive capacity, but little research has been done on the complementary advantages of diverse approaches. The present study proposed the best subset regression model and machine learning approaches, including random tree, additive regression, reduced error pruning tree, and random subspace, to estimate the ground-level PM2.5 concentrations over Dhaka. This study used advanced machine learning algorithms to measure the effects of meteorological factors and air pollutants (NOX, SO2, CO, and O3) on the dynamics of PM2.5 in Dhaka from 2012 to 2020. Results showed that the best subset regression model was well-performed for forecasting PM2.5 concentrations for all sites based on the integration of precipitation, relative humidity, temperature, wind speed, SO2, NOX, and O3. Precipitation, relative humidity, and temperature have negative correlations with PM2.5. The concentration levels of pollutants are much higher at the beginning and end of the year. Random subspace is the optimal model for estimating PM2.5 because it has the least statistical error metrics compared to other models. This study suggests ensemble learning models to estimate PM2.5 concentrations. This study will help quantify ground-level PM2.5 concentration exposure and recommend regional government actions to prevent and regulate PM2.5 air pollution. Supplementary Information The online version contains supplementary material available at 10.1007/s11869-023-01329-w.
Collapse
Affiliation(s)
| | - Mohammed Al Awadh
- Department of Industrial Engineering, College of Engineering, King Khalid University, Abha, 61421 Saudi Arabia
| | - Javed Mallick
- Department of Civil Engineering, King Khalid University, Abha, Saudi Arabia
| | - Subodh Chandra Pal
- Department of Geography, The University of Burdwan, Bardhaman, West Bengal 713104 India
| | - Rabin Chakraborty
- Department of Geography, The University of Burdwan, Bardhaman, West Bengal 713104 India
| | - Md. Abdul Fattah
- Department of Urban and Regional Planning, Khulna University of Engineering and Technology, Khulna, Bangladesh
| | - Bonosri Ghose
- Department of Disaster Management, Begum Rokeya University, Rangpur, Rangpur, 5400 Bangladesh
| | | | - Md. Aminul Islam
- Department of Disaster Management, Begum Rokeya University, Rangpur, Rangpur, 5400 Bangladesh
| | - Hasan Raja Naqvi
- Department of Geography, Faculty of Natural Sciences, Jamia Millia Islamia (A Central University), New Delhi, 110025 India
| | - Muhammad Bilal
- School of Surveying and Land Information Engineering, Henan Polytechnic University, Jiaozuo, 45003 China
| | - Ahmed Elbeltagi
- Agricultural Engineering Dept., Faculty of Agriculture, Mansoura University, Mansoura, 35516 Egypt
| |
Collapse
|
5
|
Sun X, Zhou Z, Wang Y. Water resource carrying capacity and obstacle factors in the Yellow River basin based on the RBF neural network model. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:22743-22759. [PMID: 36306066 PMCID: PMC9613451 DOI: 10.1007/s11356-022-23712-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 10/14/2022] [Indexed: 06/16/2023]
Abstract
The Yellow River basin (YRB) plays an important role in China's economic and social growth. Based on different dimensions, we adopted the radial basis function (RBF) neural network model and the obstacle degree model to examine the water resource carrying capacity (WRCC) of the YRB. From 2005 to 2020, the WRCC of the entire YRB, as well as the upstream and midstream regions, improved, but the WRCC of the downstream region remained poor, revealing spatial differences. The overall improvement in the WRCC of the Yellow River's nine provinces is good, but the WRCC of Inner Mongolia and Henan is poor, suggesting regional differences. From the standpoint of obstacle factors, the development and usage rate of surface water resources are the main challenges. In 2020, the obstacle degree of the YRB reached 87.4871%. The irrigated area rate in Gansu was the primary obstacle factor, and the obstacle degree reached 73.0238%. Qinghai's industrial aspects mostly hindered the improvement of its WRCC, with an obstacle degree of 31.36%. The results provide a theoretical reference for the high-quality development of the YRB.
Collapse
Affiliation(s)
- Xinrui Sun
- School of Statistics, Dongbei University of Finance and Economics, Dalian, 116023, China
| | - Zixuan Zhou
- School of Statistics, Dongbei University of Finance and Economics, Dalian, 116023, China
| | - Yong Wang
- School of Statistics, Dongbei University of Finance and Economics, Dalian, 116023, China.
| |
Collapse
|
6
|
Analysis of PM2.5 Variations Based on Observed, Satellite-Derived, and Population-Weighted Concentrations. REMOTE SENSING 2022. [DOI: 10.3390/rs14143381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Fine particulate matter (PM2.5), which can cause adverse human health effects, has been proven as the first air pollutant in China. In situ observations with ground-level monitoring and satellite-based concentrations have been used to analyze the variations in PM2.5. However, variation analyses based on these two kinds of measurement have mainly focused on the concentration itself and ignored the effects on the population. Therefore, this study not only investigated these two kinds of measurements, but also performed weighted population analyses to study the variations in PM2.5. Firstly, daily models of timely structure adaptive modeling (TSAM) were constructed to simulate satellite-derived PM2.5 levels from January 2013 to December 2016. Secondly, population-weighted concentrations were calculated based on TSAM-derived PM2.5 surfaces. Finally, observed, TSAM-derived, and population-weighted concentrations were used to analyze the variations in PM2.5. The results showed the different importance of various input parameters; AOD had the highest rank. Additionally, TSAM models demonstrated good performance, fitting R ranging from 0.86 to 0.91, and validating R from 0.82 to 0.89. According to the air quality standard in China, TSAM-derived PM2.5 showed that the increase in area lower than Level II was 29.03% and the increase in population was only 14.81%. This indicates that the air quality exhibited an overall improvement in spatial perspective, but some areas with high population density showed a relatively low improvement due to uneven distributions in China. The population-weighted PM2.5 concentration could better represent the health threats of air pollutants compared with in situ observations.
Collapse
|
7
|
Guo B, Wang L, Pan R, Zhu X. A grouped spatial-temporal model for PM 2.5 data and its applications on outlier detection. COMMUN STAT-SIMUL C 2022. [DOI: 10.1080/03610918.2022.2081707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Baishan Guo
- Meta Platforms, Inc, Menlo Park, California, USA
| | - Lei Wang
- Heinz College of Information Systems and Public Policy, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| | - Rui Pan
- School of Statistics and Mathematics, Central University of Finance and Economics, Beijing, China
| | - Xuening Zhu
- School of Data Science, Fudan University, Shanghai, China
| |
Collapse
|
8
|
High Spatiotemporal Resolution PM2.5 Concentration Estimation with Machine Learning Algorithm: A Case Study for Wildfire in California. REMOTE SENSING 2022. [DOI: 10.3390/rs14071635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
As an aggregate of suspended particulate matter in the air, atmospheric aerosols can affect the regional climate. With the help of satellite remote sensing technology to retrieve AOD (aerosol optical depth) on a global or regional scale, accurate estimation of PM2.5 concentration has become an important task to quantify the spatiotemporal distribution of AOD and PM2.5. However, due to the limitations of satellite platforms, sensors, and inversion algorithms, the spatiotemporal resolution of current major AOD products is still relatively low. Meanwhile, for the impact of cloud, the AOD products often have a serious data gap problem, which also objectively limits the spatiotemporal coverage of predicted PM2.5 concentration. Therefore, how to effectively improve the spatiotemporal resolution and coverage of PM2.5 concentration under the requisite accuracy is still a grand challenge. In this study, the fused high spatial-temporal resolution AOD data in our previous study were used to estimate the ground PM2.5 concentration through machine learning algorithms, the deep belief network (DBN). The PM2.5 data had spatiotemporal autocorrelation in geostatistics and followed the Gaussian kernel distribution. Hence, the autocorrelation model modified by Gaussian kernel function integrated with DBN algorithm, named Geoi-DBN, was used to estimate PM2.5 concentration. The cross-validation results showed that the Geoi-DBN (R2 = 0.86, RMSE = 6.84 µg m−3) performed better than the original DBN (R2 = 0.67, RMSE = 10.46 µg m−3). The final high quality PM2.5 concentration data can be applied for urban air quality monitoring and related PM2.5 exposure risk assessment such as wildfire.
Collapse
|
9
|
High-Resolution PM2.5 Estimation Based on the Distributed Perception Deep Neural Network Model. SUSTAINABILITY 2021. [DOI: 10.3390/su132413985] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The accurate measurement of the PM2.5 individual exposure level is a key issue in the study of health effects. However, the lack of historical data and the minute coverage of ground monitoring points are obstacles to the study of such issues. Based on the aerosol optical depth provided by NASA, combined with ground monitoring data and meteorological data, it is an effective method to estimate the near-ground concentration of PM2.5. With the deepening of related research, the models used have developed from univariate and multivariate linear models to nonlinear models such as support vector machine, random forest model, and deep learning neural network model. Among them, the depth neural network model has better performance. However, in the existing research, the variables used are input into the same neural network together, that is, the complex relationship caused by the lag effect of features and the correlation and partial correlation between features have not been considered. The above neural network framework can not be well applied to the complex situation of atmospheric systems and the estimation accuracy of the model needs to be improved. This is the first problem that we need to be overcome. Secondly, in the missing data value processing, the existing studies mostly use single interpolation methods such as linear fitting and Kriging interpolation. However, because the time and place of data missing are complex and changeable, a single method is difficult to deal with a large area of strip and block missing data. Moreover, the linear fitting method is easy to smooth out the special data in bad weather. This is the second problem that we need to overcome. Therefore, we construct a distributed perception deep neural network model (DP-DNN) and spatiotemporal multiview interpolation module to overcome problems 1 and 2. In empirical research, based on the Beijing–Tianjin–Hebei–Shandong region in 2018, we introduce 50 features such as meteorology, NDVI, spatial-temporal feature to analyze the relationship between AOD and PM2.5, and test the performance of DP-DNN and spatiotemporal multiview interpolation module. The results show that after applying the spatiotemporal multiview interpolation module, the average proportion of missing data decreases from 52.1% to 4.84%, and the relative error of the results is 27.5%. Compared with the single interpolation method, this module has obvious advantages in accuracy and level of completion. The mean absolute error, relative error, mean square error, and root mean square error of DP-DNN in time prediction are 17.7 μg/m3, 46.8%, 766.2 g2/m6, and 26.9 μg/m3, respectively, and in space prediction, they are 16.6 μg/m3, 41.8%, 691.5 μg2/m6, and 26.6 μg/m3. DP-DNN has higher accuracy and generalization ability. At the same time, the estimation method in this paper can estimate the PM2.5 of the selected longitude and latitude, which can effectively solve the problem of insufficient coverage of China’s meteorological environmental quality monitoring stations.
Collapse
|
10
|
Guo B, Zhang D, Pei L, Su Y, Wang X, Bian Y, Zhang D, Yao W, Zhou Z, Guo L. Estimating PM 2.5 concentrations via random forest method using satellite, auxiliary, and ground-level station dataset at multiple temporal scales across China in 2017. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 778:146288. [PMID: 33714834 DOI: 10.1016/j.scitotenv.2021.146288] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 02/15/2021] [Accepted: 03/01/2021] [Indexed: 06/12/2023]
Abstract
Fine particulate matter with aerodynamic diameters less than 2.5 μm (PM2.5) poses adverse impacts on public health and the environment. It is still a great challenge to estimate high-resolution PM2.5 concentrations at moderate scales. The current study calibrated PM2.5 concentrations at a 1 km resolution scale using ground-level monitoring data, Aerosol Optical Depth (AOD), meteorological data, and auxiliary data via Random Forest (RF) model across China in 2017. The three ten-folded cross-validations (CV) methods including sample-based, time-based, and spatial-based validation combined with Coefficient Square (R2), Root-Mean-Square Error (RMSE), and Mean Predictive Error (MPE) have been used for validation at different temporal scales in terms of daily, monthly, heating seasonal, and non-heating seasonal. Finally, the distribution map of PM2.5 concentrations was illustrated based on the RF model. Some findings were achieved. The RF model performed well, with a relatively high sample-based cross-validation R2 of 0.74, a low RMSE of 16.29 μg × m-3, and a small MPE of -0.282 μg × m-3. Meanwhile, the performance of the RF model in inferring the PM2.5 concentrations was well at urban scales except for Chengyu (CY). North China, the CY urban agglomeration, and the northwest of China exhibited relatively high PM2.5 pollution features, especially in the heating season. The robustness of the RF model in the present study outperformed most statistical regression models for calibrating PM2.5 concentrations. The outcomes can supply an up-to-date scientific dataset for epidemiological and air pollutants exposure risk studies across China.
Collapse
Affiliation(s)
- Bin Guo
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China.
| | - Dingming Zhang
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Lin Pei
- School of Public Health, Xi'an Jiaotong University, Xi'an, China.
| | - Yi Su
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Xiaoxia Wang
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Yi Bian
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Donghai Zhang
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Wanqiang Yao
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China.
| | - Zixiang Zhou
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| | - Liyu Guo
- College of Geomatics, Xi'an University of Science and Technology, Xi'an, China
| |
Collapse
|
11
|
Etchie TO, Etchie AT, Jauro A, Pinker RT, Swaminathan N. Season, not lockdown, improved air quality during COVID-19 State of Emergency in Nigeria. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 768:145187. [PMID: 33736334 PMCID: PMC7825968 DOI: 10.1016/j.scitotenv.2021.145187] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 01/09/2021] [Accepted: 01/10/2021] [Indexed: 05/24/2023]
Abstract
Globally, ambient air pollution claims ~9 million lives yearly, prompting researchers to investigate changes in air quality. Of special interest is the impact of COVID-19 lockdown. Many studies reported substantial improvements in air quality during lockdowns compared with pre-lockdown or as compared with baseline values. Since the lockdown period coincided with the onset of the rainy season in some tropical countries such as Nigeria, it is unclear if such improvements can be fully attributed to the lockdown. We investigate whether significant changes in air quality in Nigeria occurred primarily due to statewide COVID-19 lockdown. We applied a neural network approach to derive monthly average ground-level fine aerosol optical depth (AODf) across Nigeria from year 2001-2020, using the Multi-angle Implementation of Atmospheric Correction (MAIAC) AODs from Terra and Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) satellites, AERONET aerosol optical properties, meteorological and spatial parameters. During the year 2020, we found a 21% or 26% decline in average AODf level across Nigeria during lockdown (April) as compared to pre-lockdown (March), or during the easing phase-1 (May) as compared to lockdown, respectively. Throughout the 20-year period, AODf levels were highest in January and lowest in May or June, but not April. Comparison of AODf levels between 2020 and 2019 shows a small decline (1%) in pollution level in April of 2020 compare to 2019. Using a linear time-lag model to compare changes in AODf levels for similar months from 2002 to 2020, we found no significant difference (Levene's test and ANCOVA; α = 0.05) in the pollution levels by year, which indicates that the lockdown did not significantly improve air quality in Nigeria. Impact analysis using multiple linear regression revealed that favorable meteorological conditions due to seasonal change in temperature, relative humidity, planetary boundary layer height, wind speed and rainfall improved air quality during the lockdown.
Collapse
Affiliation(s)
| | | | - Aliyu Jauro
- National Environmental Standards and Regulations Enforcement Agency (NESREA), Garki-Abuja, Nigeria.
| | - Rachel T Pinker
- Department of Atmospheric and Oceanic Science, University of Maryland, College Park, USA.
| | | |
Collapse
|
12
|
Zhang H, Shang Z, Song Y, He Z, Li L. A novel combined model based on echo state network - a case study of PM 10 and PM 2.5 prediction in China. ENVIRONMENTAL TECHNOLOGY 2020; 41:1937-1949. [PMID: 30472931 DOI: 10.1080/09593330.2018.1551941] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 11/16/2018] [Indexed: 06/09/2023]
Abstract
Particulate Matters such as PM10, PM2.5 may contain heavy metal oxides and harmful substances that threaten human health and environmental quality. In this paper, we propose a new combined neural network algorithm which based on Elman, echo state network (ESN) and cascaded BP neural network (CBP) to predict PM10 and PM2.5. In order to further improve the performance of the prediction result, we use the simulated annealing algorithm (SA) to optimize the parameters in the combination method to form the optimal combination model. And particle swarm optimization (PSO) is used to optimize the parameters in ESN. The chemical species in the atmosphere which include SO2, NO, NO2, O3 and CO in Baiyin, Gansu Province of China are used to test and verify the proposed combined method. The experimental results show that the prediction performance of the combined model presented in this paper is indeed superior to other three neural network models.
Collapse
Affiliation(s)
- Hairui Zhang
- School of Information Science and Engineering, Lanzhou University, Lanzhou, People's Republic of China
| | - Zhihao Shang
- School of Information Science and Engineering, Lanzhou University, Lanzhou, People's Republic of China
- Department of Mathematics and Computer Science, Free University of Berlin, Berlin, Germany
| | - Yanru Song
- School of Information Science and Engineering, Lanzhou University, Lanzhou, People's Republic of China
| | - Zhaoshuang He
- School of Information Science and Engineering, Lanzhou University, Lanzhou, People's Republic of China
| | - Lian Li
- School of Information Science and Engineering, Lanzhou University, Lanzhou, People's Republic of China
| |
Collapse
|
13
|
Park Y, Kwon B, Heo J, Hu X, Liu Y, Moon T. Estimating PM2.5 concentration of the conterminous United States via interpretable convolutional neural networks. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2020; 256:113395. [PMID: 31708281 DOI: 10.1016/j.envpol.2019.113395] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 09/30/2019] [Accepted: 10/12/2019] [Indexed: 06/10/2023]
Abstract
We apply convolutional neural network (CNN) model for estimating daily 24-h averaged ground-level PM2.5 of the conterminous United States in 2011 by incorporating aerosol optical depth (AOD) data, meteorological fields, and land-use data. Unlike some of the recent supervised learning-based approaches, which only utilized the predictors from the location of which PM2.5 value is estimated, we naturally aggregate predictors from nearby locations such that the spatial correlation among the predictors can be exploited. We carefully evaluate the performance of our method via overall, temporally-separated, and spatially-separated cross-validations (CV) and show that our CNN achieves competitive estimation accuracy compared to the recently developed baselines. Furthermore, we develop a novel predictor importance metric for our CNN based on the recent neural network interpretation method, Layerwise Relevance Propagation (LRP), and identify several informative predictors for PM2.5 estimation.
Collapse
Affiliation(s)
- Yongbee Park
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, South Korea
| | | | - Juyeon Heo
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, South Korea
| | - Xuefei Hu
- Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA, 30322, USA
| | - Yang Liu
- Department of Environmental Health, Rollins School of Public Health, Emory University, Atlanta, GA, 30322, USA
| | - Taesup Moon
- Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, 16419, South Korea; Department of Artificial Intelligence, Sungkyunkwan University, Suwon, 16419, South Korea.
| |
Collapse
|
14
|
Bi J, Stowell J, Seto EYW, English PB, Al-Hamdan MZ, Kinney PL, Freedman FR, Liu Y. Contribution of low-cost sensor measurements to the prediction of PM 2.5 levels: A case study in Imperial County, California, USA. ENVIRONMENTAL RESEARCH 2020; 180:108810. [PMID: 31630004 PMCID: PMC6899193 DOI: 10.1016/j.envres.2019.108810] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 08/13/2019] [Accepted: 10/07/2019] [Indexed: 05/22/2023]
Abstract
Regulatory monitoring networks are often too sparse to support community-scale PM2.5 exposure assessment while emerging low-cost sensors have the potential to fill in the gaps. To date, limited studies, if any, have been conducted to utilize low-cost sensor measurements to improve PM2.5 prediction with high spatiotemporal resolutions based on statistical models. Imperial County in California is an exemplary region with sparse Air Quality System (AQS) monitors and a community-operated low-cost network entitled Identifying Violations Affecting Neighborhoods (IVAN). This study aims to evaluate the contribution of IVAN measurements to the quality of PM2.5 prediction. We adopted the Random Forest algorithm to estimate daily PM2.5 concentrations at a 1-km spatial resolution using three different PM2.5 datasets (AQS-only, IVAN-only, and AQS/IVAN combined). The results show that the integration of low-cost sensor measurements is an effective way to significantly improve the quality of PM2.5 prediction with an increase of cross-validation (CV) R2 by ~0.2. The IVAN measurements also contributed to the increased importance of emission source-related covariates and more reasonable spatial patterns of PM2.5. The remaining uncertainty in the calibrated IVAN measurements could still cause apparent outliers in the prediction model, highlighting the need for more effective calibration or integration methods to relieve its negative impact.
Collapse
Affiliation(s)
- Jianzhao Bi
- Department of Environmental Health, Emory University, Rollins School of Public Health, Atlanta, GA, 30322, United States
| | - Jennifer Stowell
- Department of Environmental Health, Emory University, Rollins School of Public Health, Atlanta, GA, 30322, United States
| | - Edmund Y W Seto
- Department of Environmental & Occupational Health Sciences, University of Washington, Seattle, WA, 98195, United States
| | - Paul B English
- California Department of Public Health, Richmond, CA, 94804, United States
| | - Mohammad Z Al-Hamdan
- Universities Space Research Association, NASA Marshall Space Flight Center, Huntsville, AL, 35808, United States
| | - Patrick L Kinney
- Department of Environmental Health, Boston University, School of Public Health, Boston, MA, 02118, United States
| | - Frank R Freedman
- Department of Meteorology and Climate Science, San Jose State University, San Jose, CA, 95192, United States.
| | - Yang Liu
- Department of Environmental Health, Emory University, Rollins School of Public Health, Atlanta, GA, 30322, United States.
| |
Collapse
|
15
|
Wang W, Zhao S, Jiao L, Taylor M, Zhang B, Xu G, Hou H. Estimation of PM2.5 Concentrations in China Using a Spatial Back Propagation Neural Network. Sci Rep 2019; 9:13788. [PMID: 31551510 PMCID: PMC6760143 DOI: 10.1038/s41598-019-50177-1] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 09/06/2019] [Indexed: 01/28/2023] Open
Abstract
Methods for estimating the spatial distribution of PM2.5 concentrations have been developed but have not yet been able to effectively include spatial correlation. We report on the development of a spatial back-propagation neural network (S-BPNN) model designed specifically to make such correlations implicit by incorporating a spatial lag variable (SLV) as a virtual input variable. The S-BPNN fits the nonlinear relationship between ground-based air quality monitoring station measurements of PM2.5, satellite observations of aerosol optical depth, meteorological synoptic conditions data and emissions data that include auxiliary geographical parameters such as land use, normalized difference vegetation index, elevation, and population density. We trained and validated the S-BPNN for both yearly and seasonal mean PM2.5 concentrations. In addition, principal components analysis was employed to reduce the dimensionality of the data and a grid of neural network models was run to optimize the model design. The S-BPNN was cross-validated against an analogous but SLV-free BPNN model using the coefficient of determination (R2) and root mean squared error (RMSE) as statistical measures of goodness of fit. The inclusion of the SLV led to demonstrably superior performance of the S-BPNN over the BPNN with R2 values increasing from 0.80 to 0.89 and with the RMSE decreasing from 8.1 to 5.8 μg/m3. The yearly mean PM2.5 concentration in China during the study period was found to be 41.8 μg/m3 and the model estimated spatial distribution was found to exceed Level 2 of the China Ambient Air Quality Standards (CAAQS) enacted in 2012 (>35 μg/m3) in more than 70% of the Chinese territory. The inclusion of spatial correlation upgrades the performance of conventional BPNN models and provides a more accurate estimation of PM2.5 concentrations for air quality monitoring.
Collapse
Affiliation(s)
- Weilin Wang
- School of Resource and Environmental Sciences, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.,Key Laboratory of Geographic Information System, Ministry of Education, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China
| | - Suli Zhao
- School of Resource and Environmental Sciences, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China
| | - Limin Jiao
- School of Resource and Environmental Sciences, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China. .,Key Laboratory of Geographic Information System, Ministry of Education, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.
| | - Michael Taylor
- Department of Meteorology, University of Reading, Reading, RG6 6BB, UK
| | - Boen Zhang
- School of Resource and Environmental Sciences, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.,Key Laboratory of Geographic Information System, Ministry of Education, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China
| | - Gang Xu
- School of Resource and Environmental Sciences, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China.,Key Laboratory of Geographic Information System, Ministry of Education, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China
| | - Haobo Hou
- School of Resource and Environmental Sciences, Wuhan University, 129 Luoyu Road, Wuhan, 430079, China
| |
Collapse
|
16
|
Chen J, de Hoogh K, Gulliver J, Hoffmann B, Hertel O, Ketzel M, Bauwelinck M, van Donkelaar A, Hvidtfeldt UA, Katsouyanni K, Janssen NAH, Martin RV, Samoli E, Schwartz PE, Stafoggia M, Bellander T, Strak M, Wolf K, Vienneau D, Vermeulen R, Brunekreef B, Hoek G. A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide. ENVIRONMENT INTERNATIONAL 2019; 130:104934. [PMID: 31229871 DOI: 10.1016/j.envint.2019.104934] [Citation(s) in RCA: 85] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 05/21/2019] [Accepted: 06/13/2019] [Indexed: 05/12/2023]
Abstract
Empirical spatial air pollution models have been applied extensively to assess exposure in epidemiological studies with increasingly sophisticated and complex statistical algorithms beyond ordinary linear regression. However, different algorithms have rarely been compared in terms of their predictive ability. This study compared 16 algorithms to predict annual average fine particle (PM2.5) and nitrogen dioxide (NO2) concentrations across Europe. The evaluated algorithms included linear stepwise regression, regularization techniques and machine learning methods. Air pollution models were developed based on the 2010 routine monitoring data from the AIRBASE dataset maintained by the European Environmental Agency (543 sites for PM2.5 and 2399 sites for NO2), using satellite observations, dispersion model estimates and land use variables as predictors. We compared the models by performing five-fold cross-validation (CV) and by external validation (EV) using annual average concentrations measured at 416 (PM2.5) and 1396 sites (NO2) from the ESCAPE study. We further assessed the correlations between predictions by each pair of algorithms at the ESCAPE sites. For PM2.5, the models performed similarly across algorithms with a mean CV R2 of 0.59 and a mean EV R2 of 0.53. Generalized boosted machine, random forest and bagging performed best (CV R2~0.63; EV R2 0.58-0.61), while backward stepwise linear regression, support vector regression and artificial neural network performed less well (CV R2 0.48-0.57; EV R2 0.39-0.46). Most of the PM2.5 model predictions at ESCAPE sites were highly correlated (R2 > 0.85, with the exception of predictions from the artificial neural network). For NO2, the models performed even more similarly across different algorithms, with CV R2s ranging from 0.57 to 0.62, and EV R2s ranging from 0.49 to 0.51. The predicted concentrations from all algorithms at ESCAPE sites were highly correlated (R2 > 0.9). For both pollutants, biases were low for all models except the artificial neural network. Dispersion model estimates and satellite observations were two of the most important predictors for PM2.5 models whilst dispersion model estimates and traffic variables were most important for NO2 models in all algorithms that allow assessment of the importance of variables. Different statistical algorithms performed similarly when modelling spatial variation in annual average air pollution concentrations using a large number of training sites.
Collapse
Affiliation(s)
- Jie Chen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Postbus 80125, 3508 TC, Utrecht, the Netherlands.
| | - Kees de Hoogh
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051 Basel, Switzerland; University of Basel, Petersplatz 1, Postfach 4001 Basel, Switzerland.
| | - John Gulliver
- Centre for Environmental Health and Sustainability, School of Geography, Geology and the Environment, University of Leicester, University Road, Leicester LE1 7RH, UK.
| | - Barbara Hoffmann
- Institute for Occupational, Social and Environmental Medicine, Centre for Health and Society, Medical Faculty, Heinrich Heine University Düsseldorf, Universitätsstraße 1, 40225 Düsseldorf, Germany.
| | - Ole Hertel
- Department of Environmental Science, Aarhus University, P.O. Box 358, Frederiksborgvej 399, 4000 Roskilde, Denmark.
| | - Matthias Ketzel
- Department of Environmental Science, Aarhus University, P.O. Box 358, Frederiksborgvej 399, 4000 Roskilde, Denmark; Global Centre for Clean Air Research (GCARE), Department of Civil and Environmental Engineering, University of Surrey, Guildford GU2 7XH, UK.
| | - Mariska Bauwelinck
- Interface Demography, Department of Sociology, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium.
| | - Aaron van Donkelaar
- Department of Physics and Atmospheric Science, Dalhousie University, B3H 4R2 Halifax, Nova Scotia, Canada.
| | - Ulla A Hvidtfeldt
- Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark.
| | - Klea Katsouyanni
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 75 Mikras Asias Str, 115 27 Athens, Greece; Department Population Health Sciences and Department of Analytical, Environmental and Forensic Sciences, School of Population Health & Environmental Sciences, King's College Strand, London WC2R 2LS, UK.
| | - Nicole A H Janssen
- National Institute for Public Health and the Environment (RIVM), PO Box 1, 3720 BA, Bilthoven, the Netherlands.
| | - Randall V Martin
- Department of Physics and Atmospheric Science, Dalhousie University, B3H 4R2 Halifax, Nova Scotia, Canada; Atomic and Molecular Physics Division, Harvard-Smithsonian Center for Astrophysics, 60 Garden St, Cambridge, MA 02138, USA.
| | - Evangelia Samoli
- Department of Hygiene, Epidemiology and Medical Statistics, Medical School, National and Kapodistrian University of Athens, 75 Mikras Asias Str, 115 27 Athens, Greece.
| | - Per E Schwartz
- Division of Environmental Medicine, Norwegian Institute of Public Health, PO Box 4404 Nydalen, N-0403 Oslo, Norway.
| | - Massimo Stafoggia
- Department of Epidemiology, Lazio Region Health Service/ASL Roma 1, Via Cristoforo Colombo, 112, 00147, Rome, Italy; Institute of Environmental Medicine, Karolinska Institutet, SE-171 77 Stockholm, Sweden.
| | - Tom Bellander
- Institute of Environmental Medicine, Karolinska Institutet, SE-171 77 Stockholm, Sweden.
| | - Maciek Strak
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Postbus 80125, 3508 TC, Utrecht, the Netherlands.
| | - Kathrin Wolf
- Helmholtz Zentrum München, German Research Center for Environmental Health (GmbH), Institute of Epidemiology, Ingolstädter Landstr. 1, D-85764 Neuherberg, Germany.
| | - Danielle Vienneau
- Swiss Tropical and Public Health Institute, Socinstrasse 57, 4051 Basel, Switzerland; University of Basel, Petersplatz 1, Postfach 4001 Basel, Switzerland.
| | - Roel Vermeulen
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Postbus 80125, 3508 TC, Utrecht, the Netherlands; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, the Netherlands.
| | - Bert Brunekreef
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Postbus 80125, 3508 TC, Utrecht, the Netherlands; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, the Netherlands.
| | - Gerard Hoek
- Institute for Risk Assessment Sciences (IRAS), Utrecht University, Postbus 80125, 3508 TC, Utrecht, the Netherlands.
| |
Collapse
|
17
|
Li X, Zhang X. Predicting ground-level PM 2.5 concentrations in the Beijing-Tianjin-Hebei region: A hybrid remote sensing and machine learning approach. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2019; 249:735-749. [PMID: 30933771 DOI: 10.1016/j.envpol.2019.03.068] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 02/13/2019] [Accepted: 03/17/2019] [Indexed: 06/09/2023]
Abstract
An accurate estimation of PM2.5 (fine particulate matters with diameters ≤ 2.5 μm) concentration is critical for health risk assessment and generating air pollution control strategies. In this study, a hybrid remote sensing and machine learning approach, named RSRF model is proposed to estimate daily ground-level PM2.5 concentrations, which integrates Random Forest (RF), one of machine learning (ML) models, and aerosol optical depth (AOD), one of remote sensing (RS) products. The proposed RSRF model provides an opportunity for an adequate characterization of real-time spatiotemporal PM2.5 distributions at uninhabited places and complex surfaces. It also offers advantages in handling complicated non-linear relationships among a large number of meteorological, environmental and air pollutant factors, as well as ever-increasing environmental data sets. The applicability of the proposed RSRF model is tested in the Beijing-Tianjin-Hebei region (BTH region) during 2015-2017. Deep Blue (DB) AOD from Aqua-retrieved Collection 6.1 (C_61) aerosol products of Moderate Resolution Imaging Spectroradiometer (MODIS) is validated with Aerosol Robotic Network. The validation results indicate C_61 DB AOD has a high correlation with ground based AOD in the BTH region. The proposed RSRF model performed well in characterizing spatiotemporal variations of annual and seasonal PM2.5 concentrations. It not only is useful to quantify the relationships between PM2.5 and relevant factors such as DB AOD, meteorological and air pollutant variables, but also can provide decision support for air pollution control at a regional environment during haze periods.
Collapse
Affiliation(s)
- Xintong Li
- School of Environmental Science and Engineering, Shandong University, Qingdao, Shandong, 266237, China
| | - Xiaodong Zhang
- School of Environmental Science and Engineering, Shandong University, Qingdao, Shandong, 266237, China.
| |
Collapse
|
18
|
Xue T, Zheng Y, Tong D, Zheng B, Li X, Zhu T, Zhang Q. Spatiotemporal continuous estimates of PM 2.5 concentrations in China, 2000-2016: A machine learning method with inputs from satellites, chemical transport model, and ground observations. ENVIRONMENT INTERNATIONAL 2019; 123:345-357. [PMID: 30562706 DOI: 10.1016/j.envint.2018.11.075] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Revised: 11/09/2018] [Accepted: 11/29/2018] [Indexed: 05/22/2023]
Abstract
Ambient exposure to fine particulate matter (PM2.5) is known to harm public health in China. Satellite remote sensing measurements of aerosol optical depth (AOD) were statistically associated with in-situ observations after 2013 to predict PM2.5 concentrations nationwide, while the lack of surface monitoring data before 2013 have created difficulties in historical PM2.5 exposure estimates. Hindcast approaches using statistical models or chemical transport models (CTMs) were developed to overcome this limitation, while those approaches still suffer from incomplete daily coverage due to missing AOD data or limited accuracy due to uncertainties of CTMs. Here we developed a new machine learning (ML) model with high-dimensional expansion (HD-expansion) of numerous predictors (including AOD and other satellite covariates, meteorological variables and CTM simulations). Through comprehensive characterization of the nonlinear effects of, and interactions among different predictors, the HD-expansion parameterized the association between PM2.5 and AOD as a nonlinear function of space and time covariates (e.g., planetary boundary layer height and relative humidity). In this way, the PM2.5-AOD association can vary spatiotemporally. We trained the model with data from 2013 to 2016 and evaluated its performance using annually-iterated cross-validation, which iteratively held out the in-situ observations for a whole calendar year (as testing data) to examine the predictions from a model trained by the rest of the observations. Our estimates were found to be in good agreement with in-situ observations, with correlation coefficients (R2) of 0.61, 0.68, and 0.75 for daily, monthly and annual averages, respectively. To interpolate the missing predictions due to incomplete AOD data, we incorporated a generalized additive model into the ML model. The two-stage estimates of PM2.5 sacrificed the prediction accuracy on a daily timescale (R2 = 0.55), but achieved complete spatiotemporal coverage and improved the accuracy of monthly (R2 = 0.71) and annual (R2 = 0.77) averages. The model was then used to predict daily PM2.5 concentrations during 2000-2016 across China and estimate long-term trends in PM2.5 for the period. We found that population-weighted concentrations of PM2.5 significantly increased, by 2.10 (95% confidence interval (CI): 1.74, 2.46) μg/m3/year during 2000-2007, and rapidly decreased by 4.51 (3.12, 5.90) μg/m3/year during 2013-2016. In this study, we produced AOD-based estimates of historical PM2.5 with complete spatiotemporal coverage, which were evidenced as accurate, particularly in middle and long term. The products could support large-scale epidemiological studies and risk assessments of ambient PM2.5 in China and can be accessed via the website (http://www.meicmodel.org/dataset-phd.html).
Collapse
Affiliation(s)
- Tao Xue
- BIC-ESAT and SKL-ESPC, College of Environmental Science and Engineering, Peking University, Beijing 100871, China; Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Yixuan Zheng
- Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Dan Tong
- Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Bo Zheng
- State Key Joint Laboratory of Environment Simulation and Pollution Control, School of Environment, Tsinghua University, Beijing 100084, China
| | - Xin Li
- Department of Earth System Science, Tsinghua University, Beijing 100084, China
| | - Tong Zhu
- BIC-ESAT and SKL-ESPC, College of Environmental Science and Engineering, Peking University, Beijing 100871, China
| | - Qiang Zhang
- Department of Earth System Science, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
19
|
Bi J, Belle JH, Wang Y, Lyapustin AI, Wildani A, Liu Y. Impacts of snow and cloud covers on satellite-derived PM 2.5 levels. REMOTE SENSING OF ENVIRONMENT 2019; 221:665-674. [PMID: 31359889 PMCID: PMC6662717 DOI: 10.1016/j.rse.2018.12.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Satellite aerosol optical depth (AOD) has been widely employed to evaluate ground fine particle (PM2.5) levels, whereas snow/cloud covers often lead to a large proportion of non-random missing AOD values. As a result, the fully covered and unbiased PM2.5 estimates will be hard to generate. Among the current approaches to deal with the data gap issue, few have considered the cloud-AOD relationship and none of them have considered the snow-AOD relationship. This study examined the impacts of snow and cloud covers on AOD and PM2.5 and made full- coverage PM2.5 predictions by considering these impacts. To estimate missing AOD values, daily gap-filling models with snow/cloud fractions and meteorological covariates were developed using the random forest algorithm. By using these models in New York State, a daily AOD data set with a 1-km resolution was generated with a complete coverage. The "out-of-bag" R2 of the gap-filling models averaged 0.93 with an interquartile range from 0.90 to 0.95. Subsequently, a random forest-based PM2.5 prediction model with the gap-filled AOD and covariates was built to predict fully covered PM2.5 estimates. A ten-fold cross-validation for the prediction model showed a good performance with an R2 of 0.82. In the gap-filling models, the snow fraction was of higher significance to the snow season compared with the rest of the year. The prediction models fitted with/without the snow fraction also suggested the discernible changes in PM2.5 patterns, further confirming the significance of this parameter. Compared with the methods without considering snow and cloud covers, our PM2.5 prediction surfaces showed more spatial details and reflected small-scale terrain-driven PM2.5 patterns. The proposed methods can be generalized to the areas with extensive snow/cloud covers and large proportions of missing satellite AOD data for predicting PM2.5 levels with high resolutions and complete coverage.
Collapse
Affiliation(s)
- Jianzhao Bi
- Department of Environmental Health, Emory University, Rollins School of Public Health, Atlanta, GA, USA
| | - Jessica H. Belle
- Department of Environmental Health, Emory University, Rollins School of Public Health, Atlanta, GA, USA
| | - Yujie Wang
- Goddard Earth Sciences and Technology Center, University of Maryland Baltimore County, Baltimore, MD, USA
- NASA Goddard Space Flight Center, Greenbelt, MD, USA
| | - Alexei I. Lyapustin
- Goddard Earth Sciences and Technology Center, University of Maryland Baltimore County, Baltimore, MD, USA
- NASA Goddard Space Flight Center, Greenbelt, MD, USA
| | - Avani Wildani
- Department of Computer Science, Emory University, Atlanta, GA, USA
| | - Yang Liu
- Department of Environmental Health, Emory University, Rollins School of Public Health, Atlanta, GA, USA
| |
Collapse
|
20
|
Xie J, Wang X, Liu Y, Bai Y. Autoencoder-based deep belief regression network for air particulate matter concentration forecasting. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-169527] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Jingjing Xie
- School of Mechanical Engineering, Dongguan University of Technology, Dongguan, China
| | - Xiaoxue Wang
- Nan’an District Environmental Monitoring Station of Chongqing, Chongqing, China
| | - Yu Liu
- Institute of High Energy Physics, Chinese Academy of Sciences, Dongguan, China
| | - Yun Bai
- School of Mechanical Engineering, Dongguan University of Technology, Dongguan, China
| |
Collapse
|
21
|
A New MODIS C6 Dark Target and Deep Blue Merged Aerosol Product on a 3 km Spatial Grid. REMOTE SENSING 2018. [DOI: 10.3390/rs10030463] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
22
|
Scale- and Region-Dependence in Landscape-PM2.5 Correlation: Implications for Urban Planning. REMOTE SENSING 2017. [DOI: 10.3390/rs9090918] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Under rapid urbanization, many cities in China suffer from serious fine particulate matter (PM2.5) pollution. As the emission sources or adsorption sinks, land use and the corresponding landscape pattern unavoidably affect the concentration. However, the correlation varies with different regions and scales, leaving a significant gap for urban planning. This study clarifies the correlation with the aid of in situ and satellite-based spatial datasets over six urban agglomerations in China. Two coverage and four landscape indices are adopted to represent land use and landscape pattern. Specifically, the coverage indices include the area ratios of forest (F_PLAND) and built-up areas (C_PLAND). The landscape indices refer to the perimeter-area fractal dimension index (PAFRAC), interspersion and juxtaposition index (IJI), aggregation index (AI), Shannon’s diversity index (SHDI). Then, the correlation between PM2.5 concentration with the selected indices are evaluated from supporting the potential urban planning. Results show that the correlations are weak with the in situ PM2.5 concentration, which are significant with the regional value. It means that land use coverage and landscape pattern affect PM2.5 at a relatively large scale. Furthermore, regional PM2.5 concentration negatively correlate to F_PLAND and positively to C_PLAND (significance at p < 0.05), indicating that forest helps to improve air quality, while built-up areas worsen the pollution. Finally, the heterogeneous landscape presents positive correlation to the regional PM2.5 concentration in most regions, except for the urban agglomeration with highly-developed urban (i.e., the Jing-Jin-Ji and Chengdu-Chongqing urban agglomerations). It suggests that centralized urbanization would be helpful for PM2.5 pollution controlling by reducing the emission sources in most regions. Based on the results, the potential urban planning is proposed for controlling PM2.5 pollution for each urban agglomeration.
Collapse
|
23
|
Hu X, Belle JH, Meng X, Wildani A, Waller LA, Strickland MJ, Liu Y. Estimating PM 2.5 Concentrations in the Conterminous United States Using the Random Forest Approach. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2017; 51:6936-6944. [PMID: 28534414 DOI: 10.1021/acs.est.7b01210] [Citation(s) in RCA: 207] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
To estimate PM2.5 concentrations, many parametric regression models have been developed, while nonparametric machine learning algorithms are used less often and national-scale models are rare. In this paper, we develop a random forest model incorporating aerosol optical depth (AOD) data, meteorological fields, and land use variables to estimate daily 24 h averaged ground-level PM2.5 concentrations over the conterminous United States in 2011. Random forests are an ensemble learning method that provides predictions with high accuracy and interpretability. Our results achieve an overall cross-validation (CV) R2 value of 0.80. Mean prediction error (MPE) and root mean squared prediction error (RMSPE) for daily predictions are 1.78 and 2.83 μg/m3, respectively, indicating a good agreement between CV predictions and observations. The prediction accuracy of our model is similar to those reported in previous studies using neural networks or regression models on both national and regional scales. In addition, the incorporation of convolutional layers for land use terms and nearby PM2.5 measurements increase CV R2 by ∼0.02 and ∼0.06, respectively, indicating their significant contributions to prediction accuracy. A pair of different variable importance measures both indicate that the convolutional layer for nearby PM2.5 measurements and AOD values are among the most-important predictor variables for the training process.
Collapse
Affiliation(s)
| | | | | | | | | | - Matthew J Strickland
- School of Community Health Sciences, University of Nevada Reno , Reno, Nevada 89557, United States
| | | |
Collapse
|
24
|
Land Use Regression Modeling of PM2.5 Concentrations at Optimized Spatial Scales. ATMOSPHERE 2016. [DOI: 10.3390/atmos8010001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
25
|
Ding W, Zhang J, Leung Y. Prediction of air pollutant concentration based on sparse response back-propagation training feedforward neural networks. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2016; 23:19481-19494. [PMID: 27384165 DOI: 10.1007/s11356-016-7149-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 06/23/2016] [Indexed: 06/06/2023]
Abstract
In this paper, we predict air pollutant concentration using a feedforward artificial neural network inspired by the mechanism of the human brain as a useful alternative to traditional statistical modeling techniques. The neural network is trained based on sparse response back-propagation in which only a small number of neurons respond to the specified stimulus simultaneously and provide a high convergence rate for the trained network, in addition to low energy consumption and greater generalization. Our method is evaluated on Hong Kong air monitoring station data and corresponding meteorological variables for which five air quality parameters were gathered at four monitoring stations in Hong Kong over 4 years (2012-2015). Our results show that our training method has more advantages in terms of the precision of the prediction, effectiveness, and generalization of traditional linear regression algorithms when compared with a feedforward artificial neural network trained using traditional back-propagation.
Collapse
Affiliation(s)
- Weifu Ding
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, China
- School of Mathematics and Information, BeiFang University for Minority, Yinchuan, 750021, China
| | - Jiangshe Zhang
- School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, China.
| | - Yee Leung
- Institute of Future Cities, Chinese University of Hong Kong, Shatin, Hong Kong
| |
Collapse
|
26
|
Effect of Land Use and Cover Change on Air Quality in Urban Sprawl. SUSTAINABILITY 2016. [DOI: 10.3390/su8070677] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|