Kim J, Wang X, Kang C, Yu J, Li P. Forecasting air pollutant concentration using a novel spatiotemporal deep learning model based on clustering, feature selection and empirical wavelet transform.
Sci Total Environ 2021;
801:149654. [PMID:
34416605 DOI:
10.1016/j.scitotenv.2021.149654]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 07/30/2021] [Accepted: 08/10/2021] [Indexed: 06/13/2023]
Abstract
Accurate forecasting of air pollutant concentration is of great importance since it is an essential part of the early warning system. However, it still remains a challenge due to the limited information of emission source and high uncertainties of the dynamic processes. In order to improve the accuracy of air pollutant concentration forecast, this study proposes a novel hybrid model using clustering, feature selection, real-time decomposition by empirical wavelet transform, and deep learning neural network. First, all air pollutant time series are decomposed by empirical wavelet transform based on real-time decomposition, and subsets of output data are constructed by combining corresponding decomposed components. Second, each subset of output data is classified into several clusters by clustering algorithm, and then appropriate inputs are selected by feature selection method. Third, a deep learning-based predictor, which uses three dimensional convolutional neural network and bidirectional long short-term memory neural network, is applied to predict decomposition components of each cluster. Last, air pollutant concentration forecast for each monitoring station is obtained by reconstructing predicted values of all the decomposition components. PM2.5 concentration data of Beijing, China is used to validate and test our model. Results show that the proposed model outperforms other models used in this study. In our model, mean absolute percentage error for 1, 6, 10 h ahead PM2.5 concentration prediction is 4.03%, 6.87%, and 8.98%, respectively. These outcomes demonstrate that the proposed hybrid model is a powerful tool to provide highly accurate forecast for air pollutant concentration.
Collapse