1
|
Natarajan SK, Shanmurthy P, Arockiam D, Balusamy B, Selvarajan S. Optimized machine learning model for air quality index prediction in major cities in India. Sci Rep 2024; 14:6795. [PMID: 38514669 PMCID: PMC10958024 DOI: 10.1038/s41598-024-54807-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 02/16/2024] [Indexed: 03/23/2024] Open
Abstract
Industrial advancements and utilization of large amount of fossil fuels, vehicle pollution, and other calamities increases the Air Quality Index (AQI) of major cities in a drastic manner. Major cities AQI analysis is essential so that the government can take proper preventive, proactive measures to reduce air pollution. This research incorporates artificial intelligence in AQI prediction based on air pollution data. An optimized machine learning model which combines Grey Wolf Optimization (GWO) with the Decision Tree (DT) algorithm for accurate prediction of AQI in major cities of India. Air quality data available in the Kaggle repository is used for experimentation, and major cities like Delhi, Hyderabad, Kolkata, Bangalore, Visakhapatnam, and Chennai are considered for analysis. The proposed model performance is experimentally verified through metrics like R-Square, RMSE, MSE, MAE, and accuracy. Existing machine learning models, like k-nearest Neighbor, Random Forest regressor, and Support vector regressor, are compared with the proposed model. The proposed model attains better prediction performance compared to traditional machine learning algorithms with maximum accuracy of 88.98% for New Delhi city, 91.49% for Bangalore city, 94.48% for Kolkata, 97.66% for Hyderabad, 95.22% for Chennai and 97.68% for Visakhapatnam city.
Collapse
Affiliation(s)
- Suresh Kumar Natarajan
- School of Computer Science and Engineering, Jain (Deemed-to-be University), Bengaluru, Karnataka, India
| | - Prakash Shanmurthy
- School of Computer Science and Engineering and Information Science, Presidency University, Bengaluru, Karnataka, India
| | | | | | - Shitharth Selvarajan
- School of Built Environment, Engineering and Computing, Leeds Beckett University, Leeds, LS1 3HE, UK.
| |
Collapse
|
2
|
El Mghouchi Y, Udristioiu MT, Yildizhan H. Multivariable Air-Quality Prediction and Modelling via Hybrid Machine Learning: A Case Study for Craiova, Romania. SENSORS (BASEL, SWITZERLAND) 2024; 24:1532. [PMID: 38475068 DOI: 10.3390/s24051532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 02/22/2024] [Accepted: 02/26/2024] [Indexed: 03/14/2024]
Abstract
Inadequate air quality has adverse impacts on human well-being and contributes to the progression of climate change, leading to fluctuations in temperature. Therefore, gaining a localized comprehension of the interplay between climate variations and air pollution holds great significance in alleviating the health repercussions of air pollution. This study uses a holistic approach to make air quality predictions and multivariate modelling. It investigates the associations between meteorological factors, encompassing temperature, relative humidity, air pressure, and three particulate matter concentrations (PM10, PM2.5, and PM1), and the correlation between PM concentrations and noise levels, volatile organic compounds, and carbon dioxide emissions. Five hybrid machine learning models were employed to predict PM concentrations and then the Air Quality Index (AQI). Twelve PM sensors evenly distributed in Craiova City, Romania, provided the dataset for five months (22 September 2021-17 February 2022). The sensors transmitted data each minute. The prediction accuracy of the models was evaluated and the results revealed that, in general, the coefficient of determination (R2) values exceeded 0.96 (interval of confidence is 0.95) and, in most instances, approached 0.99. Relative humidity emerged as the least influential variable on PM concentrations, while the most accurate predictions were achieved by combining pressure with temperature. PM10 (less than 10 µm in diameter) concentrations exhibited a notable correlation with PM2.5 (less than 2.5 µm in diameter) concentrations and a moderate correlation with PM1 (less than 1 µm in diameter). Nevertheless, other findings indicated that PM concentrations were not strongly related to NOISE, CO2, and VOC, and these last variables should be combined with another meteorological variable to enhance the prediction accuracy. Ultimately, this study established novel relationships for predicting PM concentrations and AQI based on the most effective combinations of predictor variables identified.
Collapse
Affiliation(s)
- Youness El Mghouchi
- Department of Energetics, ENSAM, Moulay Ismail University, Meknes 50050, Morocco
| | - Mihaela Tinca Udristioiu
- Department of Physics, Faculty of Science, University of Craiova, 13 A.I. Cuza Street, 200585 Craiova, Romania
| | - Hasan Yildizhan
- Engineering Faculty, Energy Systems Engineering, Adana Alparslan Türkeş Science and Technology University, Adana 46278, Turkey
| |
Collapse
|
3
|
Effrosynidis D, Spiliotis E, Sylaios G, Arampatzis A. Time series and regression methods for univariate environmental forecasting: An empirical evaluation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 875:162580. [PMID: 36906023 DOI: 10.1016/j.scitotenv.2023.162580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 02/17/2023] [Accepted: 02/27/2023] [Indexed: 06/18/2023]
Abstract
One of the most common and valuable applications of science to the environment is to forecast the future, as it affects human lives in many aspects. However, it is not yet clear which methods -conventional time series or regression- deliver the highest performance in univariate time series forecasting. This study attempts to answer that question with a large-scale comparative evaluation that includes 68 environmental variables over three frequencies (hourly, daily, monthly), forecasted in one to twelve steps into the future, and evaluated over six statistical time series and fourteen regression methods. Results suggest that the strongest representatives of the time series methods (ARIMA, Theta) exhibit high accuracies, but certain regression methods (Huber, Extra Trees, Random Forest, Light Gradient Boosting Machines, Gradient Boosting Machines, Ridge, Bayesian Ridge) deliver even more promising results for all forecasting horizons. Finally, depending on the specific use case, the suitable method should be employed, as certain methods are more appropriate for different frequencies and some have an advantageous trade-off between computational time and performance.
Collapse
Affiliation(s)
- Dimitrios Effrosynidis
- Database & Information Retrieval Research Unit, Department of Electrical & Computer Engineering, Democritus University of Thrace, Xanthi 67100, Greece.
| | - Evangelos Spiliotis
- Forecasting and Strategy Unit, School of Electrical and Computer Engineering, National Technical University of Athens, Athens, Greece.
| | - Georgios Sylaios
- Lab of Ecological Engineering & Technology, Department of Environmental Engineering, Democritus University of Thrace, Xanthi 67100, Greece.
| | - Avi Arampatzis
- Database & Information Retrieval Research Unit, Department of Electrical & Computer Engineering, Democritus University of Thrace, Xanthi 67100, Greece.
| |
Collapse
|
4
|
Rakholia R, Le Q, Quoc Ho B, Vu K, Simon Carbajo R. Multi-output machine learning model for regional air pollution forecasting in Ho Chi Minh City, Vietnam. ENVIRONMENT INTERNATIONAL 2023; 173:107848. [PMID: 36842381 DOI: 10.1016/j.envint.2023.107848] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/31/2023] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
Air pollution concentrations in Ho Chi Minh City (HCMC) have been found to surpass the WHO standard, which has become a very serious problem affecting human health and the ecosystem. Various machine learning algorithms have recently been widely used in air quality forecasting studies to predict possible impacts. Training and constructing several machine learning models for different air pollutants, such as NO2, SO2, O3, and CO forecasts, is a time-consuming process that necessitates additional effort for deployment, maintenance, and monitoring. In this paper, an effort has been made to develop a multi-step multi-output multivariate model (a global model) for air quality forecasting, taking into account various parameters such as meteorological conditions, air quality data from urban traffic, residential, and industrial areas, urban space information, and time component for the prediction of NO2, SO2, O3, CO hourly (1 h to 24 h) concentrations. The global forecasting model can anticipate multiple air pollutant concentrations concurrently, based on past concentrations of covariate characteristics. The datasets on air pollution time series were gathered from six HealthyAir air quality monitoring sites in HCMC between February 2021 and August 2022. Darksky weather provided the hourly concentrations of meteorological conditions for the same period. This is the first model built using real-time air quality data for NO2, SO2, CO, and O3 forecasting in HCM city. To assess the effectiveness of the proposed model, it was evaluated using real data from HealthyAir stations and quantified using Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and correlation indices. The results show that the global air quality forecasting model beats earlier models built for air quality forecasting of each specific pollutant in HCMC.
Collapse
Affiliation(s)
- Rajnish Rakholia
- Ireland's National Centre for Applied Artificial Intelligence (CeADAR), University College Dublin, NexusUCD, Belfield Office Park, Dublin, Ireland.
| | - Quan Le
- Ireland's National Centre for Applied Artificial Intelligence (CeADAR), University College Dublin, NexusUCD, Belfield Office Park, Dublin, Ireland
| | - Bang Quoc Ho
- Institute for Environment and Resources (IER), Ho Chi Minh City 700000, Vietnam; Department of Science and Technology, Vietnam National University, Ho Chi Minh City 700000, Vietnam
| | - Khue Vu
- Institute for Environment and Resources (IER), Ho Chi Minh City 700000, Vietnam
| | - Ricardo Simon Carbajo
- Ireland's National Centre for Applied Artificial Intelligence (CeADAR), University College Dublin, NexusUCD, Belfield Office Park, Dublin, Ireland
| |
Collapse
|
5
|
Feng H, Zhang X. A novel encoder-decoder model based on Autoformer for air quality index prediction. PLoS One 2023; 18:e0284293. [PMID: 37053153 PMCID: PMC10101400 DOI: 10.1371/journal.pone.0284293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Accepted: 03/28/2023] [Indexed: 04/14/2023] Open
Abstract
Rapid economic development has led to increasingly serious air quality problems. Accurate air quality prediction can provide technical support for air pollution prevention and treatment. In this paper, we proposed a novel encoder-decoder model named as Enhanced Autoformer (EnAutoformer) to improve the air quality index (AQI) prediction. In this model, (a) The enhanced cross-correlation (ECC) is proposed for extracting the temporal dependencies in AQI time series; (b) Combining the ECC with the cross-stage feature fusion mechanism of CSPDenseNet, the core module CSP_ECC is proposed for improving the computational efficiency of the EnAutoformer. (c) The time series decomposition and dilated causal convolution added in the decoder module are exploited to extract the finer-grained features from the original AQI data and improve the performance of the proposed model for long-term prediction. The real-world air quality datasets collected from Lanzhou are used to validate the performance of our prediction model. The experimental results show that our EnAutoformer model can greatly improve the prediction accuracy compared to the baselines and can be used as a promising alternative for complex air quality prediction.
Collapse
Affiliation(s)
- Huifang Feng
- College of Mathematics and Statistics, Northwest Normal University, Lanzhou, China
| | - Xianghong Zhang
- College of Mathematics and Statistics, Northwest Normal University, Lanzhou, China
| |
Collapse
|
6
|
Dual-channel spatial–temporal difference graph neural network for PM$$_{2.5}$$ forecasting. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08036-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|