1
|
Pal Shuvo S, MdAshikuzzaman J, Pal Shibazee S, Paul G, Banerjee P, Mashfiq Fahmid K, Rahman A. Hybrid noise reduction-based data-driven modeling of relative humidity in Khulna, Bangladesh. Heliyon 2024; 10:e36290. [PMID: 39253257 PMCID: PMC11381818 DOI: 10.1016/j.heliyon.2024.e36290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2024] [Revised: 08/12/2024] [Accepted: 08/13/2024] [Indexed: 09/11/2024] Open
Abstract
In this study, a hybrid Machine Learning (ML) approach is proposed for Relative Humidity (RH) prediction with a combination of Empirical Mode Decomposition (EMD) to improve the prediction accuracy over the traditional prediction technique using a Machine Learning (ML) algorithm called Support Vector Machine (SVM). The main objective of proposing this hybrid technique is to deal with the extremely nonlinear and noisy humidity pattern in Khulna, Bangladesh, which is experiencing rapid urbanization and environmental change. To develop the model, data on temperature, relative humidity, rainfall, and wind speed were collected from the Bangladesh Meteorological Department (BMD), and the data was divided into three phases: 70 % of the historical dataset as training data for the model, 15 % of the data set as the validation phase, and the remaining 15 % of the data set as the test phase of the model. Employing the Particle Swarm Optimization (PSO) algorithm, the SVM model determines its best hypermeters within this research. In the present research, performance analysis is carried out utilizing the Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and Coefficient of Determination (R2). Results show that the increase in R2 values resulting from the EMD-based approach is significant: 21.05 % in H1(Traditional model), 19.48 % in H2 (Traditional model), 76.92 % in H3 (Traditional model), 55.93 % in H4 (Traditional model), and 64.29 % in H5 (Traditional model) and H6 (Traditional model). The analytical results show that the proposed EMD-based technique efficiently filters and processes noisy, highly nonlinear humidity data during prediction in the Khulna region. It is recommended that this technique could be applied to other geological areas.
Collapse
Affiliation(s)
- Shuvendu Pal Shuvo
- Department of Civil Engineering, Khulna University of Engineering & Technology, Khulna, Bangladesh
| | - Joarder MdAshikuzzaman
- Department of Civil Engineering, Khulna University of Engineering & Technology, Khulna, Bangladesh
| | - Shirshendu Pal Shibazee
- Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka, Bangladesh
| | - Goutam Paul
- Department of Pharmacy, Bangabandhu Sheikh Mujibur Rahman Science and Technology University, Gopalganj, Bangladesh
| | - Pritam Banerjee
- Department of Civil Engineering, Khulna University of Engineering & Technology, Khulna, Bangladesh
| | - Kazi Mashfiq Fahmid
- Department of Civil Engineering, Khulna University of Engineering & Technology, Khulna, Bangladesh
| | - Ashiqur Rahman
- Department of Civil Engineering, Khulna University of Engineering & Technology, Khulna, Bangladesh
| |
Collapse
|
2
|
Kamalov F, Sulieman H, Moussa S, Reyes JA, Safaraliev M. Nested ensemble selection: An effective hybrid feature selection method. Heliyon 2023; 9:e19686. [PMID: 37809839 PMCID: PMC10558945 DOI: 10.1016/j.heliyon.2023.e19686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 08/29/2023] [Accepted: 08/30/2023] [Indexed: 10/10/2023] Open
Abstract
It has been shown that while feature selection algorithms are able to distinguish between relevant and irrelevant features, they fail to differentiate between relevant and redundant and correlated features. To address this issue, we propose a highly effective approach, called Nested Ensemble Selection (NES), that is based on a combination of filter and wrapper methods. The proposed feature selection algorithm differs from the existing filter-wrapper hybrid methods in its simplicity and efficiency as well as precision. The new algorithm is able to separate the relevant variables from the irrelevant as well as the redundant and correlated features. Furthermore, we provide a robust heuristic for identifying the optimal number of selected features which remains one of the greatest challenges in feature selection. Numerical experiments on synthetic and real-life data demonstrate the effectiveness of the proposed method. The NES algorithm achieves perfect precision on the synthetic data and near optimal accuracy on the real-life data. The proposed method is compared against several popular algorithms including mRMR, Boruta, genetic, recursive feature elimination, Lasso, and Elastic Net. The results show that NES significantly outperforms the benchmarks algorithms especially on multi-class datasets.
Collapse
Affiliation(s)
- Firuz Kamalov
- Department of Electrical Engineering, Canadian University Dubai, Dubai, United Arab Emirates
| | - Hana Sulieman
- Department of Mathematics and Statistics, American University of Sharjah, Sharjah, United Arab Emirates
| | - Sherif Moussa
- Department of Electrical Engineering, Canadian University Dubai, Dubai, United Arab Emirates
| | - Jorge Avante Reyes
- Department of Electrical Engineering, Canadian University Dubai, Dubai, United Arab Emirates
| | - Murodbek Safaraliev
- Department of Automated Electrical Systems, Ural Federal University, Yekaterinburg, Russian Federation
| |
Collapse
|
3
|
Wang S, Ren Y, Xia B. Estimation of urban AQI based on interpretable machine learning. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:96562-96574. [PMID: 37580474 DOI: 10.1007/s11356-023-29336-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 08/10/2023] [Indexed: 08/16/2023]
Abstract
Air pollution is an increasingly serious problem. Accurate and efficient prediction of air quality can effectively prevent air pollution and improve the quality of human life. The air quality index (AQI) is a dimensionless tool to describe air quality quantitatively. In this study, the machine learning (ML) method was used to estimate AQI for Shijiazhuang, China, as the research object, and pollutants and meteorological factors as data models. Specifically, eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), and Random Forest (RF) models were used. The experimental results show that XGBoost model captures the AQI variation trend well, and the R2 of XGBoost model is 0.929, which is 0.3% and 2.3% higher than the R2 of RF model and LightGBM model, respectively. In addition, through the SHAP-based model interpretation method, the study reveals the key factors of AQI variation, that is PM2.5 and PM10, play positive roles in the variation of AQI and AQI is less sensitive to meteorological factors. Finally, Beijing, Shanghai, Xi'an, and Guangzhou were selected to test the model's validity, and the model performance remained good. Our study shows that applying ML approach to air quality prediction is beneficial for efficiently assessing cities' future air quality.
Collapse
Affiliation(s)
- Siyuan Wang
- School of Mathematics and Computer Science, Yan'an University, Yan'an, 716000, China
| | - Ying Ren
- School of Mathematics and Computer Science, Yan'an University, Yan'an, 716000, China
| | - Bisheng Xia
- School of Mathematics and Computer Science, Yan'an University, Yan'an, 716000, China.
| |
Collapse
|
4
|
Merabet K, Heddam S. Improving the accuracy of air relative humidity prediction using hybrid machine learning based on empirical mode decomposition: a comparative study. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:60868-60889. [PMID: 37041358 DOI: 10.1007/s11356-023-26779-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 03/29/2023] [Indexed: 05/10/2023]
Abstract
This paper proposes a hybrid air relative humidity prediction based on preprocessing signal decomposition. New modelling strategy was introduced based on the use of the empirical mode decomposition, variational mode decomposition, and the empirical wavelet transform, combined with standalone machine learning to increase their numerical performances. First, standalone models, i.e., extreme learning machine, multilayer perceptron neural network, and random forest regression, were used for predicting daily air relative humidity using various daily meteorological variables, i.e., maximal and minimal air temperatures, precipitation, solar radiation, and wind speed, measured at two meteorological stations located in Algeria. Second, meteorological variables are decomposed into several intrinsic mode functions and presented as new input variables to the hybrid models. The comparison between the models was achieved based on numerical and graphical indices, and obtained results demonstrate the superiority of the proposed hybrid models compared to the standalone models. Further analysis revealed that using standalone models, the best performances are obtained using the multilayer perceptron neural network with Pearson correlation coefficient, Nash-Sutcliffe efficiency, root-mean-square error, and mean absolute error of approximately ≈0.939, ≈0.882, ≈7.44, and ≈5.62 at Constantine station, and ≈0.943, ≈0.887, ≈7.72, and ≈5.93 at Sétif station, respectively. The hybrid models based on the empirical wavelet transform decomposition exhibited high performances with Pearson correlation coefficient, Nash-Sutcliffe efficiency, root-mean-square error, and mean absolute error of approximately ≈0.950, ≈0.902, ≈6.79, and ≈5.24, at Constantine station, and ≈0.955, ≈0.912, ≈6.82, and ≈5.29, at Sétif station. Finally, we show that the new hybrid approaches delivered high predictive accuracies of air relative humidity, and it was concluded that the contribution of the signal decomposition was demonstrated and justified.
Collapse
Affiliation(s)
- Khaled Merabet
- Laboratory of Optimizing Agricultural Production in Subhumid Zones (LOPAZS), Faculty of Science, Agronomy Department, University 20 Août 1955-Skikda, Route El Hadaik, BP 26, Skikda, Algeria.
| | - Salim Heddam
- Laboratory of Research in Biodiversity Interaction Ecosystem and Biotechnology (LRIBEB), Faculty of Science, Agronomy Department, University 20 Août 1955-Skikda, Route El Hadaik, BP 26, Skikda, Algeria
| |
Collapse
|
5
|
Nguyen DK, Nguyen TP, Ngamkhanong C, Keawsawasvong S, Lai VQ. Bearing capacity of ring footings in anisotropic clays: FELA and ANN. Neural Comput Appl 2023. [DOI: 10.1007/s00521-023-08278-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
6
|
Ensemble feature selection for multi‐label text classification: An intelligent order statistics approach. INT J INTELL SYST 2022. [DOI: 10.1002/int.23044] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
|
7
|
Tao H, Salih S, Oudah AY, Abba SI, Ameen AMS, Awadh SM, Alawi OA, Mostafa RR, Surendran UP, Yaseen ZM. Development of new computational machine learning models for longitudinal dispersion coefficient determination: case study of natural streams, United States. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:35841-35861. [PMID: 35061183 DOI: 10.1007/s11356-022-18554-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 01/04/2022] [Indexed: 06/14/2023]
Abstract
Natural streams longitudinal dispersion coefficient (Kx) is an essential indicator for pollutants transport and its determination is very important. Kx is influenced by several parameters, including river hydraulic geometry, sediment properties, and other morphological characteristics, and thus its calculation is a highly complex engineering problem. In this research, three relatively explored machine learning (ML) models, including Random Forest (RF), Gradient Boosting Decision Tree (GTB), and XGboost-Grid, were proposed for the Kx determination. The modeling scheme on building the prediction matrix was adopted from the well-established literature. Several input combinations were tested for better predictability performance for the Kx. The modeling performance was tested based on the data division for the training and testing (70-30% and 80-20%). Based on the attained modeling results, XGboost-Grid reported the best prediction results over the training and testing phase compared to RF and GTB models. The development of the newly established machine learning model revealed an excellent computed-aided technology for the Kx simulation.
Collapse
Affiliation(s)
- Hai Tao
- School of Electronics and Information Engineering, Ankang University, Ankang, China
- School of Computer Sciences, Baoji University of Arts and Sciences, Shaanxi, China
- Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Universiti Teknologi MARA, Shah Alam, Selangor, Malaysia
| | - Sinan Salih
- Computer Science Department, Dijlah University College, Al-Dora, Baghdad, Iraq
- Artificial Intelligence Research Unit (AIRU), Dijlah University College, Al-Dora, Baghdad, Iraq
| | - Atheer Y Oudah
- Department of Computer Sciences, College of Education for Pure Science, University of Thi-Qar, Thi-Qar, Iraq
- Scientific Research Center, Al-Ayen University, Thi-Qar, 64001, Iraq
| | - S I Abba
- Interdisciplinary Research Center for Membrane and Water Security, King Fahd University of Petroleum and Minerals, Dhahran, 31261, Saudi Arabia
- Faculty of Engineering, Department of Civil Engineering, Baze University, Abuja, Nigeria
| | | | | | - Omer A Alawi
- Department of Thermofluids, School of Mechanical Engineering, Universiti Teknologi Malaysia, 81310 UTM, Skudai, Johor Bahru, Malaysia
| | - Reham R Mostafa
- Information Systems Department, Faculty of Computers and Information Sciences, Mansoura University, Mansoura, 35516, Egypt
| | - Udayar Pillai Surendran
- Land and Water Management Research Group, Centre for Water Resources Development and Management (CWRDM), Kozhikode, Kerala, India
| | - Zaher Mundher Yaseen
- Department of Urban Planning, Engineering Networks and Systems, Institute of Architecture and Construction, South Ural State University, 76, Lenin Prospect, 454080, Chelyabinsk, Russia.
- New era and development in civil engineering research group, Scientific Research Center, Al-Ayen University, Thi-Qar, Nasiriyah, 64001, Iraq.
- College of Creative Design, Asia University, Taichung City, Taiwan.
| |
Collapse
|
8
|
Bhagat SK, Tiyasha T, Kumar A, Malik T, Jawad AH, Khedher KM, Deo RC, Yaseen ZM. Integrative artificial intelligence models for Australian coastal sediment lead prediction: An investigation of in-situ measurements and meteorological parameters effects. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2022; 309:114711. [PMID: 35182982 DOI: 10.1016/j.jenvman.2022.114711] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 01/17/2022] [Accepted: 02/09/2022] [Indexed: 06/14/2023]
Abstract
Heavy metals (HMs) such as Lead (Pb) have played a vital role in increasing the sediments of the Australian bay's ecosystem. Several meteorological parameters (i.e., minimum, maximum and average temperature (Tmin, Tmax and TavgoC), rainfall (Rn mm) and their interactions with the other batch HMs, are hypothesized to have high impact for the decision-making strategies to minimize the impacts of Pb. Three feature selection (FS) algorithms namely the Boruta method, genetic algorithm (GA) and extreme gradient boosting (XGBoost) were investigated to select the highly important predictors for Pb concentration in the coastal bay sediments of Australia. These FS algorithms were statistically evaluated using principal component analysis (PCA) Biplot along with the correlation metrics describing the statistical characteristics that exist in the input and output parameter space of the models. To ensure a high accuracy attained by the applied predictive artificial intelligence (AI) models i.e., XGBoost, support vector machine (SVM) and random forest (RF), an auto-hyper-parameter tuning process using a Grid-search approach was also implemented. Cu, Ni, Ce, and Fe were selected by all the three applied FS algorithms whereas the Tavg and Rn inputs remained the essential parameters identified by GA and Boruta. The order of the FS outcome was XGBoost > GA > Boruta based on the applied statistical examination and the PCA Biplot results and the order of applied AI predictive models was XGBoost-SVM > GA-SVM > Boruta-SVM, where the SVM model remained at the top performance among the other statistical metrics. Based on the Taylor diagram for model evaluation, the RF model was reflected only marginally different so overall, the proposed integrative AI model provided an evidence a robust and reliable predictive technique used for coastal sediment Pb prediction.
Collapse
Affiliation(s)
- Suraj Kumar Bhagat
- Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam.
| | - Tiyasha Tiyasha
- Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Viet Nam.
| | - Adarsh Kumar
- Institute of Natural Sciences and Mathematics, Ural Federal University, Ekaterinburg, 620002, Russia.
| | - Tabarak Malik
- Department of Biochemistry, College of Medicine & Health Sciences, School of Medicine, University of Gondar, Ethiopia.
| | - Ali H Jawad
- Faculty of Applied Sciences, Universiti Teknologi MARA, 40450, Shah Alam, Selangor, Malaysia.
| | - Khaled Mohamed Khedher
- Department of Civil Engineering, College of Engineering, King Khalid University, Abha 61421, Saudi Arabia; Department of Civil Engineering, High Institute of Technological Studies, Mrezgua University Campus, Nabeul, 8000, Tunisia
| | - Ravinesh C Deo
- School of Mathematics, Physics and Computing, University of Southern Queensland, Springfield, QLD, 4300, Australia
| | - Zaher Mundher Yaseen
- Adjunct Research Fellow, USQ's Advanced Data Analytics Research Group, School of Mathematics Physics and Computing, University of Southern Queensland, QLD 4350, Australia; Department of Urban Planning, Engineering Networks and Systems, Institute of Architecture and Construction, South Ural State University, 76, Lenin Prospect, 454080 Chelyabinsk, Russia; College of Creative Design, Asia University, Taichung City, Taiwan; New Era and Development in Civil Engineering Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, 64001, Iraq; Institute for Big Data Analytics and Artificial Intelligence (IBDAAI), Kompleks Al-Khawarizmi, Universiti Teknologi MARA, Shah Alam, 40450 Selangor, Malaysia.
| |
Collapse
|
9
|
Shad M, Sharma YD, Singh A. Forecasting of monthly relative humidity in Delhi, India, using SARIMA and ANN models. MODELING EARTH SYSTEMS AND ENVIRONMENT 2022; 8:4843-4851. [PMID: 35434264 PMCID: PMC8998166 DOI: 10.1007/s40808-022-01385-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 03/20/2022] [Indexed: 06/14/2023]
Abstract
UNLABELLED Relative humidity plays an important role in climate change and global warming, making it a research area of greater concern in recent decades. The present study attempted to implement seasonal autoregressive moving average (SARIMA) and artificial neural network (ANN) with multilayer perceptron (MLP) models to forecast the monthly relative humidity in Delhi, India during 2017-2025. The average monthly relative humidity data for the period 2000-2016 have been used to carry out the objectives of the proposed study. The forecast trend in relative humidity declines from 2017 to 2025. The accuracy of the models has been measured using root mean squared error (RMSE) and mean absolute error (MAE). The results showed that the SARIMA model provides the forecasted relative humidity with RMSE of 6.04 and MAE of 4.56. On the other hand, MLP model reported the forecasted relative humidity with RMSE of 4.65 and MAE of 3.42. This study concluded that the ANN model was more reliable for predicting relative humidity than SARIMA model. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s40808-022-01385-8.
Collapse
Affiliation(s)
- Mohammad Shad
- Department of Mathematics and Scientific Computing, National Institute of Technology, Hamirpur, Himachal Pradesh 177005 India
| | - Y. D. Sharma
- Department of Mathematics and Scientific Computing, National Institute of Technology, Hamirpur, Himachal Pradesh 177005 India
| | - Abhishek Singh
- Department of Mathematics and Scientific Computing, National Institute of Technology, Hamirpur, Himachal Pradesh 177005 India
| |
Collapse
|