Boudibi S, Fadlaoui H, Hiouani F, Bouzidi N, Aissaoui A, Khomri ZE. Groundwater salinity modeling and mapping using machine learning approaches: a case study in Sidi Okba region, Algeria.
ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024:10.1007/s11356-024-34440-1. [PMID:
39042194 DOI:
10.1007/s11356-024-34440-1]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 07/16/2024] [Indexed: 07/24/2024]
Abstract
The groundwater salinization process complexity and the lack of data on its controlling factors are the main challenges for accurate predictions and mapping of aquifer salinity. For this purpose, effective machine learning (ML) methodologies are employed for effective modeling and mapping of groundwater salinity (GWS) in the Mio-Pliocene aquifer in the Sidi Okba region, Algeria, based on limited dataset of electrical conductivity (EC) measurements and readily available digital elevation model (DEM) derivatives. The dataset was randomly split into training (70%) and testing (30%) sets, and three wrapper selection methods, recursive feature elimination (RFE), forward feature selection (FFS), and backward feature selection (BFS) are applied to train the data. The resulting combinations are used as inputs for five ML models, namely random forest (RF), hybrid neuro-fuzzy inference system (HyFIS), K-nearest neighbors (KNN), cubist regression model (CRM), and support vector machine (SVM). The best-performing model is identified and applied to predict and map GWS across the entire study area. It is highlighted that the applied methods yield input variation combinations as critical factors that are often overlocked by many researchers, which substantially impacts the models' accuracy. Among different alternatives the RF model emerged as the most effective for predicting and mapping GWS in the study area, which led to the high performance in both the training (RMSE = 1.016, R = 0.854, and MAE = 0.759) and testing (RMSE = 1.069, R = 0.831, and MAE = 0.921) phases. The generated digital map highlighted the alarming situation regarding excessive GWS levels in the study area, particularly in zones of low elevations and far from the Foum Elgherza dam and Elbiraz wadi. Overall, this study represents a significant advancement over previous approaches, offering enhanced predictive performance for GWS with the minimum number of input variables.
Collapse