1
|
Das P, Zhang Z, Ghosh S, Hang R. A hybrid ensemble learning merging approach for enhancing the super drought computation over Lake Victoria Basin. Sci Rep 2024; 14:13870. [PMID: 38879570 PMCID: PMC11180181 DOI: 10.1038/s41598-024-61520-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 05/07/2024] [Indexed: 06/19/2024] Open
Abstract
This study introduces a novel Hybrid Ensemble Machine-Learning (HEML) algorithm to merge long-term satellite-based reanalysis precipitation products (SRPPs), enabling the estimation of super drought events in the Lake Victoria Basin (LVB) during the period of 1984 to 2019. This study considers three widely used Machine learning (ML) models, including RF (Random Forest), GBM (Gradient Boosting Machine), and KNN (k-nearest Neighbors), for the emerging HEML approach. The three SRPPs, including CHIRPS (Climate Hazards Group Infra-Red Precipitation with Station), ERA5-Land, and PERSIANN-CDR (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Network-Climate Data Record), were used to merge for developing new precipitation estimates from HEML model. Additionally, classification and regression models were employed as base learners in developing this algorithm. The newly developed HEML datasets were compared with other ML and SRPP products for super-drought monitoring. The Standardized precipitation evapotranspiration index (SPEI) was used to estimate super drought characteristics, including Drought frequency (DF), Drought Duration (DD), and Drought Intensity (DI) from machine learning and SRPPs products in LVB and compared with RG observation. The results revealed that the HEML algorithm shows excellent performance (CC = 0.93) compared to the single ML merging method and SRPPs against observation. Furthermore, the HEML merging product adeptly captures the spatiotemporal patterns of super drought characteristics during both training (1984-2009) and testing (2010-2019) periods. This research offers crucial insights for near-real-time drought monitoring, water resource management, and informed policy decisions.
Collapse
Affiliation(s)
- Priyanko Das
- Institute of African Studies, School of Geography and Ocean Sciences, Nanjing University, Nanjing, China
| | - Zhenke Zhang
- Institute of African Studies, School of Geography and Ocean Sciences, Nanjing University, Nanjing, China.
| | - Suravi Ghosh
- Institute of Atmospheric Physics, University of Chinese Academy of Sciences, Beijing, China
| | - Ren Hang
- Institute of Population Studies, Nanjing University of Post and Telecommunication, Nanjing, China
| |
Collapse
|
2
|
Liu W, Lin S, Li X, Li W, Deng H, Fang H, Li W. Analysis of dissolved oxygen influencing factors and concentration prediction using input variable selection technique: A hybrid machine learning approach. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 357:120777. [PMID: 38581893 DOI: 10.1016/j.jenvman.2024.120777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 02/29/2024] [Accepted: 03/26/2024] [Indexed: 04/08/2024]
Abstract
Accurate quantification of dissolved oxygen (DO) is critically important for the protection and management of aquatic ecosystems. Successful applications have utilized mechanistic and data-driven models to simulate DO content in aquatic ecosystems. However, mechanistic models present challenges due to their complex and difficult-to-solve conditions, making them less portable. Additionally, data-driven model predictions are hindered by the challenge of numerous input variables, impacting both the running speed and prediction performance of the model. To address these challenges, water quality data and meteorological data of the Tanjiang River were obtained. The maximum information coefficient (MIC) input variable selection technique was employed to identify primary environmental factors influencing DO changes. Furthermore, coupled with support vector regression (SVR), two models (SVR and MIC-SVR) were employed to estimate the DO concentration of the Tanjiang River, and the optimal model was established. The results indicated a shift in the primary pollution factor from ammonia nitrogen to total phosphorus after recent treatment in the Tanjiang River. In comparison with the SVR model, the root mean square error (RMSE) of the MIC-SVR model was reduced by 4.46%, and the Nash-efficiency coefficient (NSE) was improved by 45.85%. In addition, study of kernel function selection revealed that considering as many kernel functions as possible is necessary for improving the performance of the SVR model. Conclusively, the proposed MIC-SVR model serves as an effective tool to analyze the relationship between DO and environmental factors, identifying the primary causes of low DO, and accurately predict the DO concentration in the Tanjiang River (especially in its middle and lower reaches), thus providing a reference for governmental decision-making on water environmental protection and water resource management.
Collapse
Affiliation(s)
- Wei Liu
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China
| | - Shu Lin
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Xiaobao Li
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Wenjing Li
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Hong Deng
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China
| | - Huaiyang Fang
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Weijie Li
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China; The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China.
| |
Collapse
|
3
|
Pan Y, Yuan Q, Ma J, Wang L. Improved Daily Spatial Precipitation Estimation by Merging Multi-Source Precipitation Data Based on the Geographically Weighted Regression Method: A Case Study of Taihu Lake Basin, China. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:13866. [PMID: 36360744 PMCID: PMC9655682 DOI: 10.3390/ijerph192113866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 06/16/2023]
Abstract
Accurately estimating the spatial and temporal distribution of precipitation is crucial for hydrological modeling. However, precipitation products based on a single source have their advantages and disadvantages. How to effectively combine the advantages of different precipitation datasets has become an important topic in developing high-quality precipitation products internationally in recent years. This paper uses the measured precipitation data of Multi-Source Weighted-Ensemble Precipitation (MSWEP) and in situ rainfall observation in the Taihu Lake Basin, as well as the longitude, latitude, elevation, slope, aspect, surface roughness, distance to the coastline, and land use and land cover data, and adopts a two-step method to achieve precipitation fusion: (1) downscaling the MSWEP source precipitation field using the bilinear interpolation method and (2) using the geographically weighted regression (GWR) method and tri-cube function weighting method to achieve fusion. Considering geographical and human activities factors, the spatial and temporal distribution of precipitation errors in MSWEP is detected. The fusion of MSWEP and gauge observation precipitation is realized. The results show that the method in this paper significantly improves the spatial resolution and accuracy of precipitation data in the Taihu Lake Basin.
Collapse
|
4
|
Improving multiple model ensemble predictions of daily precipitation and temperature through machine learning techniques. Sci Rep 2022; 12:4678. [PMID: 35304552 PMCID: PMC8933560 DOI: 10.1038/s41598-022-08786-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 02/25/2022] [Indexed: 12/05/2022] Open
Abstract
Multi-Model Ensembles (MMEs) are used for improving the performance of GCM simulations. This study evaluates the performance of MMEs of precipitation, maximum temperature and minimum temperature over a tropical river basin in India developed by various techniques like arithmetic mean, Multiple Linear Regression (MLR), Support Vector Machine (SVM), Extra Tree Regressor (ETR), Random Forest (RF) and long short-term memory (LSTM). The 21 General Circulation Models (GCMs) from National Aeronautics Space Administration (NASA) Earth Exchange Global Daily Downscaled Projections (NEX-GDDP) dataset and 13 GCMs of Coupled Model Inter-comparison Project, Phase 6 (CMIP6) are used for this purpose. The results of the study reveal that the application of a LSTM model for ensembling performs significantly better than models in the case of precipitation with a coefficient of determination (R2) value of 0.9. In case of temperature, all the machine learning (ML) methods showed equally good performance, with RF and LSTM performing consistently well in all the cases of temperature with R2 value ranging from 0.82 to 0.93. Hence, based on this study RF and LSTM methods are recommended for creation of MMEs in the basin. In general, all ML approaches performed better than mean ensemble approach.
Collapse
|
5
|
Sharannya TM, Venkatesh K, Mudbhatkal A, Dineshkumar M, Mahesha A. Effects of land use and climate change on water scarcity in rivers of the Western Ghats of India. ENVIRONMENTAL MONITORING AND ASSESSMENT 2021; 193:820. [PMID: 34792670 DOI: 10.1007/s10661-021-09598-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 11/05/2021] [Indexed: 06/13/2023]
Abstract
This paper assesses the long-term combined effects of land use (LU) and climate change on river hydrology and water scarcity of two rivers of the Western Ghats of India. The historical LU changes were studied for four decades (1988-2016) using the maximum likelihood algorithm and the long-term LU (2016-2075) was estimated using the Dyna-CLUE prediction model. Five General Circulation Models (GCMs) were utilized to assess the effects of climate change (CC) and the Soil and Water Assessment Tool (SWAT) model was used for hydrological modeling of the two river catchments. To characterize granular effects of LU and CC on regional hydrology, a scenario approach was adopted and three scenarios depicting near-future (2006-2040), mid-future (2041-2070), and far-future (2071-2100) based on climate were established. The present rate of LU change indicated a reduction in forest cover by 20% and an increase in urbanized areas by 9.5% between 1988 and 2016. It was estimated that forest cover in the catchments may be expected to halve compared to the present-day LU (55% in 2016 to 23% in 2075), along with large-scale conversion to agricultural lands (13.5% in 2016 to 49.5% in 2075). As a result of changes to LU and forecasted climate, it was found that rivers in the Western Ghats of India might face scarcity of fresh water in the next two decades until the year 2040. However, because of large-scale LU conversion toward the year 2050, streamflow in rivers might increase as high as 70.94% at certain times of the year. Although an increase in streamflow is perceived favorable, the streamflow changes during summer and winter may be expected to affect the cropping calendar and crop yield. The changes to streamflow were also linked to a 4.2% increase in ecologically sensitive wetlands of the Aghanashini river catchment.
Collapse
Affiliation(s)
- T M Sharannya
- Department of Water Resources & Ocean Engineering, National Institute of Technology Karnataka, Surathkal, Mangaluru, 575 025, India.
| | - K Venkatesh
- Department of Environment and Sustainability, University of South Dakota, Vermillion, South Dakota, 57069, USA
| | - Amogh Mudbhatkal
- School of Geography and Lincoln Centre for Water and Planetary Health, University of Lincoln, Brayford Pool, Lincoln, Lincolnshire, LN6 7TS, UK
| | - M Dineshkumar
- Department of Water Resources & Ocean Engineering, National Institute of Technology Karnataka, Surathkal, Mangaluru, 575 025, India
| | - Amai Mahesha
- Department of Water Resources & Ocean Engineering, National Institute of Technology Karnataka, Surathkal, Mangaluru, 575 025, India
| |
Collapse
|
6
|
Xu C, Chen X, Zhang L. Predicting river dissolved oxygen time series based on stand-alone models and hybrid wavelet-based models. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2021; 295:113085. [PMID: 34147993 DOI: 10.1016/j.jenvman.2021.113085] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 05/17/2021] [Accepted: 06/13/2021] [Indexed: 06/12/2023]
Abstract
Accurate prediction of dissolved oxygen time series is important for improving the water environment and aiding water resource management. In this study, four stand-alone models including multiple linear regression (MLR), support vector machine (SVM), artificial neural network (ANN) and random forest (RF), and four hybrid models based on wavelet transform (WT) including WT-MLR, WT-SVM, WT-ANN and WT-RF were used to predict the daily dissolved oxygen (DO) at 1-5-day lead times in the Dongjiang River Basin, China. To make the prediction robust, the maximal information coefficient (MIC) was used to capture comprehensive information between DO and explanatory variables. The 5-fold cross validation grid search approach was used to optimize parameters of machine learning tools. Two types of frameworks of WT: direct framework (i.e., only the explanatory variables were decomposed) and multicomponent framework (i.e., both explanatory variables and target variables were decomposed) were used to construct hybrid models. The results show that MIC extracts four optimal explanatory variables: previous DO, water temperature, air temperature and air pressure. Four evaluation parameters including correlation coefficient (R), Nash-Sutcliffe efficiency (NSE), mean absolute error (MAE) and root mean square error (RMSE) indicate that the prediction accuracy decreases as the lead time changes from 1 to 5 days. In terms of the stand-alone models, MLR model outperforms the other three models with higher NSE values of 0.616-0.921, and lower RMSE values of 0.503-1.111. With regard to the hybrid models, WT-ANN and WT-MLR models exhibit higher performance, and multicomponent framework performs better than direct framework in all hybrid models. In general, the multicomponent framework of WT can improve the prediction accuracy of stand-alone models at a certain degree, while the direct framework shows no obvious advantage.
Collapse
Affiliation(s)
- Chuang Xu
- Center for Water Resources and Environment Research, School of Civil Engineering, Sun Yat-sen University, Guangzhou, China
| | - Xiaohong Chen
- Center for Water Resources and Environment Research, School of Civil Engineering, Sun Yat-sen University, Guangzhou, China.
| | - Lilan Zhang
- Center for Water Resources and Environment Research, School of Civil Engineering, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
7
|
Validation of CHIRPS Precipitation Estimates over Taiwan at Multiple Timescales. REMOTE SENSING 2021. [DOI: 10.3390/rs13020254] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS), which incorporates satellite imagery and in situ station information, is a new high-resolution long-term precipitation dataset available since 1981. This study aims to understand the performance of the latest version of CHIRPS in depicting the multiple timescale precipitation variation over Taiwan. The analysis is focused on examining whether CHIRPS is better than another satellite precipitation product—the Integrated Multi-satellitE Retrievals for Global Precipitation Mission (GPM) final run (hereafter IMERG)—which is known to effectively capture the precipitation variation over Taiwan. We carried out the evaluations made for annual cycle, seasonal cycle, interannual variation, and daily variation during 2001–2019. Our results show that IMERG is slightly better than CHIRPS considering most of the features examined; however, CHIRPS performs better than that of IMERG in representing the (1) magnitude of the annual cycle of monthly precipitation climatology, (2) spatial distribution of the seasonal mean precipitation for all four seasons, (3) quantitative precipitation estimation of the interannual variation of area-averaged winter precipitation in Taiwan, and (4) occurrence frequency of the non-rainy grids in winter. Notably, despite the fact that CHIRPS is not better than IMERG for many examined features, CHIRPS can depict the temporal variation in precipitation over Taiwan on annual, seasonal, and interannual timescales with 95% significance. This highlights the potential use of CHIRPS in studying the multiple timescale variation in precipitation over Taiwan during the years 1981–2000, for which there are no data available in the IMERG database.
Collapse
|