1
|
Zhou Y, He B, Cao X, Xiao Y, Feng Q, Yang F, Xiao F, Geng X, Du Y. Remotely sensed estimates of long-term biochemical oxygen demand over Hong Kong marine waters using machine learning enhanced by imbalanced label optimisation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 943:173748. [PMID: 38857793 DOI: 10.1016/j.scitotenv.2024.173748] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 04/30/2024] [Accepted: 06/02/2024] [Indexed: 06/12/2024]
Abstract
In many coastal cities around the world, continuing water degradation threatens the living environment of humans and aquatic organisms. To assess and control the water pollution situation, this study estimated the Biochemical Oxygen Demand (BOD) concentration of Hong Kong's marine waters using remote sensing and an improved machine learning (ML) method. The scheme was derived from four ML algorithms (RBF, SVR, RF, XGB) and calibrated using a large amount (N > 1000) of in-situ BOD5 data. Based on labeled datasets with different preprocessing, i.e., the original BOD5, the log10(BOD5), and label distribution smoothing (LDS), three types of models were trained and evaluated. The results highlight the superior potential of the LDS-based model to improve BOD5 estimate by dealing with imbalanced training dataset. Additionally, XGB and RF outperformed RBF and SVR when the model was developed using log10(BOD5) or LDS(BOD5). Over two decades, the BOD5 concentration of Hong Kong marine waters in the autumn (Sep. to Nov.) shows a downward trend, with significant decreases in Deep Bay, Western Buffer, Victoria Harbour, Eastern Buffer, Junk Bay, Port Shelter, and the Tolo Harbour and Channel. Principal component analysis revealed that nutrient levels emerged as the predominant factor in Victoria Harbour and the interior of Deep Bay, while chlorophyll-related and physical parameters were dominant in Southern, Mirs Bay, Northwestern, and the outlet of Deep Bay. LDS provides a new perspective to improve ML-based water quality estimation by alleviating the imbalance in the labeled dataset. Overall, the remotely sensed BOD5 can offer insight into the spatial-temporal distribution of organic matter in Hong Kong coastal waters and valuable guidance for the pollution control.
Collapse
Affiliation(s)
- Yadong Zhou
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
| | - Boayin He
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China.
| | - Xiaoyu Cao
- School of Geography and Ocean Science, Nanjing University, Nanjing 210023, China
| | - Yu Xiao
- Key Laboratory of Wetland Ecology and Environment, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Qi Feng
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
| | - Fan Yang
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fei Xiao
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
| | - Xueer Geng
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yun Du
- Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
| |
Collapse
|
2
|
Szota C, Danger A, Poelsma PJ, Hatt BE, James RB, Rickard A, Burns MJ, Cherqui F, Grey V, Coleman RA, Fletcher TD. Developing simple indicators of nitrogen and phosphorus removal in constructed stormwater wetlands. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 928:172192. [PMID: 38604363 DOI: 10.1016/j.scitotenv.2024.172192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 03/11/2024] [Accepted: 04/01/2024] [Indexed: 04/13/2024]
Abstract
Quantifying pollutant removal by stormwater wetlands requires intensive sampling which is cost-prohibitive for authorities responsible for a large number of wetlands. Wetland managers require simple indicators that provide a practical means of estimating performance and prioritising maintenance works across their asset base. We therefore aimed to develop vegetation cover and metrics derived from monitoring water level, as simple indicators of likely nutrient pollutant removal from stormwater wetlands. Over a two-year period, we measured vegetation cover and water levels at 17 wetlands and used both to predict nitrogen (N) and phosphorus (P) removal. Vegetation cover explained 48 % of variation in total nitrogen (TN) removal; with a linear relationship suggesting an approximate 9 % loss in TN removal per 10 % decrease in vegetation cover. Vegetation cover is therefore a useful indicator of TN removal. Further development of remotely-sensed data on vegetation configuration, species and condition will likely improve the accuracy of TN removal estimates. Total phosphorus (TP) removal was not predicted by vegetation cover, but was weakly related to the median water level which explained 25 % of variation TP removal. Despite weak prediction of TP removal, metrics derived from water level sensors identified faults such as excessive inflow and inefficient outflow, which in combination explained 50 % of the variation in the median water level. Monitoring water levels therefore has the potential to detect faults prior to loss of vegetation cover and therefore TN removal, as well as inform the corrective action required.
Collapse
Affiliation(s)
- Christopher Szota
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia.
| | | | - Peter J Poelsma
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia
| | - Belinda E Hatt
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia; Melbourne Water Corporation, Docklands, Victoria, Australia
| | - Robert B James
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia
| | - Alison Rickard
- Melbourne Water Corporation, Docklands, Victoria, Australia
| | - Matthew J Burns
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia
| | - Frédéric Cherqui
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia; Univ Lyon, INSA-LYON, Université Claude Bernard Lyon 1, DEEP, F-69621, F-69622, Villeurbanne, France
| | - Vaughn Grey
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia; Melbourne Water Corporation, Docklands, Victoria, Australia
| | - Rhys A Coleman
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia; Melbourne Water Corporation, Docklands, Victoria, Australia
| | - Tim D Fletcher
- School of Agriculture, Food and Ecosystem Sciences, The University of Melbourne, Burnley, Victoria, Australia
| |
Collapse
|
3
|
Hu Y, Liu C, Wollheim WM, Jiao T, Ma M. A hybrid deep learning approach to predict hourly riverine nitrate concentrations using routine monitored data. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 360:121097. [PMID: 38733844 DOI: 10.1016/j.jenvman.2024.121097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/26/2024] [Accepted: 05/04/2024] [Indexed: 05/13/2024]
Abstract
With high-frequency data of nitrate (NO3-N) concentrations in waters becoming increasingly important for understanding of watershed system behaviors and ecosystem managements, the accurate and economic acquisition of high-frequency NO3-N concentration data has become a key point. This study attempted to use coupled deep learning neural networks and routine monitored data to predict hourly NO3-N concentrations in a river. The hourly NO3-N concentration at the outlet of the Oyster River watershed in New Hampshire, USA, was predicted through neural networks with a hybrid model architecture coupling the Convolutional Neural Networks and the Long Short-Term Memory model (CNN-LSTM). The routine monitored data (the river depth, water temperature, air temperature, precipitation, specific conductivity, pH and dissolved oxygen concentrations) for model training were collected from a nested high-frequency monitoring network, while the high-frequency NO3-N concentration data obtained at the outlet were not included as inputs. The whole dataset was separated into training, validation, and testing processes according to the ratio of 5:3:2, respectively. The hybrid CNN-LSTM model with different input lengths (1d, 3d, 7d, 15d, 30d) displayed comparable even better performance than other studies with lower frequencies, showing mean values of the Nash-Sutcliffe Efficiency 0.60-0.83. Models with shorter input lengths demonstrated both the higher modeling accuracy and stability. The water level, water temperature and pH values at monitoring sites were main controlling factors for forecasting performances. This study provided a new insight of using deep learning networks with a coupled architecture and routine monitored data for high-frequency riverine NO3-N concentration forecasting and suggestions about strategies about variable and input length selection during preprocessing of input data.
Collapse
Affiliation(s)
- Yue Hu
- State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu, 610059, China
| | - Chuankun Liu
- Sichuan Academy of Environmental Policy and Planning, Department of Ecology and Environment of Sichuan Province, Chengdu, 610059, China.
| | - Wilfred M Wollheim
- Department of Natural Resources and Environment, University of New Hampshire, Durham, NH, 03824, USA
| | - Tong Jiao
- State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu, 610059, China
| | - Meng Ma
- China Institute of Water Resources and Hydropower Research, Beijing, 100048, China
| |
Collapse
|
4
|
Nam SH, Kwon S, Kim YD. Development of a basin-scale total nitrogen prediction model by integrating clustering and regression methods. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 920:170765. [PMID: 38340839 DOI: 10.1016/j.scitotenv.2024.170765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/15/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
Nutrient runoff into rivers caused by human activity has led to global eutrophication issues. The Nakdong River in South Korea is currently facing significant challenges related to eutrophication and harmful algal blooms, underscoring the critical importance of managing total nitrogen (T-N) levels. However, traditional methods of indoor analysis, which depend on sampling, are labor-intensive and face limitations in collecting high-frequency data. Despite advancements in sensor allowing for the measurement of various parameters, sensors still cannot directly measure T-N, necessitating surrogate regression methods. Therefore, we conducted T-N predictions using a water quality dataset collected from 2018 to 2022 at 157 observatories within the Nakdong River basin. To account for the water quality characteristics of each location, we employed a clustering technique to divide the basin and compared a Gaussian mixture model with K-means clustering. Moreover, optimal regressor for each cluster was selected by comparing multiple linear regression (MLR), random forest, and XGBoost. The results showed that forming four clusters via K-means clustering was the most suitable approach and MLR was reasonably accurate for all clusters. Subsequently, recursive feature elimination cross-validation was used to identify suitable parameters for T-N prediction, thus leading to the construction of high-accuracy T-N prediction models. Clustering was useful not only for improving the regressors but also for spatially analyzing the water quality characteristics of the Nakdong River. The MLR model can reveal causal relationships and thus is useful for decision-making. The results of this study revealed that the combination of a simple linear regression model and clustering method can be applied to a wide watershed. The clustering-based regression model showed potential for accurately predicting T-N at the basin level and is expected to contribute to nationwide water quality management through future applications in various fields.
Collapse
Affiliation(s)
- Su Han Nam
- Department of Civil and Environmental Engineering, Myongji University, Yongin, South Korea
| | - Siyoon Kwon
- Center for Water and the Environment, Department of Civil, Architectural and Environmental Engineering, The University of Texas at Austin, Austin, TX, USA
| | - Young Do Kim
- Department of Civil and Environmental Engineering, Myongji University, Yongin, South Korea.
| |
Collapse
|
5
|
Wang H, Zhang L, Wu R, Cen Y. Spatio-temporal fusion of meteorological factors for multi-site PM2.5 prediction: A deep learning and time-variant graph approach. ENVIRONMENTAL RESEARCH 2023; 239:117286. [PMID: 37797668 DOI: 10.1016/j.envres.2023.117286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 09/29/2023] [Accepted: 09/30/2023] [Indexed: 10/07/2023]
Abstract
In the field of environmental science, traditional methods for predicting PM2.5 concentrations primarily focus on singular temporal or spatial dimensions. This approach presents certain limitations when it comes to deeply mining the joint influence of multiple monitoring sites and their inherent connections with meteorological factors. To address this issue, we introduce an innovative deep-learning-based multi-graph model using Beijing as the study case. This model consists of two key modules: firstly, the 'Meteorological Factor Spatio-Temporal Feature Extraction Module'. This module deeply integrates spatio-temporal features of hourly meteorological data by employing Graph Convolutional Networks (GCN) and Long Short-Term Memory (LSTM) for spatial and temporal encoding respectively. Subsequently, through an attention mechanism, it retrieves a feature tensor associated with air pollutants. Secondly, these features are amalgamated with PM2.5 concentration values, allowing the 'PM2.5 Concentration Prediction Module' to predict with enhanced accuracy the joint influence across multiple monitoring sites. Our model exhibits significant advantages over traditional methods in processing the joint impact of multiple sites and their associated meteorological factors. By providing new perspectives and tools for the in-depth understanding of urban air pollutant distribution and optimization of air quality management, this model propels us towards a more comprehensive approach in tackling air pollution issues.
Collapse
Affiliation(s)
- Hongqing Wang
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China; University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Lifu Zhang
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China.
| | - Rong Wu
- Department of Mathematical Sciences, Tsinghua University, Beijing, 100084, China.
| | - Yi Cen
- Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, 100094, China.
| |
Collapse
|
6
|
Carter JB, Huffaker R, Singh A, Bean E. HUM: A review of hydrochemical analysis using ultraviolet-visible absorption spectroscopy and machine learning. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 901:165826. [PMID: 37524192 DOI: 10.1016/j.scitotenv.2023.165826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 07/23/2023] [Accepted: 07/25/2023] [Indexed: 08/02/2023]
Abstract
There is a need to develop improved methods for water quality analysis. Traditionally, water quality analysis is performed in a laboratory on discrete samples or in the field with simple sensors, but these methods have inherent limitations. Ultraviolet-visible absorption spectroscopy (UVAS) is a commonly used laboratory technique for water quality analysis and is being applied more broadly in combination with machine learning (ML) to allow for the detection of multiple analytes without sample pretreatments. This methodology (referred to here as Hydrochemical analysis using Ultraviolet-visible absorption spectroscopy and Machine learning; 'HUM') can be applied in the laboratory or in situ while requiring less time, labor, and materials compared to traditional laboratory analysis. HUM has been used for the quantification of a variety of chemicals in a variety of settings, but information is lacking related to instrumental setup, sample requirements, and data analysis procedures. For instance, there is a need to investigate the influence of spectral parameters (e.g., sensitivity, signal-to-noise ratio, and spectral resolution) on measurement error. There is also a lack of research aimed at developing ML algorithms specifically for HUM. Finally, there are emerging concepts such as sensor fusion and model-sensor fusion which have been applied to similar fields but are not common in studies involving HUM. This review suggests the need for further studies to better understand the factors that influence HUM measurement accuracy along with the need for hardware and software developments so that the methodology can ultimately become more robust and standardized. This, in turn, could increase its adoption in both academic and non-academic settings. Once the HUM methodology has matured, it could help to reduce the environmental impacts of society by improving our understanding and management of environmental systems through high-frequency data collection and automated control of water quality in environmentally relevant systems.
Collapse
Affiliation(s)
- J Barrett Carter
- Department of Agricultural and Biological Engineering, University of Florida, 1741 Museum Road, Gainesville, FL 32611-0570, United States of America.
| | - Ray Huffaker
- Department of Agricultural and Biological Engineering, University of Florida, 1741 Museum Road, Gainesville, FL 32611-0570, United States of America
| | - Aditya Singh
- Department of Agricultural and Biological Engineering, University of Florida, 1741 Museum Road, Gainesville, FL 32611-0570, United States of America
| | - Eban Bean
- Department of Agricultural and Biological Engineering, University of Florida, 1741 Museum Road, Gainesville, FL 32611-0570, United States of America
| |
Collapse
|
7
|
de Toledo MB, Baulch HM. Variability of sedimentary phosphorus composition across Canadian lakes. ENVIRONMENTAL RESEARCH 2023; 236:116654. [PMID: 37487921 DOI: 10.1016/j.envres.2023.116654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 06/26/2023] [Accepted: 07/11/2023] [Indexed: 07/26/2023]
Abstract
Phosphorus (P) in lake sediments is stored within diverse forms, often associated with metals, minerals, and organic matter. Sediment P can be remobilized to the water column, but the environmental conditions influencing the P retention-release balance depend upon the sediment chemistry and forms of P present. Sequential fractionation approaches can be used to help understand forms of P present in the sediments, and their vulnerability to release. We assessed P composition in surficial sediments (as an assemblage of six P-fractions) and its relationship with watershed, and lake-specific explanatory variables from 236 lakes across Canada. Sediment P composition varied widely across the 12 sampled Canadian ecozones. The dominant P-fractions were the residual-P and the labile organic P, while the loosely bound P corresponded to the smallest proportion of sediment TP. Notable contrasts in sediment P composition were apparent across select regions - with the most significant differences between sediment P in lakes from the mid-West Canada region (Prairies and Boreal Plains ecozones) and both Eastern coastal (Atlantic Maritime and Atlantic Highlands) and Western coastal (Pacific Maritime) ecozones. The ecozone attributes most critical to sediment P speciation across Canadian lakes were related to soil types in the watershed (e.g., podzols, chernozems, and Luvisols) and the chemical composition of lake water and sediments, such as dissolved Ca in lake water, bulk sedimentary Ca, Al, and Fe, dissolved SO4 in lake water, lake pH, and salinity. Understanding predictors of the forms of P stored in surficial sediments helps advance our knowledge of in-lake P retention and remobilization processes across the millions of unstudied lakes and can help our understanding of controls on internal P loading.
Collapse
Affiliation(s)
- Mauro B de Toledo
- School of Environment and Sustainability, University of Saskatchewan, Saskatoon, SK, Canada; Global Institute for Water Security, University of Saskatchewan, 11 Innovation Blvd, Saskatoon, SK, S7N 3H5, Canada.
| | - Helen M Baulch
- School of Environment and Sustainability, University of Saskatchewan, Saskatoon, SK, Canada; Global Institute for Water Security, University of Saskatchewan, 11 Innovation Blvd, Saskatoon, SK, S7N 3H5, Canada.
| |
Collapse
|
8
|
Guo C, Wan D, Li Y, Zhu Q, Luo Y, Luo W, Cui Y. Quantitative prediction of the hydraulic performance of free water surface constructed wetlands by integrating numerical simulation and machine learning. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 337:117745. [PMID: 36965370 DOI: 10.1016/j.jenvman.2023.117745] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 02/24/2023] [Accepted: 03/13/2023] [Indexed: 06/18/2023]
Abstract
Quantitative prediction of the design parameter-influenced hydraulic performance is significant for optimizing free water surface constructed wetlands (FWS CWs) to reduce point and non-point source pollution and improve land utilization. However, owing to limitations of the test conditions and data scale, a quantitative prediction model of the hydraulic performance under multiple design parameters has not yet been established. In this study, we integrated field test data, mechanism model, statistical regression, and machine learning (ML) to construct such quantitative prediction models. A FWS CW numerical model was established by integrating 13 groups of trace data from field tests. Subsequently, training, test and extension datasets comprising 125 (5^3), 25 (L25(56)) and 16 (L16(44)) data points, respectively, were generated via numerical simulation of multi-level value combination of three quantitative design parameters, namely, water depth, hydraulic loading rate (HLR), and aspect ratio. The short circuit index (φ10), Morrill dispersion index (MDI), hydraulic efficiency (λ) and moment index (MI) were used as representative hydraulic performance indicators. Training set with large samples were analyzed to determine the variation rules of different hydraulic indicators. Based on the control variable method, φ10, λ, and MI grew exponentially with increasing aspect ratio whereas MDI showed a decreasing trend; with increasing water depth, φ10, λ, and MI showed polynomial decreases whereas MDI increased; with increasing HLR, φ10, λ, and MI slowly increased linearly whereas MDI showed the opposite trend. Finally, we constructed models based on multivariate nonlinear regression (MNLR) and ML (random forest (RF), multilayer perceptron (MLP), and support vector regression. The coefficients of determination (R2) of the MNLR and ML models fitting the training and test sets were all greater than 0.9; however, the generalization abilities of different models in the extension set were different. The most robust MLP, MNLR without interaction term, and RF models were recommended as the preferred models to hydraulic performance prediction. The extreme importance of aspect ratio in hydraulic performance was revealed. Thus, gaps in the current understanding of multivariate quantitative prediction of the hydraulic performance of FWS CWs are addressed while providing an avenue for researching FWS CWs in different regions according to local conditions.
Collapse
Affiliation(s)
- Changqiang Guo
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China; Key Laboratory of Basin Water Resources and Eco-Environmental Science in Hubei Province, Changjiang River Scientific Research Institute of Changjiang Water Resources Commission, Wuhan, 430010, China
| | - Di Wan
- Key Laboratory of Basin Water Resources and Eco-Environmental Science in Hubei Province, Changjiang River Scientific Research Institute of Changjiang Water Resources Commission, Wuhan, 430010, China; State Key Laboratory of Water Resource and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China
| | - Yalong Li
- Key Laboratory of Basin Water Resources and Eco-Environmental Science in Hubei Province, Changjiang River Scientific Research Institute of Changjiang Water Resources Commission, Wuhan, 430010, China
| | - Qing Zhu
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China
| | - Yufeng Luo
- State Key Laboratory of Water Resource and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China
| | - Wenbing Luo
- Key Laboratory of Basin Water Resources and Eco-Environmental Science in Hubei Province, Changjiang River Scientific Research Institute of Changjiang Water Resources Commission, Wuhan, 430010, China
| | - Yuanlai Cui
- State Key Laboratory of Water Resource and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China.
| |
Collapse
|
9
|
Arhab M, Huang J. Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest. SENSORS (BASEL, SWITZERLAND) 2023; 23:6057. [PMID: 37447905 DOI: 10.3390/s23136057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 06/18/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023]
Abstract
Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logistical issues with data handling. To address this, the study aimed to determine the optimal subset of predictors and the sampling frequency for developing nutrient soft sensors using random forest. The study used water quality data at 15-min intervals from 2 automatic stations on the Main River, Germany, and included dissolved oxygen, temperature, conductivity, pH, streamflow, and cyclical time features as predictors. The optimal subset of predictors was identified using forward subset selection, and the models fitted with the optimal predictors produced R2 values above 0.95 for nitrate, orthophosphate, and ammonium for both stations. The study then trained the models on 40 sampling frequencies, ranging from monthly to 15-min intervals. The results showed that as the sampling frequency increased, the model's performance, measured by RMSE, improved. The optimal balance between sampling frequency and model performance was identified using a knee-point determination algorithm. The optimal sampling frequency for nitrate was 3.6 and 2.8 h for the 2 stations, respectively. For orthophosphate, it was 2.4 and 1.8 h. For ammonium, it was 2.2 h for 1 station. The study highlights the utility of surrogate models for monitoring nutrient levels and demonstrates that nutrient soft sensors can function with fewer predictors at lower frequencies without significantly decreasing performance.
Collapse
Affiliation(s)
- Muhammad Arhab
- Chair of Hydrology and River Basin Management, Technical University of Munich, Arcisstrasse 21, 80333 Munich, Germany
| | - Jingshui Huang
- Chair of Hydrology and River Basin Management, Technical University of Munich, Arcisstrasse 21, 80333 Munich, Germany
| |
Collapse
|
10
|
Cheng Q, Chunhong Z, Qianglin L. Development and application of random forest regression soft sensor model for treating domestic wastewater in a sequencing batch reactor. Sci Rep 2023; 13:9149. [PMID: 37277429 DOI: 10.1038/s41598-023-36333-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 06/01/2023] [Indexed: 06/07/2023] Open
Abstract
Small-scale distributed water treatment equipment such as sequencing batch reactor (SBR) is widely used in the field of rural domestic sewage treatment because of its advantages of rapid installation and construction, low operation cost and strong adaptability. However, due to the characteristics of non-linearity and hysteresis in SBR process, it is difficult to construct the simulation model of wastewater treatment. In this study, a methodology was developed using artificial intelligence and automatic control system that can save energy corresponding to reduce carbon emissions. The methodology leverages random forest model to determine a suitable soft sensor for the prediction of COD trends. This study uses pH and temperature sensors as premises for COD sensors. In the proposed method, data were pre-processed into 12 input variables and top 7 variables were selected as the variables of the optimized model. Cycle ended by the artificial intelligence and automatic control system instead of by fixed time control that was an uncontrolled scenario. In 12 test cases, percentage of COD removal is about 91. 075% while 24. 25% time or energy was saved from an average perspective. This proposed soft sensor selection methodology can be applied in field of rural domestic sewage treatment with advantages of time and energy saving. Time-saving results in increasing treatment capacity and energy-saving represents low carbon technology. The proposed methodology provides a framework for investigating ways to reduce costs associated with data collection by replacing costly and unreliable sensors with affordable and reliable alternatives. By adopting this approach, energy conservation can be maintained while meeting emission standards.
Collapse
Affiliation(s)
- Qiu Cheng
- Department of Material and Environmental Engineering, Chengdu Technological University, Chengdu, China
| | - Zhan Chunhong
- Huicai Environmental Technology Co., Ltd., De Yuan Zhen, Pidu District, Chengdu, Sichuan, China
| | - Li Qianglin
- Department of Material and Environmental Engineering, Chengdu Technological University, Chengdu, China.
| |
Collapse
|
11
|
Pacheco VL, Bragagnolo L, Dalla Rosa F, Thomé A. Optimization of biocementation responses by artificial neural network and random forest in comparison to response surface methodology. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:61863-61887. [PMID: 36934187 DOI: 10.1007/s11356-023-26362-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 03/05/2023] [Indexed: 05/10/2023]
Abstract
In this article, the optimization of the specific urease activity (SUA) and the calcium carbonate (CaCO3) using microbially induced calcite precipitation (MICP) was compared to optimization using three algorithms based on machine learning: random forest regressor, artificial neural networks (ANNs), and multivariate linear regression. This study applied the techniques in two existing response surface method (RSM) experiments involving MICP technique. Random forest-based models and artificial neural network-based models were submitted through the optimization of hyperparameters via cross-validation technique and grid search, to select the best-optimized model. For this study, the random forest-based algorithm is aimed at having the best performance of 0.9381 and 0.9463 in comparison to the original r2 of 0.9021 and 0.8530, respectively. This study is aimed at exploring the capability of using machine learning-based models in small datasets for the purpose of optimization of experimental variables in MICP technique and the meaningfulness of the models by their specificities in the small experimental datasets applied to experimental designs. This study is aimed at exploring the capability of using machine learning-based models in small datasets for experimental variable optimization in MICP technique. The use of these techniques can create prerogatives to scale and mitigate costs in future experiments associated to the field.
Collapse
Affiliation(s)
- Vinicius Luiz Pacheco
- Graduate Program in Civil and Environmental Engineering, University of Passo Fundo (UPF), Campus I, Km 171, BR 285, Passo Fundo, Rio Grande Do Sul, CEP: 99001-970, Brazil.
| | - Lucimara Bragagnolo
- Graduate Program in Civil and Environmental Engineering, University of Passo Fundo (UPF), Campus I, Km 171, BR 285, Passo Fundo, Rio Grande Do Sul, CEP: 99001-970, Brazil
| | - Francisco Dalla Rosa
- Graduate Program in Civil and Environmental Engineering, University of Passo Fundo (UPF), Campus I, Km 171, BR 285, Passo Fundo, Rio Grande Do Sul, CEP: 99001-970, Brazil
| | - Antonio Thomé
- Graduate Program in Civil and Environmental Engineering, University of Passo Fundo (UPF), Campus I, Km 171, BR 285, Passo Fundo, Rio Grande Do Sul, CEP: 99001-970, Brazil
| |
Collapse
|
12
|
Rao W, Qian X, Fan Y, Liu T. A soft sensor for simulating algal cell density based on dynamic response to environmental changes in a eutrophic shallow lake. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 868:161543. [PMID: 36640876 DOI: 10.1016/j.scitotenv.2023.161543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 01/07/2023] [Accepted: 01/07/2023] [Indexed: 06/17/2023]
Abstract
There is a great need for timely monitoring and rapid water quality assessment to control the algal blooms that often occur in eutrophic lakes. While algal cell density (ACD) is a critical indicator of algal growth, field monitoring is laborious and time-consuming, and rapid assessment of algal blooms based on ACD is often not possible. To address the limitations of conventional ACD detection, we proposed a soft sensor approach that uses surrogate indicators to simulate ACD in machine learning models. We conducted a case study using monitoring data from Chaohu Lake collected between 2016 and 2019. We found that ensemble learning models, especially extreme gradient boosting (XGBoost), outperformed traditional machine learning algorithms by comparing various machine learning algorithms. Also, considering the influence of input variable selection on model performance, we combined the results of different filter methods in the multi-stage variable selection process. Finally, we screened out seven key variables out of the 43 initial candidate variables, including dissolved oxygen (DO), chlorophyll-a (Chl-a), Secchi disk depth (SD), pH, permanganate index (CODMn), week of the year (WOY), and wind velocity (WV). Their inclusion substantially improved data accessibility and supported the development of a rapid simulation model. The final model was capable of reliable spatiotemporal generalization, with an overall R2 value of 0.761. On the theoretical side, our study makes a new attempt to simulate ACD values in a eutrophic lake. For practical purposes, the soft sensor can facilitate the rapid assessment of bloom conditions, which helps the local administration with emergency prevention and control.
Collapse
Affiliation(s)
- Wenxin Rao
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Xin Qian
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China; Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing 210044, China.
| | - Yifan Fan
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing 210023, China
| | - Tong Liu
- Faculty of Environmental Earth Science, Hokkaido University, Sapporo 060-0810, Japan
| |
Collapse
|
13
|
Bieroza M, Acharya S, Benisch J, ter Borg RN, Hallberg L, Negri C, Pruitt A, Pucher M, Saavedra F, Staniszewska K, van’t Veen SGM, Vincent A, Winter C, Basu NB, Jarvie HP, Kirchner JW. Advances in Catchment Science, Hydrochemistry, and Aquatic Ecology Enabled by High-Frequency Water Quality Measurements. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:4701-4719. [PMID: 36912874 PMCID: PMC10061935 DOI: 10.1021/acs.est.2c07798] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 03/03/2023] [Accepted: 03/03/2023] [Indexed: 06/18/2023]
Abstract
High-frequency water quality measurements in streams and rivers have expanded in scope and sophistication during the last two decades. Existing technology allows in situ automated measurements of water quality constituents, including both solutes and particulates, at unprecedented frequencies from seconds to subdaily sampling intervals. This detailed chemical information can be combined with measurements of hydrological and biogeochemical processes, bringing new insights into the sources, transport pathways, and transformation processes of solutes and particulates in complex catchments and along the aquatic continuum. Here, we summarize established and emerging high-frequency water quality technologies, outline key high-frequency hydrochemical data sets, and review scientific advances in key focus areas enabled by the rapid development of high-frequency water quality measurements in streams and rivers. Finally, we discuss future directions and challenges for using high-frequency water quality measurements to bridge scientific and management gaps by promoting a holistic understanding of freshwater systems and catchment status, health, and function.
Collapse
Affiliation(s)
- Magdalena Bieroza
- Department
of Soil and Environment, SLU, Box 7014, Uppsala 750
07 Sweden
| | - Suman Acharya
- Department
of Environment and Genetics, School of Agriculture, Biomedicine and
Environment, La Trobe University, Albury/Wodonga Campus, Victoria 3690, Australia
| | - Jakob Benisch
- Institute
for Urban Water Management, TU Dresden, Bergstrasse 66, Dresden 01068, Germany
| | | | - Lukas Hallberg
- Department
of Soil and Environment, SLU, Box 7014, Uppsala 750
07 Sweden
| | - Camilla Negri
- Environment
Research Centre, Teagasc, Johnstown Castle, Wexford Y35 Y521, Ireland
- The
James
Hutton Institute, Craigiebuckler, Aberdeen AB15 8QH, United Kingdom
- School
of
Archaeology, Geography and Environmental Science, University of Reading, Whiteknights, Reading RG6 6AB, United Kingdom
| | - Abagael Pruitt
- Department
of Biological Sciences, University of Notre
Dame, Notre
Dame, Indiana 46556, United States
| | - Matthias Pucher
- Institute
of Hydrobiology and Aquatic Ecosystem Management, Vienna University of Natural Resources and Life Sciences, Gregor Mendel Straße 33, Vienna 1180, Austria
| | - Felipe Saavedra
- Department
for Catchment Hydrology, Helmholtz Centre
for Environmental Research - UFZ, Theodor-Lieser-Straße 4, Halle (Saale) 06120, Germany
| | - Kasia Staniszewska
- Department
of Earth and Atmospheric Sciences, University
of Alberta, Edmonton, Alberta T6G 2E3, Canada
| | - Sofie G. M. van’t Veen
- Department
of Ecoscience, Aarhus University, Aarhus 8000, Denmark
- Envidan
A/S, Silkeborg 8600, Denmark
| | - Anna Vincent
- Department
of Biological Sciences, University of Notre
Dame, Notre
Dame, Indiana 46556, United States
| | - Carolin Winter
- Environmental
Hydrological Systems, University of Freiburg, Friedrichstraße 39, Freiburg 79098, Germany
- Department
of Hydrogeology, Helmholtz Centre for Environmental
Research - UFZ, Permoserstr.
15, Leipzig 04318, Germany
| | - Nandita B. Basu
- Department
of Civil and Environmental Engineering and Department of Earth and
Environmental Sciences, and Water Institute, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - Helen P. Jarvie
- Water Institute
and Department of Geography and Environmental Management, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - James W. Kirchner
- Department
of Environmental System Sciences, ETH Zurich, Zurich CH-8092, Switzerland
- Swiss
Federal Research Institute WSL, Birmensdorf CH-8903, Switzerland
| |
Collapse
|
14
|
Onwuka IS, Scinto LJ, Fugate DC. High-Resolution Estimation of Suspended Solids and Particulate Phosphorus Using Acoustic Devices in a Hydrologically Managed Canal in South Florida, USA. SENSORS (BASEL, SWITZERLAND) 2023; 23:2281. [PMID: 36850879 PMCID: PMC9960507 DOI: 10.3390/s23042281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 02/13/2023] [Accepted: 02/15/2023] [Indexed: 06/18/2023]
Abstract
Conventional methods of measuring total suspended sediments (TSS) and total particulate phosphorus (TPP) are typically low-resolution and miss critical processes that impact their exports in aquatic environments. To create high-resolution TSS and TPP estimates, echo intensity (EI), a biproduct of velocity measurements from acoustic devices, was utilized. An acoustic Doppler velocimeter (ADV) and an acoustic Doppler current profiler (ADCP) were deployed in three locations in the L-29 Canal in South Florida, USA, to obtain estimates near the canal bed and in the water column, respectively. Corrections for transmission losses from the ADCP proved unnecessary due to the low vertical variability in the measured EI. EI calibrations were performed using artificially created TSS obtained from bed sediments (ADV) and gravimetrically measured TSS from water samples that matched the depths and times of the ADCP deployments. The measured TSS values were then analyzed for total phosphorus and converted to TPP estimates. The results showed that high TSS and TPP were caused by the rapid discharge releases typical of managed canals. This work demonstrates that high-resolution estimates are imperative for assessing the effects of such swift hydrologic changes on the potential export of sediments and nutrients to delicate ecosystems downstream.
Collapse
Affiliation(s)
- Ikechukwu S. Onwuka
- Department of Earth and Environment, Florida International University, Miami, FL 33199, USA
- Institute of Environment, Florida International University, Miami, FL 33199, USA
| | - Leonard J. Scinto
- Department of Earth and Environment, Florida International University, Miami, FL 33199, USA
- Institute of Environment, Florida International University, Miami, FL 33199, USA
| | - David C. Fugate
- Department of Marine and Earth Sciences, Florida Gulf Coast University, 10501 FGCU Blvd. South, Fort Myers, FL 33965, USA
| |
Collapse
|
15
|
Paepae T, Bokoro PN, Kyamakya K. Data Augmentation for a Virtual-Sensor-Based Nitrogen and Phosphorus Monitoring. SENSORS (BASEL, SWITZERLAND) 2023; 23:1061. [PMID: 36772100 PMCID: PMC9920320 DOI: 10.3390/s23031061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 01/06/2023] [Accepted: 01/16/2023] [Indexed: 06/18/2023]
Abstract
To better control eutrophication, reliable and accurate information on phosphorus and nitrogen loading is desired. However, the high-frequency monitoring of these variables is economically impractical. This necessitates using virtual sensing to predict them by utilizing easily measurable variables as inputs. While the predictive performance of these data-driven, virtual-sensor models depends on the use of adequate training samples (in quality and quantity), the procurement and operational cost of nitrogen and phosphorus sensors make it impractical to acquire sufficient samples. For this reason, the variational autoencoder, which is one of the most prominent methods in generative models, was utilized in the present work for generating synthetic data. The generation capacity of the model was verified using water-quality data from two tributaries of the River Thames in the United Kingdom. Compared to the current state of the art, our novel data augmentation-including proper experimental settings or hyperparameter optimization-improved the root mean squared errors by 23-63%, with the most significant improvements observed when up to three predictors were used. In comparing the predictive algorithms' performances (in terms of the predictive accuracy and computational cost), k-nearest neighbors and extremely randomized trees were the best-performing algorithms on average.
Collapse
Affiliation(s)
- Thulane Paepae
- Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South Africa
| | - Pitshou N. Bokoro
- Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South Africa
| | - Kyandoghere Kyamakya
- Institute for Smart Systems Technologies, Transportation Informatics, Alpen-Adria Universität Klagenfurt, 9020 Klagenfurt, Austria
- Faculté Polytechnique, Université de Kinshasa, P.O. Box 127, Kinshasa XI, Democratic Republic of the Congo
| |
Collapse
|
16
|
The Use of Random Forest Regression for Estimating Leaf Nitrogen Content of Oil Palm Based on Sentinel 1-A Imagery. INFORMATION 2022. [DOI: 10.3390/info14010010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
For obtaining a spatial map of the distribution of nitrogen nutrients from oil palm plantations, a quite complex Leaf Sampling Unit (LSU) is required. In addition, sample analysis in the laboratory is time consuming and quite expensive, especially for large plantation areas. Monitoring the nutrition of oil palm plants can be achieved using remote-sensing technology. The main obstacles of using passive sensors in multispectral imagery are cloud cover and shadow noise. This research used C-SAR Sentinel equipped with active sensors that can overcome cloud barriers. A model to estimate leaf nitrogen nutrient status was constructed using random forest regression (RFR) based on multiple polarization (VV-VH) and local incidence angle (LIA) data on Sentinel-1A imagery. A sample of 1116 LSU data from different islands (i.e., Sumatra, Java, and Kalimantan) was used to develop the proposed estimation model. The performance evaluation of the model obtained the averaged MAPE, correctness, and MSE of 9.68%, 90.32% and 11.03%, respectively. Spatial maps of the distribution of nitrogen values in certain oil palm areas can be produced and visualized on the web so that they can be accessed easily and quickly for various purposes of oil palm management such as fertilization planning, recommendations, and monitoring.
Collapse
|
17
|
A Computational Framework for Design and Optimization of Risk-Based Soil and Groundwater Remediation Strategies. Processes (Basel) 2022. [DOI: 10.3390/pr10122572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Soil and groundwater systems have natural attenuation potential to degrade or detoxify contaminants due to biogeochemical processes. However, such potential is rarely incorporated into active remediation strategies, leading to over-remediation at many remediation sites. Here, we propose a framework for designing and searching optimal remediation strategies that fully consider the combined effects of active remediation strategies and natural attenuation potentials. The framework integrates machine-learning and process-based models for expediting the optimization process with its applicability demonstrated at a field site contaminated with arsenic (As). The process-based model was employed in the framework to simulate the evolution of As concentrations by integrating geochemical and biogeochemical processes in soil and groundwater systems under various scenarios of remedial activities. The simulation results of As concentration evolution, remedial activities, and associated remediation costs were used to train a machine learning model, random forest regression, with a goal to establish a relationship between the remediation inputs, outcomes, and associated cost. The relationship was then used to search for optimal (low cost) remedial strategies that meet remediation constraints. The strategy was successfully applied at the field site, and the framework provides an effective way to search for optimal remediation strategies at other remediation sites.
Collapse
|
18
|
Paepae T, Bokoro PN, Kyamakya K. A Virtual Sensing Concept for Nitrogen and Phosphorus Monitoring Using Machine Learning Techniques. SENSORS (BASEL, SWITZERLAND) 2022; 22:7338. [PMID: 36236438 PMCID: PMC9572788 DOI: 10.3390/s22197338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 09/20/2022] [Accepted: 09/24/2022] [Indexed: 06/16/2023]
Abstract
Harmful cyanobacterial bloom (HCB) is problematic for drinking water treatment, and some of its strains can produce toxins that significantly affect human health. To better control eutrophication and HCB, catchment managers need to continuously keep track of nitrogen (N) and phosphorus (P) in the water bodies. However, the high-frequency monitoring of these water quality indicators is not economical. In these cases, machine learning techniques may serve as viable alternatives since they can learn directly from the available surrogate data. In the present work, a random forest, extremely randomized trees (ET), extreme gradient boosting, k-nearest neighbors, a light gradient boosting machine, and bagging regressor-based virtual sensors were used to predict N and P in two catchments with contrasting land uses. The effect of data scaling and missing value imputation were also assessed, while the Shapley additive explanations were used to rank feature importance. A specification book, sensitivity analysis, and best practices for developing virtual sensors are discussed. Results show that ET, MinMax scaler, and a multivariate imputer were the best predictive model, scaler, and imputer, respectively. The highest predictive performance, reported in terms of R2, was 97% in the rural catchment and 82% in an urban catchment.
Collapse
Affiliation(s)
- Thulane Paepae
- Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South Africa
| | - Pitshou N. Bokoro
- Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South Africa
| | - Kyandoghere Kyamakya
- Institute for Smart Systems Technologies, Transportation Informatics, Alpen-Adria Universität Klagenfurt, 9020 Klagenfurt, Austria
| |
Collapse
|
19
|
Virro H, Kmoch A, Vainu M, Uuemaa E. Random forest-based modeling of stream nutrients at national level in a data-scarce region. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 840:156613. [PMID: 35700783 DOI: 10.1016/j.scitotenv.2022.156613] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 05/12/2022] [Accepted: 06/07/2022] [Indexed: 06/15/2023]
Abstract
Nutrient runoff from agricultural production is one of the main causes of water quality deterioration in river systems and coastal waters. Water quality modeling can be used for gaining insight into water quality issues in order to implement effective mitigation efforts. Process-based nutrient models are very complex, requiring a lot of input parameters and computationally expensive calibration. Recently, ML approaches have shown to achieve an accuracy comparable to the process-based models and even outperform them when describing nonlinear relationships. We used observations from 242 Estonian catchments, amounting to 469 yearly TN and 470 TP measurements covering the period 2016-2020 to train random forest (RF) models for predicting annual N and P concentrations. We used a total of 82 predictor variables, including land cover, soil, climate and topography parameters and applied a feature selection strategy to reduce the number of dependent features in the models. The SHAP method was used for deriving the most relevant predictors. The performance of our models is comparable to previous process-based models used in the Baltic region with the TN and TP model having an R2 score of 0.83 and 0.52, respectively. However, as input data used in our models is easier to obtain, the models offer superior applicability in areas, where data availability is insufficient for process-based approaches. Therefore, the models enable to give a robust estimation for nutrient losses at national level and allows to capture the spatial variability of the nutrient runoff which in turn enables to provide decision-making support for regional water management plans.
Collapse
Affiliation(s)
- Holger Virro
- Department of Geography, Institute of Ecology and Earth Sciences, University of Tartu, Vanemuise 46, Tartu 51003, Estonia.
| | - Alexander Kmoch
- Department of Geography, Institute of Ecology and Earth Sciences, University of Tartu, Vanemuise 46, Tartu 51003, Estonia
| | - Marko Vainu
- Institute of Ecology, Tallinn University, Uus-Sadama 5, Tallinn 10120, Estonia
| | - Evelyn Uuemaa
- Department of Geography, Institute of Ecology and Earth Sciences, University of Tartu, Vanemuise 46, Tartu 51003, Estonia
| |
Collapse
|
20
|
Behrouz MS, Yazdi MN, Sample DJ. Using Random Forest, a machine learning approach to predict nitrogen, phosphorus, and sediment event mean concentrations in urban runoff. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2022; 317:115412. [PMID: 35649331 DOI: 10.1016/j.jenvman.2022.115412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 05/22/2022] [Accepted: 05/24/2022] [Indexed: 06/15/2023]
Abstract
Estimating pollutant loads from developed watersheds is vitally important to reduce nonpoint source pollution from urban areas, as a key tool in meeting water quality goals is the implementation of Stormwater Control Measures (SCMs). SCMs are selected and sized based on influent pollutant loads. A common method used to estimate pollutant loads in urban runoff is the Event Mean Concentration (EMC) method. In this study, we develop and apply data-driven models using Random Forest (RF), a machine learning approach, to predict Total Nitrogen (TN), Total Phosphorus (TP), Total Suspended Solids (TSS), and Ortho-Phosphorus (Ortho-P) EMCs in urban runoff. The parameters considered in this study were climatological characteristics (i.e., Antecedent Dry Period or ADP, Precipitation Depth or P, Duration or D, and Intensity or I) and catchment characteristics including land use-related parameters including Imperviousness or Imp, Saturated Hydraulic Conductivity or Ksat, and Available Water Capacity or AWC), and site-specific parameters including Slope (S), and Catchment Size (A). Stormwater quality data for this study were obtained from the National Stormwater Quality Database (NSQD), which is the largest repository of stormwater quality data in the U.S. Results demonstrate that land use-related characteristics (i.e., Imp, Ksat, and AWC) were the most effective variables for predicting all EMCs. For TP, TSS, and Ortho-P, site-specific characteristics (S and A) had a greater effect than climatological characteristics (i.e., ADP, P, D, and I). However, for TN, climatological characteristics had a greater effect than site-specific characteristics (S and A). In addition, for TN, TP, and TSS, precipitation characteristics (P, D, and I) were found to be more effective parameters for estimating EMCs than ADP. This study highlights the most influential parameters affecting EMCs which can be used by stakeholders and SCMs designers to improve estimates of nutrients and sediment EMCs. The selection and design of the highest performing SCMs is essential in achieving effective treatment of stormwater, attaining water quality goals, and protecting downstream waterbodies.
Collapse
Affiliation(s)
- Mina Shahed Behrouz
- Department of Biological System Engineering, Virginia Polytechnic Institute and State University, Seitz Hall, 155 Ag-Quad Ln, Blacksburg, VA, 24060, United States; Hampton Roads Agricultural Research and Extension Center, Virginia Polytechnic and State University, 1444 Diamond Springs Rd, Virginia Beach, VA, 23455, United States.
| | - Mohammad Nayeb Yazdi
- Department of Biological System Engineering, Virginia Polytechnic Institute and State University, Seitz Hall, 155 Ag-Quad Ln, Blacksburg, VA, 24060, United States; Hampton Roads Agricultural Research and Extension Center, Virginia Polytechnic and State University, 1444 Diamond Springs Rd, Virginia Beach, VA, 23455, United States.
| | - David J Sample
- Department of Biological System Engineering, Virginia Polytechnic Institute and State University, Seitz Hall, 155 Ag-Quad Ln, Blacksburg, VA, 24060, United States.
| |
Collapse
|
21
|
Xu X, Yu J, Wang F. Analysis of ecosystem service drivers based on interpretive machine learning: a case study of Zhejiang Province, China. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:64060-64076. [PMID: 35469384 DOI: 10.1007/s11356-022-20311-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Accepted: 04/13/2022] [Indexed: 06/14/2023]
Abstract
A systematic understanding of the driving mechanisms of ecosystem services (ESs) and the relationships among them is critical for successful ecosystem management. However, the impact of driving factors on the relationships between ESs and the formation of ecosystem service bundles (ESBs) remains unclear. To address this gap, we developed a modeling process that used random forest (RF) to model the ESs and ESBs of Zhejiang Province, China, in regression and classification mode, respectively, and the Shapley Additive Explanations (SHAP) method to interpret the underlying driving forces. We first mapped the spatial distribution of seven ESs in Zhejiang Province at a 1 × 1 km spatial resolution and then used the K-means clustering algorithm to obtain four ESBs. Combining the RF models with SHAP analysis, the results showed that each ES had key driving factors, and the relationships of synergy and trade-off between ESs were determined by the driving direction and intensity of the key factors. The driving factors affect the relationships of ESs and consequently affect the formation of ESBs. Thus, managing the dominant drivers is key to improving the supply capacity of ESs.
Collapse
Affiliation(s)
- Xiaohang Xu
- College of Environmental & Resource Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Jie Yu
- Zhejiang Environmental Monitoring Center, Hangzhou, 310012, Zhejiang, China
| | - Feier Wang
- College of Environmental & Resource Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
- Zhejiang Ecological Civilization Academy, Anji, 313300, Zhejiang, China.
| |
Collapse
|
22
|
Chen S, Zhang Z, Lin J, Huang J. Machine learning-based estimation of riverine nutrient concentrations and associated uncertainties caused by sampling frequencies. PLoS One 2022; 17:e0271458. [PMID: 35830456 PMCID: PMC9278742 DOI: 10.1371/journal.pone.0271458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 06/30/2022] [Indexed: 11/23/2022] Open
Abstract
Accurate and sufficient water quality data is essential for watershed management and sustainability. Machine learning models have shown great potentials for estimating water quality with the development of online sensors. However, accurate estimation is challenging because of uncertainties related to models used and data input. In this study, random forest (RF), support vector machine (SVM), and back-propagation neural network (BPNN) models are developed with three sampling frequency datasets (i.e., 4-hourly, daily, and weekly) and five conventional indicators (i.e., water temperature (WT), hydrogen ion concentration (pH), electrical conductivity (EC), dissolved oxygen (DO), and turbidity (TUR)) as surrogates to individually estimate riverine total phosphorus (TP), total nitrogen (TN), and ammonia nitrogen (NH4+-N) in a small-scale coastal watershed. The results show that the RF model outperforms the SVM and BPNN machine learning models in terms of estimative performance, which explains much of the variation in TP (79 ± 1.3%), TN (84 ± 0.9%), and NH4+-N (75 ± 1.3%), when using the 4-hourly sampling frequency dataset. The higher sampling frequency would help the RF obtain a significantly better performance for the three nutrient estimation measures (4-hourly > daily > weekly) for R2 and NSE values. WT, EC, and TUR were the three key input indicators for nutrient estimations in RF. Our study highlights the importance of high-frequency data as input to machine learning model development. The RF model is shown to be viable for riverine nutrient estimation in small-scale watersheds of important local water security.
Collapse
Affiliation(s)
- Shengyue Chen
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, Xiamen University, Xiamen, China
| | - Zhenyu Zhang
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, Xiamen University, Xiamen, China
| | - Juanjuan Lin
- Xiamen Environmental Publicity and Education Center, Xiamen, China
| | - Jinliang Huang
- Fujian Key Laboratory of Coastal Pollution Prevention and Control, Xiamen University, Xiamen, China
- * E-mail:
| |
Collapse
|
23
|
Guo C, Cui Y. Machine learning exhibited excellent advantages in the performance simulation and prediction of free water surface constructed wetlands. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2022; 309:114694. [PMID: 35182978 DOI: 10.1016/j.jenvman.2022.114694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 01/19/2022] [Accepted: 02/06/2022] [Indexed: 06/14/2023]
Abstract
Optimizing the design and operation parameters of free water surface constructed wetlands (FWS CWs) in runoff regulation and wastewater treatment is necessary to improve the comprehensive performance. In this study, nine machine learning (ML) algorithms were successfully developed to optimize the parameter combinations for FWS CWs. The scale effect of surface area on wetland performance was determined based on consistently smaller predictions (-6.2% to -28.9%) of the nine well-established ML algorithms. The models most suitable for FWS CW performance simulation and prediction were random forest and extra trees algorithms because of their high R2 values (0.818 in both) with the training set and low mean absolute relative errors (4.7% and 3.8%, respectively) with the test set. Results from feature analysis of the six tree-based algorithms emphasized the importance of water depth and layout of inlet and outlet, and revealed the negligible effect of the aspect ratio. Feature importance and partial dependence analysis enhanced the interpretability of the tree-based algorithms. The proposed ML algorithms enabled the implementation of an extended scenario at a low cost in real time. Therefore, ML algorithms are suitable for expressing the complex and uncertain effects of the design and operation parameters on the performance of FWS CWs. Acquiring datasets consisting of more extensive, uniform, and unbiased parameter combinations is crucial for developing more robust and practical ML algorithms for the optimal design of FWS CWs.
Collapse
Affiliation(s)
- Changqiang Guo
- Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing, 210008, China; Key Laboratory of Basin Water Resources and Eco-Environmental Science in Hubei Province, Changjiang River Scientific Research Institute of Changjiang Water Resources Commission, Wuhan, 430010, China
| | - Yuanlai Cui
- State Key Laboratory of Water Resource and Hydropower Engineering Science, Wuhan University, Wuhan, 430072, China.
| |
Collapse
|
24
|
Wang Y, Chen L, Jiang Y, Yang X, Dai J, Dai X, Dong M, Yan Y. Salt sacrificial template strategy and in-situ growth of lamellar La(OH)3 on a novel PVDF foam for the simultaneous removal of phosphates and oil pollution without VOCs emission. Sep Purif Technol 2022. [DOI: 10.1016/j.seppur.2022.120681] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
25
|
Ranking of Basin-Scale Factors Affecting Metal Concentrations in River Sediment. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062805] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
River sediments often contain potentially harmful pollutants such as metals. Much research has been conducted to identify factors involved in sediment concentrations of metals. While most metal pollution studies focus on smaller scales, it has been shown that basin-scale parameters are powerful predictors of river water quality. The present study focused on basin-scale factors of metal concentrations in river sediments. The study was performed on the contiguous USA using Random Forest (R.F.) to analyze the importance of different factors of the metal pollution potential of river sediments and evaluate the possibility of assessing this potential from basin characteristics. Results indicated that the most important factors belonged to the groups Geology, Dams, and Land cover. Rock characteristics (contents of K2O, CaO, and SiO2) and reservoir drainage area were strong factors. Vegetation indices were more important than land cover types. The response of different metals to basin-scale factors varied greatly. The R.F. models performed well with prediction errors of 16.5% to 28.1%, showing that basin-scale parameters hold sufficient information for predicting potential metal concentrations. The results contribute to research and policymaking dependent on understanding large-scale factors of metal pollution.
Collapse
|
26
|
Paepae T, Bokoro PN, Kyamakya K. From Fully Physical to Virtual Sensing for Water Quality Assessment: A Comprehensive Review of the Relevant State-of-the-Art. SENSORS (BASEL, SWITZERLAND) 2021; 21:6971. [PMID: 34770278 PMCID: PMC8587795 DOI: 10.3390/s21216971] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/17/2021] [Accepted: 10/17/2021] [Indexed: 12/17/2022]
Abstract
Rapid urbanization, industrial development, and climate change have resulted in water pollution and in the quality deterioration of surface and groundwater at an alarming rate, deeming its quick, accurate, and inexpensive detection imperative. Despite the latest developments in sensor technologies, real-time determination of certain parameters is not easy or uneconomical. In such cases, the use of data-derived virtual sensors can be an effective alternative. In this paper, the feasibility of virtual sensing for water quality assessment is reviewed. The review focuses on the overview of key water quality parameters for a particular use case and the development of the corresponding cost estimates for their monitoring. The review further evaluates the current state-of-the-art in terms of the modeling approaches used, parameters studied, and whether the inputs were pre-processed by interrogating relevant literature published between 2001 and 2021. The review identified artificial neural networks, random forest, and multiple linear regression as dominant machine learning techniques used for developing inferential models. The survey also highlights the need for a comprehensive virtual sensing system in an internet of things environment. Thus, the review formulates the specification book for the advanced water quality assessment process (that involves a virtual sensing module) that can enable near real-time monitoring of water quality.
Collapse
Affiliation(s)
- Thulane Paepae
- Department of Mathematics and Applied Mathematics, University of Johannesburg, Doornfontein 2028, South Africa;
| | - Pitshou N. Bokoro
- Department of Electrical and Electronic Engineering Technology, University of Johannesburg, Doornfontein 2028, South Africa
| | - Kyandoghere Kyamakya
- Institute for Smart Systems Technologies, Transportation Informatics Group, Alpen-Adria Universität Klagenfurt, 9020 Klagenfurt, Austria;
| |
Collapse
|
27
|
Xu L, Wu R, Zhu X, Wang X, Geng X, Xiong Y, Chen T, Wen Y, Ai S. Intelligent analysis of maleic hydrazide using a simple electrochemical sensor coupled with machine learning. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2021; 13:4662-4673. [PMID: 34546231 DOI: 10.1039/d1ay01261d] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
A simple electrochemical sensing platform based on a low-cost disposable laser-induced porous graphene (LIPG) flexible electrode for the intelligent analysis of maleic hydrazide (MH) in potatoes and peanuts coupled with machine learning (ML) was successfully designed. The LIPG electrode was patterned by a simple one-step laser-induced procedure on commercial polyimide film using a computer-controlled direct laser writing micromachining system and displayed excellent flexibility, 3D porous structure, large specific surface area, and preferable conductivity. A data partitioning technique was proposed for the optimal MH concentration ranges by selecting the size of datasets, including the size of the training set and the size of the test set combined with the performance metrics of ML models. Different algorithms such as artificial neural networks (ANN), random forest (RF), and least squares support vector machine (LS-SVM) were selected to build the ML models. Three ML models were evaluated, and the LS-SVM model displayed unique superiority. Both the recoveries and RSD of practical application were further measured to assess the feasibility of the selected LS-SVM model. This will have important theoretical and practical significance for the intelligent analysis of harmful residuals in agro-product safety using an electrochemical sensing platform.
Collapse
Affiliation(s)
- Lulu Xu
- College of Software, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China.
- Institute of Functional Materials and Agricultural Applied Chemistry, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
- College of Engineering, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China.
| | - Ruimei Wu
- College of Engineering, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China.
| | - Xiaoyu Zhu
- Institute of Functional Materials and Agricultural Applied Chemistry, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
| | - Xiaoqiang Wang
- Institute of Functional Materials and Agricultural Applied Chemistry, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
| | - Xiang Geng
- College of Food Science and Engineering, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
| | - Yao Xiong
- College of Software, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China.
- Institute of Functional Materials and Agricultural Applied Chemistry, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
| | - Tao Chen
- Institute of Functional Materials and Agricultural Applied Chemistry, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
| | - Yangping Wen
- Institute of Functional Materials and Agricultural Applied Chemistry, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China
| | - Shirong Ai
- College of Software, Jiangxi Agricultural University, Nanchang 330045, People's Republic of China.
| |
Collapse
|
28
|
Xu C, Chen X, Zhang L. Predicting river dissolved oxygen time series based on stand-alone models and hybrid wavelet-based models. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2021; 295:113085. [PMID: 34147993 DOI: 10.1016/j.jenvman.2021.113085] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 05/17/2021] [Accepted: 06/13/2021] [Indexed: 06/12/2023]
Abstract
Accurate prediction of dissolved oxygen time series is important for improving the water environment and aiding water resource management. In this study, four stand-alone models including multiple linear regression (MLR), support vector machine (SVM), artificial neural network (ANN) and random forest (RF), and four hybrid models based on wavelet transform (WT) including WT-MLR, WT-SVM, WT-ANN and WT-RF were used to predict the daily dissolved oxygen (DO) at 1-5-day lead times in the Dongjiang River Basin, China. To make the prediction robust, the maximal information coefficient (MIC) was used to capture comprehensive information between DO and explanatory variables. The 5-fold cross validation grid search approach was used to optimize parameters of machine learning tools. Two types of frameworks of WT: direct framework (i.e., only the explanatory variables were decomposed) and multicomponent framework (i.e., both explanatory variables and target variables were decomposed) were used to construct hybrid models. The results show that MIC extracts four optimal explanatory variables: previous DO, water temperature, air temperature and air pressure. Four evaluation parameters including correlation coefficient (R), Nash-Sutcliffe efficiency (NSE), mean absolute error (MAE) and root mean square error (RMSE) indicate that the prediction accuracy decreases as the lead time changes from 1 to 5 days. In terms of the stand-alone models, MLR model outperforms the other three models with higher NSE values of 0.616-0.921, and lower RMSE values of 0.503-1.111. With regard to the hybrid models, WT-ANN and WT-MLR models exhibit higher performance, and multicomponent framework performs better than direct framework in all hybrid models. In general, the multicomponent framework of WT can improve the prediction accuracy of stand-alone models at a certain degree, while the direct framework shows no obvious advantage.
Collapse
Affiliation(s)
- Chuang Xu
- Center for Water Resources and Environment Research, School of Civil Engineering, Sun Yat-sen University, Guangzhou, China
| | - Xiaohong Chen
- Center for Water Resources and Environment Research, School of Civil Engineering, Sun Yat-sen University, Guangzhou, China.
| | - Lilan Zhang
- Center for Water Resources and Environment Research, School of Civil Engineering, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
29
|
Machine Learning-Based Prediction of Chlorophyll-a Variations in Receiving Reservoir of World’s Largest Water Transfer Project—A Case Study in the Miyun Reservoir, North China. WATER 2021. [DOI: 10.3390/w13172406] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Although water transfer projects can alleviate the water crisis, they may cause potential risks to water quality safety in receiving areas. The Miyun Reservoir in northern China, one of the receiving reservoirs of the world’s largest water transfer project (South-to-North Water Transfer Project, SNWTP), was selected as a case study. Considering its potential eutrophication trend, two machine learning models, i.e., the support vector machine (SVM) model and the random forest (RF) model, were built to investigate the trophic state by predicting the variations of chlorophyll-a (Chl-a) concentrations, the typical reflection of eutrophication, in the reservoir after the implementation of SNWTP. The results showed that compared with the SVM model, the RF model had higher prediction accuracy and more robust prediction ability with abnormal data, and was thus more suitable for predicting Chl-a concentration variations in the receiving reservoir. Additionally, short-term water transfer would not cause significant variations of Chl-a concentrations. After the project implementation, the impact of transferred water on the water quality of the receiving reservoir would have gradually increased. After a 10-year implementation, transferred water would cause a significant decline in the receiving reservoir’s water quality, and Chl-a concentrations would increase, especially from July to August. This led to a potential risk of trophic state change in the Miyun Reservoir and required further attention from managers. This study can provide prediction techniques and advice on water quality security management associated with eutrophication risks resulting from water transfer projects.
Collapse
|