1
|
Bolick MM, Post CJ, Naser MZ, Mikhailova EA. Comparison of machine learning algorithms to predict dissolved oxygen in an urban stream. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023:10.1007/s11356-023-27481-5. [PMID: 37266780 DOI: 10.1007/s11356-023-27481-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 05/03/2023] [Indexed: 06/03/2023]
Abstract
Water quality monitoring for urban watersheds is critical to identify the negative urbanization impacts. This study sought to identify a successful predictive machine learning model with minimal parameters from easy-to-deploy, low-cost sensors to create a monitoring system for the urban stream network, Hunnicutt Creek, in Clemson, SC, USA. A multiple linear regression model was compared to machine learning algorithms k-nearest neighbor, decision tree, random forest, and gradient boosting. These algorithms were evaluated to understand which best predicted dissolved oxygen (DO) from water temperature, conductivity, turbidity, and water level change at four locations along the urban stream. The random forest algorithm had the highest performance in predicting DO for all four sites, with Nash-Sutcliffe model efficiency coefficient (NSE) scores > 0.9 at three sites and > 0.598 at the fourth site. The random forest model was further examined using explainable artificial intelligence (XAI) and found that temperature influenced the DO predictions for three of the four sites, but there were different water quality interactions depending on site location. Calculating the land cover type in each site's sub-watershed revealed that different amounts of impervious surface and vegetation influenced water quality and the resulting DO predictions. Overall, machine learning combined with land cover data helps decision-makers better understand the nuances of urban watersheds and the relationships between urban land cover and water quality.
Collapse
Affiliation(s)
- Madeleine M Bolick
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, 29634, USA.
| | - Christopher J Post
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, 29634, USA
| | - Mohannad-Zeyad Naser
- Department of Civil and Environmental Engineering & Earth Sciences, Clemson University, Clemson, SC, 29634, USA
| | - Elena A Mikhailova
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, 29634, USA
| |
Collapse
|
2
|
Reliability assessment of water quality index based on guidelines of national sanitation foundation in natural streams: integration of remote sensing and data-driven models. Artif Intell Rev 2021. [DOI: 10.1007/s10462-021-10007-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2022]
|
3
|
A Review of the Artificial Neural Network Models for Water Quality Prediction. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10175776] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Water quality prediction plays an important role in environmental monitoring, ecosystem sustainability, and aquaculture. Traditional prediction methods cannot capture the nonlinear and non-stationarity of water quality well. In recent years, the rapid development of artificial neural networks (ANNs) has made them a hotspot in water quality prediction. We have conducted extensive investigation and analysis on ANN-based water quality prediction from three aspects, namely feedforward, recurrent, and hybrid architectures. Based on 151 papers published from 2008 to 2019, 23 types of water quality variables were highlighted. The variables were primarily collected by the sensor, followed by specialist experimental equipment, such as a UV-visible photometer, as there is no mature sensor for measurement at present. Five different output strategies, namely Univariate-Input-Itself-Output, Univariate-Input-Other-Output, Multivariate-Input-Other(multi), Multivariate-Input-Itself-Other-Output, and Multivariate-Input-Itself-Other (multi)-Output, are summarized. From results of the review, it can be concluded that the ANN models are capable of dealing with different modeling problems in rivers, lakes, reservoirs, wastewater treatment plants (WWTPs), groundwater, ponds, and streams. The results of many of the review articles are useful to researchers in prediction and similar fields. Several new architectures presented in the study, such as recurrent and hybrid structures, are able to improve the modeling quality of future development.
Collapse
|
4
|
Mitrović T, Antanasijević D, Lazović S, Perić-Grujić A, Ristić M. Virtual water quality monitoring at inactive monitoring sites using Monte Carlo optimized artificial neural networks: A case study of Danube River (Serbia). THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 654:1000-1009. [PMID: 30453255 DOI: 10.1016/j.scitotenv.2018.11.189] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 11/06/2018] [Accepted: 11/13/2018] [Indexed: 06/09/2023]
Abstract
Rationalization of water quality monitoring stations nowadays is applied in many countries. In some cases, missing data from abandoned/inactive stations, spatial and temporal, could be very important, hence the use of artificial neural networks (ANNs) for virtual water quality monitoring at inactive monitoring sites was investigated. The aim was to develop single-output and simultaneous ANNs for the spatial interpolation of 18 water quality parameters at single- and multi-inactive monitoring sites on Danube River course through Serbia. Those different modeling approaches were considered in order to determine the most suitable combination of models. The variable selection and sensitivity analysis in the case of simultaneous models were performed using a modified procedure based on Monte Carlo Simulations (MCS). In general, the multi-target models tend to be more accurate than single target ones, while single output models outperform the simultaneous ones. Hence, for particular monitoring network and set of water quality parameters the optimal combination of models must be defined based on model's accuracy and computational effort needed. The MCS selection procedure has proved to be efficient only in the case of simultaneous multi-target model. MCS based analysis of input-output interactions has shown all significant interactions in the case of simultaneous single-target are grouped as a complex cluster of interactions, where majority of inputs influence on several outputs. In the case multi-target model those interactions were portioned in five separate clusters, there majority of them mimic the input-output interactions that are present in single output models. The modeling strategy for study area was proposed on the basis of the performance of created models (mean average percentage error < 10%): simultaneous multi-target model for pH, alkalinity, conductivity, hardness, dissolved oxygen, HCO3-, SO42- and Ca, single-output multi-target models for temperature and Cl-, simultaneous single-target models for Mg and CO2, single output single target models for NO3-.
Collapse
Affiliation(s)
- Tatjana Mitrović
- Jaroslav Cerni Institute for Development of Water Resources, Jaroslava Cernog 80, 11226 Belgrade, Serbia
| | - Davor Antanasijević
- Innovation Center of the Faculty of Technology and Metallurgy, Karnegijeva 4, 11120 Belgrade, Serbia.
| | - Saša Lazović
- Institute of Physics Belgrade, University of Belgrade, Pregrevica 118, 11080 Belgrade, Serbia
| | - Aleksandra Perić-Grujić
- University of Belgrade, Faculty of Technology and Metallurgy, Karnegijeva 4, 11120 Belgrade, Serbia
| | - Mirjana Ristić
- University of Belgrade, Faculty of Technology and Metallurgy, Karnegijeva 4, 11120 Belgrade, Serbia
| |
Collapse
|
5
|
Adamović VM, Antanasijević DZ, Ćosović AR, Ristić MĐ, Pocajt VV. An artificial neural network approach for the estimation of the primary production of energy from municipal solid waste and its application to the Balkan countries. WASTE MANAGEMENT (NEW YORK, N.Y.) 2018; 78:955-968. [PMID: 32559992 DOI: 10.1016/j.wasman.2018.07.012] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 06/19/2018] [Accepted: 07/05/2018] [Indexed: 06/11/2023]
Abstract
Although the use of municipal solid waste to generate energy can decrease dependency on fossil fuels and consequently reduces greenhouse gases emissions and areas that waste occupies, in many countries municipal solid waste is not recognized as a valuable resource and possible alternative fuel. The aim of this study is to develop a model for the prediction of primary energy production from municipal solid waste in the European countries and then to apply it to the Balkan countries in order to assess their potentials in that field. For this purpose, general regression neural network architecture was applied, and correlation and sensitivity analyses were used for optimisation of the model. The data for 16 countries from the European Union and Norway for the period 2006-2015 was used for the development of the model. The model with the best performance (coefficient of determination R2 = 0.995 and the mean absolute percentage error MAPE = 7.757%) was applied to the data for the Balkan countries from 2006 to 2015. The obtained results indicate that there is a significant potential for utilization of municipal solid waste for energy production, which should lead to substantial savings of fossil fuels, primarily lignite which is the most common fossil fuel in the Balkans.
Collapse
Affiliation(s)
- Vladimir M Adamović
- Institute for Technology of Nuclear and Other Mineral Row Materials, Bulevar Franše d'Eperea 86, 11000 Belgrade, Serbia.
| | - Davor Z Antanasijević
- University of Belgrade, Innovation Center of the Faculty of Technology and Metallurgy, Karnegijeva 4, 11120 Belgrade, Serbia
| | - Aleksandar R Ćosović
- Institute for Technology of Nuclear and Other Mineral Row Materials, Bulevar Franše d'Eperea 86, 11000 Belgrade, Serbia
| | - Mirjana Đ Ristić
- University of Belgrade, Faculty of Technology and Metallurgy, Karnegijeva 4, 11120 Belgrade, Serbia
| | - Viktor V Pocajt
- University of Belgrade, Faculty of Technology and Metallurgy, Karnegijeva 4, 11120 Belgrade, Serbia
| |
Collapse
|
6
|
Voza D, Vuković M. The assessment and prediction of temporal variations in surface water quality-a case study. ENVIRONMENTAL MONITORING AND ASSESSMENT 2018; 190:434. [PMID: 29951924 DOI: 10.1007/s10661-018-6814-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Accepted: 06/18/2018] [Indexed: 06/08/2023]
Abstract
In order to optimize the processes of sampling, monitoring, and management, the initial aim of this paper was to develop a model for the definition and prediction of temporal changes of water quality. In the case of the Morava River Basin (Serbia), the patterns of temporal changes have been recognized by applying different multivariate statistical techniques. The results of the conducted cluster analysis are the indicators of the existence of the three monitoring periods: the low-water, transitional, and high-water periods, which is in accordance with changes in the water flow in the analyzed river basin. A possibility of reducing the initial data set and recognizing the main pollution sources was examined by carrying out the principal component/factor analysis. The results indicate that the natural factor has a dominant influence in temporal groups. In order to recognize the discriminatory water quality parameters, a discriminant analysis (DA) was carried out. Conducting the DA enabled a significant reduction in the data set by the extraction of two parameters (the water temperature and electrical conductivity). Furthermore, the artificial neural network technique was used for testing the possibility of predicting changes in the values of the discriminant factors in the monitoring periods. The reliability of this method for the prediction of temporal variations of both extracted parameters within all temporal clusters has been proven.
Collapse
Affiliation(s)
- Danijela Voza
- Technical faculty in Bor, University of Belgrade, Vojske Jugoslavije 12, 19210, Bor, Serbia.
| | - Milovan Vuković
- Technical faculty in Bor, University of Belgrade, Vojske Jugoslavije 12, 19210, Bor, Serbia
| |
Collapse
|
7
|
Šiljić Tomić A, Antanasijević D, Ristić M, Perić-Grujić A, Pocajt V. Application of experimental design for the optimization of artificial neural network-based water quality model: a case study of dissolved oxygen prediction. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2018; 25:9360-9370. [PMID: 29349736 DOI: 10.1007/s11356-018-1246-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 01/08/2018] [Indexed: 06/07/2023]
Abstract
This paper presents an application of experimental design for the optimization of artificial neural network (ANN) for the prediction of dissolved oxygen (DO) content in the Danube River. The aim of this research was to obtain a more reliable ANN model that uses fewer monitoring records, by simultaneous optimization of the following model parameters: number of monitoring sites, number of historical monitoring data (expressed in years), and number of input water quality parameters used. Box-Behnken three-factor at three levels experimental design was applied for simultaneous spatial, temporal, and input variables optimization of the ANN model. The prediction of DO was performed using a feed-forward back-propagation neural network (BPNN), while the selection of most important inputs was done off-model using multi-filter approach that combines a chi-square ranking in the first step with a correlation-based elimination in the second step. The contour plots of absolute and relative error response surfaces were utilized to determine the optimal values of design factors. From the contour plots, two BPNN models that cover entire Danube flow through Serbia are proposed: an upstream model (BPNN-UP) that covers 8 monitoring sites prior to Belgrade and uses 12 inputs measured in the 7-year period and a downstream model (BPNN-DOWN) which covers 9 monitoring sites and uses 11 input parameters measured in the 6-year period. The main difference between the two models is that BPNN-UP utilizes inputs such as BOD, P, and PO43-, which is in accordance with the fact that this model covers northern part of Serbia (Vojvodina Autonomous Province) which is well-known for agricultural production and extensive use of fertilizers. Both models have shown very good agreement between measured and predicted DO (with R2 ≥ 0.86) and demonstrated that they can effectively forecast DO content in the Danube River.
Collapse
Affiliation(s)
- Aleksandra Šiljić Tomić
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| | - Davor Antanasijević
- Innovation Center of the Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia.
| | - Mirjana Ristić
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| | - Aleksandra Perić-Grujić
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| | - Viktor Pocajt
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| |
Collapse
|
8
|
Ji X, Shang X, Dahlgren RA, Zhang M. Prediction of dissolved oxygen concentration in hypoxic river systems using support vector machine: a case study of Wen-Rui Tang River, China. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2017; 24:16062-16076. [PMID: 28537025 DOI: 10.1007/s11356-017-9243-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2016] [Accepted: 05/09/2017] [Indexed: 06/07/2023]
Abstract
Accurate quantification of dissolved oxygen (DO) is critically important for managing water resources and controlling pollution. Artificial intelligence (AI) models have been successfully applied for modeling DO content in aquatic ecosystems with limited data. However, the efficacy of these AI models in predicting DO levels in the hypoxic river systems having multiple pollution sources and complicated pollutants behaviors is unclear. Given this dilemma, we developed a promising AI model, known as support vector machine (SVM), to predict the DO concentration in a hypoxic river in southeastern China. Four different calibration models, specifically, multiple linear regression, back propagation neural network, general regression neural network, and SVM, were established, and their prediction accuracy was systemically investigated and compared. A total of 11 hydro-chemical variables were used as model inputs. These variables were measured bimonthly at eight sampling sites along the rural-suburban-urban portion of Wen-Rui Tang River from 2004 to 2008. The performances of the established models were assessed through the mean square error (MSE), determination coefficient (R 2), and Nash-Sutcliffe (NS) model efficiency. The results indicated that the SVM model was superior to other models in predicting DO concentration in Wen-Rui Tang River. For SVM, the MSE, R 2, and NS values for the testing subset were 0.9416 mg/L, 0.8646, and 0.8763, respectively. Sensitivity analysis showed that ammonium-nitrogen was the most significant input variable of the proposal SVM model. Overall, these results demonstrated that the proposed SVM model can efficiently predict water quality, especially for highly impaired and hypoxic river systems.
Collapse
Affiliation(s)
- Xiaoliang Ji
- Zhejiang Province Key Laboratory of Watershed Science and Health, Southern Zhejiang Water Research Institute (iWATER), Wenzhou Medical University, Wenzhou, 325035, China
| | - Xu Shang
- Zhejiang Province Key Laboratory of Watershed Science and Health, Southern Zhejiang Water Research Institute (iWATER), Wenzhou Medical University, Wenzhou, 325035, China
| | - Randy A Dahlgren
- Zhejiang Province Key Laboratory of Watershed Science and Health, Southern Zhejiang Water Research Institute (iWATER), Wenzhou Medical University, Wenzhou, 325035, China
- Department of Land, Air and Water Resources, University of California, Davis, CA, 95616, USA
| | - Minghua Zhang
- Zhejiang Province Key Laboratory of Watershed Science and Health, Southern Zhejiang Water Research Institute (iWATER), Wenzhou Medical University, Wenzhou, 325035, China.
- Department of Land, Air and Water Resources, University of California, Davis, CA, 95616, USA.
| |
Collapse
|
9
|
Šiljić Tomić AN, Antanasijević DZ, Ristić MĐ, Perić-Grujić AA, Pocajt VV. Modeling the BOD of Danube River in Serbia using spatial, temporal, and input variables optimized artificial neural network models. ENVIRONMENTAL MONITORING AND ASSESSMENT 2016; 188:300. [PMID: 27094057 DOI: 10.1007/s10661-016-5308-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 04/14/2016] [Indexed: 06/05/2023]
Abstract
This paper describes the application of artificial neural network models for the prediction of biological oxygen demand (BOD) levels in the Danube River. Eighteen regularly monitored water quality parameters at 17 stations on the river stretch passing through Serbia were used as input variables. The optimization of the model was performed in three consecutive steps: firstly, the spatial influence of a monitoring station was examined; secondly, the monitoring period necessary to reach satisfactory performance was determined; and lastly, correlation analysis was applied to evaluate the relationship among water quality parameters. Root-mean-square error (RMSE) was used to evaluate model performance in the first two steps, whereas in the last step, multiple statistical indicators of performance were utilized. As a result, two optimized models were developed, a general regression neural network model (labeled GRNN-1) that covers the monitoring stations from the Danube inflow to the city of Novi Sad and a GRNN model (labeled GRNN-2) that covers the stations from the city of Novi Sad to the border with Romania. Both models demonstrated good agreement between the predicted and actually observed BOD values.
Collapse
Affiliation(s)
- Aleksandra N Šiljić Tomić
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| | - Davor Z Antanasijević
- Innovation Center of the Faculty of Technology and Metallurgy, Karnegijeva 4, Belgrade, 11120, Serbia.
| | - Mirjana Đ Ristić
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| | - Aleksandra A Perić-Grujić
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| | - Viktor V Pocajt
- Faculty of Technology and Metallurgy, University of Belgrade, Karnegijeva 4, Belgrade, 11120, Serbia
| |
Collapse
|