1
|
Liao Z, Zhang M, Chen Y, Zhang Z, Wang H. A "Prediction - Detection - Judgment" framework for sudden water contamination event detection with online monitoring. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 355:120496. [PMID: 38437742 DOI: 10.1016/j.jenvman.2024.120496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 02/16/2024] [Accepted: 02/22/2024] [Indexed: 03/06/2024]
Abstract
The contamination detection technology helps in water quality management and protection in surface water. It is important to detect sudden contamination events timely from dynamic variations due to various interference factors in online water quality monitoring data. In this study, a framework named "Prediction - Detection - Judgment" is proposed with a method framework of "Time series increment - Hierarchical clustering - Bayes' theorem model". Time to detection is used as an evaluation index of contamination detection methods, along with the probability of detection and false alarm rate. The proposed method is tested with available public data and further applied in a monitoring site of a river. Results showed that the method could detect the contamination events with a 100% probability of detection, a 17% false alarm rate and a time to detection close to 4 monitoring intervals. The proposed index time to detection evaluates the timeliness of the method, and timely detection ensures that contamination events can be responded to and dealt with in time. The site application also demonstrates the feasibility and practicability of the framework proposed in this study and its potential for extensive implementation.
Collapse
Affiliation(s)
- Zhenliang Liao
- College of Civil Engineering and Architecture, Xinjiang University, Urumqi 830046, PR China; College of Environmental Science and Engineering, Tongji University, Shanghai 200092, PR China.
| | - Minhao Zhang
- College of Environmental Science and Engineering, Tongji University, Shanghai 200092, PR China
| | - Yun Chen
- College of Environmental Science and Engineering, Tongji University, Shanghai 200092, PR China; Water Conservancy Development Research Center, Taihu Basin Authority, PR China
| | - Zhiyu Zhang
- College of Environmental Science and Engineering, Tongji University, Shanghai 200092, PR China.
| | - Huijuan Wang
- College of Civil Engineering and Architecture, Xinjiang University, Urumqi 830046, PR China
| |
Collapse
|
2
|
Swain SS, Khura TK, Sahoo PK, Chobhe KA, Al-Ansari N, Kushwaha HL, Kushwaha NL, Panda KC, Lande SD, Singh C. Proportional impact prediction model of coating material on nitrate leaching of slow-release Urea Super Granules (USG) using machine learning and RSM technique. Sci Rep 2024; 14:3053. [PMID: 38321086 PMCID: PMC10847469 DOI: 10.1038/s41598-024-53410-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 01/31/2024] [Indexed: 02/08/2024] Open
Abstract
An accurate assessment of nitrate leaching is important for efficient fertiliser utilisation and groundwater pollution reduction. However, past studies could not efficiently model nitrate leaching due to utilisation of conventional algorithms. To address the issue, the current research employed advanced machine learning algorithms, viz., Support Vector Machine, Artificial Neural Network, Random Forest, M5 Tree (M5P), Reduced Error Pruning Tree (REPTree) and Response Surface Methodology (RSM) to predict and optimize nitrate leaching. In this study, Urea Super Granules (USG) with three different coatings were used for the experiment in the soil columns, containing 1 kg soil with fertiliser placed in between. Statistical parameters, namely correlation coefficient, Mean Absolute Error, Willmott index, Root Mean Square Error and Nash-Sutcliffe efficiency were used to evaluate the performance of the ML techniques. In addition, a comparison was made in the test set among the machine learning models in which, RSM outperformed the rest of the models irrespective of coating type. Neem oil/ Acacia oil(ml): clay/sulfer (g): age (days) for minimum nitrate leaching was found to be 2.61: 1.67: 2.4 for coating of USG with bentonite clay and neem oil without heating, 2.18: 2: 1 for bentonite clay and neem oil with heating and 1.69: 1.64: 2.18 for coating USG with sulfer and acacia oil. The research would provide guidelines to researchers and policymakers to select the appropriate tool for precise prediction of nitrate leaching, which would optimise the yield and the benefit-cost ratio.
Collapse
Affiliation(s)
- Sidhartha Sekhar Swain
- Division of Agricultural Engineering, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Tapan Kumar Khura
- Division of Agricultural Engineering, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Pramod Kumar Sahoo
- Division of Agricultural Engineering, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Kapil Atmaram Chobhe
- Division of Soil Science and Agricultural Chemistry, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Nadhir Al-Ansari
- Department of Civil, Environmental and Natural Resources Engineering, Lulea University of Technology, 97187, Lulea, Sweden.
| | - Hari Lal Kushwaha
- Division of Agricultural Engineering, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Nand Lal Kushwaha
- Division of Agricultural Engineering, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Kanhu Charan Panda
- Department of Soil Conservation, National PG College (Barhalganj), DDU Gorakhpur University, Gorakhpur, UP, 273402, India
| | - Satish Devram Lande
- Division of Agricultural Engineering, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Chandu Singh
- Division of Genetics, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| |
Collapse
|
3
|
Ly QV, Tong NA, Lee BM, Nguyen MH, Trung HT, Le Nguyen P, Hoang THT, Hwang Y, Hur J. Improving algal bloom detection using spectroscopic analysis and machine learning: A case study in a large artificial reservoir, South Korea. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 901:166467. [PMID: 37611716 DOI: 10.1016/j.scitotenv.2023.166467] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 08/17/2023] [Accepted: 08/19/2023] [Indexed: 08/25/2023]
Abstract
The prediction of algal blooms using traditional water quality indicators is expensive, labor-intensive, and time-consuming, making it challenging to meet the critical requirement of timely monitoring for prompt management. Using optical measures for forecasting algal blooms is a feasible and useful method to overcome these problems. This study explores the potential application of optical measures to enhance algal bloom prediction in terms of prediction accuracy and workload reduction, aided by machine learning (ML) models. Compared to absorption-derived parameters, commonly used fluorescence indices such as the fluorescence index (FI), humification index (HIX), biological index (BIX), and protein-like component improved the prediction accuracy. However, the prediction accuracy was decreased when all optical indices were considered for computation due to increased noise and uncertainty in the models. With the exception of chemical oxygen demand (COD), this study successfully replaced biochemical oxygen demand (BOD), dissolved organic carbon (DOC), and nutrients with selected fluorescence indices, demonstrating relatively analogous performance in either training or testing data, with consistent and good coefficient of determination (R2) values of approximately 0.85 and 0.74, respectively. Among all models considered, ensemble learning models consistently outperformed conventional regression models and artificial neural networks (ANNs). However, there was a trade-off between accuracy and computation efficiency among the ensemble learning models (i.e., Stacking and XGBoost) for algal bloom prediction. Our study offers a glimpse of the potential application of spectroscopic measures to improve accuracy and efficiency in algal bloom prediction, but further work should be carried out in other water bodies to further validate our proposed hypothesis.
Collapse
Affiliation(s)
- Quang Viet Ly
- Department of Environmental Engineering, Seoul National University of Science and Technology, Seoul 01811, South Korea
| | - Ngoc Anh Tong
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Bo-Mi Lee
- Water Quality Assessment Research Division, National Institute of Environmental Research, Incheon 22689, South Korea
| | - Minh Hieu Nguyen
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam; School of Information and Communication Technology, Griffith University, Gold Coast, Australia
| | - Huynh Thanh Trung
- Ecole Polytechnique Federale de Lausanne, 1015 Lausanne, Switzerland
| | - Phi Le Nguyen
- School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam
| | - Thu-Huong T Hoang
- School of Chemistry and Life Science, Hanoi University of Science and Technology, Hanoi 10000, Vietnam
| | - Yuhoon Hwang
- Department of Environmental Engineering, Seoul National University of Science and Technology, Seoul 01811, South Korea
| | - Jin Hur
- Department of Environment and Energy, Sejong University, Seoul 05006, South Korea.
| |
Collapse
|
4
|
Dong J, Wang Z, Wu J, Huang J, Zhang C. A water quality prediction model based on signal decomposition and ensemble deep learning techniques. WATER SCIENCE AND TECHNOLOGY : A JOURNAL OF THE INTERNATIONAL ASSOCIATION ON WATER POLLUTION RESEARCH 2023; 88:2611-2632. [PMID: 38017681 PMCID: wst_2023_357 DOI: 10.2166/wst.2023.357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
Accurate water quality predictions are critical for water resource protection, and dissolved oxygen (DO) reflects overall river water quality and ecosystem health. This study proposes a hybrid model based on the fusion of signal decomposition and deep learning for predicting river water quality. Initially, complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is employed to split the internal series of DO into numerous internal mode functions (IMFs). Subsequently, we employed multi-scale fuzzy entropy (MFE) to compute the entropy values for each IMF component. Time-varying filtered empirical mode decomposition (TVFEMD) is used to further extract features in high-frequency subsequences after linearly aggregating the high-frequency sequences. Finally, support vector machine (SVM) and long short-term memory (LSTM) neural networks are used to predict low- and high-frequency subsequences. Moreover, by comparing it with single models, models based on 'single layer decomposition-prediction-ensemble' and combination models using different methods, the feasibility of the proposed model in predicting water quality data for the Xinlian section of Fuhe River and the Chucha section of Ganjiang River was verified. As a result, the combined prediction approach developed in this work has improved generalizability and prediction accuracy, and it may be used to forecast water quality in complicated waters.
Collapse
Affiliation(s)
- Jinghan Dong
- College of Marine Ecology and Environment, Shanghai Ocean University, Shanghai 201306, China E-mail:
| | - Zhaocai Wang
- College of Information, Shanghai Ocean University, Shanghai 201306, China
| | - Junhao Wu
- State Key Laboratory of Estuarine and Coastal Research, East China Normal University, Shanghai 200241, China
| | - Jinghan Huang
- College of Economics and Management, Shanghai Ocean University, Shanghai 201306, China
| | - Can Zhang
- College of Information, Shanghai Ocean University, Shanghai 201306, China
| |
Collapse
|
5
|
Saxena B, Gaonkar M, Singh SK. Study of the effectiveness of wavelet genetic programming model for water quality analysis in the Uttar Pradesh region. ENVIRONMENTAL MONITORING AND ASSESSMENT 2023; 195:1010. [PMID: 37523098 DOI: 10.1007/s10661-023-11489-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 06/08/2023] [Indexed: 08/01/2023]
Abstract
Water constitutes an essential part of the earth as it helps in making the environment greener and support life. But water quality and availability are drastically affected by rising water pollution and its poor sanitation. Water gets contaminated due to the excessive use of chemicals by the industries, fertilizers, and pesticides by the farmers. Not only the surface water, groundwater and river water are also getting contaminated. Several published work in Indian context have used different models for the prediction of water quality. Some of them performed poorly due to the presence of irrelevant and missing data in the training samples. Moreover, these studies have assessed water quality on the basis of biochemical oxygen demand (BOD) and coliform and chemical oxygen demand (COD), whereas dissolved oxygen(DO) is one of the most important parameters in terms of water quality assessment as it is considered a key determinant of pollution. Thus, there is a strong need to categorically identify and visualize the DO as one of the key components responsible for deteriorating the quality of water in Indian context. The main objective of this work is to build a wavelet genetic programming (WGP)-based workflow model for the assessment of water quality in 13 rivers of Uttar Pradesh region. WGP model has a unique feature of discarding the redundant and irrelevant data values from the source data. The proposed WGP model has given promising results which can be attributed to two factors: firstly, the novel use of Morlet wavelet in place of the widely popular Db wavelet, as the mother wavelet, and secondly, the use of MICE technique for missing value imputation in the pre-processing stage. The proposed model not only cleans the data but also demonstrates the feasibility of using DO values as one of the prime factors to assess the water quality.
Collapse
Affiliation(s)
- Bhawna Saxena
- Department of Computer Science and Engineering & Information Technology, Jaypee Institute of Information Technology, Noida - 201309, UP, India
| | - Mansi Gaonkar
- Department of Computer Science and Engineering & Information Technology, Jaypee Institute of Information Technology, Noida - 201309, UP, India
| | - Sandeep Kumar Singh
- Department of Computer Science and Engineering & Information Technology, Jaypee Institute of Information Technology, Noida - 201309, UP, India.
| |
Collapse
|
6
|
Hu Y, Lyu L, Wang N, Zhou X, Fang M. Application of hybrid improved temporal convolution network model in time series prediction of river water quality. Sci Rep 2023; 13:11260. [PMID: 37438608 DOI: 10.1038/s41598-023-38465-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 07/08/2023] [Indexed: 07/14/2023] Open
Abstract
Time series prediction of river water quality is an important method to grasp the changes of river water quality and protect the river water environment. However, due to the time series data of river water quality have strong periodicity, seasonality and nonlinearity, which seriously affects the accuracy of river water quality prediction. In this paper, a new hybrid deep neural network model is proposed for river water quality prediction, which is integrated with Savitaky-Golay (SG) filter, STL time series decomposition method, Self-attention mechanism, and Temporal Convolutional Network (TCN). The SG filter can effectively remove the noise in the time series data of river water quality, and the STL technology can decompose the time series data into trend, seasonal and residual series. The decomposed trend series and residual series are input into the model combining the Self-attention mechanism and TCN respectively for training and prediction. In order to verify the proposed model, this study uses opensource water quality data and private water quality data to conduct experiments, and compares with other water quality prediction models. The experimental results show that our method achieves the best prediction results in the water quality data of two different rivers.
Collapse
Affiliation(s)
- Yankun Hu
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang, 110168, Liaoning, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Li Lyu
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang, 110168, Liaoning, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Ning Wang
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang, 110168, Liaoning, China.
- University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Xiaolei Zhou
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang, 110168, Liaoning, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Meng Fang
- Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang, 110168, Liaoning, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
7
|
Bolick MM, Post CJ, Naser MZ, Mikhailova EA. Comparison of machine learning algorithms to predict dissolved oxygen in an urban stream. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023:10.1007/s11356-023-27481-5. [PMID: 37266780 DOI: 10.1007/s11356-023-27481-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 05/03/2023] [Indexed: 06/03/2023]
Abstract
Water quality monitoring for urban watersheds is critical to identify the negative urbanization impacts. This study sought to identify a successful predictive machine learning model with minimal parameters from easy-to-deploy, low-cost sensors to create a monitoring system for the urban stream network, Hunnicutt Creek, in Clemson, SC, USA. A multiple linear regression model was compared to machine learning algorithms k-nearest neighbor, decision tree, random forest, and gradient boosting. These algorithms were evaluated to understand which best predicted dissolved oxygen (DO) from water temperature, conductivity, turbidity, and water level change at four locations along the urban stream. The random forest algorithm had the highest performance in predicting DO for all four sites, with Nash-Sutcliffe model efficiency coefficient (NSE) scores > 0.9 at three sites and > 0.598 at the fourth site. The random forest model was further examined using explainable artificial intelligence (XAI) and found that temperature influenced the DO predictions for three of the four sites, but there were different water quality interactions depending on site location. Calculating the land cover type in each site's sub-watershed revealed that different amounts of impervious surface and vegetation influenced water quality and the resulting DO predictions. Overall, machine learning combined with land cover data helps decision-makers better understand the nuances of urban watersheds and the relationships between urban land cover and water quality.
Collapse
Affiliation(s)
- Madeleine M Bolick
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, 29634, USA.
| | - Christopher J Post
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, 29634, USA
| | - Mohannad-Zeyad Naser
- Department of Civil and Environmental Engineering & Earth Sciences, Clemson University, Clemson, SC, 29634, USA
| | - Elena A Mikhailova
- Department of Forestry and Environmental Conservation, Clemson University, Clemson, SC, 29634, USA
| |
Collapse
|
8
|
Collaborative Energy Price Computing Based on Sarima-Ann and Asymmetric Stackelberg Games. Symmetry (Basel) 2023. [DOI: 10.3390/sym15020443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023] Open
Abstract
The energy trading problem in smart grids has been of great interest. In this paper, we focus on two problems: 1. Energy sellers’ inaccurate grasp of users’ real needs causes information asymmetry in transactions, making it difficult for energy sellers to develop more satisfactory pricing strategies for users based on those real needs. 2. The uneven variation of user demand causes the grid costs to increase. In this paper, we design a collaborative pricing strategy based on the seasonal autoregressive integrated moving average-artificial neural network (Sarima-Ann) and an asymmetric Stackelberg game. Specifically, we propose a dissatisfaction function for users and an incentive function for grid companies to construct a utility function for both parties, which introduces an incentive amount to achieve better results in equilibrating user demand while optimizing the transaction utility. In addition, we constructed a demand fluctuation function based on user demand data and introduced it into the game model to predict the demand by Sarima-Ann, which achieves better prediction accuracy. Finally, through simulation experiments, we demonstrate the effectiveness of our scheme in balancing demand and improving utility, and the superiority of our Sarima-Ann model in terms of forecasting accuracy. Specifically, the peak reduction can reach 94.1% and the total transaction utility increase can reach 4.6 × 107, and better results can be achieved by adjusting the incentive rate. Our Sarima-Ann model improves accuracy by 64.95% over Arima and 64.47% over Sarima under MAE metric evaluation, and also shows superior accuracy under other metrics evaluation.
Collapse
|
9
|
Do P, Chow CWK, Rameezdeen R, Gorjian N. Wastewater inflow time series forecasting at low temporal resolution using SARIMA model: a case study in South Australia. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:70984-70999. [PMID: 35595895 PMCID: PMC9515036 DOI: 10.1007/s11356-022-20777-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 05/06/2022] [Indexed: 06/15/2023]
Abstract
Forecasts of wastewater inflow are considered as a significant component to support the development of a real-time control (RTC) system for a wastewater pumping network and to achieve optimal operations. This paper aims to investigate patterns of the wastewater inflow behaviour and develop a seasonal autoregressive integrated moving average (SARIMA) forecasting model at low temporal resolution (hourly) for a short-term period of 7 days for a real network in South Australia, the Murray Bridge wastewater network/wastewater treatment plant (WWTP). Historical wastewater inflow data collected for a 32-month period (May 2016 to December 2018) was pre-processed (transformed into an hourly dataset) and then separated into two parts for training (80%) and testing (20%). Results reveal that there is seasonality presence in the wastewater inflow time series data, as it is heavily dependent on time of the day and day of the week. Besides, the SARIMA (1,0,3)(2,1,2)24 was found as the best model to predict wastewater inflow and its forecasting accuracy was determined based on the evaluation criteria including the root mean square error (RMSE = 5.508), the mean absolute value percent error (MAPE = 20.78%) and the coefficient of determination (R2 = 0.773). From the results, this model can provide wastewater operators curial information that supports decision making more effectively for their daily tasks on operating their systems in real-time.
Collapse
Affiliation(s)
- Phuong Do
- Sustainable Infrastructure and Resource Management (SIRM), UniSA STEM, University of South Australia, Mawson Lakes, Adelaide, SA, 5095, Australia
| | - Christopher W K Chow
- Sustainable Infrastructure and Resource Management (SIRM), UniSA STEM, University of South Australia, Mawson Lakes, Adelaide, SA, 5095, Australia.
- Future Industries Institute, University of South Australia, Adelaide, SA, 5095, Australia.
| | - Raufdeen Rameezdeen
- Sustainable Infrastructure and Resource Management (SIRM), UniSA STEM, University of South Australia, Mawson Lakes, Adelaide, SA, 5095, Australia
| | - Nima Gorjian
- Sustainable Infrastructure and Resource Management (SIRM), UniSA STEM, University of South Australia, Mawson Lakes, Adelaide, SA, 5095, Australia
- South Australian Water Corporation, Adelaide, South Australia, Australia
| |
Collapse
|
10
|
Ly QV, Truong VH, Ji B, Nguyen XC, Cho KH, Ngo HH, Zhang Z. Exploring potential machine learning application based on big data for prediction of wastewater quality from different full-scale wastewater treatment plants. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 832:154930. [PMID: 35390391 DOI: 10.1016/j.scitotenv.2022.154930] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 03/26/2022] [Accepted: 03/26/2022] [Indexed: 06/14/2023]
Abstract
Water pollution generated from intensive anthropogenic activities has emerged as a critical issue concerning ecosystem balance and livelihoods worldwide. Although optimizing wastewater treatment efficiency is widely regarded as the foremost step to minimize pollutants released into the environment, this widespread application has encountered two major problems: firstly, the significant variation of influent wastewater constituents; secondly, complex treatment processes within wastewater treatment plants (WWTPs). Based on the data collected hourly using real-time sensors in three different full-scale WWTPs (24 h × 365 days × 3 WWTPs × 10 wastewater parameters), this work introduced the potential application of Machine Learning (ML) to predict wastewater quality. In this work, six different ML algorithms were examined and compared, varying from shallow to deep learning architectures including Seasonal Autoregressive Integrated Moving Average (SARIMAX), Random Forest (RF), Support Vector Machine (SVM), Gradient Tree Boosting (GTB), Adaptive Neuro-Fuzzy Inference System (ANFIS) and Long Short-Term Memory (LSTM). These models were developed to detect total phosphorus in the outlet (Outlet-TP), which served as an output variable due to the rising concerns about the eutrophication problem. Irrespective of WWTPs, SARIMAX consistently demonstrated the best performance for regression estimation as evidenced by the lowest values of Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE) and the highest coefficient of determination (R2). In terms of computation efficiency, SARIMAX exhibited acceptable time computation, acknowledging the successful application of this algorithm for Outlet-TP modeling. In contrast, the complex structure of LSTM made it time-consuming and unstable coupled with noise, while other shallower architectures, i.e., RF, SVM, GTB, and ANFIS were unable to address large datasets with nonlinear and nonstationary behavior. Consequently, this study provides a reliable and accurate approach to forecast wastewater effluent quality, which is pivotal in terms of the socio-economic aspects of wastewater management.
Collapse
Affiliation(s)
- Quang Viet Ly
- Institute of Environmental Engineering & Nano-Technology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China; Guangdong Provincial Engineering Research Center for Urban Water Recycling and Environmental Safety, Tsinghua-Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China; School of Environment, Tsinghua University, Beijing 100084, China
| | - Viet Hung Truong
- Faculty of Civil Engineering, Thuyloi University, 175 Tay Son, Dong Da, Hanoi 10000, Viet Nam
| | - Bingxuan Ji
- Institute of Environmental Engineering & Nano-Technology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China; Guangdong Provincial Engineering Research Center for Urban Water Recycling and Environmental Safety, Tsinghua-Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China; School of Environment, Tsinghua University, Beijing 100084, China
| | - Xuan Cuong Nguyen
- Institute of Research and Development, Duy Tan University, Danang 550000, Viet Nam
| | - Kyung Hwa Cho
- School of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, 50 UNIST-gil, Eonyang-eup, Ulju-gun, Ulsan 689-798, South Korea
| | - Huu Hao Ngo
- School of Civil and Environmental Engineering, University of Technology Sydney, Sydney, NWS 2007, Australia
| | - Zhenghua Zhang
- Institute of Environmental Engineering & Nano-Technology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China; Guangdong Provincial Engineering Research Center for Urban Water Recycling and Environmental Safety, Tsinghua-Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China; School of Environment, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
11
|
Khodakhah H, Aghelpour P, Hamedi Z. Comparing linear and non-linear data-driven approaches in monthly river flow prediction, based on the models SARIMA, LSSVM, ANFIS, and GMDH. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:21935-21954. [PMID: 34773585 DOI: 10.1007/s11356-021-17443-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 11/05/2021] [Indexed: 06/13/2023]
Abstract
River flow variations directly affect the hydro-climatological, environmental, and ecological characteristics of a region. Therefore, an accurate prediction of river flow can critically be important for water managers and planners. The present study aims to compare different data-driven models in predicting monthly flow. Two river catchments located in the Guilan province in Iran, where rivers play an essential role in agricultural productions (mainly rice), are studied. The monthly river flow dataset was provided by Guilan Regional Water Authority during 1986-2015. The models are derived from two different numerical types of stochastic and machine learning (ML) models. The stochastic model is seasonal autoregressive integrated moving average (SARIMA), and the MLs are least square support vector machine (LSSVM), adaptive neuro-fuzzy inference system (ANFIS), and group method of data handling (GMDH). The inputs were selected by autocorrelation and partial autocorrelation functions (ACF and PACF) from the flow rates of the previous months. The data was divided into 75% of training and 25% of testing phases, and then the mentioned models were implemented. Predictions were evaluated by the criteria of root mean square error (RMSE), normalized RMSE (NRMSE), and Nash Sutcliff (NS) coefficient. According to the calculated values of different criteria during the test phase, RMSE = 1.138 cms, NRMSE = 0.109, and NS = 0.826, it can be concluded that the SARIMA model was superior to its ML competitors. Among the ML models, GMDH had the best performance (by RMSE = 1.290 cms, NRMSE = 0.124, and NS = 0.777) because it has more optimization parameters and sample space for network make-up. The models were also evaluated in hydrological drought conditions of both rivers. It was resulted that the rivers' flow can be well predicted in drought conditions by using these models, especially the SARIMA stochastic model. According to the NRMSE values (ranged between 0.1 and 0.2), the accuracy of predictions is evaluated in the appropriate range, and the present study shows promising results of the current approaches. Consequently, a comparison between the performance of linear stochastic models and complex black-box MLs, reveals that linear stochastic models are more suitable for the current region's monthly river flow prediction.
Collapse
Affiliation(s)
- Hedieh Khodakhah
- Department of Water Engineering, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran
| | - Pouya Aghelpour
- Department of Water Engineering, Faculty of Agriculture, Bu-Ali Sina University, Hamedan, Iran.
| | - Zahra Hamedi
- Computer Science Department, University of Birmingham, Birmingham, UK
| |
Collapse
|
12
|
Nguyen XC, Ly QV, Nguyen TTH, Ngo HTT, Hu Y, Zhang Z. Potential application of machine learning for exploring adsorption mechanisms of pharmaceuticals onto biochars. CHEMOSPHERE 2022; 287:132203. [PMID: 34826908 DOI: 10.1016/j.chemosphere.2021.132203] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 08/14/2021] [Accepted: 09/06/2021] [Indexed: 06/13/2023]
Abstract
The increasing accumulation of pharmaceuticals in aquatic ecosystems could impair freshwater quality and threaten human health. Despite the adsorption of pharmaceuticals on biochars is one of the most cost-effective and eco-friendly removal methods, the wide variation of experimental designs and research aims among previous studies pose significant challenge in selecting biochar for optimal removal. In this work, literature data of 1033 sets with 21 variables collected from 267 papers over ten years (2010-2020) covering 19 pharmaceuticals onto 88 biochars were assessed by different machine learning (ML) algorithms i.e., Linear regression model (LM), Feed-forward neural networks (NNET), Deep neutral networks (DNN), Cubist, K-nearest neighbor (KNN), and Random forest (RF), to predict equilibrium adsorption capacity (Qe) and explore adsorption mechanisms. LM showed the best performance on ranking importance of input variables. Except for initial concentration of pharmaceuticals, Qe was strongly governed by biochars' properties including specific surface area (BET), pore volume (PV), and pore structure (PS) rather than pharmaceuticals' properties and experimental conditions. The most accurate model for estimating Qe was achieved by Cubist, followed by KNN, RF, KNN, NNET and LM. The generalization ability was observed by the tuned Cubist with 26 rules for the prediction of the unseen data. This study not only provides an insightful evidence for data-based adsorption mechanisms of pharmaceuticals on biochars, but also offers a potential method to accurately predict the biochar adsorption performance without conducting any experiments, which will be of high interests in practice in terms of water/wastewater treatment using biochars.
Collapse
Affiliation(s)
- Xuan Cuong Nguyen
- Laboratory of Energy and Environmental Science, Institute of Research and Development, Duy Tan University, Da Nang, 550000, Vietnam; Faculty of Environmental and Chemical Engineering, Duy Tan University, Da Nang, 550000, Vietnam
| | - Quang Viet Ly
- Institute of Environmental Engineering & Nano-Technology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, Guangdong, China.
| | - Thi Thanh Huyen Nguyen
- Laboratory of Energy and Environmental Science, Institute of Research and Development, Duy Tan University, Da Nang, 550000, Vietnam; Faculty of Environmental and Chemical Engineering, Duy Tan University, Da Nang, 550000, Vietnam
| | - Hien Thi Thu Ngo
- Department of Public Health, Faculty of Health Sciences, Thang Long University, Hanoi, Vietnam
| | - Yunxia Hu
- State Key Laboratory of Separation Membranes and Membrane Processes, National Center for International Joint Research on Membrane Science and Technology, School of Materials Science and Engineering, Tiangong University, Tianjin, 300387, PR China
| | - Zhenghua Zhang
- Institute of Environmental Engineering & Nano-Technology, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, Guangdong, China.
| |
Collapse
|
13
|
Ly QV, Nguyen XC, Lê NC, Truong TD, Hoang THT, Park TJ, Maqbool T, Pyo J, Cho KH, Lee KS, Hur J. Application of Machine Learning for eutrophication analysis and algal bloom prediction in an urban river: A 10-year study of the Han River, South Korea. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 797:149040. [PMID: 34311376 DOI: 10.1016/j.scitotenv.2021.149040] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/29/2021] [Accepted: 07/10/2021] [Indexed: 06/13/2023]
Abstract
The increasing release of nutrients to aquatic environments has led to great concern regarding eutrophication and the risk of unwanted algal blooms. Based on observational data of 20 water quality parameters measured on a monthly basis at 40 stations from 2011 to 2020, this study applied different Machine Learning (ML) algorithms to suggest the best option for algal bloom prediction in the Han River, a large river in South Korea. Eight different ML algorithms were categorized into several groups of statistical learning, regression family, and deep learning, and were then compared for their suitability to predict the chlorophyll-derived trophic index (TSI-Chla). ML algorithms helped identify the most important water quality parameters contributing to algal bloom prediction. The ML results confirmed that eutrophication and algal proliferation were governed by the complex interplay between nutrients (nitrogen and phosphorus), organic contaminants, and environmental factors. Of the models tested, the adaptive neuro-fuzzy inference system (ANFIS) exhibited the best performance owing to its consistent and outperforming prediction both quantitatively (i.e., via regression) and qualitatively (i.e., via classification), which was evidenced by the lowest value of mean absolute error (MAE) of 0.09, and the highest F1-score, Recall and Precision of 0.97, 0.98 and 0.96, respectively. In a further step, a representative web application was constructed to assist common users to predict the trophic status of the Han River. This study demonstrated that ML techniques are not only promising for highly accurate water quality modeling of urban rivers, but also reduce time and labor intensity for experiments, which decreases the number of monitored water quality parameters, providing further insights into the driving factors of water quality deterioration. They ultimately help devise proactive strategies for sustainable water management.
Collapse
Affiliation(s)
- Quang Viet Ly
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China
| | - Xuan Cuong Nguyen
- Laboratory of Energy and Environmental Science, Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam; Faculty of Environmental and Chemical Engineering, Duy Tan University, Da Nang 550000, Vietnam
| | - Ngoc C Lê
- School of Applied Mathematics and Informatics, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
| | - Tien-Dung Truong
- School of Applied Mathematics and Informatics, Hanoi University of Science and Technology, Hanoi 100000, Vietnam
| | - Thu-Huong T Hoang
- School of Environmental Science and Technology, Hanoi University of Science and Technology, Hanoi 100000, Vietnam.
| | - Tae Jun Park
- Department of Environment and Energy, Sejong University, Seoul 05006, South Korea
| | - Tahir Maqbool
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, Guangdong, China
| | - JongCheol Pyo
- Center for Environmental Data Strategy, Korea Environment Institute, Sejong 30147, South Korea
| | - Kyung Hwa Cho
- School of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, 50 UNIST-gil, Eonyang-eup, Ulju-gun, Ulsan 44919, South Korea
| | - Kwang-Sik Lee
- Korea Basic Science Institute, Yeongudanji-ro 162, Cheongwon-gu, Cheongju, Chungcheongbuk-do 28119, South Korea
| | - Jin Hur
- Department of Environment and Energy, Sejong University, Seoul 05006, South Korea.
| |
Collapse
|