1
|
Searcy RT, Boehm AB. Know Before You Go: Data-Driven Beach Water Quality Forecasting. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17930-17939. [PMID: 36472482 DOI: 10.1021/acs.est.2c05972] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Forecasting environmental hazards is critical in preventing or building resilience to their impacts on human communities and ecosystems. Environmental data science is an emerging field that can be harnessed for forecasting, yet more work is needed to develop methodologies that can leverage increasingly large and complex data sets for decision support. Here, we design a data-driven framework that can, for the first time, forecast bacterial standard exceedances at marine beaches with 3 days lead time. Using historical data sets collected at two California sites, we train nearly 400 forecast models using statistical and machine learning techniques and test forecasts against predictions from both a naive "persistence" model and a baseline nowcast model. Overall, forecast models are found to have similar sensitivities and specificities to the persistence model, but significantly higher areas under the ROC curve (a metric distinguishing a model's ability to effectively parse classes across decision thresholds), suggesting that forecasts can provide enhanced information beyond past observations alone. Forecast model performance at all lead times was similar to that of nowcast models. Together, results suggest that integrating the forecasting framework developed in this study into beach management programs can enable better public notification and aid in proactive pollution and health risk management.
Collapse
Affiliation(s)
- Ryan T Searcy
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, California 94305, United States
| | - Alexandria B Boehm
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, California 94305, United States
| |
Collapse
|
2
|
Zhang SZ, Chen S, Jiang H. A back propagation neural network model for accurately predicting the removal efficiency of ammonia nitrogen in wastewater treatment plants using different biological processes. WATER RESEARCH 2022; 222:118908. [PMID: 35917670 DOI: 10.1016/j.watres.2022.118908] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 07/14/2022] [Accepted: 07/21/2022] [Indexed: 06/15/2023]
Abstract
Accurately predicting the water quality of treated water from a water treatment plant (WWTP) based on the obtained operating database is of great significance. However, it is difficult for common mechanistic models to work well. In this study, a back propagation artificial neural network (BPANN) model with high accuracy was developed to predict the denitrification efficiency based on a 1-year operating database. Standardized principal component analysis (PCA) methods were used to address the data, and the PCA processed data exhibited the best accuracy. In three WWTPs adopting the anaerobic/anoxic/oxic (A2O) process, the ammonia nitrogen removal efficiency of WWTPs was successfully predicted by using five variables: inlet flow rate, pH value, original ammonia nitrogen concentration, Chemical oxygen demand (COD) concentration, and total phosphorus concentration. Importantly, the obtained BPANN model can be effectively used for other widely used treatment processes, such as oxidation ditch (OD), sequencing batch reactor activated sludge process (SBR), membrane bioreactor (MBR), and cyclic activated sludge technology (CAST), by simply optimizing the training data ratios between 50/50 and 90/10. This is the first trial to set up a universal model for predicting the denitrification efficiency of WWTPs adopting common biological processes. The model could be used to choose the optimum treatment process in the new WWTP design or take action in advance to avoid the risk of excessive emissions when the already built WWTPs are subjected to sudden shocks.
Collapse
Affiliation(s)
- Shu-Zhe Zhang
- CAS Key Laboratory of Urban Pollutant Conversion, Department of Applied Chemistry, University of Science and Technology of China, Hefei 230026, China
| | - Shuo Chen
- CAS Key Laboratory of Urban Pollutant Conversion, Department of Applied Chemistry, University of Science and Technology of China, Hefei 230026, China
| | - Hong Jiang
- CAS Key Laboratory of Urban Pollutant Conversion, Department of Applied Chemistry, University of Science and Technology of China, Hefei 230026, China.
| |
Collapse
|
3
|
Research on Forest Conversation Analysis Using Autoregressive Neural Network-Based Model. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:3280928. [PMID: 35770125 PMCID: PMC9236798 DOI: 10.1155/2022/3280928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/11/2022] [Accepted: 05/16/2022] [Indexed: 11/18/2022]
Abstract
Forest biodiversity is an important component of biological diversity that should not be disregarded. The question of how to evaluate it has sparked scholarly inquiry and discussion. The purpose of this paper is to describe the principles of general linear regression, the selection of model variables in OLS autoregressive modelling, model coefficient testing, analysis of variance of autoregressive models, and model evaluation indicators in order to clarify the suitability of GWR models for solving biomass-related data problems. The GWR 4.0 program was used to create a spatially weighted autoregressive model. Model testing and an accuracy analysis were performed on the model. Following a comparison and study with the general linear regression model, it was discovered that the geographically weighted autoregressive model is better suited to defining spatially correlated data than the general linear regression model.
Collapse
|
4
|
Yu JW, Kim JS, Li X, Jong YC, Kim KH, Ryang GI. Water quality forecasting based on data decomposition, fuzzy clustering and deep learning neural network. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 303:119136. [PMID: 35283198 DOI: 10.1016/j.envpol.2022.119136] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 02/12/2022] [Accepted: 03/09/2022] [Indexed: 06/14/2023]
Abstract
Water quality forecasting can provide useful information for public health protection and support water resources management. In order to forecast water quality more accurately, this paper proposes a novel hybrid model by combining data decomposition, fuzzy C-means clustering and bidirectional gated recurrent unit. Firstly, the original water quality data is decomposed into several subseries by empirical wavelet transform, and then, the decomposed subseries are recombined by fuzzy C-means clustering. Next, for each clustered series, bidirectional gated recurrent unit is applied to develop prediction model. Finally, the forecast result is obtained by the summation of the predictions for the subseries. The proposed forecast model is evaluated by the water quality data of Poyang Lake, China. Results show that the proposed forecast model provides highly accurate forecast result for all of the six water quality data: the average of MAPE of the forecast results for the six water quality datasets is 4.59% for 7 day ahead prediction. Furthermore, our model shows better forecast performance than the other models. Particularly, compared with the single BiGRU model, MAPE decreased by 32.86% in average. Results demonstrate that the proposed forecast model can be used effectively for water quality forecasting.
Collapse
Affiliation(s)
- Jin-Won Yu
- School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin, 300384, China; University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Ju-Song Kim
- School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin, 300384, China; University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Xia Li
- School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin, 300384, China.
| | - Yun-Chol Jong
- University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Kwang-Hun Kim
- University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Gwang-Il Ryang
- University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| |
Collapse
|
5
|
Ataeefard M, Tilebon SMS, Etezad SM, Mahdavi S. Intelligent modeling and optimization of environmentally friendly green enzymatic deinking of printed paper. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:39486-39499. [PMID: 35103941 DOI: 10.1007/s11356-021-15622-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 07/20/2021] [Indexed: 06/14/2023]
Abstract
Nowadays, the paper industry supplies its required fibers either from primary fibers, including wood and plants, or waste papers, called secondary fibers. One of the most challenging recycling processes is deinking of papers digitally printed with electrophotographic ink. In order to produce optically high-quality paper from recycled waste papers, deinking step is required at the desired levels. In this work, the environmentally friendly green enzymatic deinking of printed paper was modeled and optimized via an innovative approach called artificial intelligence method. The effect of treatment temperature, treatment time, and enzyme dosage on mechanical properties (tensile and burst strengths) as well as optical properties (whiteness and brightness) of handsheet was investigated. The developed code can appropriately learn the non-linear behavior of deinking process, and make decisions according to the pattern constructed intelligently. Finally, multi-objective optimization at the specified treatment temperature, treatment time, and enzyme dosage was performed to identify the best conditions for enzyme-deinked handsheet (maximized mechanical and optical properties).
Collapse
Affiliation(s)
- Maryam Ataeefard
- Department of Printing Science and Technology, Institute for Color Science and Technology, Tehran, Iran.
| | | | - Seyed Masoud Etezad
- Department of Environmental Research, Institute for Color Science and Technology, Tehran, Iran
| | - Saeed Mahdavi
- Wood and Forest Products Division, Research Institute of Forest and Rangelands, Agricultural Research Education and Extension Organization (AREEO), Tehran, Iran
| |
Collapse
|
6
|
Prediction of Total Nitrogen and Phosphorus in Surface Water by Deep Learning Methods Based on Multi-Scale Feature Extraction. WATER 2022. [DOI: 10.3390/w14101643] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
To improve the precision of water quality forecasting, the variational mode decomposition (VMD) method was used to denoise the total nitrogen (TN) and total phosphorus (TP) time series and obtained several high- and low-frequency components at four online surface water quality monitoring stations in Poyang Lake. For each of the aforementioned high-frequency components, a long short-term memory (LSTM) network was introduced to achieve excellent prediction results. Meanwhile, a novel metaheuristic optimization algorithm, called the chaos sparrow search algorithm (CSSA), was implemented to compute the optimal hyperparameters for the LSTM model. For each low-frequency component with periodic changes, the multiple linear regression model (MLR) was adopted for rapid and effective prediction. Finally, a novel combined water quality prediction model based on VMD-CSSA-LSTM-MLR (VCLM) was proposed and compared with nine prediction models. Results indicated that (1), for the three standalone models, LSTM performed best in terms of mean absolute error (MAE), mean absolute percentage error (MAPE), and the root mean square error (RMSE), as well as the Nash–Sutcliffe efficiency coefficient (NSE) and Kling–Gupta efficiency (KGE). (2) Compared with the standalone model, the decomposition and prediction of TN and TP into relatively stable sub-sequences can evidently improve the performance of the model. (3) Compared with CEEMDAN, VMD can extract the multiscale period and nonlinear information of the time series better. The experimental results proved that the averages of MAE, MAPE, RMSE, NSE, and KGE predicted by the VCLM model for TN are 0.1272, 8.09%, 0.1541, 0.9194, and 0.8862, respectively; those predicted by the VCLM model for TP are 0.0048, 10.83%, 0.0062, 0.9238, and 0.8914, respectively. The comprehensive performance of the model shows that the proposed hybrid VCLM model can be recommended as a promising model for online water quality prediction and comprehensive water environment management in lake systems.
Collapse
|
7
|
Tong X, You L, Zhang J, He Y, Gin KYH. Advancing prediction of emerging contaminants in a tropical reservoir with general water quality indicators based on a hybrid process and data-driven approach. JOURNAL OF HAZARDOUS MATERIALS 2022; 430:128492. [PMID: 35739673 DOI: 10.1016/j.jhazmat.2022.128492] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 02/05/2022] [Accepted: 02/12/2022] [Indexed: 06/15/2023]
Abstract
Monitoring and predicting the occurrence and dynamic distributions of emerging contaminants (ECs) in the aquatic environment has always been a great challenge. This study aims to explore the potential of fully utilizing the advantages of combining traditional process-based models (PBMs) and data-driven models (DDMs) with general water quality indicators in terms of improving the accuracy and efficiency of predicting ECs in aquatic ecosystems. Two representative ECs, namely Bisphenol A (BPA) and N, N-diethyltoluamide (DEET), in a tropical reservoir were chosen for this study. A total of 36 DDMs based on different input datasets using Artificial Neural Networks (ANN) and Random Forests (RF) were examined in three case studies. The models were applied in prognosis validation based on easily accessible data on water quality indicators. Our results revealed that all the models yielded good fits when compared to the observed data. These new insights into the advantages using the combination of traditional PBMs and DDMs with general water quality datasets help to overcome the constraints in terms of model accuracy and efficiency as well as technical and budget limitations due to monitoring surveys and laboratory experiments in the study of fate and transport of ECs in aquatic environments.
Collapse
Affiliation(s)
- Xuneng Tong
- Department of Civil & Environmental Engineering, National University of Singapore, 1 Engineering Drive 2, Singapore 117576, Singapore
| | - Luhua You
- E2S2-CREATE, NUS Environmental Research Institute, National University of Singapore, 1 Create way, Create Tower, #15-02, Singapore 138602, Singapore
| | - Jingjie Zhang
- E2S2-CREATE, NUS Environmental Research Institute, National University of Singapore, 1 Create way, Create Tower, #15-02, Singapore 138602, Singapore; Shenzhen Municipal Engineering Lab of Environmental IoT Technologies, Southern University of Science and Technology, Shenzhen 518055, China; Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130102, China.
| | - Yiliang He
- School of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Karina Yew-Hoong Gin
- Department of Civil & Environmental Engineering, National University of Singapore, 1 Engineering Drive 2, Singapore 117576, Singapore; E2S2-CREATE, NUS Environmental Research Institute, National University of Singapore, 1 Create way, Create Tower, #15-02, Singapore 138602, Singapore.
| |
Collapse
|
8
|
Li L, Qiao J, Yu G, Wang L, Li HY, Liao C, Zhu Z. Interpretable tree-based ensemble model for predicting beach water quality. WATER RESEARCH 2022; 211:118078. [PMID: 35066260 DOI: 10.1016/j.watres.2022.118078] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 11/29/2021] [Accepted: 01/12/2022] [Indexed: 06/14/2023]
Abstract
Tree-based machine learning models based on environmental features offer low-cost and timely solutions for predicting microbial fecal contamination in beach water to inform the public of the health risk. However, many of these models are black boxes that are difficult for humans to understand, which may cause severe consequences such as unexplained decisions and failure in accountability. To develop interpretable predictive models for beach water quality, we evaluate five tree-based models, namely classification tree, random forest, CatBoost, XGBoost, and LightGBM, and employ a state-of-the-art explanation method SHAP to explain the models. When tested on the Escherichia coli (E. coli) concentration data collected from three beach sites along Lake Erie shores, LightGBM, followed by XGBoost, achieves the highest averaged precision and recall scores. For all three sites, both models suggest lake turbidity as the most important predictor, and elucidate the crucial role of accurate local data of wave height and rainfall in the model development. Local SHAP values further reveal the robustness of the importance of lake turbidity as its SHAP value increases nearly monotonically with its value and is minimally affected by other environmental factors. Moreover, we found an intriguing interaction between lake turbidity and day-of-year. This work suggests that the combination of LightGBM and SHAP has a promising potential to develop interpretable models for predicting microbial water quality in freshwater lakes.
Collapse
Affiliation(s)
- Lingbo Li
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Jundong Qiao
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Guan Yu
- Department of Biostatistics, University at Buffalo, The State University of New York, Buffalo, NY, USA
| | - Leizhi Wang
- Nanjing Hydraulic Research Institute, State Key laboratory of Hydrology, Water Resources and Hydraulic Engineering & Science, Nanjing 210029, China
| | - Hong-Yi Li
- Department of Civil and Environmental Engineering, University of Houston, Houston, TX, USA
| | - Chen Liao
- Program for Computational and Systems Biology, Memorial Sloan-Kettering Cancer Center, NY, USA.
| | - Zhenduo Zhu
- Department of Civil, Structural and Environmental Engineering, University at Buffalo, The State University of New York, Buffalo, NY, USA.
| |
Collapse
|
9
|
Jiang Y, Li C, Zhang Y, Zhao R, Yan K, Wang W. Data-driven method based on deep learning algorithm for detecting fat, oil, and grease (FOG) of sewer networks in urban commercial areas. WATER RESEARCH 2021; 207:117797. [PMID: 34731668 DOI: 10.1016/j.watres.2021.117797] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 09/17/2021] [Accepted: 10/20/2021] [Indexed: 06/13/2023]
Abstract
The content of fat, oil and grease (FOG) in the sewer network sediments is the key indicator for diagnosing sewer blockage and overflow. However, the traditional FOG detection is time-consuming and costly, and the establishment of mathematical models based on statistical methods to predict the content of FOG fail to provide satisfactory accuracy. Herein, a deep learning algorithm used a data-driven FOG content prediction model is proposed to achieve a more accurate prediction of FOG content. Meanwhile, global sensitivity analysis (GSA) is exploited to evaluate the contribution of input indicators to the output indicator (FOG) in the model, so that some input indicators that have less impact on the prediction performance can be screened out, the best combination of input indicators can be determined, and the operation cost of the model can be reduced. To evaluate the effectiveness of the proposed model, a case study was conducted in a city in southern China. The experimental results indicate that the prediction model obtains good FOG estimations and performs well from a single site to multiple sites with a mean R2 of 0.922, showing a good generalization performance. Through GSA, the key input indicators in the model were identified as pH, water temperature (T), relative humidity (RH), sewage flow (Flow), drinking water supply (DWS), velocity (V) and conductivity (σ), and the input indicators such as air pressure (AP), population (Pop.), and liquid level (LV) can be reduced without affecting the prediction accuracy of the model.
Collapse
Affiliation(s)
- Yiqi Jiang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Chaolin Li
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China; State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin, 150090, China.
| | - Yituo Zhang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Ruobin Zhao
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Kefen Yan
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Wenhui Wang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China.
| |
Collapse
|
10
|
Bourel M, Segura AM, Crisci C, López G, Sampognaro L, Vidal V, Kruk C, Piccini C, Perera G. Machine learning methods for imbalanced data set for prediction of faecal contamination in beach waters. WATER RESEARCH 2021; 202:117450. [PMID: 34352535 DOI: 10.1016/j.watres.2021.117450] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 07/09/2021] [Accepted: 07/15/2021] [Indexed: 06/13/2023]
Abstract
Predicting water contamination by statistical models is a useful tool to manage health risk in recreational beaches. Extreme contamination events, i.e. those exceeding normative are generally rare with respect to bathing conditions and thus the data is said to be imbalanced. Modeling and predicting those rare events present unique challenges. Here we introduce and evaluate several machine learning techniques and metrics to model imbalanced data and evaluate model performance. We do so by using a) simulated data-sets and b) a real data base with records of faecal coliform abundance monitored for 10 years in 21 recreational beaches in Uruguay (N ≈ 19000) using in situ and meteorological variables. We discuss advantages and disadvantages of the methods and provide a simple guide to perform models for a general audience. We also provide R codes to reproduce model fitting and testing. We found that most Machine Learning techniques are sensitive to imbalance and require specific data pre-treatment (e.g. upsampling) to improve performance. Accuracy (i.e. correctly classified cases over total cases) is not adequate to evaluate model performance on imbalanced data set. Instead, true positive rates (TPR) and false positive rates (FPR) are recommended. Among the 52 possible candidate algorithms tested, the stratified Random forest presented the better performance improving TPR in 50% with respect to baseline (0.4) and outperformed baseline in the evaluated metrics. Support vector machines combined with upsampling method or synthetic minority oversampling technique (SMOTE) performed well, similar to Adaboost with SMOTE. These results suggests that combining modeling strategies is necessary to improve our capacity to anticipate water contamination and avoid health risk.
Collapse
Affiliation(s)
- Mathias Bourel
- IMERL, Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay; Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay.
| | - Angel M Segura
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Carolina Crisci
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Guzmán López
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Lia Sampognaro
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Victoria Vidal
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| | - Carla Kruk
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay; Departamento de Microbiología, Instituto de Investigaciones Biológicas Clemente Estable, Ministerio de Educación y Cultura, Montevideo, Uruguay; Instituto de Ecología y Ciencias Ambientales, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Claudia Piccini
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay; Departamento de Microbiología, Instituto de Investigaciones Biológicas Clemente Estable, Ministerio de Educación y Cultura, Montevideo, Uruguay
| | - Gonzalo Perera
- Departamento de Modelización Estadística de Datos e Inteligencia Artificial (MEDIA), Centro Universitario Regional Este, Universidad de la República, Rocha, Uruguay
| |
Collapse
|
11
|
Heasley C, Sanchez JJ, Tustin J, Young I. Systematic review of predictive models of microbial water quality at freshwater recreational beaches. PLoS One 2021; 16:e0256785. [PMID: 34437625 PMCID: PMC8389397 DOI: 10.1371/journal.pone.0256785] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 08/14/2021] [Indexed: 11/19/2022] Open
Abstract
Monitoring of fecal indicator bacteria at recreational waters is an important public health measure to minimize water-borne disease, however traditional culture methods for quantifying bacteria can take 18-24 hours to obtain a result. To support real-time notifications of water quality, models using environmental variables have been created to predict indicator bacteria levels on the day of sampling. We conducted a systematic review of predictive models of fecal indicator bacteria at freshwater recreational sites in temperate climates to identify and describe the existing approaches, trends, and their performance to inform beach water management policies. We conducted a comprehensive search strategy, including five databases and grey literature, screened abstracts for relevance, and extracted data using structured forms. Data were descriptively summarized. A total of 53 relevant studies were identified. Most studies (n = 44, 83%) were conducted in the United States and evaluated water quality using E. coli as fecal indicator bacteria (n = 46, 87%). Studies were primarily conducted in lakes (n = 40, 75%) compared to rivers (n = 13, 25%). The most commonly reported predictive model-building method was multiple linear regression (n = 37, 70%). Frequently used predictors in best-fitting models included rainfall (n = 39, 74%), turbidity (n = 31, 58%), wave height (n = 24, 45%), and wind speed and direction (n = 25, 47%, and n = 23, 43%, respectively). Of the 19 (36%) studies that measured accuracy, predictive models averaged an 81.0% accuracy, and all but one were more accurate than traditional methods. Limitations identifed by risk-of-bias assessment included not validating models (n = 21, 40%), limited reporting of whether modelling assumptions were met (n = 40, 75%), and lack of reporting on handling of missing data (n = 37, 70%). Additional research is warranted on the utility and accuracy of more advanced predictive modelling methods, such as Bayesian networks and artificial neural networks, which were investigated in comparatively fewer studies and creating risk of bias tools for non-medical predictive modelling.
Collapse
Affiliation(s)
- Cole Heasley
- School of Occupational and Public Health, Ryerson University, Toronto, Ontario, Canada
| | - J. Johanna Sanchez
- School of Occupational and Public Health, Ryerson University, Toronto, Ontario, Canada
| | - Jordan Tustin
- School of Occupational and Public Health, Ryerson University, Toronto, Ontario, Canada
| | - Ian Young
- School of Occupational and Public Health, Ryerson University, Toronto, Ontario, Canada
| |
Collapse
|
12
|
Safaie A, Weiskerger CJ, Nevers MB, Byappanahalli MN, Phanikumar MS. Evaluating the impacts of foreshore sand and birds on microbiological contamination at a freshwater beach. WATER RESEARCH 2021; 190:116671. [PMID: 33302038 DOI: 10.1016/j.watres.2020.116671] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 10/29/2020] [Accepted: 11/23/2020] [Indexed: 06/12/2023]
Abstract
Beaches along the Great Lakes shorelines are important recreational and economic resources. However, contamination at the beaches can threaten their usage during the swimming season, potentially resulting in beach closures and/or advisories. Thus, understanding the dynamics that control nearshore water quality is integral to effective beach management. There have been significant improvements in this effort, including incorporating modeling (empirical, mechanistic) in recent years. Mechanistic modeling frameworks can contribute to this understanding of dynamics by determining sources and interactions that substantially impact fecal indicator bacteria concentrations, an index routinely used in water quality monitoring programs. To simulate E. coli concentrations at Jeorse Park beaches in southwest Lake Michigan, a coupled hydrodynamic and wave-current interaction model was developed that progressively added contaminant sources from river inputs, avian presence, bacteria-sediment interactions, and bacteria-sand-sediment interactions. Results indicated that riverine inputs affected E. coli concentrations at Jeorse Park beaches only marginally, while avian, shoreline sand, and sediment sources were much more substantial drivers of E. coli contamination at the beach. By including avian and riverine inputs, as well as bacteria-sand-sediment interactions at the beach, models can reasonably capture the variability in observed E. coli concentrations in nearshore water and bed sediments at Jeorse Park beaches. Consequently, it will be crucial to consider avian contamination sources and water-sand-sediment interactions in effective management of the beach for public health and as a recreational resource and to extend these findings to similar beaches affected by shoreline embayment.
Collapse
Affiliation(s)
- Ammar Safaie
- Department of Civil & Environmental Engineering, Michigan State University, East Lansing, MI 48824, United States
| | - Chelsea J Weiskerger
- Department of Civil & Environmental Engineering, Michigan State University, East Lansing, MI 48824, United States
| | - Meredith B Nevers
- U.S. Geological Survey, Great Lakes Science Center, Lake Michigan Ecological Research Station, 1574 N. County Road 300 E. Chesterton, Indiana 46304, United States
| | - Muruleedhara N Byappanahalli
- U.S. Geological Survey, Great Lakes Science Center, Lake Michigan Ecological Research Station, 1574 N. County Road 300 E. Chesterton, Indiana 46304, United States
| | - Mantha S Phanikumar
- Department of Civil & Environmental Engineering, Michigan State University, East Lansing, MI 48824, United States.
| |
Collapse
|
13
|
Searcy RT, Boehm AB. A Day at the Beach: Enabling Coastal Water Quality Prediction with High-Frequency Sampling and Data-Driven Models. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:1908-1918. [PMID: 33471505 DOI: 10.1021/acs.est.0c06742] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
To reduce the incidence of recreational waterborne illness, fecal indicator bacteria (FIB) are measured to assess water quality and inform beach management. Recently, predictive FIB models have been used to aid managers in making beach posting and closure decisions. However, those predictive models must be trained using rich historical data sets consisting of FIB and environmental data that span years, and many beaches lack such data sets. Here, we investigate whether water quality data collected during discrete short duration, high-frequency beach sampling events (e.g., samples collected at sub-hourly intervals for 24-48 h) are sufficient to train predictive models that can be used for beach management. We use data collected during six high-frequency sampling events at three California marine beaches and train a total of 126 models using common data-driven techniques. Tide, solar irradiation, water temperature, significant wave height, and offshore wind speed were found to be the most important environmental variables in the models. We validate the predictive performance of models using withheld data. Random forests are consistently the top performing model type. Overall, we find that data-driven models trained using high-frequency FIB and environmental data perform well at predicting water quality and can be used to inform public health decisions at beaches.
Collapse
Affiliation(s)
- Ryan T Searcy
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, Palo Alto 94305, California, United States
| | - Alexandria B Boehm
- Department of Civil & Environmental Engineering, Stanford University, 473 Via Ortega, Stanford, Palo Alto 94305, California, United States
| |
Collapse
|
14
|
Numerical Modeling of Microbial Fate and Transport in Natural Waters: Review and Implications for Normal and Extreme Storm Events. WATER 2020. [DOI: 10.3390/w12071876] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Degradation of water quality in recreational areas can be a substantial public health concern. Models can help beach managers make contemporaneous decisions to protect public health at recreational areas, via the use of microbial fate and transport simulation. Approaches to modeling microbial fate and transport vary widely in response to local hydrometeorological contexts, but many parameterizations include terms for base mortality, solar inactivation, and sedimentation of microbial contaminants. Models using these parameterizations can predict up to 87% of variation in observed microbial concentrations in nearshore water, with root mean squared errors ranging from 0.41 to 5.37 log10 Colony Forming Units (CFU) 100 mL−1. This indicates that some models predict microbial fate and transport more reliably than others and that there remains room for model improvement across the board. Model refinement will be integral to microbial fate and transport simulation in the face of less readily observable processes affecting water quality in nearshore areas. Management of contamination phenomena such as the release of storm-associated river plumes and the exchange of contaminants between water and sand at the beach can benefit greatly from optimized fate and transport modeling in the absence of directly observable data.
Collapse
|
15
|
Brester C, Ryzhikov I, Siponen S, Jayaprakash B, Ikonen J, Pitkänen T, Miettinen IT, Torvinen E, Kolehmainen M. Potential and limitations of a pilot-scale drinking water distribution system for bacterial community predictive modelling. THE SCIENCE OF THE TOTAL ENVIRONMENT 2020; 717:137249. [PMID: 32092807 DOI: 10.1016/j.scitotenv.2020.137249] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 02/09/2020] [Accepted: 02/09/2020] [Indexed: 06/10/2023]
Abstract
Waterborne disease outbreaks are a persistent and serious threat to public health according to reported incidents across the globe. Online drinking water quality monitoring technologies have evolved substantially and have become more accurate and accessible. However, using online measurements alone is unsuitable for detecting microbial regrowth, potentially including harmful species, ahead of time in the distribution systems. Alternatively, observational data could be collected periodically, e.g. once per week or once per month and it could include a representative set of variables: physicochemical water characteristics, disinfectant concentrations, and bacterial abundances, which would be a valuable source of knowledge for predictive modelling that aims to reveal pathogen-related threats. In this study, we utilised data collected from a pilot-scale drinking water distribution system. A data-driven random forest model was used for predictive modelling and was trained for nowcasting and forecasting abundances of bacterial groups. In all the experiments, we followed the realistic crossline scenario, which means that when training and testing the models the data is collected from different pipelines. In spite of the more accurate results of the nowcasting, the 1-week forecasting still provided accurate predictions of the most abundant bacteria, their rapid increase and decrease. In the future predictive modelling might be used as a tool in designing control measures for opportunistic pathogens which are able to multiply in the favourable conditions in drinking water distribution systems (DWDS). Eventually, the forecasting information will be able to produce practically helpful data for controlling the DWDS regrowth.
Collapse
Affiliation(s)
- Christina Brester
- Department of Environmental and Biological Sciences, University of Eastern Finland, P.O. Box 1627, FI-70211 Kuopio, Finland.
| | - Ivan Ryzhikov
- Department of Environmental and Biological Sciences, University of Eastern Finland, P.O. Box 1627, FI-70211 Kuopio, Finland
| | - Sallamaari Siponen
- Department of Environmental and Biological Sciences, University of Eastern Finland, P.O. Box 1627, FI-70211 Kuopio, Finland
| | - Balamuralikrishna Jayaprakash
- Department of Health Security, Expert Microbiology Unit, National Institute for Health and Welfare, P.O. Box 95, FI-70701 Kuopio, Finland
| | - Jenni Ikonen
- Department of Health Security, Expert Microbiology Unit, National Institute for Health and Welfare, P.O. Box 95, FI-70701 Kuopio, Finland
| | - Tarja Pitkänen
- Department of Health Security, Expert Microbiology Unit, National Institute for Health and Welfare, P.O. Box 95, FI-70701 Kuopio, Finland
| | - Ilkka T Miettinen
- Department of Health Security, Expert Microbiology Unit, National Institute for Health and Welfare, P.O. Box 95, FI-70701 Kuopio, Finland
| | - Eila Torvinen
- Department of Environmental and Biological Sciences, University of Eastern Finland, P.O. Box 1627, FI-70211 Kuopio, Finland
| | - Mikko Kolehmainen
- Department of Environmental and Biological Sciences, University of Eastern Finland, P.O. Box 1627, FI-70211 Kuopio, Finland
| |
Collapse
|
16
|
Zhang Y, Gao X, Smith K, Inial G, Liu S, Conil LB, Pan B. Integrating water quality and operation into prediction of water production in drinking water treatment plants by genetic algorithm enhanced artificial neural network. WATER RESEARCH 2019; 164:114888. [PMID: 31377525 DOI: 10.1016/j.watres.2019.114888] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Revised: 06/25/2019] [Accepted: 07/18/2019] [Indexed: 06/10/2023]
Abstract
Stringent regulations and deteriorating source water quality could greatly influence the water production capacity of drinking water treatment plants (DWTPs). Using models to predict the performance of DWTPs under stress provides valuable information for decision making and future planning. A hybrid statistic model named HANN was established by combining artificial neural network (ANN) with genetic algorithm (GA) aiming at forecasting the overall performance of DWTPs nationwide in China. Monthly data from 45 DWTPs across China was employed. Water quality parameters like temperature and chemical oxygen demand (COD) and operational parameters like electricity consumption and chemical consumption were selected as input variables, while drinking water production was employed as the output. Both preliminary data analysis and principal component analysis (PCA) suggested a clear non-linear relationship between the input and output variables. The structure of the HANN model was optimized by employing the lowest mean squared error (MSE) as the indicator. The resultant HANN model performed well when simulating the training datasets. Its predictive accuracy for the independent test datasets was enhanced when feeding more training datasets and the performance was constantly higher than the independent multi-layered ANN models using the coefficient of determination (R2) as the indicator, indicating the HANN model was capable of capturing complex non-linear relationship and extrapolation. Results from Accuracy test, Garson sensitivity analysis and Analysis of Variance (ANOVA) suggested the quantity of water produced by DWTPs was closely linked to water quality and operational parameters. The scenario analysis showed that the HANN model was capable of predicting water production variation based on the parameter variations, indicating that the HANN model could be a general management tool for decision makers and DWTP managers to make plans in advance of regulatory changes, source water quality variations and market demand.
Collapse
Affiliation(s)
- Yanyang Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, 210023, China; Research Center for Environmental Nanotechnology (ReCENT), Nanjing University, Nanjing, 210023, China.
| | - Xiang Gao
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, 210023, China
| | - Kate Smith
- School of Environment, Tsinghua University, Haidian District, Beijing, 100084, China
| | - Goulven Inial
- Plastic Metal Technology (PMT), Veolia Water Technology, France
| | - Shuming Liu
- School of Environment, Tsinghua University, Haidian District, Beijing, 100084, China
| | - Lenny B Conil
- Veolia Research & Innovation (VeRI), Hong Kong, China
| | - Bingcai Pan
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, 210023, China; Research Center for Environmental Nanotechnology (ReCENT), Nanjing University, Nanjing, 210023, China
| |
Collapse
|
17
|
Wang Y, Xu C, Zhang S, Yang L, Wang Z, Zhu Y, Yuan J. Development and evaluation of a deep learning approach for modeling seasonality and trends in hand-foot-mouth disease incidence in mainland China. Sci Rep 2019; 9:8046. [PMID: 31142826 PMCID: PMC6541597 DOI: 10.1038/s41598-019-44469-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2018] [Accepted: 03/06/2019] [Indexed: 02/03/2023] Open
Abstract
The high incidence, seasonal pattern and frequent outbreaks of hand, foot, and mouth disease (HFMD) represent a threat for millions of children in mainland China. And advanced response is being used to address this. Here, we aimed to model time series with a long short-term memory (LSTM) based on the HFMD notified data from June 2008 to June 2018 and the ultimate performance was compared with the autoregressive integrated moving average (ARIMA) and nonlinear auto-regressive neural network (NAR). The results indicated that the identified best-fitting LSTM with the better superiority, be it in modeling dataset or two robustness tests dataset, than the best-conducting NAR and seasonal ARIMA (SARIMA) methods in forecasting performances, including the minimum indices of root mean square error, mean absolute error and mean absolute percentage error. The epidemic trends of HFMD remained stable during the study period, but the reported cases were even at significantly high levels with a notable high-risk seasonality in summer, and the incident cases projected by the LSTM would still be fairly high with a slightly upward trend in the future. In this regard, the LSTM approach should be highlighted in forecasting the epidemics of HFMD, and therefore assisting decision makers in making efficient decisions derived from the early detection of the disease incidents.
Collapse
Affiliation(s)
- Yongbin Wang
- Department of Epidemiology and Health Statistics, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, P.R. China
| | - Chunjie Xu
- Department of Occupational and Environmental Health, School of Public Health, Capital Medical University, Beijing, 100069, P.R. China
| | - Shengkui Zhang
- Department of Epidemiology and Health Statistics, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, P.R. China
| | - Li Yang
- Department of Epidemiology and Health Statistics, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, P.R. China
| | - Zhende Wang
- Department of Epidemiology and Health Statistics, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, P.R. China
| | - Ying Zhu
- Department of Epidemiology and Health Statistics, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, P.R. China
| | - Juxiang Yuan
- Department of Epidemiology and Health Statistics, School of Public Health, North China University of Science and Technology, Tangshan, Hebei Province, P.R. China.
| |
Collapse
|
18
|
Laureano-Rosario AE, Duncan AP, Symonds EM, Savic DA, Muller-Karger FE. Predicting culturable enterococci exceedances at Escambron Beach, San Juan, Puerto Rico using satellite remote sensing and artificial neural networks. JOURNAL OF WATER AND HEALTH 2019; 17:137-148. [PMID: 30758310 DOI: 10.2166/wh.2018.128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Predicting recreational water quality is key to protecting public health from exposure to wastewater-associated pathogens. It is not feasible to monitor recreational waters for all pathogens; therefore, monitoring programs use fecal indicator bacteria (FIB), such as enterococci, to identify wastewater pollution. Artificial neural networks (ANNs) were used to predict when culturable enterococci concentrations exceeded the U.S. Environmental Protection Agency (U.S. EPA) Recreational Water Quality Criteria (RWQC) at Escambron Beach, San Juan, Puerto Rico. Ten years of culturable enterococci data were analyzed together with satellite-derived sea surface temperature (SST), direct normal irradiance (DNI), turbidity, and dew point, along with local observations of precipitation and mean sea level (MSL). The factors identified as the most relevant for enterococci exceedance predictions based on the U.S. EPA RWQC were DNI, turbidity, cumulative 48 h precipitation, MSL, and SST; they predicted culturable enterococci exceedances with an accuracy of 75% and power greater than 60% based on the Receiving Operating Characteristic curve and F-Measure metrics. Results show the applicability of satellite-derived data and ANNs to predict recreational water quality at Escambron Beach. Future work should incorporate local sanitary survey data to predict risky recreational water conditions and protect human health.
Collapse
Affiliation(s)
- Abdiel E Laureano-Rosario
- College of Marine Science, University of South Florida, 140 7th Avenue South, Saint Petersburg, FL 33701, USA E-mail:
| | - Andrew P Duncan
- Centre for Water Systems, University of Exeter, Harrison Building, North Park Road, Exeter EX4 4QF, UK
| | - Erin M Symonds
- College of Marine Science, University of South Florida, 140 7th Avenue South, Saint Petersburg, FL 33701, USA E-mail:
| | - Dragan A Savic
- Centre for Water Systems, University of Exeter, Harrison Building, North Park Road, Exeter EX4 4QF, UK
| | - Frank E Muller-Karger
- College of Marine Science, University of South Florida, 140 7th Avenue South, Saint Petersburg, FL 33701, USA E-mail:
| |
Collapse
|