1
|
Nong X, Lai C, Chen L, Wei J. A novel coupling interpretable machine learning framework for water quality prediction and environmental effect understanding in different flow discharge regulations of hydro-projects. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 950:175281. [PMID: 39117235 DOI: 10.1016/j.scitotenv.2024.175281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/01/2024] [Accepted: 08/02/2024] [Indexed: 08/10/2024]
Abstract
Machine learning models (MLMs) have been increasingly used to forecast water pollution. However, the "black box" characteristic for understanding mechanism processes still limits the applicability of MLMs for water quality management in hydro-projects under complex and frequently artificial regulation. This study proposes an interpretable machine learning framework for water quality prediction coupled with a hydrodynamic (flow discharge) scenario-based Random Forest (RF) model with multiple model-agnostic techniques and quantifies global, local, and joint interpretations (i.e., partial dependence, individual conditional expectation, and accumulated local effects) of environmental factor implications. The framework was applied and verified to predict the permanganate index (CODMn) under different flow discharge regulation scenarios in the Middle Route of the South-to-North Water Diversion Project of China (MRSNWDPC). A total of 4664 sampling cases data matrices, including water quality, meteorological, and hydrological indicators from eight national stations along the main canal of the MRSNWDPC, were collected from May 2019 to December 2020. The results showed that the RF models were effective in forecasting CODMn in all flow discharge scenarios, with a mean square error, coefficient of determination, and mean absolute error of 0.006-0.026, 0.481-0.792, and 0.069-0.104, respectively, in the testing dataset. A global interpretation indicated that dissolved oxygen, flow discharge, and surface pressure are the three most important variables of CODMn. Local and joint interpretations indicated that the RF-based prediction model provides a basic understanding of the physical mechanisms of environmental systems. The proposed framework can effectively learn the fundamental environmental implications of water quality variations and provide reliable prediction performance, highlighting the importance of model interpretability for trustworthy machine learning applications in water management projects. This study provides scientific references for applying advanced data-driven MLMs to water quality forecasting and a reliable methodological framework for water quality management and similar hydro-projects.
Collapse
Affiliation(s)
- Xizhi Nong
- College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China; State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing 100084, China; Centre for Urban Sustainability and Resilience, Department of Civil, Environmental and Geomatic Engineering, University College London, London WC1E 6BT, UK; School of Computing and Engineering, University of West London, London W5 5RF, UK
| | - Cheng Lai
- College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China
| | - Lihua Chen
- College of Civil Engineering and Architecture, Guangxi University, Nanning 530004, China.
| | - Jiahua Wei
- State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing 100084, China
| |
Collapse
|
2
|
Qu Q, Wang S, Hu X, Mu L. The impact of anthropogenic pressures on microbial diversity and river multifunctionality relationships on a global scale. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 950:175293. [PMID: 39111414 DOI: 10.1016/j.scitotenv.2024.175293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 07/29/2024] [Accepted: 08/03/2024] [Indexed: 08/28/2024]
Abstract
Conserving biodiversity is crucial for maintaining essential ecosystem functions, as indicated by the positive relationships between biodiversity and ecosystem functioning. However, the impacts of declining biodiversity on ecosystem functions in response to mounting human pressures remain uncertain. This uncertainty arises from the complexity of trade-offs among human activities, climate change, river properties, and biodiversity, which have not been comprehensively addressed collectively. Here, we provide evidence that river biodiversity was significantly and positively associated with multifunctionality and contributed to key ecosystem functions such as microbially driven water purification, leaf litter decomposition and pathogen control. However, human pressure led to abrupt changes in microbial diversity and river multifunctionality relationships at a human pressure value of 0.5. In approximately 30 % (N = 58) of countries globally, the ratio of area above this threshold exceeded the global average (∼11 %), especially in Europe. Results show that human pressure affected ecosystem functions through direct effects and interactive effects. We provide more direct evidence that the nonadditive effects triggered by prevailing human pressure impact the multifunctionality of rivers globally. Under high levels of human stress, the beneficial effects of biodiversity on nutrient cycling, carbon storage, gross primary productivity, leaf litter decomposition, and pathogen control tend to diminish. Our findings highlight that considering interactions between human pressure and local abiotic and biotic factors is key for understanding the fate of river ecosystems under climate change and increasing human pressure.
Collapse
Affiliation(s)
- Qian Qu
- Key Laboratory of Pollution Processes and Environmental Criteria (Ministry of Education), Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China
| | - Shuting Wang
- Key Laboratory of Pollution Processes and Environmental Criteria (Ministry of Education), Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China
| | - Xiangang Hu
- Key Laboratory of Pollution Processes and Environmental Criteria (Ministry of Education), Tianjin Key Laboratory of Environmental Remediation and Pollution Control, College of Environmental Science and Engineering, Nankai University, Tianjin 300350, China.
| | - Li Mu
- Tianjin Key Laboratory of Agro-Environment and Product Safety, Key Laboratory for Environmental Factors Controlling Agro-Product Quality Safety (Ministry of Agriculture and Rural Affairs), Institute of Agro-Environmental Protection, Ministry of Agriculture and Rural Affairs, 300191 Tianjin, China.
| |
Collapse
|
3
|
Patel L, Singh R, Gowd SC, Thottathil SD. Environmental determinants of aerobic methane oxidation in a tropical river network. WATER RESEARCH 2024; 265:122257. [PMID: 39178592 DOI: 10.1016/j.watres.2024.122257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 07/08/2024] [Accepted: 08/12/2024] [Indexed: 08/26/2024]
Abstract
Aerobic methane oxidation (MOX) significantly reduces methane (CH4) emissions from inland water bodies and is, therefore, an important determinant of global CH4 budget. Yet, the magnitude and controls of MOX rates in rivers - a quantitatively significant natural source of atmospheric CH4 - are poorly constrained. Here, we conducted a series of incubation experiments to understand the magnitude and environmental controls of MOX rates in tropical fluvial systems. We observed a large variability in MOX rate (0.03 - 3.45 μmol l-1d-1) shaped by a suit of environmental variables. Accordingly, we developed an empirical model for MOX that incorporate key environmental drivers, including temperature, CH4, total phosphorus, and dissolved oxygen (O2) concentrations, based on the results of our incubation experiments. We show that temperature dependency of MOX (activation energy: 0.66 ± 0.18 eV) is lower than that of sediment methanogenesis (0.71 ± 0.21 eV) in the studied tropical fluvial network. Furthermore, we observed a non-linear relationship between O2 concentration and MOX, with the highest MOX rate occuring ∼135 μmol O2l-1, above or below this "optimal O2" concentration, MOX rate shows a gradual decline. Together, our results suggest that the relatively lower temperature response of MOX compared to methanogenesis along with the projected decrease of O2 concentration due to organic pollution may cause elevated CH4 emission from tropical southeast Asian rivers. Since estimation of CH4 oxidation is often neglected in routine CH4 monitoring programs, the model developed here may help to integrate MOX rate into process-based models for fluvial CH4 budget.
Collapse
Affiliation(s)
- Latika Patel
- Department of Environmental Science and Engineering, SRM University-AP, Amaravati, Andhra Pradesh 522 502, India
| | - Rashmi Singh
- Department of Environmental Science and Engineering, SRM University-AP, Amaravati, Andhra Pradesh 522 502, India
| | - Sarath C Gowd
- Department of Environmental Science and Engineering, SRM University-AP, Amaravati, Andhra Pradesh 522 502, India
| | - Shoji D Thottathil
- Department of Environmental Science and Engineering, SRM University-AP, Amaravati, Andhra Pradesh 522 502, India.
| |
Collapse
|
4
|
Huang S, Wang Y, Xia J. Which riverine water quality parameters can be predicted by meteorologically-driven deep learning? THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 946:174357. [PMID: 38945234 DOI: 10.1016/j.scitotenv.2024.174357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 06/26/2024] [Accepted: 06/27/2024] [Indexed: 07/02/2024]
Abstract
River water quality has been significantly impacted by climate change and extreme weather events worldwide. Despite increasing studies on deep learning techniques for river water quality management, understanding which riverine water quality parameters can be well predicted by meteorologically-driven deep learning still requires further investigation. Here we explored the prediction performance of a traditional Recurrent Neural Network, a Long Short-Term Memory network (LSTM), and a Gated Recurrent Unit (GRU) using meteorological conditions as inputs in the Dahei River basin. We found that deep learning models (i.e., LSTM and GRU) demonstrated remarkable effectiveness in predicting multiple water quality parameters at daily scale, including water temperature, dissolved oxygen, electrical conductivity, chemical oxygen demand, ammonia nitrogen, total phosphorous, and total nitrogen, but not turbidity. The GRU model performed best with an average determination coefficient of 0.94. Compared to the daily-average prediction, the GRU model exhibited limited error increment of 10-40 % for most water quality parameters when predicting daily extreme values (i.e., the maximum and minimum). Moreover, deep learning showed superior performance in collective prediction for multiple water quality parameters than individual ones, enabling a more comprehensive understanding of the river water quality dynamics from meteorological data. This study holds the promise of applying meteorologically-driven deep learning techniques for water quality prediction to a broader range of watersheds, particularly in chemically ungauged areas.
Collapse
Affiliation(s)
- Sheng Huang
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China
| | - Yueling Wang
- Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
| | - Jun Xia
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China; Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|
5
|
Huang S, Xia J, Wang Y, Wang G, She D, Lei J. Pollution loads in the middle-lower Yangtze river by coupling water quality models with machine learning. WATER RESEARCH 2024; 263:122191. [PMID: 39098157 DOI: 10.1016/j.watres.2024.122191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Revised: 07/26/2024] [Accepted: 07/29/2024] [Indexed: 08/06/2024]
Abstract
Pollution control and environmental protection of the Yangtze River have received major attention in China. However, modeling the river's pollution load remains challenging due to limited monitoring and unclear spatiotemporal distribution of pollution sources. Specifically, anthropogenic activities' contribution to the pollution have been underestimated in previous research. Here, we coupled a hydrodynamic-based water quality (HWQ) model with a machine learning (ML) model, namely attention-based Gated Recurrent Unit, to decipher the daily pollution loads (i.e., chemical oxygen demand, COD; total phosphorus, TP) and their sources in the Middle-Lower Yangtze River from 2014 to 2018. The coupled HWQ-ML model outperformed the standalone ML model with KGE values ranging 0.77-0.91 for COD and 0.47-0.64 for TP, while also reducing parameter uncertainty. When examining the relative contributions at the Middle Yangtze River Hankou cross-section, we observed that the main stream and tributaries, lateral anthropogenic discharges, and parameter uncertainty contributed 15, 66, and 19% to COD, and 58, 35, and 7% to TP, respectively. For the Lower Yangtze River Datong cross-section, the contributions were 6, 69, and 25% for COD and 41, 42, and 17% for TP. According to the attention weights of the coupled model, the primary drivers of lateral anthropogenic pollution sources, in descending order of importance, were temperature, date, and precipitation, reflecting seasonal pollution discharge, industrial effluent, and first flush effect and combined sewer overflows, respectively. This study emphasizes the synergy between physical modeling and machine learning, offering new insights into pollution load dynamics in the Yangtze River.
Collapse
Affiliation(s)
- Sheng Huang
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China; Institute for Water-Carbon Cycles & Carbon Neutrality, Wuhan University, Wuhan 430072, China; Department of Civil and Environmental Engineering, National University of Singapore, 117578, Singapore
| | - Jun Xia
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China; Institute for Water-Carbon Cycles & Carbon Neutrality, Wuhan University, Wuhan 430072, China; Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China.
| | - Yueling Wang
- Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
| | - Gangsheng Wang
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China; Institute for Water-Carbon Cycles & Carbon Neutrality, Wuhan University, Wuhan 430072, China.
| | - Dunxian She
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China; Institute for Water-Carbon Cycles & Carbon Neutrality, Wuhan University, Wuhan 430072, China
| | - Jiarui Lei
- Department of Civil and Environmental Engineering, National University of Singapore, 117578, Singapore
| |
Collapse
|
6
|
He H, Boehringer T, Schäfer B, Heppell K, Beck C. Analyzing spatio-temporal dynamics of dissolved oxygen for the River Thames using superstatistical methods and machine learning. Sci Rep 2024; 14:21288. [PMID: 39266599 DOI: 10.1038/s41598-024-72084-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Accepted: 09/03/2024] [Indexed: 09/14/2024] Open
Abstract
By employing superstatistical methods and machine learning, we analyze time series data of water quality indicators for the River Thames (UK). The indicators analyzed include dissolved oxygen, temperature, electrical conductivity, pH, ammonium, turbidity, and rainfall, with a specific focus on the dynamics of dissolved oxygen. After detrending, the probability density functions of dissolved oxygen fluctuations exhibit heavy tails that are effectively modeled using q-Gaussian distributions. Our findings indicate that the multiplicative Empirical Mode Decomposition method stands out as the most effective detrending technique, yielding the highest log-likelihood in nearly all fittings. We also observe that the optimally fitted width parameter of the q-Gaussian shows a negative correlation with the distance to the sea, highlighting the influence of geographical factors on water quality dynamics. In the context of same-time prediction of dissolved oxygen, regression analysis incorporating various water quality indicators and temporal features identify the Light Gradient Boosting Machine as the best model. SHapley Additive exPlanations reveal that temperature, pH, and time of year play crucial roles in the predictions. Furthermore, we use the Transformer, a state-of-the-art machine learning model, to forecast dissolved oxygen concentrations. For long-term forecasting, the Informer model consistently delivers superior performance, achieving the lowest Mean Absolute Error (0.15) and Symmetric Mean Absolute Percentage Error (21.96%) with the 192 historical time steps that we used. This performance is attributed to the Informer's ProbSparse self-attention mechanism, which allows it to capture long-range dependencies in time-series data more effectively than other machine learning models. It effectively recognizes the half-life cycle of dissolved oxygen, with particular attention to critical periods such as morning to early afternoon, late evening to early morning, and key intervals between the 16th and 26th quarter-hours of the previous half-day. Our findings provide valuable insights for policymakers involved in ecological health assessments, aiding in accurate predictions of river water quality and the maintenance of healthy aquatic ecosystems.
Collapse
Affiliation(s)
- Hankun He
- Centre for Complex Systems, Queen Mary University of London, London, UK.
| | | | - Benjamin Schäfer
- Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Kate Heppell
- Chilterns National Landscape, High Wycombe, UK
- School of Geography, Queen Mary University of London, London, UK
| | - Christian Beck
- Centre for Complex Systems, Queen Mary University of London, London, UK
| |
Collapse
|
7
|
Elnabwy MT, Alshahri AH, El-Gamal AA. An integrated deep learning approach for modeling dissolved oxygen concentration at coastal inlets based on hydro-climatic parameters. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 367:122018. [PMID: 39111007 DOI: 10.1016/j.jenvman.2024.122018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Revised: 07/15/2024] [Accepted: 07/26/2024] [Indexed: 08/15/2024]
Abstract
Climate change has a significant impact on dissolved oxygen (DO) concentrations, particularly in coastal inlets where numerous human activities occur. Due to the various water quality (WQ), hydrological, and climatic parameters that influence this phenomenon, predicting and modeling DO variation is a challenging process. Accordingly, this study introduces an innovative Deep Learning Neural Network (DLNN) methodology to model and predict DO concentrations for the Egyptian Rashid coastal inlet, leveraging field-recorded WQ and hydroclimatic datasets. Initially, statistical and exploratory data analyses are performed to provide a thorough understanding of the relationship between DO fluctuations and associated WQ and hydroclimatic stressors. As an initial step towards developing an effective DO predictive model, conventional Machine Learning (ML) approaches such as Gaussian Process Regression (GPR), Support Vector Regression (SVR), and Decision Tree Regressor (DTR) are employed. Subsequently, a DLNN approach is utilized to validate the prediction capabilities of the investigated conventional ML approaches. Finally, a sensitivity analysis is conducted to evaluate the impact of WQ and hydroclimatic parameters on predicted DO. The outcomes demonstrate that DLNN significantly improves DO prediction accuracy by 4% compared to the best-performing ML approach, achieving a Correlation Coefficient of 0.95 with a root mean square error (RMSE) of 0.42 mg/l. Solar radiation (SR), pH, water levels (WL), and atmospheric pressure (P) emerge as the most significant hydroclimatic parameters influencing DO fluctuations. Ultimately, the developed models could serve as effective indicators for coastal authorities to monitor DO changes resulting from accelerated climate change along the Egyptian coast.
Collapse
Affiliation(s)
- Mohamed T Elnabwy
- Coastal Research Institute (CORI), National Water Research Center, Alexandria 21415, Egypt; Civil Engineering Department., Faculty of Engineering, Damietta University., New Damietta 34517, Egypt.
| | - Abdullah H Alshahri
- Department of Civil Engineering, College of Engineering, Taif University, P.O. Box 11099, Taif City 21974, Saudi Arabia.
| | - Ayman A El-Gamal
- Department of Marine Geology, Coastal Research Institute (CoRI), National Water Research Center, Alexandria 21415 Egypt.
| |
Collapse
|
8
|
Fu H, Zheng W, Duan W, Fang G, Duan X, Wang S, Feng C, Zhu S. Overlooked Roles and Transformation of Carbon-Centered Radicals Produced from Natural Organic Matter in a Thermally Activated Persulfate System. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:14949-14960. [PMID: 39126387 DOI: 10.1021/acs.est.4c06770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/12/2024]
Abstract
The presence and induced secondary reactions of natural organic matter (NOM) significantly affect the remediation efficacy of in situ chemical oxidation (ISCO) systems. However, it remains unclear how this process relates to organic radicals generated from reactions between the NOM and oxidants. The study, for the first time, reported the vital roles and transformation pathways of carbon-centered radicals (CCR•) derived from NOM in activated persulfate (PS) systems. Results showed that both typical terrestrial/aquatic NOM isolates and collected NOM samples produced CCR• by scavenging activated PS and greatly enhanced the dehalogenation performance under anoxic conditions. Under oxic conditions, newly formed CCR• could be oxidized by O2 and generate organic peroxide intermediates (ROO•) to catalytically yield additional •OH without the involvement of PS. Nuclear magnetic resonance (NMR) and Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS) results indicated that CCR• predominantly formed from carboxyl and aliphatic structures instead of aromatics within NOM through hydrogen abstraction and decarboxylation reactions by SO4•- or •OH. Specific anoxic reactions (i.e., dehalogenation and intramolecular cross-coupling reactions) further promoted the transformation of CCR• to more unsaturated and polymerized/condensed compounds. In contrast, oxic propagation of ROO• enhanced bond breakage/ring cleavage and degradation of CCR• due to the presence of additional •OH and self-decomposition. This study provides novel insights into the role of NOM and O2 in ISCO and the development of engineered strategies for creating organic radicals capable of enhancing the remediation of specific contaminants and recovering organic carbon.
Collapse
Affiliation(s)
- Hengyi Fu
- The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, School of Environment and Energy, South China University of Technology, Guangzhou 510006, P. R. China
| | - Wenxiao Zheng
- The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, School of Environment and Energy, South China University of Technology, Guangzhou 510006, P. R. China
| | - Weijian Duan
- The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, School of Environment and Energy, South China University of Technology, Guangzhou 510006, P. R. China
| | - Guodong Fang
- Key Laboratory of Soil Environment and Pollution Remediation, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, P. R. China
| | - Xiaoguang Duan
- School of Chemical Engineering, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Shaobin Wang
- School of Chemical Engineering, The University of Adelaide, Adelaide, SA 5005, Australia
| | - Chunhua Feng
- The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, School of Environment and Energy, South China University of Technology, Guangzhou 510006, P. R. China
| | - Shishu Zhu
- The Key Lab of Pollution Control and Ecosystem Restoration in Industry Clusters, Ministry of Education, School of Environment and Energy, South China University of Technology, Guangzhou 510006, P. R. China
| |
Collapse
|
9
|
Karbasi M, Ali M, Bateni SM, Jun C, Jamei M, Farooque AA, Yaseen ZM. Multi-step ahead forecasting of electrical conductivity in rivers by using a hybrid Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) model enhanced by Boruta-XGBoost feature selection algorithm. Sci Rep 2024; 14:15051. [PMID: 38951605 PMCID: PMC11217395 DOI: 10.1038/s41598-024-65837-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 06/24/2024] [Indexed: 07/03/2024] Open
Abstract
Electrical conductivity (EC) is widely recognized as one of the most essential water quality metrics for predicting salinity and mineralization. In the current research, the EC of two Australian rivers (Albert River and Barratta Creek) was forecasted for up to 10 days using a novel deep learning algorithm (Convolutional Neural Network combined with Long Short-Term Memory Model, CNN-LSTM). The Boruta-XGBoost feature selection method was used to determine the significant inputs (time series lagged data) to the model. To compare the performance of Boruta-XGB-CNN-LSTM models, three machine learning approaches-multi-layer perceptron neural network (MLP), K-nearest neighbour (KNN), and extreme gradient boosting (XGBoost) were used. Different statistical metrics, such as correlation coefficient (R), root mean square error (RMSE), and mean absolute percentage error, were used to assess the models' performance. From 10 years of data in both rivers, 7 years (2012-2018) were used as a training set, and 3 years (2019-2021) were used for testing the models. Application of the Boruta-XGB-CNN-LSTM model in forecasting one day ahead of EC showed that in both stations, Boruta-XGB-CNN-LSTM can forecast the EC parameter better than other machine learning models for the test dataset (R = 0.9429, RMSE = 45.6896, MAPE = 5.9749 for Albert River, and R = 0.9215, RMSE = 43.8315, MAPE = 7.6029 for Barratta Creek). Considering the better performance of the Boruta-XGB-CNN-LSTM model in both rivers, this model was used to forecast 3-10 days ahead of EC. The results showed that the Boruta-XGB-CNN-LSTM model is very capable of forecasting the EC for the next 10 days. The results showed that by increasing the forecasting horizon from 3 to 10 days, the performance of the Boruta-XGB-CNN-LSTM model slightly decreased. The results of this study show that the Boruta-XGB-CNN-LSTM model can be used as a good soft computing method for accurately predicting how the EC will change in rivers.
Collapse
Affiliation(s)
- Masoud Karbasi
- Water Engineering Department, Faculty of Agriculture, University of Zanjan, Zanjan, Iran.
| | - Mumtaz Ali
- UniSQ College, University of Southern Queensland, Springfield Campus, QLD, 4301, Australia
| | - Sayed M Bateni
- Department of Civil, Environmental and Construction Engineering and Water Resources Research Center, University of Hawaii at Manoa, Honolulu, HI, 96822, USA
| | - Changhyun Jun
- Department of Civil and Environmental Engineering, College of Engineering, Chung-Ang University, Seoul, Republic of Korea.
| | - Mehdi Jamei
- Faculty of Civil Engineering and Architecture, Shahid Chamran University of Ahvaz, Ahvaz, Iran
- New Era and Development in Civil Engineering Research Group, Scientific Research Center, Al-Ayen University, Thi-Qar, Nasiriyah, 64001, Iraq
| | - Aitazaz Ahsan Farooque
- Canadian Centre for Climate Change and Adaptation, University of Prince Edward Island, St Peters Bay, PE, Canada.
- Faculty of Sustainable Design Engineering, University of Prince Edward Island, Charlottetown, PE, C1A4P3, Canada.
| | - Zaher Mundher Yaseen
- Civil and Environmental Engineering Department, King Fahd University of Petroleum & Minerals, 31261, Dhahran, Saudi Arabia
| |
Collapse
|
10
|
Huang S, Xia J, Wang Y, Lei J, Wang G. Water quality prediction based on sparse dataset using enhanced machine learning. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2024; 20:100402. [PMID: 38585199 PMCID: PMC10998092 DOI: 10.1016/j.ese.2024.100402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 02/18/2024] [Accepted: 02/19/2024] [Indexed: 04/09/2024]
Abstract
Water quality in surface bodies remains a pressing issue worldwide. While some regions have rich water quality data, less attention is given to areas that lack sufficient data. Therefore, it is crucial to explore novel ways of managing source-oriented surface water pollution in scenarios with infrequent data collection such as weekly or monthly. Here we showed sparse-dataset-based prediction of water pollution using machine learning. We investigated the efficacy of a traditional Recurrent Neural Network alongside three Long Short-Term Memory (LSTM) models, integrated with the Load Estimator (LOADEST). The research was conducted at a river-lake confluence, an area with intricate hydrological patterns. We found that the Self-Attentive LSTM (SA-LSTM) model outperformed the other three machine learning models in predicting water quality, achieving Nash-Sutcliffe Efficiency (NSE) scores of 0.71 for CODMn and 0.57 for NH3N when utilizing LOADEST-augmented water quality data (referred to as the SA-LSTM-LOADEST model). The SA-LSTM-LOADEST model improved upon the standalone SA-LSTM model by reducing the Root Mean Square Error (RMSE) by 24.6% for CODMn and 21.3% for NH3N. Furthermore, the model maintained its predictive accuracy when data collection intervals were extended from weekly to monthly. Additionally, the SA-LSTM-LOADEST model demonstrated the capability to forecast pollution loads up to ten days in advance. This study shows promise for improving water quality modeling in regions with limited monitoring capabilities.
Collapse
Affiliation(s)
- Sheng Huang
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China
- Institute for Water-Carbon Cycles and Carbon Neutrality, Wuhan University, Wuhan 430072, China
- Department of Civil and Environmental Engineering, National University of Singapore, 117578 Singapore
| | - Jun Xia
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China
- Institute for Water-Carbon Cycles and Carbon Neutrality, Wuhan University, Wuhan 430072, China
- Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
| | - Yueling Wang
- Key Laboratory of Water Cycle and Related Land Surface Processes, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China
| | - Jiarui Lei
- Department of Civil and Environmental Engineering, National University of Singapore, 117578 Singapore
| | - Gangsheng Wang
- State Key Laboratory of Water Resources Engineering and Management, Wuhan University, Wuhan 430072, China
- Institute for Water-Carbon Cycles and Carbon Neutrality, Wuhan University, Wuhan 430072, China
| |
Collapse
|
11
|
Hu Y, Liu C, Wollheim WM, Jiao T, Ma M. A hybrid deep learning approach to predict hourly riverine nitrate concentrations using routine monitored data. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 360:121097. [PMID: 38733844 DOI: 10.1016/j.jenvman.2024.121097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 04/26/2024] [Accepted: 05/04/2024] [Indexed: 05/13/2024]
Abstract
With high-frequency data of nitrate (NO3-N) concentrations in waters becoming increasingly important for understanding of watershed system behaviors and ecosystem managements, the accurate and economic acquisition of high-frequency NO3-N concentration data has become a key point. This study attempted to use coupled deep learning neural networks and routine monitored data to predict hourly NO3-N concentrations in a river. The hourly NO3-N concentration at the outlet of the Oyster River watershed in New Hampshire, USA, was predicted through neural networks with a hybrid model architecture coupling the Convolutional Neural Networks and the Long Short-Term Memory model (CNN-LSTM). The routine monitored data (the river depth, water temperature, air temperature, precipitation, specific conductivity, pH and dissolved oxygen concentrations) for model training were collected from a nested high-frequency monitoring network, while the high-frequency NO3-N concentration data obtained at the outlet were not included as inputs. The whole dataset was separated into training, validation, and testing processes according to the ratio of 5:3:2, respectively. The hybrid CNN-LSTM model with different input lengths (1d, 3d, 7d, 15d, 30d) displayed comparable even better performance than other studies with lower frequencies, showing mean values of the Nash-Sutcliffe Efficiency 0.60-0.83. Models with shorter input lengths demonstrated both the higher modeling accuracy and stability. The water level, water temperature and pH values at monitoring sites were main controlling factors for forecasting performances. This study provided a new insight of using deep learning networks with a coupled architecture and routine monitored data for high-frequency riverine NO3-N concentration forecasting and suggestions about strategies about variable and input length selection during preprocessing of input data.
Collapse
Affiliation(s)
- Yue Hu
- State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu, 610059, China
| | - Chuankun Liu
- Sichuan Academy of Environmental Policy and Planning, Department of Ecology and Environment of Sichuan Province, Chengdu, 610059, China.
| | - Wilfred M Wollheim
- Department of Natural Resources and Environment, University of New Hampshire, Durham, NH, 03824, USA
| | - Tong Jiao
- State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu, 610059, China
| | - Meng Ma
- China Institute of Water Resources and Hydropower Research, Beijing, 100048, China
| |
Collapse
|
12
|
An T, Feng K, Cheng P, Li R, Zhao Z, Xu X, Zhu L. Adaptive prediction for effluent quality of wastewater treatment plant: Improvement with a dual-stage attention-based LSTM network. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 359:120887. [PMID: 38678908 DOI: 10.1016/j.jenvman.2024.120887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 04/04/2024] [Accepted: 04/10/2024] [Indexed: 05/01/2024]
Abstract
The accurate effluent prediction plays a crucial role in providing early warning for abnormal effluent and achieving the adjustment of feedforward control parameters during wastewater treatment. This study applied a dual-staged attention mechanism based on long short-term memory network (DA-LSTM) to improve the accuracy of effluent quality prediction. The results showed that input attention (IA) and temporal attention (TA) significantly enhanced the prediction performance of LSTM. Specially, IA could adaptively adjust feature weights to enhance the robustness against input noise, with R2 increased by 13.18%. To promote its long-term memory ability, TA was used to increase the memory span from 96 h to 168 h. Compared to a single LSTM model, the DA-LSTM model showed an improvement in prediction accuracy by 5.10%, 2.11%, 14.47% for COD, TP, and TN. Additionally, DA-LSTM demonstrated excellent generalization performance in new scenarios, with the R2 values for COD, TP, and TN increasing by 22.67%, 20.06%, and 17.14% respectively, while the MAPE values decreased by 56.46%, 63.08%, and 42.79%. In conclusion, the DA-LSTM model demonstrated excellent prediction performance and generalization ability due to its advantages of feature-adaptive weighting and long-term memory focusing. This has forward-looking significance for achieving efficient early warning of abnormal operating conditions and timely management of control parameters.
Collapse
Affiliation(s)
- Tong An
- College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Kuanliang Feng
- Zhejiang Supcon Information Technology Co., Ltd, Hangzhou, 310052, China
| | - Peijin Cheng
- College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Ruojia Li
- College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Zihao Zhao
- Shanghai Municipal Engineering Design Institute (group) Co., Ltd, Shanghai, 200092, China
| | - Xiangyang Xu
- College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China; Zhejiang Provincial Engineering Laboratory of Water Pollution Control, Hangzhou, 310058, China
| | - Liang Zhu
- College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China; Innovation Center of Yangtze River Delta, Zhejiang University, Jiashan, 314100, China; Zhejiang Provincial Engineering Laboratory of Water Pollution Control, Hangzhou, 310058, China.
| |
Collapse
|
13
|
Liu W, Lin S, Li X, Li W, Deng H, Fang H, Li W. Analysis of dissolved oxygen influencing factors and concentration prediction using input variable selection technique: A hybrid machine learning approach. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 357:120777. [PMID: 38581893 DOI: 10.1016/j.jenvman.2024.120777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 02/29/2024] [Accepted: 03/26/2024] [Indexed: 04/08/2024]
Abstract
Accurate quantification of dissolved oxygen (DO) is critically important for the protection and management of aquatic ecosystems. Successful applications have utilized mechanistic and data-driven models to simulate DO content in aquatic ecosystems. However, mechanistic models present challenges due to their complex and difficult-to-solve conditions, making them less portable. Additionally, data-driven model predictions are hindered by the challenge of numerous input variables, impacting both the running speed and prediction performance of the model. To address these challenges, water quality data and meteorological data of the Tanjiang River were obtained. The maximum information coefficient (MIC) input variable selection technique was employed to identify primary environmental factors influencing DO changes. Furthermore, coupled with support vector regression (SVR), two models (SVR and MIC-SVR) were employed to estimate the DO concentration of the Tanjiang River, and the optimal model was established. The results indicated a shift in the primary pollution factor from ammonia nitrogen to total phosphorus after recent treatment in the Tanjiang River. In comparison with the SVR model, the root mean square error (RMSE) of the MIC-SVR model was reduced by 4.46%, and the Nash-efficiency coefficient (NSE) was improved by 45.85%. In addition, study of kernel function selection revealed that considering as many kernel functions as possible is necessary for improving the performance of the SVR model. Conclusively, the proposed MIC-SVR model serves as an effective tool to analyze the relationship between DO and environmental factors, identifying the primary causes of low DO, and accurately predict the DO concentration in the Tanjiang River (especially in its middle and lower reaches), thus providing a reference for governmental decision-making on water environmental protection and water resource management.
Collapse
Affiliation(s)
- Wei Liu
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China
| | - Shu Lin
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Xiaobao Li
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Wenjing Li
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Hong Deng
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China
| | - Huaiyang Fang
- The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China
| | - Weijie Li
- School of Environment and Energy, Guangdong Provincial Key Laboratory of Solid Wastes Pollution Control and Resource Recycling, South China University of Technology, Guangzhou, 510006, China; The Key Laboratory of Water and Air Pollution Control of Guangdong Province, State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, South China Institute of Environmental Sciences, Ministry of Ecology and Environment of the People's Republic of China, Guangzhou, 510535, China.
| |
Collapse
|
14
|
Saha G, Shen C, Duncan J, Cibin R. Performance evaluation of deep learning based stream nitrate concentration prediction model to fill stream nitrate data gaps at low-frequency nitrate monitoring basins. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 357:120721. [PMID: 38565027 DOI: 10.1016/j.jenvman.2024.120721] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 02/09/2024] [Accepted: 03/19/2024] [Indexed: 04/04/2024]
Abstract
Accurate and frequent nitrate estimates can provide valuable information on the nitrate transport dynamics. The study aimed to develop a data-driven modeling framework to estimate daily nitrate concentrations at low-frequency nitrate monitoring sites using the daily nitrate concentration and stream discharge information of a neighboring high-frequency nitrate monitoring site. A Long Short-Term Memory (LSTM) based deep learning (DL) modeling framework was developed to predict daily nitrate concentrations. The DL modeling framework performance was compared with two well-established statistical models, including LOADEST and WRTDS-Kalman, in three selected basins in Iowa, USA: Des Moines, Iowa, and Cedar River. The developed DL model performed well with NSE >0.70 and KGE >0.70 for 67% and 79% nitrate monitoring sites, respectively. DL and WRTDS-Kalman models performed better than the LOADEST in nitrate concentration and load estimation for all low-frequency sites. The average NSE performance of the DL model in daily nitrate estimation is 20% higher than that of the WRTDS-Kalman model at 18 out of 24 sites (75%). The WRTDS-Kalman model showed unrealistic fluctuations in the estimated daily nitrate time series when the model received limited observed nitrate data (less than 50) for simulation. The DL model indicated superior performance in winter months' nitrate prediction (60% of cases) compared to WRTDS-Kalman models (33% of cases). The DL model also better represented the exceedance days from the USEPA maximum contamination level (MCL). Both the DL and WRTDS-Kalman models demonstrated similar performance in annual stream nitrate load estimation, and estimated values are close to actual nitrate loads.
Collapse
Affiliation(s)
- Gourab Saha
- Department of Agricultural and Biological Engineering, The Pennsylvania State University, United States
| | - Chaopeng Shen
- Department of Civil and Environmental Engineering, The Pennsylvania State University, United States
| | - Jonathan Duncan
- Department of Ecosystem Science and Management, The Pennsylvania State University, United States
| | - Raj Cibin
- Department of Agricultural and Biological Engineering, The Pennsylvania State University, United States; Department of Civil and Environmental Engineering, The Pennsylvania State University, United States.
| |
Collapse
|
15
|
Nallakaruppan MK, Gangadevi E, Shri ML, Balusamy B, Bhattacharya S, Selvarajan S. Reliable water quality prediction and parametric analysis using explainable AI models. Sci Rep 2024; 14:7520. [PMID: 38553492 PMCID: PMC10980827 DOI: 10.1038/s41598-024-56775-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 03/11/2024] [Indexed: 04/02/2024] Open
Abstract
The consumption of water constitutes the physical health of most of the living species and hence management of its purity and quality is extremely essential as contaminated water has to potential to create adverse health and environmental consequences. This creates the dire necessity to measure, control and monitor the quality of water. The primary contaminant present in water is Total Dissolved Solids (TDS), which is hard to filter out. There are various substances apart from mere solids such as potassium, sodium, chlorides, lead, nitrate, cadmium, arsenic and other pollutants. The proposed work aims to provide the automation of water quality estimation through Artificial Intelligence and uses Explainable Artificial Intelligence (XAI) for the explanation of the most significant parameters contributing towards the potability of water and the estimation of the impurities. XAI has the transparency and justifiability as a white-box model since the Machine Learning (ML) model is black-box and unable to describe the reasoning behind the ML classification. The proposed work uses various ML models such as Logistic Regression, Support Vector Machine (SVM), Gaussian Naive Bayes, Decision Tree (DT) and Random Forest (RF) to classify whether the water is drinkable. The various representations of XAI such as force plot, test patch, summary plot, dependency plot and decision plot generated in SHAPELY explainer explain the significant features, prediction score, feature importance and justification behind the water quality estimation. The RF classifier is selected for the explanation and yields optimum Accuracy and F1-Score of 0.9999, with Precision and Re-call of 0.9997 and 0.998 respectively. Thus, the work is an exploratory analysis of the estimation and management of water quality with indicators associated with their significance. This work is an emerging research at present with a vision of addressing the water quality for the future as well.
Collapse
Affiliation(s)
- M K Nallakaruppan
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, 632014, India
| | - E Gangadevi
- Department of Computer Science, Loyola College, Chennai, Tamil Nadu, 600034, India
| | - M Lawanya Shri
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, 632014, India
| | | | - Sweta Bhattacharya
- School of Computer Science Engineering and Information Systems, Vellore Institute of Technology, Vellore, 632014, India
| | - Shitharth Selvarajan
- School of Built Environment, Engineering and Computing, Leeds Beckett University, Leeds, LS13HE, UK.
- Department of Computer Science, Kebri Dehar University, Kebri Dehar, Ethiopia.
| |
Collapse
|
16
|
Hu Y, Liu C, Wollheim WM. Prediction of riverine daily minimum dissolved oxygen concentrations using hybrid deep learning and routine hydrometeorological data. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 918:170383. [PMID: 38280612 DOI: 10.1016/j.scitotenv.2024.170383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 01/12/2024] [Accepted: 01/21/2024] [Indexed: 01/29/2024]
Abstract
Dissolved oxygen (DO) depletion is a severe threat to aquatic ecosystems. Hence, using easily available routine hydrometeorological variables without DO as inputs to predict the daily minimum DO concentration in rivers has huge practical significance in the watershed management. The daily minimum DO concentrations at the outlet of the Oyster River watershed in New Hampshire, USA, were predicted by a set of deep learning neural networks using meteorological data and high-frequency water level, water temperature, and specific conductance (CTD) data measured within the watershed. The dependent variable, DO concentration, was measured at the outlet. From April 2013 to March 2018, the dataset was separated into training, validation, and test portions with a ratio of 5:3:3. A Long Short-Term Memory (LSTM) model and a hybrid Convolutional Neural Networks (CNN-LSTM) model were trained and evaluated for predicting the daily minimum DO concentration. The hybrid CNN-LSTM model exhibited the better predictive stability but the comparable accuracy (the mean R2 value = 0.865) compared with the pure LSTM model (the mean R2 value = 0.839). The model performance (both the stability and accuracy) was improved by aggregating the input data frequency from 15 min of raw data to 24 h. Likewise, the modeling performance didn't benefit from including 'forecasted' meteorological data at the predicted time step in the input dataset. This study provided an efficient and low-cost approach to predict the water quality in rivers and streams to realize the scientific watershed management.
Collapse
Affiliation(s)
- Yue Hu
- State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology), Chengdu 610059, China
| | - Chuankun Liu
- Sichuan Academy of Environmental Policy and Planning, Department of Ecology and Environment of Sichuan Province, Chengdu 610059, China.
| | - Wilfred M Wollheim
- Department of Natural Resources and Environment, University of New Hampshire, Durham, NH 03824, USA
| |
Collapse
|
17
|
Jeong H, Byeon E, Lee JS, Kim HS, Sayed AEDH, Bo J, Wang M, Wang DZ, Park HG, Lee JS. Single and combined effects of increased temperature and methylmercury on different stages of the marine rotifer Brachionus plicatilis. JOURNAL OF HAZARDOUS MATERIALS 2024; 466:133448. [PMID: 38244454 DOI: 10.1016/j.jhazmat.2024.133448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 12/24/2023] [Accepted: 01/03/2024] [Indexed: 01/22/2024]
Abstract
Rapid, anthropogenic activity-induced global warming is a severe problem that not only raises water temperatures but also shifts aquatic environments by increasing the bioavailability of heavy metals (HMs), with potentially complicated effects on aquatic organisms, including small aquatic invertebrates. For this paper, we investigated the combined effects of temperature (23 and 28 °C) and methylmercury (MeHg) by measuring physiological changes, bioaccumulation, oxidative stress, antioxidants, and the mitogen-activated protein kinase signaling pathway in the marine rotifer Brachionus plicatilis. High temperature and MeHg adversely affected the survival rate, lifespan, and population of rotifers, and bioaccumulation, oxidative stress, and biochemical reactions depended on the developmental stage, with neonates showing higher susceptibility than adults. These findings demonstrate that increased temperature enhances potentially toxic effects from MeHg, and susceptibility differs with the developmental stage. This study provides a comprehensive understanding of the combined effects of elevated temperature and MeHg on rotifers. ENVIRONMENTAL IMPLICATION: Methylmercury (MeHg) is a widespread and harmful heavy metal that can induce lethal effects on aquatic organisms in even trace amounts. The toxicity of metals can vary depending on various environmental conditions. In particular, rising temperatures are considered a major factor affecting bioavailability and toxicity by changing the sensitivity of organisms. However, there are few studies on the combinational effects of high temperatures and MeHg on aquatic animals, especially invertebrates. Our research would contribute to understanding the actual responses of aquatic organisms to complex aquatic environments.
Collapse
Affiliation(s)
- Haksoo Jeong
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Eunjin Byeon
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea
| | - Jin-Sol Lee
- School of Pharmacy, Sungkyunkwan University, Suwon 16419, South Korea
| | - Hyung Sik Kim
- School of Pharmacy, Sungkyunkwan University, Suwon 16419, South Korea
| | - Alaa El-Din H Sayed
- Department of Zoology, Faculty of Science, Assiut University, Assiut 71516, Egypt
| | - Jun Bo
- Laboratory of Marine Biology and Ecology, Third Institute of Oceanography, Ministry of Natural Resources, Xiamen 361005, China
| | - Minghua Wang
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment & Ecology, Xiamen University, Xiamen 361102, China
| | - Da-Zhi Wang
- State Key Laboratory of Marine Environmental Science, College of the Environment & Ecology, Xiamen University, Xiamen 361102, China
| | - Heum Gi Park
- Department of Marine Ecology and Environment, College of Life Sciences, Gangneung-Wonju National University, Gangneung 25457, South Korea
| | - Jae-Seong Lee
- Department of Biological Sciences, College of Science, Sungkyunkwan University, Suwon 16419, South Korea.
| |
Collapse
|
18
|
Zhi W, Appling AP, Golden HE, Podgorski J, Li L. Deep learning for water quality. NATURE WATER 2024; 2:228-241. [PMID: 38846520 PMCID: PMC11151732 DOI: 10.1038/s44221-024-00202-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 01/10/2024] [Indexed: 06/09/2024]
Abstract
Understanding and predicting the quality of inland waters are challenging, particularly in the context of intensifying climate extremes expected in the future. These challenges arise partly due to complex processes that regulate water quality, and arduous and expensive data collection that exacerbate the issue of data scarcity. Traditional process-based and statistical models often fall short in predicting water quality. In this Review, we posit that deep learning represents an underutilized yet promising approach that can unravel intricate structures and relationships in high-dimensional data. We demonstrate that deep learning methods can help address data scarcity by filling temporal and spatial gaps and aid in formulating and testing hypotheses via identifying influential drivers of water quality. This Review highlights the strengths and limitations of deep learning methods relative to traditional approaches, and underscores its potential as an emerging and indispensable approach in overcoming challenges and discovering new knowledge in water-quality sciences.
Collapse
Affiliation(s)
- Wei Zhi
- The National Key Laboratory of Water Disaster Prevention, Yangtze Institute for Conservation and Development, Key Laboratory of Hydrologic-Cycle and Hydrodynamic-System of Ministry of Water Resources, Hohai University, Nanjing, China
- Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA, USA
| | | | - Heather E Golden
- Office of Research and Development, US Environmental Protection Agency, Cincinnati, OH, USA
| | - Joel Podgorski
- Department of Water Resources and Drinking Water, Swiss Federal Institute of Aquatic Science and Technology (EAWAG), Dübendorf, Switzerland
| | - Li Li
- Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
19
|
Liu X, Yue FJ, Guo TL, Li SL. High-frequency data significantly enhances the prediction ability of point and interval estimation. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 912:169289. [PMID: 38135069 DOI: 10.1016/j.scitotenv.2023.169289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/08/2023] [Accepted: 12/09/2023] [Indexed: 12/24/2023]
Abstract
Accurate prediction of dissolved oxygen (DO) dynamics is crucial for understanding the influence of environmental factors on the stability of aquatic ecosystem. However, limited research has been conducted to determine the optimal frequency of water quality monitoring that ensures continuous assessment of water health while minimizing costs. To address these challenges, the present study developed a hybrid stochastic hydrological model (i.e., ARIMA-GARCH hybrid model) and machine learning (ML) models. The objective of this study is to identify the best-performing model and establish the optimal monitoring frequency. Results revealed that high-frequency DO monitoring data exhibit greater variability compared to low-frequency data. Moreover, the ARIMA-GARCH model demonstrates promising potential in predicting DO concentrations for low-frequency monitoring data, surpassing ML models in performance. Furthermore, increasing the monitoring frequency significantly improves the prediction accuracy of models, regardless of whether point (with lower R2 values of 0.64 and 0.51 for daily detection than these of every 15 min (0.96 and 0.99) at CHQ and LHT, respectively) or interval predictions (with RIW higher values of 2.00 and 1.55 for daily detection higher than these of 0.02 and 0.16 in every 15 min at CHQ and LHT, respectively) are considered. Additionally, a 4 hourly monitoring frequency was found to be optimal for water quality assessment using each model. These findings identify the superior performing of the ARIMA-GARCH model and highlight the crucial role of monitoring frequency in enhancing DO prediction and improving model performance.
Collapse
Affiliation(s)
- Xin Liu
- Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin 300072, China
| | - Fu-Jun Yue
- Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin 300072, China.
| | - Tian-Li Guo
- College of Water Resources and Architectural Engineering, Northwest A&F University, Yangling 712100, China
| | - Si-Liang Li
- Institute of Surface-Earth System Science, School of Earth System Science, Tianjin University, Tianjin 300072, China
| |
Collapse
|
20
|
Dong Y, Sun Y, Liu Z, Du Z, Wang J. Predicting dissolved oxygen level using Young's double-slit experiment optimizer-based weighting model. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2024; 351:119807. [PMID: 38100864 DOI: 10.1016/j.jenvman.2023.119807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 11/30/2023] [Accepted: 12/06/2023] [Indexed: 12/17/2023]
Abstract
Accurate prediction of the dissolved oxygen level (DOL) is important for enhancing environmental conditions and facilitating water resource management. However, the irregularity and volatility inherent in DOL pose significant challenges to achieving precise forecasts. A single model usually suffers from low prediction accuracy, narrow application range, and difficult data acquisition. This study proposes a new weighted model that avoids these problems, which could increase the prediction accuracy of the DOL. The weighting constructs of the proposed model (PWM) included eight neural networks and one statistical method and utilized Young's double-slit experimental optimizer as an intelligent weighting tool. To evaluate the effectiveness of PWM, simulations were conducted using real-world data acquired from the Tualatin River Basin in Oregon, United States. Empirical findings unequivocally demonstrated that PWM outperforms both the statistical model and the individual machine learning models, and has the lowest mean absolute percentage error among all the weighted models. Based on two real datasets, the PWM can averagely obtain the mean absolute percentage errors of 1.0216%, 1.4630%, and 1.7087% for one-, two-, and three-step predictions, respectively. This study shows that the PWM can effectively integrate the distinctive merits of deep learning methods, neural networks, and statistical models, thereby increasing forecasting accuracy and providing indispensable technical support for the sustainable development of regional water environments.
Collapse
Affiliation(s)
- Ying Dong
- School of Statistics, Dongbei University of Finance and Economics, No. 217, Jianshan Road, Shahekou District, Dalian, Liaoning Province, 116025, China.
| | - Yuhuan Sun
- School of Statistics, Dongbei University of Finance and Economics, No. 217, Jianshan Road, Shahekou District, Dalian, Liaoning Province, 116025, China.
| | - Zhenkun Liu
- School of Management, Nanjing University of Posts and Telecommunications, No 66 Xinmofan Road, Gulou District, Nanjing, Jiangsu Province, 210023, China.
| | - Zhiyuan Du
- Department of Statistics, Virginia Polytechnic Institute and State University, 250 Drillfield Drive, Blacksburg, VA, 24060, United States.
| | - Jianzhou Wang
- Institute of Systems Engineering, Macau University of Science and Technology, Taipa Street, Macao, 999078, China.
| |
Collapse
|
21
|
Fu X, Jiang J, Wu X, Huang L, Han R, Li K, Liu C, Roy K, Chen J, Mahmoud NTA, Wang Z. Deep learning in water protection of resources, environment, and ecology: achievement and challenges. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2024; 31:14503-14536. [PMID: 38305966 DOI: 10.1007/s11356-024-31963-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 01/06/2024] [Indexed: 02/03/2024]
Abstract
The breathtaking economic development put a heavy toll on ecology, especially on water pollution. Efficient water resource management has a long-term influence on the sustainable development of the economy and society. Economic development and ecology preservation are tangled together, and the growth of one is not possible without the other. Deep learning (DL) is ubiquitous in autonomous driving, medical imaging, speech recognition, etc. The spectacular success of deep learning comes from its power of richer representation of data. In view of the bright prospects of DL, this review comprehensively focuses on the development of DL applications in water resources management, water environment protection, and water ecology. First, the concept and modeling steps of DL are briefly introduced, including data preparation, algorithm selection, and model evaluation. Finally, the advantages and disadvantages of commonly used algorithms are analyzed according to their structures and mechanisms, and recommendations on the selection of DL algorithms for different studies, as well as prospects for the application and development of DL in water science are proposed. This review provides references for solving a wider range of water-related problems and brings further insights into the intelligent development of water science.
Collapse
Affiliation(s)
- Xiaohua Fu
- Ecological Environment Management and Assessment Center, Central South University of Forestry and Technology, Changsha, 410004, People's Republic of China
| | - Jie Jiang
- Ecological Environment Management and Assessment Center, Central South University of Forestry and Technology, Changsha, 410004, People's Republic of China
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China
| | - Xie Wu
- China Railway Water Information Technology Co, LTD, Nanchang, 330000, People's Republic of China
| | - Lei Huang
- School of Environmental Science and Engineering, Guangzhou University, Guangzhou, 510006, People's Republic of China
| | - Rui Han
- China Environment Publishing Group, Beijing, 100062, People's Republic of China
| | - Kun Li
- Freeman Business School, Tulane University, New Orleans, LA, 70118, USA
- Guangzhou Huacai Environmental Protection Technology Co., Ltd, Guangzhou, 511480, People's Republic of China
| | - Chang Liu
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China
| | - Kallol Roy
- Institute of Computer Science, University of Tartu, 51009, Tartu, Estonia
| | - Jianyu Chen
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China
| | | | - Zhenxing Wang
- State Environmental Protection Key Laboratory of Water Environmental Simulation and Pollution Control, Ministry of Ecology and Environment, South China Institute of Environmental Sciences, Guangzhou, 510655, People's Republic of China.
| |
Collapse
|
22
|
Schmidt JQ, Kerkez B. Machine Learning-Assisted, Process-Based Quality Control for Detecting Compromised Environmental Sensors. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:18058-18066. [PMID: 37582237 DOI: 10.1021/acs.est.3c00360] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Machine learning (ML) techniques promise to revolutionize environmental research and management, but collecting the necessary volumes of high-quality data remains challenging. Environmental sensors are often deployed under harsh conditions, requiring labor-intensive quality assurance and control (QAQC) processes. The need for manual QAQC is a major impediment to the scalability of these sensor networks. Existing techniques for automated QAQC make strong assumptions about noise profiles in the data they filter that do not necessarily hold for broadly deployed environmental sensors, however. Toward the goal of increasing the volume of high-quality environmental data, we introduce an ML-assisted QAQC methodology that is robust to low signal-to-noise ratio data. Our approach embeds sensor measurements into a dynamical feature space and trains a binary classification algorithm (Support Vector Machine) to detect deviation from expected process dynamics, indicating whether a sensor has become compromised and requires maintenance. This strategy enables the automated detection of a wide variety of nonphysical signals. We apply the methodology to three novel data sets produced by 136 low-cost environmental sensors (stream level, drinking water pH, and drinking water electroconductivity), deployed by our group across 250,000 km2 in Michigan, USA. The proposed methodology achieved accuracy scores of up to 0.97 and consistently outperformed state-of-the-art anomaly detection techniques.
Collapse
Affiliation(s)
- Jacquelyn Q Schmidt
- Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, Michigan 48109-2125, United States
| | - Branko Kerkez
- Department of Civil and Environmental Engineering, University of Michigan, Ann Arbor, Michigan 48109-2125, United States
| |
Collapse
|
23
|
Rahat SH, Steissberg T, Chang W, Chen X, Mandavya G, Tracy J, Wasti A, Atreya G, Saki S, Bhuiyan MAE, Ray P. Remote sensing-enabled machine learning for river water quality modeling under multidimensional uncertainty. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 898:165504. [PMID: 37459982 DOI: 10.1016/j.scitotenv.2023.165504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 07/03/2023] [Accepted: 07/11/2023] [Indexed: 07/24/2023]
Abstract
Two fundamental problems have inhibited progress in the simulation of river water quality under climate (and other) uncertainty: 1) insufficient data, and 2) the inability of existing models to account for the complexity of factors (e.g., hydro-climatic, basin characteristics, land use features) affecting river water quality. To address these concerns this study presents a technique for augmenting limited ground-based observations of water quality variables with remote-sensed surface reflectance data by leveraging a machine learning model capable of accommodating the multidimensionality of water quality influences. Total Suspended Solids (TSS) can serve as a surrogate for chemical and biological pollutants of concern in surface water bodies. Historically, TSS data collection in the United States has been limited to the location of water treatment plants where state or federal agencies conduct regularly-scheduled water sampling. Mathematical models relating riverine TSS concentration to the explanatory factors have therefore been limited and the relationships between climate extremes and water contamination events have not been effectively diagnosed. This paper presents a method to identify these issues by utilizing a Long Short-Term Memory Network (LSTM) model trained on Moderate Resolution Imaging Spectroradiometer (MODIS) satellite reflectance data, which is calibrated to TSS data collected by the Ohio River Valley Water Sanitation Commission (ORSANCO). The methodology developed enables a thorough empirical analysis and data-driven algorithms able to account for spatial variability within the watershed and provide effective water quality prediction under uncertainty.
Collapse
Affiliation(s)
- Saiful Haque Rahat
- Geosyntec Consultants, 920 SW 6th Ave Suite, 600, Portland, OR 97204, United States of America.
| | - Todd Steissberg
- U. S. Army Engineer Research and Development Center (ERDC), 707 Fourth St., Davis, CA 95616, United States of America
| | - Won Chang
- Department of Statistics, University of Cincinnati, 5516 French Hall, 2815, Commons Way, University of Cincinnati, Cincinnati, OH 45221, United States of America
| | - Xi Chen
- Department of Geography, University of Cincinnati, Braunstein Hall, A&S Geography, 0131, Cincinnati, OH 45221, United States of America
| | - Garima Mandavya
- Department of Chemical and Environmental Engineering, University of Cincinnati, 601, Engineering Research Center, Cincinnati, OH 45221-0012, United States of America
| | - Jacob Tracy
- Department of Chemical and Environmental Engineering, University of Cincinnati, 601, Engineering Research Center, Cincinnati, OH 45221-0012, United States of America
| | - Asphota Wasti
- Department of Chemical and Environmental Engineering, University of Cincinnati, 601, Engineering Research Center, Cincinnati, OH 45221-0012, United States of America
| | - Gaurav Atreya
- Department of Chemical and Environmental Engineering, University of Cincinnati, 601, Engineering Research Center, Cincinnati, OH 45221-0012, United States of America
| | - Shah Saki
- Department of Civil and Environmental Engineering, University of Connecticut, 261 Glenbrook Road Unit, 3037, Storrs, CT 06269-3037, United States of America
| | - Md Abul Ehsan Bhuiyan
- Climate Prediction Center, National Oceanic & Atmospheric Administration (NOAA), College Park, MA 20742, United States of America
| | - Patrick Ray
- Department of Chemical and Environmental Engineering, University of Cincinnati, 601, Engineering Research Center, Cincinnati, OH 45221-0012, United States of America
| |
Collapse
|
24
|
de Almeida RGB, Lamparelli MC, Dodds WK, Cunha DGF. Sampling frequency optimization of the water quality monitoring network in São Paulo State (Brazil) towards adaptive monitoring in a developing country. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:111113-111136. [PMID: 37798518 DOI: 10.1007/s11356-023-29998-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 09/17/2023] [Indexed: 10/07/2023]
Abstract
Water quality monitoring networks (WQMNs) that capture both the temporal and spatial dimensions are essential to provide reliable data for assessing water quality trends in surface waters, as well as for supporting initiatives to control anthropogenic activities. Meeting these monitoring goals as efficiently as possible is crucial, especially in developing countries where the financial resources are limited and the water quality degradation is accelerating. Here, we asked if sampling frequency could be reduced while maintaining the same degree of information as with bimonthly sampling in the São Paulo State (Brazil) WQMN. For this purpose, we considered data from 2004 to 2018 for 56 monitoring sites distributed into four out of 22 of the state's water resources management units (UGRHIs, "Unidades de Gerenciamento de Recursos Hídricos"). We ran statistical tests for identifying data redundancy among two-month periods in the dry and wet seasons, followed by objective criteria to develop a sampling frequency recommendation. Our results showed that the reduction would be feasible in three UGRHIs, with the number of annual samplings ranging from two to four (instead of the original six). In both seasons, dissolved oxygen and Escherichia coli required more frequent sampling than the other analyzed parameters to adequately capture variability. The recommendation was compatible with flexible monitoring strategies observed in well-structured WQMNs worldwide, since the suggested sampling frequencies were not the same for all UGRHIs. Our approach can contribute to establishing a methodology to reevaluate WQMNs, potentially resulting in less costly and more adaptive strategies in São Paulo State and other developing areas with similar challenges.
Collapse
Affiliation(s)
| | - Marta Condé Lamparelli
- Companhia Ambiental do Estado de São Paulo (CETESB), Avenida Professor Frederico Hermann Júnior, 345 Alto de Pinheiros, São Paulo, SP, CEP 05459-900, Brazil
| | - Walter Kennedy Dodds
- Division of Biology, Kansas State University, 116 Ackert Hall, Manhattan, KS, 66506, USA
| | - Davi Gasparini Fernandes Cunha
- Departamento de Hidráulica e Saneamento, Escola de Engenharia de São Carlos, Universidade de São Paulo, Avenida Trabalhador São-Carlense, 400 Centro, Sao Carlos, SP, CEP 13566-590, Brazil
| |
Collapse
|
25
|
Cojbasic S, Dmitrasinovic S, Kostic M, Turk Sekulic M, Radonic J, Dodig A, Stojkovic M. Application of machine learning in river water quality management: a review. WATER SCIENCE AND TECHNOLOGY : A JOURNAL OF THE INTERNATIONAL ASSOCIATION ON WATER POLLUTION RESEARCH 2023; 88:2297-2308. [PMID: 37966184 PMCID: wst_2023_331 DOI: 10.2166/wst.2023.331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
Machine learning (ML), a branch of artificial intelligence (AI), has been increasingly used in environmental engineering due to the ability to analyze complex nonlinear problems (such as ones connected with water quality management) through a data-driven approach. This study provides an overview of different ML algorithms applied for monitoring and predicting river water quality. Different parameters could be monitored or predicted, such as dissolved oxygen (DO), biological and chemical oxygen demand (BOD and COD), turbidity levels, the concentration of different ions (such as Mg2+ and Ca2+), heavy metal or other pollutant's concentration, pH, temperature, and many more. Although many algorithms have been investigated for the prediction of river water quality, there are several which are most commonly used in engineering practice. These models mostly include so-called supervised learning algorithms, such as artificial neural network (ANN), support vector machine (SVM), random forest (RF), decision tree (DT), and deep learning (DL). To further enhance prediction power, novel hybrid algorithms, could be used. However, the quality of prediction is not only dependent on the applied algorithm but also on the availability of previously mentioned water quality parameters, their selection, and the combination of input data used to train the ML model.
Collapse
Affiliation(s)
- Sanja Cojbasic
- Faculty of Technical Sciences, Department of Environmental Engineering and Occupational Safety and Health, University of Novi Sad, Trg Dositeja Obradovica 6, 21000 Novi Sad, Serbia E-mail:
| | - Sonja Dmitrasinovic
- Faculty of Technical Sciences, Department of Environmental Engineering and Occupational Safety and Health, University of Novi Sad, Trg Dositeja Obradovica 6, 21000 Novi Sad, Serbia
| | - Marija Kostic
- Faculty of Technical Sciences, Department of Environmental Engineering and Occupational Safety and Health, University of Novi Sad, Trg Dositeja Obradovica 6, 21000 Novi Sad, Serbia
| | - Maja Turk Sekulic
- Faculty of Technical Sciences, Department of Environmental Engineering and Occupational Safety and Health, University of Novi Sad, Trg Dositeja Obradovica 6, 21000 Novi Sad, Serbia
| | - Jelena Radonic
- Faculty of Technical Sciences, Department of Environmental Engineering and Occupational Safety and Health, University of Novi Sad, Trg Dositeja Obradovica 6, 21000 Novi Sad, Serbia
| | - Ana Dodig
- Institute for Artificial Intelligence R&D of Serbia, Fruskogorska 1, Novi Sad, Serbia
| | - Milan Stojkovic
- Institute for Artificial Intelligence R&D of Serbia, Fruskogorska 1, Novi Sad, Serbia
| |
Collapse
|
26
|
Saha GK, Rahmani F, Shen C, Li L, Cibin R. A deep learning-based novel approach to generate continuous daily stream nitrate concentration for nitrate data-sparse watersheds. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 878:162930. [PMID: 36934914 DOI: 10.1016/j.scitotenv.2023.162930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 03/08/2023] [Accepted: 03/14/2023] [Indexed: 05/13/2023]
Abstract
High-frequency stream nitrate concentration provides critical insights into nutrient dynamics and can help to improve the effectiveness of management decisions to maintain a sustainable ecosystem. However, nitrate monitoring is conventionally conducted through lab analysis using in situ water samples and is typically at coarse temporal resolution. In the last decade, many agencies started collecting high-frequency (5-60 min intervals) nitrate data using optical sensors. The hypothesis of the study is that the data-driven models can learn the trend and temporal variability in nitrate concentration from high-frequency sensor-based nitrate data in the region and generate continuous nitrate data for unavailable data periods and data-limited locations. A Long Short-Term Memory (LSTM) model-based framework was developed to estimate continuous daily stream nitrate for dozens of gauge locations in Iowa, USA. The promising results supported the hypothesis; the LSTM model demonstrated median test-period Nash-Sutcliffe efficiency (NSE) = 0.75 and RMSE = 1.53 mg/L for estimating continuous daily nitrate concentration in 42 sites, which are unprecedented performance levels. Twenty-one sites (50 % of all sites) and thirty-four sites (76 % of all sites) demonstrated NSE > 0.75 and 0.50, respectively. The average nitrate concentration of neighboring sites was identified as a crucial determinant of continuous daily nitrate concentration. Seasonal model performance evaluation showed that the model performed effectively in the summer and fall seasons. About 26 sites showed correlations >0.60 between estimated nitrate concentration and discharge. The concentration-discharge (c-Q) relationship analysis showed that the study watersheds had four dominant nitrate transport patterns from landscapes to streams with increasing discharge, including the flushing pattern being the most dominant one. Stream nitrate estimation impedes due to data inadequacy. The modeling framework can be used to generate temporally continuous nitrate at nitrate data-limited regions with a nearby sensor-based nitrate gauge. Watershed planners and policymakers could utilize the continuous nitrate data to gain more information on the regional nitrate status and design conservation practices accordingly.
Collapse
Affiliation(s)
- Gourab Kumer Saha
- Department of Agricultural and Biological Engineering, The Pennsylvania State University, United States of America
| | - Farshid Rahmani
- Department of Civil and Environmental Engineering, The Pennsylvania State University, United States of America
| | - Chaopeng Shen
- Department of Civil and Environmental Engineering, The Pennsylvania State University, United States of America
| | - Li Li
- Department of Civil and Environmental Engineering, The Pennsylvania State University, United States of America
| | - Raj Cibin
- Department of Agricultural and Biological Engineering, The Pennsylvania State University, United States of America; Department of Civil and Environmental Engineering, The Pennsylvania State University, United States of America.
| |
Collapse
|
27
|
Wang Q, Li C, Hao D, Xu Y, Shi X, Liu T, Sun W, Zheng Z, Liu J, Li W, Liu W, Zheng J, Li F. A novel four-dimensional prediction model of soil heavy metal pollution: Geographical explanations beyond artificial intelligence "black box". JOURNAL OF HAZARDOUS MATERIALS 2023; 458:131900. [PMID: 37385097 DOI: 10.1016/j.jhazmat.2023.131900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 06/05/2023] [Accepted: 06/18/2023] [Indexed: 07/01/2023]
Abstract
The current artificial intelligence (AI)-based prediction approaches of soil pollutants are inadequate in estimating the geospatial source-sink processes and striking a balance between the interpretability and accuracy, resulting in poor spatial extrapolation and generalization. In this study, we developed and tested a geographically interpretable four-dimensional AI prediction model for soil heavy metal (Cd) contents (4DGISHM) in Shaoguan city of China from 2016 to 2030. The 4DGISHM approach characterized spatio-temporal changes in source-sink processes of soil Cd by estimating spatio-temporal patterns and the effects of drivers and their interactions of soil Cd at local to regional scales using TreeExplainer-based SHAP and parallel ensemble AI algorithms. The results demonstrate that the prediction model achieved MSE and R2 values of 0.012 and 0.938, respectively, at a spatial resolution of 1 km. The predicted areas exceeding the risk control values for soil Cd across Shaoguan from 2022 to 2030 increased by 22.92% at the baseline scenario. By 2030, enterprise and transportation emissions (SHAP values 0.23 and 0.12 mg/kg, respectively) were the major drivers. The influence of driver interactions on soil Cd was marginal. Our approach surpasses the limitations of the AI "black box" by integrating spatio-temporal source-sink explanation and accuracy. This advancement enables geographically precise prediction and control of soil pollutants.
Collapse
Affiliation(s)
- Qi Wang
- National-Regional Joint Engineering Research Center for Soil Pollution Control and Remediation in South China, Guangdong Key Laboratory of Integrated Agro-environmental Pollution Control and Management, Institute of Eco-environmental and Soil Sciences, Guangdong Academy of Science, Guangzhou 510650, China
| | - Cangbai Li
- Key Laboratory for City Cluster Environmental Safety and Green Development of the Ministry of Education, School of Ecology, Environment and Resources, Guangdong University of Technology, Guangzhou 510006, China
| | - Dongmei Hao
- School of Management, Lanzhou University, Lanzhou 730099, China
| | - Yafei Xu
- School of Management, Lanzhou University, Lanzhou 730099, China
| | - Xuewen Shi
- School of Management, Lanzhou University, Lanzhou 730099, China
| | - Tongxu Liu
- National-Regional Joint Engineering Research Center for Soil Pollution Control and Remediation in South China, Guangdong Key Laboratory of Integrated Agro-environmental Pollution Control and Management, Institute of Eco-environmental and Soil Sciences, Guangdong Academy of Science, Guangzhou 510650, China
| | - Weimin Sun
- National-Regional Joint Engineering Research Center for Soil Pollution Control and Remediation in South China, Guangdong Key Laboratory of Integrated Agro-environmental Pollution Control and Management, Institute of Eco-environmental and Soil Sciences, Guangdong Academy of Science, Guangzhou 510650, China
| | - Zelong Zheng
- Guangzhou First People's Hospital, School of Medicine, South China University of Technology, Guangzhou 510180, China
| | - Jianfeng Liu
- Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Wanqi Li
- School of Management, Lanzhou University, Lanzhou 730099, China
| | - Wengang Liu
- School of Management, Lanzhou University, Lanzhou 730099, China
| | - Jiaxue Zheng
- School of Data Science and Artificial Intelligence, Dongbei University of Finance & Economics, Dalian 116025, China
| | - Fangbai Li
- National-Regional Joint Engineering Research Center for Soil Pollution Control and Remediation in South China, Guangdong Key Laboratory of Integrated Agro-environmental Pollution Control and Management, Institute of Eco-environmental and Soil Sciences, Guangdong Academy of Science, Guangzhou 510650, China.
| |
Collapse
|
28
|
Sheikholeslami R, Hall JW. Global patterns and key drivers of stream nitrogen concentration: A machine learning approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 868:161623. [PMID: 36657680 PMCID: PMC10933795 DOI: 10.1016/j.scitotenv.2023.161623] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/22/2022] [Accepted: 01/11/2023] [Indexed: 06/17/2023]
Abstract
Anthropogenic loading of nitrogen to river systems can pose serious health hazards and create critical environmental threats. Quantification of the magnitude and impact of freshwater nitrogen requires identifying key controls of nitrogen dynamics and analyzing both the past and present patterns of nitrogen flows. To tackle this challenge, we adopted a machine learning (ML) approach and built an ML-driven representation that captures spatiotemporal variability in nitrogen concentrations at global scale. Our model uses random forests to regress a large sample of monthly measured stream nitrogen concentrations onto a set of 17 predictors with a spatial resolution of 0.5-degree over the 1990-2013, including observations within the pixel and upstream drivers. The model was validated with data from rivers outside the training dataset and was used to predict nitrogen concentrations in 520 major river basins of the world, including many with scarce or no observations. We predicted that the regions with highest median nitrogen concentrations in their rivers (in 2013) were: United States (Mississippi), Pakistan, Bangladesh, India (Indus, Ganges), China (Yellow, Yangtze, Yongding, Huai), and most of Europe (Rhine, Danube, Vistula, Thames, Trent, Severn). Other major hotspots were the river basins of the Sebou (Morroco), Nakdong (South Korea), Kitakami (Japan), and Egypt's Nile Delta. Our analysis showed that the rate of increase in nitrogen concentration between 1990s and 2000s was greatest in rivers located in eastern China, eastern and central parts of Canada, Baltic states, Pakistan, mainland southeast Asia, and south-eastern Australia. Using a new grouped variable importance measure, we also found that temporality (month of the year and cumulative month count) is the most influential predictor, followed by factors representing hydroclimatic conditions, diffuse nutrient emissions from agriculture, and topographic features. Our model can be further applied to assess strategies designed to reduce nitrogen pollution in freshwater bodies at large spatial scales.
Collapse
Affiliation(s)
- Razi Sheikholeslami
- School of Geography and the Environment, University of Oxford, Oxford, UK; Environmental Change Institute, University of Oxford, Oxford, UK; Department of Civil Engineering, Sharif University of Technology, Tehran, Iran.
| | - Jim W Hall
- School of Geography and the Environment, University of Oxford, Oxford, UK; Environmental Change Institute, University of Oxford, Oxford, UK
| |
Collapse
|
29
|
Guo Z, Liu F, Duan Q, Wang W, Wan Q, Huang Y, Zhao Y, Liu L, Feng Y, Xian L, Gao H, Long Y, Yao D, Lee J. A spectral learning path for simultaneous multi-parameter detection of water quality. ENVIRONMENTAL RESEARCH 2023; 216:114812. [PMID: 36395862 DOI: 10.1016/j.envres.2022.114812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 11/08/2022] [Accepted: 11/12/2022] [Indexed: 06/16/2023]
Abstract
Water quality parameters (WQP) are the most intuitive indicators of the environmental quality of water body. Due to the complexity and variability of the chemical environment of water body, simple and rapid detection of multiple parameters of water quality becomes a difficult task. In this paper, spectral images (named SPIs) and deep learning (DL) techniques were combined to construct an intelligent method for WQP detection. A novel spectroscopic instrument was used to obtain SPIs, which were converted into feature images of water chemistry and then combined with deep convolutional neural networks (CNNs) to train models and predict WQP. The results showed that the method of combining SPIs and DL has high accuracy and stability, and good prediction results with average relative error of each parameter (anions and cations, TOC, TP, TN, NO3--N, NH3-N) at 1.3%, coefficient of determination (R2) of 0.996, root mean square error (RMSE) of 0.1, residual prediction deviation (RPD) of 16.2, and mean absolute error (MAE) of 0.067. The method can achieve rapid and accurate detection of high-dimensional water quality multi-parameters, and has the advantages of simple pre-processing and low cost. It can be applied not only to the intelligent detection of environmental waters, but also has the potential to be applied in chemical, biological and medical fields.
Collapse
Affiliation(s)
- Zhiqiang Guo
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Fenli Liu
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Qiannan Duan
- Shaanxi Key Laboratory of Earth Surface System and Environmental Canying Capacity. College of Upban and Environmental Sciences, Northwest University, Xi'an, 710127, China.
| | - Wenjing Wang
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Qianru Wan
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Yicai Huang
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Yuting Zhao
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Lu Liu
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Yunjin Feng
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China
| | - Libo Xian
- Xi'an 9th Sewage Treatment Plant, Chang'an Chengrun Operation Management Co., Ltd., Chang'an Urban Rural Development Co., Ltd., Xi'an, 710199, China
| | - Hang Gao
- Xi'an 9th Sewage Treatment Plant, Chang'an Chengrun Operation Management Co., Ltd., Chang'an Urban Rural Development Co., Ltd., Xi'an, 710199, China
| | - Yiwen Long
- Xi'an 9th Sewage Treatment Plant, Chang'an Chengrun Operation Management Co., Ltd., Chang'an Urban Rural Development Co., Ltd., Xi'an, 710199, China
| | - Dan Yao
- Xi'an 9th Sewage Treatment Plant, Chang'an Chengrun Operation Management Co., Ltd., Chang'an Urban Rural Development Co., Ltd., Xi'an, 710199, China
| | - Jianchao Lee
- Laboratory of Environmental Aquatic Chemistry, Department of Environmental Science, Shaanxi Normal University, Xi'an, 710062, China.
| |
Collapse
|
30
|
Lian X, Zhao W, Gentine P. Recent global decline in rainfall interception loss due to altered rainfall regimes. Nat Commun 2022; 13:7642. [PMID: 36496496 PMCID: PMC9741630 DOI: 10.1038/s41467-022-35414-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 12/01/2022] [Indexed: 12/13/2022] Open
Abstract
Evaporative loss of interception (Ei) is the first process occurring during rainfall, yet its role in large-scale surface water balance has been largely underexplored. Here we show that Ei can be inferred from flux tower evapotranspiration measurements using physics-informed hybrid machine learning models built under wet versus dry conditions. Forced by satellite and reanalysis data, this framework provides an observationally constrained estimate of Ei, which is on average 84.1 ± 1.8 mm per year and accounts for 8.6 ± 0.2% of total rainfall globally during 2000-2020. Rainfall frequency regulates long-term average Ei changes, and rainfall intensity, rather than vegetation attributes, determines the fraction of Ei in gross precipitation (Ei/P). Rain events have become less frequent and more intense since 2000, driving a global decline in Ei (and Ei/P) by 4.9% (6.7%). This suggests that ongoing rainfall changes favor a partitioning towards more soil moisture and runoff, benefiting ecosystem functions but simultaneously increasing flood risks.
Collapse
Affiliation(s)
- Xu Lian
- Department of Earth and Environmental Engineering, Columbia University, New York, NY, USA.
| | - Wenli Zhao
- Department of Earth and Environmental Engineering, Columbia University, New York, NY, USA
| | - Pierre Gentine
- Department of Earth and Environmental Engineering, Columbia University, New York, NY, USA
- Center for Learning the Earth with Artificial intelligence and Physics (LEAP), Columbia University, New York, NY, USA
| |
Collapse
|
31
|
Sadayappan K, Kerins D, Shen C, Li L. Nitrate concentrations predominantly driven by human, climate, and soil properties in US rivers. WATER RESEARCH 2022; 226:119295. [PMID: 36323218 DOI: 10.1016/j.watres.2022.119295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/11/2022] [Accepted: 10/22/2022] [Indexed: 06/16/2023]
Abstract
Nitrate is one of the most widespread and persistent pollutants in our time. Our understanding of nitrate dynamics has advanced substantially in the past decades, although its predominant drivers across gradients of climate, land use, and geology have remained elusive. Here we collated nitrate data from 2061 rivers along with 32 watershed characteristic indexes and developed machine learning models to reconstruct long-term mean (multi-year average) nitrate concentrations in the contiguous United States (CONUS). The trained models show similarly satisfactory model performance and can predict nitrate concentrations in chemically-ungauged places with about 70% accuracy. Further analysis revealed that five (out of 32) indexes (drivers) can explain about 70% of spatial variations in mean nitrate concentrations. The five influential drivers are nitrogen application rates Nrate and urban area Aurban% (human drivers), mean annual precipitation and temperature (climate drivers), and sand percent Sand% (soil property driver). Nitrate concentrations in undeveloped sites are primarily modulated by climate and soil property; they decrease with increasing mean discharge and Sand%. Nitrate concentrations in agriculture and urban sites increase with Nrate and Aurban% until reaching their apparent maxima around 10,000 kg/km2/yr and around 25%, respectively. Results indicate that nitrate concentrations may remain similar or increase with growing human population. In addition, nitrate concentrations can increase even without human input, as warming escalates water demand and reduces mean discharge in many places. These results allude to a conceptual model that highlights the impacts of distinct drivers: while human drivers predominate nitrogen input to land and rivers, climate drivers and soil properties modulate its transport and transformation, the balance of which determine long-term mean concentrations. Such mechanism-based insights and forecasting capabilities are essential for water management as we expect changing climate and growing agriculture and urbanization.
Collapse
Affiliation(s)
- Kayalvizhi Sadayappan
- Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Devon Kerins
- Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Chaopeng Shen
- Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Li Li
- Department of Civil and Environmental Engineering, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
32
|
Abstract
AbstractRapid advances in hardware and software, accompanied by public- and private-sector investment, have led to a new generation of data-driven computational tools. Recently, there has been a particular focus on deep learning—a class of machine learning algorithms that uses deep neural networks to identify patterns in large and heterogeneous datasets. These developments have been accompanied by both hype and scepticism by ecologists and others. This review describes the context in which deep learning methods have emerged, the deep learning methods most relevant to ecosystem ecologists, and some of the problem domains they have been applied to. Deep learning methods have high predictive performance in a range of ecological contexts, leveraging the large data resources now available. Furthermore, deep learning tools offer ecosystem ecologists new ways to learn about ecosystem dynamics. In particular, recent advances in interpretable machine learning and in developing hybrid approaches combining deep learning and mechanistic models provide a bridge between pure prediction and causal explanation. We conclude by looking at the opportunities that deep learning tools offer ecosystem ecologists and assess the challenges in interpretability that deep learning applications pose.
Collapse
|
33
|
Li P, Hua P, Zhang J, Krebs P. Ecological risk and machine learning based source analyses of trace metals in typical surface water. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 838:155944. [PMID: 35588821 DOI: 10.1016/j.scitotenv.2022.155944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 04/06/2022] [Accepted: 05/10/2022] [Indexed: 06/15/2023]
Abstract
Surface water is threatened by trace metal pollution due to increasing anthropogenic activities. Therefore, an appropriate source identification was essential to reduce the ecological risk posed by the given pollutants. In this study, shallow and deep learning approaches trained by a registered environmental dataset of discharge sources were employed to classify the potential emission sources of trace metals in the Elbe River, Germany. The results showed that the overall concentration rank of the given metals was Zn (226.5 ± 526.5 μg·L-1) > Ni (5.6 ± 4.7 μg·L-1) > Cu (5.3 ± 5.8 μg·L-1) > As (3.3 ± 3.7 μg·L-1) > Pb (2.9 ± 5.2 μg·L-1) > Cr (1.8 ± 2.5 μg·L-1) > Cd (1.3 ± 3.1 μg·L-1) in seven tributaries and the mainstream of the Elbe River, among which the tributary Triebisch had the highest risk quotient over 86. Random Forest outperformed other algorithms with the highest Kappa median values of 0.59 and the lowest Hamming-loss values of 0.22 in extraction of the majority voted class. Then, the source apportionment conducted by random forest suggested that wastewater disposal and metal industrial emissions were the source contributors in the tributary Triebisch (probabilities: 0.39, 0.3), upstream segment (0.45, 0.25), and downstream segment (0.32, 0.23) of the given river. Additional sources of mineral industry emissions were found in the upstream segment (0.21) and downstream segment (0.22). The data provided herein suggest that random forest would be an effective approach to identify pollutants in aquatic environments and could assist source-oriented adaptive management.
Collapse
Affiliation(s)
- Peifeng Li
- Institute of Urban and Industrial Water Management, Technische Universität Dresden, 01062 Dresden, Germany
| | - Pei Hua
- SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, MOE Key Laboratory of Theoretical Chemistry of Environment, South China Normal University, 510006 Guangzhou, China; School of Environment, South China Normal University, University Town, 510006 Guangzhou, China
| | - Jin Zhang
- Yangtze Institute for Conservation and Development, State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai University, 210098 Nanjing, China; State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, 830011 Urumqi, China.
| | - Peter Krebs
- Institute of Urban and Industrial Water Management, Technische Universität Dresden, 01062 Dresden, Germany
| |
Collapse
|
34
|
Prediction of Dichloroethene Concentration in the Groundwater of a Contaminated Site Using XGBoost and LSTM. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19159374. [PMID: 35954730 PMCID: PMC9367752 DOI: 10.3390/ijerph19159374] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/22/2022] [Accepted: 07/27/2022] [Indexed: 02/04/2023]
Abstract
Chlorinated aliphatic hydrocarbons (CAHs) are widely used in agriculture and industries and have become one of the most common groundwater contaminations. With the excellent performance of the deep learning method in predicting, LSTM and XGBoost were used to forecast dichloroethene (DCE) concentrations in a pesticide-contaminated site undergoing natural attenuation. The input variables included BTEX, vinyl chloride (VC), and five water quality indicators. In this study, the predictive performances of long short-term memory (LSTM) and extreme gradient boosting (XGBoost) were compared, and the influences of variables on models’ performances were evaluated. The results indicated XGBoost was more likely to capture DCE variation and was robust in high values, while the LSTM model presented better accuracy for all wells. The well with higher DCE concentrations would lower the model’s accuracy, and its influence was more evident in XGBoost than LSTM. The explanation of the SHapley Additive exPlanations (SHAP) value of each variable indicated high consistency with the rules of biodegradation in the real environment. LSTM and XGBoost could predict DCE concentrations through only using water quality variables, and LSTM performed better than XGBoost.
Collapse
|
35
|
Xiong R, Zheng Y, Chen N, Tian Q, Liu W, Han F, Jiang S, Lu M, Zheng Y. Predicting Dynamic Riverine Nitrogen Export in Unmonitored Watersheds: Leveraging Insights of AI from Data-Rich Regions. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:10530-10542. [PMID: 35772808 DOI: 10.1021/acs.est.2c02232] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Terrestrial export of nitrogen is a critical Earth system process, but its global dynamics remain difficult to predict at a high spatiotemporal resolution. Here, we use deep learning (DL) to model daily riverine nitrogen export in response to hydrometeorological and anthropogenic drivers. Long short-term memory (LSTM) models for the daily concentration and flux of dissolved inorganic nitrogen (DIN) were built in a coastal watershed in southeastern China with a typical subtropical monsoon climate. The DL models exhibited excellent accuracy for both DIN concentration and flux, with Nash-Sutcliffe efficiency coefficients (NSEs) up to 0.67 and 0.92, respectively, a performance unlikely to be achieved by generic process-based models with comparable data quality. The flux model ensemble, without retraining, performed well (mean NSE = 0.32-0.84) in seven distinct watersheds in Asia, Europe, and North America, and retraining with multi-watershed data further improved the lowest NSE from 0.32 to 0.68. DL interpretation confirmed that interbasin consistency of riverine nitrogen export exists across different continents, which stems from the similarities in rainfall-runoff relationships. The multi-watershed flux model projects 0.60-12.4% increases in the nitrogen export to oceans from the studied watersheds under a 20% increase in fertilizer consumption, which rises to 6.7-20.1% with a 10% increase in runoff, indicating the synergistic effect of human activities and climate change. The DL-based method represents a successful case of explainable artificial intelligence in environmental science, providing a potential shortcut to a consistent understanding of the global daily-resolution dynamics of riverine nitrogen export under the currently limited data conditions.
Collapse
Affiliation(s)
- Rui Xiong
- School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR 999077, China
| | - Yi Zheng
- School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Shenzhen Municipal Engineering Lab of Environmental IoT Technologies, Southern University of Science and Technology, Shenzhen 518055, Guangdong Province, China
| | - Nengwang Chen
- Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Qing Tian
- Fujian Provincial Key Laboratory for Coastal Ecology and Environmental Studies, College of the Environment and Ecology, Xiamen University, Xiamen 361102, China
| | - Wei Liu
- School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Feng Han
- School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Shijie Jiang
- School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Department of Computational Hydrosystems, Helmholtz Centre for Environmental Research, Leipzig 04318, Germany
| | - Mengqian Lu
- Department of Civil and Environmental Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR 999077, China
| | - Yan Zheng
- School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
36
|
Forecasting Water Temperature in Cascade Reservoir Operation-Influenced River with Machine Learning Models. WATER 2022. [DOI: 10.3390/w14142146] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Water temperature (WT) is a critical control for various physical and biochemical processes in riverine systems. Although the prediction of river water temperature has been the subject of extensive research, very few studies have examined the relative importance of elements affecting WT and how to accurately estimate WT under the effects of cascaded dams. In this study, a series of potential influencing variables, such as air temperature, dew temperature, river discharge, day of year, wind speed and precipitation, were used to forecast daily river water temperature downstream of cascaded dams. First, the permutation importance of the influencing variables was ranked in six different machine learning models, including decision tree (DT), random forest (RF), gradient boosting (GB), adaptive boosting (AB), support vector regression (SVR) and multilayer perceptron neural network (MLPNN) models. The results showed that day of year (DOY) plays the most important role in each model for the prediction of WT, followed by flow and temperature, which are two commonly important factors in unregulated rivers. Then, combinations of the three most important inputs were used to develop the most parsimonious model based on the six machine learning models, where their performance was compared according to statistical metrics. The results demonstrated that GB3 and RF3 gave the most accurate forecasts for the training dataset and the test dataset, respectively. Overall, the results showed that the machine learning model could be effectively applied to predict river water temperature under the regulation of cascaded dams.
Collapse
|
37
|
Yu JW, Kim JS, Li X, Jong YC, Kim KH, Ryang GI. Water quality forecasting based on data decomposition, fuzzy clustering and deep learning neural network. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2022; 303:119136. [PMID: 35283198 DOI: 10.1016/j.envpol.2022.119136] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 02/12/2022] [Accepted: 03/09/2022] [Indexed: 06/14/2023]
Abstract
Water quality forecasting can provide useful information for public health protection and support water resources management. In order to forecast water quality more accurately, this paper proposes a novel hybrid model by combining data decomposition, fuzzy C-means clustering and bidirectional gated recurrent unit. Firstly, the original water quality data is decomposed into several subseries by empirical wavelet transform, and then, the decomposed subseries are recombined by fuzzy C-means clustering. Next, for each clustered series, bidirectional gated recurrent unit is applied to develop prediction model. Finally, the forecast result is obtained by the summation of the predictions for the subseries. The proposed forecast model is evaluated by the water quality data of Poyang Lake, China. Results show that the proposed forecast model provides highly accurate forecast result for all of the six water quality data: the average of MAPE of the forecast results for the six water quality datasets is 4.59% for 7 day ahead prediction. Furthermore, our model shows better forecast performance than the other models. Particularly, compared with the single BiGRU model, MAPE decreased by 32.86% in average. Results demonstrate that the proposed forecast model can be used effectively for water quality forecasting.
Collapse
Affiliation(s)
- Jin-Won Yu
- School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin, 300384, China; University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Ju-Song Kim
- School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin, 300384, China; University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Xia Li
- School of Environmental Science and Safety Engineering, Tianjin University of Technology, Tianjin, 300384, China.
| | - Yun-Chol Jong
- University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Kwang-Hun Kim
- University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| | - Gwang-Il Ryang
- University of Science, Pyongyang, 999091, Democratic People's Republic of Korea
| |
Collapse
|
38
|
Jiang Y, Li C, Song H, Wang W. Deep learning model based on urban multi-source data for predicting heavy metals (Cu, Zn, Ni, Cr) in industrial sewer networks. JOURNAL OF HAZARDOUS MATERIALS 2022; 432:128732. [PMID: 35334271 DOI: 10.1016/j.jhazmat.2022.128732] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 03/14/2022] [Accepted: 03/15/2022] [Indexed: 06/14/2023]
Abstract
The high concentrations of heavy metals in municipal industrial sewer networks will seriously impact the microorganisms of the activated sludge in the wastewater treatment plant (WWTP), thus deteriorating the effluent quality and destroying the stability of sewage treatment. Therefore, timely prediction and early warning of heavy metal concentrations in industrial sewer networks is crucial. However, due to the complex sources of heavy metals in industrial sewer networks, traditional physical modeling and linear methods cannot establish an accurate prediction model. Herein, we developed a Gated Recurrent Unit (GRU) neural network model based on a deep learning algorithm for predicting the concentrations of heavy metals in industrial sewer networks. To train the GRU model, we used low-cost and easy-to-obtain urban multi-source data, including socio-environmental indicator data, air environmental indicator data, water quantity indicator data, and easily measurable water quality indicator data. The model was applied to predict the concentrations of heavy metals (Cu, Zn, Ni, and Cr) in the sewer networks of an industrial area in southern China. The results are compared with the commonly used Artificial Neural Network (ANN) model. In this study, it was shown that the GRU had better prediction performance for Cu, Zn, Ni, and Cr concentrations, with the average R2 significantly increased by 12.35%, 11.94%, 9.21%, and 8.13%, respectively, compared to ANN predictions. The sensitivity analysis based on Shapley (SHAP) values revealed that conductivity (σ), temperature (T), pH, and sewage flow (Flow) contributed significantly to the prediction results of the model. Furthermore, the three input variables including air pressure (AP), land area (A), and population (Pop.) were removed without affecting the prediction performance of the model, which maximized the modeling efficiency and reduced the operational cost. This study provides an economical and feasible technical method for early warning of abnormal heavy metal concentrations in urban industrial sewer networks.
Collapse
Affiliation(s)
- Yiqi Jiang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, China
| | - Chaolin Li
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, China; State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150090, China.
| | - Hongxing Song
- Shenzhen Hydrology and Water Quality Center, Shenzhen 518038, China
| | - Wenhui Wang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen 518055, China.
| |
Collapse
|
39
|
Zhu M, Wang J, Yang X, Zhang Y, Zhang L, Ren H, Wu B, Ye L. A review of the application of machine learning in water quality evaluation. ECO-ENVIRONMENT & HEALTH (ONLINE) 2022; 1:107-116. [PMID: 38075524 PMCID: PMC10702893 DOI: 10.1016/j.eehl.2022.06.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 05/19/2022] [Accepted: 06/01/2022] [Indexed: 12/31/2023]
Abstract
With the rapid increase in the volume of data on the aquatic environment, machine learning has become an important tool for data analysis, classification, and prediction. Unlike traditional models used in water-related research, data-driven models based on machine learning can efficiently solve more complex nonlinear problems. In water environment research, models and conclusions derived from machine learning have been applied to the construction, monitoring, simulation, evaluation, and optimization of various water treatment and management systems. Additionally, machine learning can provide solutions for water pollution control, water quality improvement, and watershed ecosystem security management. In this review, we describe the cases in which machine learning algorithms have been applied to evaluate the water quality in different water environments, such as surface water, groundwater, drinking water, sewage, and seawater. Furthermore, we propose possible future applications of machine learning approaches to water environments.
Collapse
Affiliation(s)
- Mengyuan Zhu
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Jiawei Wang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Xiao Yang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Yu Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Linyu Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Hongqiang Ren
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Bing Wu
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| | - Lin Ye
- State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, China
| |
Collapse
|
40
|
Ahmed S, Abdul-Aziz OI. Metabolic scaling of stream dissolved oxygen across the U.S. Atlantic Coast. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022; 821:153292. [PMID: 35066036 DOI: 10.1016/j.scitotenv.2022.153292] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Revised: 01/16/2022] [Accepted: 01/16/2022] [Indexed: 06/14/2023]
Abstract
We investigated the hypothesis of emergent 'biogeochemical' similitude (parametric reduction) and scaling of dissolved oxygen (DO) in coastal streams across the U.S. Atlantic Coast by employing dimensional analysis methodology from fluid mechanics and hydraulic engineering. Two mechanistically meaningful dimensionless numbers were discovered as the stream 'metabolic' number and the fraction of 'DO saturation' number. The 'metabolic' number represented the synergistic control on stream DO from various climatic, hydrologic, biochemical, and ecological drivers (e.g., water temperature, atmospheric pressure, stream width and depth, total phosphorus, pH, and salinity). A graphical exploration of the 'metabolic' versus the 'DO saturation' numbers led to collapse of data during 1998-2015 from diverse coastal streams into an emergent process diagram, indicating three metabolism regimes (high, transitional, and low). The high and low metabolism regimes were, respectively, characterized by the most and least favorable environmental conditions for stream DO depletion-through reduced dissolution and reaeration, as well as increased organic decomposition, respiration, and nitrification. The emergent process diagram led to a generalized power law scaling relationship of the 'DO saturation' number as a function of the 'metabolic' number (exponent ~ 1/3; Nash-Sutcliffe Efficiency, NSE = 0.83-0.85). The metabolic scaling law was leveraged to develop a generalized empirical model to successfully predict DO in diverse streams across the U.S. Atlantic Coast (NSE = 0.83). The emergent process diagram, metabolic scaling law, and prediction model of DO would help understand and manage water quality and ecosystem health of coastal streams in the U.S. and elsewhere.
Collapse
Affiliation(s)
- Shakil Ahmed
- Department of Civil and Environmental Engineering, West Virginia University, 395 Evansdale Drive, Morgantown, WV 26506-6103, USA; Department of Civil Engineering, East West University, Aftabnagar, Dhaka 1212, Bangladesh
| | - Omar I Abdul-Aziz
- Department of Civil and Environmental Engineering, West Virginia University, 395 Evansdale Drive, Morgantown, WV 26506-6103, USA.
| |
Collapse
|
41
|
Land-Use Impact on Water Quality of the Opak Sub-Watershed, Yogyakarta, Indonesia. SUSTAINABILITY 2022. [DOI: 10.3390/su14074346] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
The integrated monitoring system of water quality is eminently reliant on water quality trend data. This study aims to obtain water quality patterns related to land-use change over a periodic observation in the Opak sub-watershed, Indonesia, both from a seasonal and spatial point of view. Landsat image data from 2013 to 2020 and water quality data comprising 25 parameters were compiled and analyzed. This study observed that land use remarkably correlated to water quality, especially the building area representing the dense population and various anthropogenic activities, to pollute the water sources. Three types of pollutant sources were identified using principal component analysis (PCA), including domestic, industrial, and agricultural activities, which all influenced the variance in river water quality. The use of spatiotemporal-based and multivariate analysis was to interpret water quality trend data, which can help the stakeholders to monitor pollution and take control in the Opak sub-watershed. The results investigated 17 out of 25 water quality parameters, which showed an increasing trend from upstream to downstream during the observation time. The concentration of biological oxygen demand over five days (BOD5), chemical oxygen demand (COD), nitrite, sulfide, phenol, phosphate, oil and grease, lead, Escherichia coli (E. coli), and total coli, surpassed the water quality standard through spatial analysis.
Collapse
|
42
|
Inner Dynamic Detection and Prediction of Water Quality Based on CEEMDAN and GA-SVM Models. REMOTE SENSING 2022. [DOI: 10.3390/rs14071714] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Urban water quality is facing strongly adverse degradation in rapidly developing areas. However, there exists a huge challenge to estimating the inner features and predicting the variation of long-term water quality due to the lack of related monitoring data and the complexity of urban water systems. Fortunately, multi-remote sensing data, such as nighttime light and evapotranspiration (ET), provide scientific data support and reasonably reveal the variation mechanisms. Here, we develop an integrated decomposition-reclassification-prediction method for water quality by integrating the CEEMDN method, the RF method mothed, and the genetic algorithm-support vector machine model (GA-SVM). The degression of the long-term water quality was decomposed and reclassified into three different frequency terms, i.e., high-frequency, low-frequency, and trend terms, to reveal the inner mechanism and dynamics in the CEEMDAN method. The RF method was then used to identify the teleconnection and the significance of the selected driving factors. More importantly, the GA-SVM model was designed with two types of model schemes, which were the data-driven model (GA-SVMd) and the integrated CEEMDAN-GA-SVM model (defined as GA-SVMc model), in order to predict urban water quality. Results revealed that the high-frequency terms for NH3-N and TN had a major contribution to the water quality and were mainly dominated by hydrometeorological factors such as ET, rainfall, and the dynamics of the lake water table. The trend terms revealed that the water quality continuously deteriorated during the study period; the terms were mainly regulated by the land use and land cover (LULC), land metrics, population, and yearly rainfall. The predicting results confirmed that the integrated GA-SVMc model had better performance than single data-driven models (such as the GA-SVM model). Our study supports that the integrated method reveals variation rules in water quality and provides early warning and guidance for reducing the water pollutant concentration.
Collapse
|
43
|
Kundu S, Pal S, Talukdar S, Mahato S, Singha P. Integration of satellite image-derived temperature and water depth for assessing fish habitability in dam controlled flood plain wetland. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:28083-28097. [PMID: 34988818 DOI: 10.1007/s11356-021-17869-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 11/26/2021] [Indexed: 06/14/2023]
Abstract
The present study attempted to investigate the changes in temperature conducive to fish habitability during the summer months in a hydrologically modified wetland following damming over a river. Satellite image-driven temperature and depth data calibrated with field data were used to analyse fish habitability and the presence of thermally optimum habitable zones in some fishes, such as labeo rohita, cirrhinus mrigala, tilapia fish, small shrimp, and catfish. The study was conducted both at the water's surface and at the optimum depth of survival. It is very obvious from the analysis that a larger part of the wetland has become an area that destroyed aquatic habitat during the post-dam period, and existing wetlands have suffered significant shallowing of water depth. This has resulted in a shrinking of the thermally optimum area of fish survival in relation to surface water temperature (from 100.09 to 74.24 km2 before the dam to 93.97 to 0 km2 after the dam) and an improvement in the optimum habitable condition in the comfortable depth niche of survival. In the post-dam period, it increased from 75.49 to 99.76%. Since the damming effect causes a 30.53 to 100% depletion of the optimum depth niche, improving the thermal environment has no effect on fish habitability. More water must be released from dams for restoration. Image-driven depth and temperature data calibrated with field information has been successfully applied in data sparse conditions, and it is further recommended in future work.
Collapse
Affiliation(s)
- Sonali Kundu
- Department of Geography, University of Gour Banga, Malda, India
| | - Swades Pal
- Department of Geography, University of Gour Banga, Malda, India.
| | - Swapan Talukdar
- Department of Geography, Faculty of Natural Science, Jamia Millia Islamia, New Delhi, 110025, India
| | - Susanta Mahato
- Special Centre for Disaster Research, Jawaharlal Nehru University, New Delhi, 110 067, India.
| | - Pankaj Singha
- Department of Geography, University of Gour Banga, Malda, India
| |
Collapse
|
44
|
Machine learning to predict effective reaction rates in 3D porous media from pore structural features. Sci Rep 2022; 12:5486. [PMID: 35361834 PMCID: PMC8971379 DOI: 10.1038/s41598-022-09495-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/24/2022] [Indexed: 12/03/2022] Open
Abstract
Large discrepancies between well-mixed reaction rates and effective reactions rates estimated under fluid flow conditions have been a major issue for predicting reactive transport in porous media systems. In this study, we introduce a framework that accurately predicts effective reaction rates directly from pore structural features by combining 3D pore-scale numerical simulations with machine learning (ML). We first perform pore-scale reactive transport simulations with fluid–solid reactions in hundreds of porous media and calculate effective reaction rates from pore-scale concentration fields. We then train a Random Forests model with 11 pore structural features and effective reaction rates to quantify the importance of structural features in determining effective reaction rates. Based on the importance information, we train artificial neural networks with varying number of features and demonstrate that effective reaction rates can be accurately predicted with only three pore structural features, which are specific surface, pore sphericity, and coordination number. Finally, global sensitivity analyses using the ML model elucidates how the three structural features affect effective reaction rates. The proposed framework enables accurate predictions of effective reaction rates directly from a few measurable pore structural features, and the framework is readily applicable to a wide range of applications involving porous media flows.
Collapse
|
45
|
Stream Temperature Predictions for River Basin Management in the Pacific Northwest and Mid-Atlantic Regions Using Machine Learning. WATER 2022. [DOI: 10.3390/w14071032] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Stream temperature (Ts) is an important water quality parameter that affects ecosystem health and human water use for beneficial purposes. Accurate Ts predictions at different spatial and temporal scales can inform water management decisions that account for the effects of changing climate and extreme events. In particular, widespread predictions of Ts in unmonitored stream reaches can enable decision makers to be responsive to changes caused by unforeseen disturbances. In this study, we demonstrate the use of classical machine learning (ML) models, support vector regression and gradient boosted trees (XGBoost), for monthly Ts predictions in 78 pristine and human-impacted catchments of the Mid-Atlantic and Pacific Northwest hydrologic regions spanning different geologies, climate, and land use. The ML models were trained using long-term monitoring data from 1980–2020 for three scenarios: (1) temporal predictions at a single site, (2) temporal predictions for multiple sites within a region, and (3) spatiotemporal predictions in unmonitored basins (PUB). In the first two scenarios, the ML models predicted Ts with median root mean squared errors (RMSE) of 0.69–0.84 °C and 0.92–1.02 °C across different model types for the temporal predictions at single and multiple sites respectively. For the PUB scenario, we used a bootstrap aggregation approach using models trained with different subsets of data, for which an ensemble XGBoost implementation outperformed all other modeling configurations (median RMSE 0.62 °C).The ML models improved median monthly Ts estimates compared to baseline statistical multi-linear regression models by 15–48% depending on the site and scenario. Air temperature was found to be the primary driver of monthly Ts for all sites, with secondary influence of month of the year (seasonality) and solar radiation, while discharge was a significant predictor at only 10 sites. The predictive performance of the ML models was robust to configuration changes in model setup and inputs, but was influenced by the distance to the nearest dam with RMSE <1 °C at sites situated greater than 16 and 44 km from a dam for the temporal single site and regional scenarios, and over 1.4 km from a dam for the PUB scenario. Our results show that classical ML models with solely meteorological inputs can be used for spatial and temporal predictions of monthly Ts in pristine and managed basins with reasonable (<1 °C) accuracy for most locations.
Collapse
|
46
|
Modeling Multistep Ahead Dissolved Oxygen Concentration Using Improved Support Vector Machines by a Hybrid Metaheuristic Algorithm. SUSTAINABILITY 2022. [DOI: 10.3390/su14063470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Dissolved oxygen (DO) concentration is an important water-quality parameter, and its estimation is very important for aquatic ecosystems, drinking water resources, and agro-industrial activities. In the presented study, a new support vector machine (SVM) method, which is improved by hybrid firefly algorithm–particle swarm optimization (FFAPSO), is proposed for the accurate estimation of the DO. Daily pH, temperature (T), electrical conductivity (EC), river discharge (Q) and DO data from Fountain Creek near Fountain, the United States, were used for the model development. Various combinations of pH, T, EC, and Q were used as inputs to the models to estimate the DO. The outcomes of the proposed SVM–FFAPSO model were compared with the SVM–PSO, SVM–FFA, and standalone SVM with respect to the root mean square errors (RMSE), the mean absolute error (MAE), Nash–Sutcliffe efficiency (NSE), and determination coefficient (R2), and graphical methods, such as scatterplots, and Taylor and violin charts. The SVM–FFAPSO showed a superior performance to the other methods in the estimation of the DO. The best model of each method was also assessed in multistep-ahead (from 1- to 7-day ahead) DO, and the superiority of the proposed method was observed from the comparison. The general outcomes recommend the use of SVM–FFAPSO in DO modeling, and this method can be useful for decision-makers in urban water planning and management.
Collapse
|
47
|
|
48
|
Jiang Y, Li C, Zhang Y, Zhao R, Yan K, Wang W. Data-driven method based on deep learning algorithm for detecting fat, oil, and grease (FOG) of sewer networks in urban commercial areas. WATER RESEARCH 2021; 207:117797. [PMID: 34731668 DOI: 10.1016/j.watres.2021.117797] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 09/17/2021] [Accepted: 10/20/2021] [Indexed: 06/13/2023]
Abstract
The content of fat, oil and grease (FOG) in the sewer network sediments is the key indicator for diagnosing sewer blockage and overflow. However, the traditional FOG detection is time-consuming and costly, and the establishment of mathematical models based on statistical methods to predict the content of FOG fail to provide satisfactory accuracy. Herein, a deep learning algorithm used a data-driven FOG content prediction model is proposed to achieve a more accurate prediction of FOG content. Meanwhile, global sensitivity analysis (GSA) is exploited to evaluate the contribution of input indicators to the output indicator (FOG) in the model, so that some input indicators that have less impact on the prediction performance can be screened out, the best combination of input indicators can be determined, and the operation cost of the model can be reduced. To evaluate the effectiveness of the proposed model, a case study was conducted in a city in southern China. The experimental results indicate that the prediction model obtains good FOG estimations and performs well from a single site to multiple sites with a mean R2 of 0.922, showing a good generalization performance. Through GSA, the key input indicators in the model were identified as pH, water temperature (T), relative humidity (RH), sewage flow (Flow), drinking water supply (DWS), velocity (V) and conductivity (σ), and the input indicators such as air pressure (AP), population (Pop.), and liquid level (LV) can be reduced without affecting the prediction accuracy of the model.
Collapse
Affiliation(s)
- Yiqi Jiang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Chaolin Li
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China; State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin, 150090, China.
| | - Yituo Zhang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Ruobin Zhao
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Kefen Yan
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Wenhui Wang
- School of Civil and Environmental Engineering, Harbin Institute of Technology, Shenzhen, 518055, China.
| |
Collapse
|
49
|
Huang R, Ma C, Ma J, Huangfu X, He Q. Machine learning in natural and engineered water systems. WATER RESEARCH 2021; 205:117666. [PMID: 34560616 DOI: 10.1016/j.watres.2021.117666] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/01/2021] [Accepted: 09/11/2021] [Indexed: 06/13/2023]
Abstract
Water resources of desired quality and quantity are the foundation for human survival and sustainable development. To better protect the water environment and conserve water resources, efficient water management, purification, and transportation are of critical importance. In recent years, machine learning (ML) has exhibited its practicability, reliability, and high efficiency in numerous applications; furthermore, it has solved conventional and emerging problems in both natural and engineered water systems. For example, ML can predict various water quality indicators in situ and real-time by considering the complex interactions among water-related variables. ML approaches can also solve emerging pollution problems with proven rules or universal mechanisms summarized from the related research. Moreover, by applying image recognition technology to analyze the relationships between image information and physicochemical properties of the research object, ML can effectively identify and characterize specific contaminants. In view of the bright prospects of ML, this review comprehensively summarizes the development of ML applications in natural and engineered water systems. First, the concept and modeling steps of ML are briefly introduced, including data preparation, algorithm selection and model evaluation. In addition, comprehensive applications of ML in recent studies, including predicting water quality, mapping groundwater contaminants, classifying water resources, tracing contaminant sources, and evaluating pollutant toxicity in natural water systems, as well as modeling treatment techniques, assisting characterization analysis, purifying and distributing drinking water, and collecting and treating sewage water in engineered water systems, are summarized. Finally, the advantages and disadvantages of commonly used algorithms are analyzed according to their structures and mechanisms, and recommendations on the selection of ML algorithms for different studies, as well as prospects on the application and development of ML in water science are proposed. This review provides references for solving a wider range of water-related problems and brings further insights into the intelligent development of water science.
Collapse
Affiliation(s)
- Ruixing Huang
- Key Laboratory of Eco-environments in the Three Gorges Reservoir Region, Ministry of Education, College of Environmental and Ecology, Chongqing University, Chongqing 400044, China; State Key Laboratory of Urban Water Resource and Environment, School of Municipal and Environmental Engineering, Harbin Institute of Technology, Harbin 150090, China
| | - Chengxue Ma
- Key Laboratory of Eco-environments in the Three Gorges Reservoir Region, Ministry of Education, College of Environmental and Ecology, Chongqing University, Chongqing 400044, China; State Key Laboratory of Urban Water Resource and Environment, School of Municipal and Environmental Engineering, Harbin Institute of Technology, Harbin 150090, China
| | - Jun Ma
- State Key Laboratory of Urban Water Resource and Environment, School of Municipal and Environmental Engineering, Harbin Institute of Technology, Harbin 150090, China
| | - Xiaoliu Huangfu
- Key Laboratory of Eco-environments in the Three Gorges Reservoir Region, Ministry of Education, College of Environmental and Ecology, Chongqing University, Chongqing 400044, China.
| | - Qiang He
- Key Laboratory of Eco-environments in the Three Gorges Reservoir Region, Ministry of Education, College of Environmental and Ecology, Chongqing University, Chongqing 400044, China
| |
Collapse
|
50
|
Tsai WP, Feng D, Pan M, Beck H, Lawson K, Yang Y, Liu J, Shen C. From calibration to parameter learning: Harnessing the scaling effects of big data in geoscientific modeling. Nat Commun 2021; 12:5988. [PMID: 34645796 PMCID: PMC8514470 DOI: 10.1038/s41467-021-26107-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 09/17/2021] [Indexed: 11/11/2022] Open
Abstract
The behaviors and skills of models in many geosciences (e.g., hydrology and ecosystem sciences) strongly depend on spatially-varying parameters that need calibration. A well-calibrated model can reasonably propagate information from observations to unobserved variables via model physics, but traditional calibration is highly inefficient and results in non-unique solutions. Here we propose a novel differentiable parameter learning (dPL) framework that efficiently learns a global mapping between inputs (and optionally responses) and parameters. Crucially, dPL exhibits beneficial scaling curves not previously demonstrated to geoscientists: as training data increases, dPL achieves better performance, more physical coherence, and better generalizability (across space and uncalibrated variables), all with orders-of-magnitude lower computational cost. We demonstrate examples that learned from soil moisture and streamflow, where dPL drastically outperformed existing evolutionary and regionalization methods, or required only ~12.5% of the training data to achieve similar performance. The generic scheme promotes the integration of deep learning and process-based models, without mandating reimplementation. Much effort is invested in calibrating model parameters for accurate outputs, but established methods can be inefficient and generic. By learning from big dataset, a new differentiable framework for model parameterization outperforms state-of-the-art methods, produce more physically-coherent results, using a fraction of the training data, computational power, and time. The method promotes a deep integration of machine learning with process-based geoscientific models.
Collapse
Affiliation(s)
- Wen-Ping Tsai
- Civil and Environmental Engineering, Pennsylvania State University, University Park, PA, USA
| | - Dapeng Feng
- Civil and Environmental Engineering, Pennsylvania State University, University Park, PA, USA
| | - Ming Pan
- Center for Western Weather and Water Extremes, Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA.,Civil and Environmental Engineering, Princeton University, Princeton, NJ, USA
| | | | - Kathryn Lawson
- Civil and Environmental Engineering, Pennsylvania State University, University Park, PA, USA.,HydroSapient, Inc, State College, PA, USA
| | - Yuan Yang
- Department of Hydraulic Engineering, Tsinghua University, Beijing, China.,Institute of Science and Technology, China Three Gorges Corporation, Beijing, China
| | - Jiangtao Liu
- Civil and Environmental Engineering, Pennsylvania State University, University Park, PA, USA
| | - Chaopeng Shen
- Civil and Environmental Engineering, Pennsylvania State University, University Park, PA, USA. .,HydroSapient, Inc, State College, PA, USA.
| |
Collapse
|