1
|
Qiu Y, Li Z, Zhang T, Zhang P. Predicting aqueous sorption of organic pollutants on microplastics with machine learning. WATER RESEARCH 2023; 244:120503. [PMID: 37639990 DOI: 10.1016/j.watres.2023.120503] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 08/17/2023] [Accepted: 08/18/2023] [Indexed: 08/31/2023]
Abstract
Microplastics (MPs) are ubiquitously distributed in freshwater systems and they can determine the environmental fate of organic pollutants (OPs) via sorption interaction. However, the diverse physicochemical properties of MPs and the wide range of OP species make a deeper understanding of sorption mechanisms challenging. Traditional isotherm-based sorption models are limited in their universality since they normally only consider the nature and characteristics of either sorbents or sorbates individually. Therefore, only specific equilibrium concentrations or specific sorption isotherms can be used to predict sorption. To systematically evaluate and predict OP sorption under the influence of both MPs and OPs properties, we collected 475 sorption data from peer-reviewed publications and developed a poly-parameter-linear-free-energy-relationship-embedded machine learning method to analyze the collected sorption datasets. Models of different algorithms were compared, and the genetic algorithm and support vector machine hybrid model displayed the best prediction performance (R2 of 0.93 and root-mean-square-error of 0.07). Finally, comparison results of three feature importance analysis tools (forward step wise method, Shapley method, and global sensitivity analysis) showed that chemical properties of MPs, excess molar refraction, and hydrogen-bonding interaction of OPs contribute the most to sorption, reflecting the dominant sorption mechanisms of hydrophobic partitioning, hydrogen bond formation, and π-π interaction, respectively. This study presents a novel sorbate-sorbent-based ML model with a wide applicability to expand our capacity in understanding the complicated process and mechanism of OP sorption on MPs.
Collapse
Affiliation(s)
- Ye Qiu
- Department of Civil and Environmental Engineering, Faculty of Science and Technology, University of Macau, Taipa, Macau SAR
| | - Zhejun Li
- Department of Civil and Environmental Engineering, Faculty of Science and Technology, University of Macau, Taipa, Macau SAR
| | - Tong Zhang
- College of Environmental Science and Engineering, Tianjin Key Laboratory of Environmental Remediation and Pollution Control, Nankai University, 38 Tongyan Rd., Tianjin 300350, China
| | - Ping Zhang
- Department of Civil and Environmental Engineering, Faculty of Science and Technology, University of Macau, Taipa, Macau SAR.
| |
Collapse
|
2
|
Vakarelska E, Nedyalkova M, Vasighi M, Simeonov V. Persistent organic pollutants (POPs) - QSPR classification models by means of Machine learning strategies. CHEMOSPHERE 2022; 287:132189. [PMID: 34826905 DOI: 10.1016/j.chemosphere.2021.132189] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/20/2021] [Accepted: 09/04/2021] [Indexed: 06/13/2023]
Abstract
Persistent Organic pollutants (POPs) are toxic chemicals with a shallow degradation rate and global negative impact. Their physicochemical is combined with the complex effects of long-term POPs accumulation in the environment and transport function through the food chain. That is why POPs have been linked to adverse effects on human health and animals. They circulate globally via different environmental pathways, and could be detected in regions far from their source of origin. The primary goal of the present study is to carry out classification of various representatives of POPs using different theoretical descriptors (molecular, structural) to develop quantitative structure-properties relationship (QSPR) models for predicting important properties POPs. Multivariate statistical methods such as hierarchical cluster analysis, principal components analysis and self-organizing maps were applied to reach excellent partitioning of 149 representatives of POPs into 4 classes using ten most appropriate descriptors (out of 63) defined by variable reduction procedure. The predictive capabilities of the defined classes could be applied as a pattern recognition for new and unidentified POPs, based only on structural properties that similar molecules may have. The additional self-organizing maps technique made it possible to visualize the feature-space and investigate possible patterns and similarities between POPs molecules. It contributes to confirmation of the proper classification into four classes. Based on SOM results, the effect of each variable and pattern formation has been presented.
Collapse
Affiliation(s)
- Ekaterina Vakarelska
- Department of Inorganic Chemistry, University of Sofia "St. Kl. Okhridski", Sofia, Bulgaria
| | - Miroslava Nedyalkova
- Department of Inorganic Chemistry, University of Sofia "St. Kl. Okhridski", Sofia, Bulgaria.
| | - Mahdi Vasighi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| | - Vasil Simeonov
- Department of Analytical Chemistry, University of Sofia "St. Kl. Okhridski", Sofia, Bulgaria
| |
Collapse
|
3
|
Zhu T, Gu L, Chen M, Sun F. Exploring QSPR models for predicting PUF-air partition coefficients of organic compounds with linear and nonlinear approaches. CHEMOSPHERE 2021; 266:128962. [PMID: 33218721 DOI: 10.1016/j.chemosphere.2020.128962] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 11/05/2020] [Accepted: 11/10/2020] [Indexed: 06/11/2023]
Abstract
Partition coefficients are important parameters for measuring the concentration of chemicals by passive sampling devices. Considering the wide application of the polyurethane foam (PUF) in passive air sampling, an attempt for developing several quantitative structure-property relationship (QSPR) models was made in this work, to predict PUF-air partition coefficients (KPUF-air) using linear (multiple linear regression, MLR) and non-linear (artificial neural network, ANN and support vector machine, SVM) methods by machine learning. All of the developed models were performed on a dataset of 170 compounds comprising 9 distinct classes. A series of statistical parameters and validation results showed that models had good prediction ability, robustness and goodness-of-fit. Furthermore, the underlying mechanisms of molecular descriptors emphasized that ionization potential, molecular bond, hydrophilicity, size of molecule and valence electron number had dominating influence on the adsorption process of chemicals. Overall, the obtained models were all established on the extensive applicability domains, and thus can be used as effective tools to predict the KPUF-air of new organic compounds or those have not been synthesized yet which, in turn, could help researchers better understand the mechanistic basis of adsorption behavior of PUF.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China.
| | - Liming Gu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China
| | - Ming Chen
- School of Civil Engineering, Southeast University, Nanjing, 210096, China
| | - Feng Sun
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou, 225127, Jiangsu, China
| |
Collapse
|
4
|
Zhu T, Chen W, Singh RP, Cui Y. Versatile in silico modeling of partition coefficients of organic compounds in polydimethylsiloxane using linear and nonlinear methods. JOURNAL OF HAZARDOUS MATERIALS 2020; 399:123012. [PMID: 32544766 DOI: 10.1016/j.jhazmat.2020.123012] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 05/15/2020] [Accepted: 05/20/2020] [Indexed: 06/11/2023]
Abstract
Environmental fate, behavior and effects of hazardous organic compounds have recently received great attention in diverse environmental phases, including water, atmosphere, soil and sediment. Considering polydimethylsiloxane (PDMS) fibers were validated for the wide application in the determination of partition behavior in passive sampling, in this work, several in silico models were established to predict PDMS-water (KPDMS-w), PDMS-air (KPDMS-a) and PDMS-seawater partition coefficients (KPDMS-sw) of diverse chemicals. This is an attempt to combine conventional linear method and popular nonlinear algorithm for the estimation of partition coefficients between PDMS and different environmental media. All of the developed models showed satisfactory goodness-of-fit with high adjusted correlation coefficient (R2adj) and were validated to be robust, stable and predictable by various internal and external validation techniques, deriving a wide series of statistical checks. Moreover, it was found that hydrophobicity, polarizability, charge distribution and molecular size of compounds contributed significantly to the model development by interpreting the selected descriptors. Based on the broad applicability domains (ADs), the current study provides suitable tools to fill the experimental data gap for other compounds and to help researchers better understand the mechanistic basis of adsorption behavior of PDMS.
Collapse
Affiliation(s)
- Tengyi Zhu
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China.
| | - Wenxuan Chen
- School of Environmental Science and Engineering, Yangzhou University, Yangzhou 225127, Jiangsu, China
| | | | - Yanran Cui
- Institute for Integrated Catalysis, Pacific Northwest National Laboratory, P.O. Box 999, Richland, WA 99354, United States
| |
Collapse
|
5
|
Jiao L, Bing S, Wang X, Xia D, Li H. Predicting the Aqueous Solubility of PCDD/Fs by using QSPR Method Based on the Molecular Distance-Edge Vector Index. Polycycl Aromat Compd 2015. [DOI: 10.1080/10406638.2015.1028588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Long Jiao
- College of Chemistry and Chemical Engineering, Xi’an Shiyou University, Xi’an, P.R. China
| | - Shan Bing
- College of Chemistry and Chemical Engineering, Xi’an Shiyou University, Xi’an, P.R. China
| | - Xiaofei Wang
- College of Chemistry and Chemical Engineering, Xi’an Shiyou University, Xi’an, P.R. China
| | - Donghui Xia
- School of Chemistry and Environmental Science, Shaanxi University of Technology, Hanzhong, P.R. China
| | - Hua Li
- College of Chemistry and Materials Science, Northwest University, Xi’an, P.R. China
| |
Collapse
|
6
|
Parinet J, Julien M, Nun P, Robins RJ, Remaud G, Höhener P. Predicting equilibrium vapour pressure isotope effects by using artificial neural networks or multi-linear regression - A quantitative structure property relationship approach. CHEMOSPHERE 2015; 134:521-527. [PMID: 25559176 DOI: 10.1016/j.chemosphere.2014.10.079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Revised: 10/13/2014] [Accepted: 10/19/2014] [Indexed: 06/04/2023]
Abstract
We aim at predicting the effect of structure and isotopic substitutions on the equilibrium vapour pressure isotope effect of various organic compounds (alcohols, acids, alkanes, alkenes and aromatics) at intermediate temperatures. We attempt to explore quantitative structure property relationships by using artificial neural networks (ANN); the multi-layer perceptron (MLP) and compare the performances of it with multi-linear regression (MLR). These approaches are based on the relationship between the molecular structure (organic chain, polar functions, type of functions, type of isotope involved) of the organic compounds, and their equilibrium vapour pressure. A data set of 130 equilibrium vapour pressure isotope effects was used: 112 were used in the training set and the remaining 18 were used for the test/validation dataset. Two sets of descriptors were tested, a set with all the descriptors: number of(12)C, (13)C, (16)O, (18)O, (1)H, (2)H, OH functions, OD functions, CO functions, Connolly Solvent Accessible Surface Area (CSA) and temperature and a reduced set of descriptors. The dependent variable (the output) is the natural logarithm of the ratios of vapour pressures (ln R), expressed as light/heavy as in classical literature. Since the database is rather small, the leave-one-out procedure was used to validate both models. Considering higher determination coefficients and lower error values, it is concluded that the multi-layer perceptron provided better results compared to multi-linear regression. The stepwise regression procedure is a useful tool to reduce the number of descriptors. To our knowledge, a Quantitative Structure Property Relationship (QSPR) approach for isotopic studies is novel.
Collapse
Affiliation(s)
- Julien Parinet
- Aix-Marseille Université, Laboratoire Chimie Environnement, FRE 3416-CNRS, Marseille, France
| | - Maxime Julien
- Université de Nantes, Chimie et Interdisciplinarité: Synthèse, Analyse et Modélisation, UMR 6230-CNRS, Nantes, France
| | - Pierrick Nun
- Université de Nantes, Chimie et Interdisciplinarité: Synthèse, Analyse et Modélisation, UMR 6230-CNRS, Nantes, France
| | - Richard J Robins
- Université de Nantes, Chimie et Interdisciplinarité: Synthèse, Analyse et Modélisation, UMR 6230-CNRS, Nantes, France
| | - Gerald Remaud
- Université de Nantes, Chimie et Interdisciplinarité: Synthèse, Analyse et Modélisation, UMR 6230-CNRS, Nantes, France
| | - Patrick Höhener
- Aix-Marseille Université, Laboratoire Chimie Environnement, FRE 3416-CNRS, Marseille, France.
| |
Collapse
|
7
|
Jiao L, Wang X, Bing S, Xue Z, Li H. QSPR study on the photolysis half-life of PCDD/Fs adsorbed on spruce (Picea abies (L.) Karst.) needle surfaces under sunlight irradiation by using a molecular distance-edge vector index. RSC Adv 2015. [DOI: 10.1039/c4ra14178d] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
QSPR study on the photolysis half-life of PCDD/Fs adsorbed to spruce (Picea abies (L.) Karst.) needle surfaces under sunlight irradiation.
Collapse
Affiliation(s)
- Long Jiao
- College of Chemistry and Chemical Engineering
- Xi'an Shiyou University
- Xi'an 710065
- P. R. China
- College of Chemistry and Materials Science
| | - Xiaofei Wang
- College of Chemistry and Chemical Engineering
- Xi'an Shiyou University
- Xi'an 710065
- P. R. China
| | - Shan Bing
- College of Chemistry and Chemical Engineering
- Xi'an Shiyou University
- Xi'an 710065
- P. R. China
| | - Zhiwei Xue
- No. 203 Research Institute of Nuclear Industry
- Xian yang 712000
- P. R. China
| | - Hua Li
- College of Chemistry and Materials Science
- Northwest University
- Xi'an 710069
- P. R. China
| |
Collapse
|
8
|
Zhou J, Xu Z, Chen S. Simulation and prediction of the thuringiensin abiotic degradation processes in aqueous solution by a radius basis function neural network model. CHEMOSPHERE 2013; 91:442-447. [PMID: 23273880 DOI: 10.1016/j.chemosphere.2012.11.062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Accepted: 11/24/2012] [Indexed: 06/01/2023]
Abstract
The thuringiensin abiotic degradation processes in aqueous solution under different conditions, with a pH range of 5.0-9.0 and a temperature range of 10-40°C, were systematically investigated by an exponential decay model and a radius basis function (RBF) neural network model, respectively. The half-lives of thuringiensin calculated by the exponential decay model ranged from 2.72 d to 16.19 d under the different conditions mentioned above. Furthermore, an RBF model with accuracy of 0.1 and SPREAD value 5 was employed to model the degradation processes. The results showed that the model could simulate and predict the degradation processes well. Both the half-lives and the prediction data showed that thuringiensin was an easily degradable antibiotic, which could be an important factor in the evaluation of its safety.
Collapse
Affiliation(s)
- Jingwen Zhou
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, 1800 Lihu Road, Wuxi, Jiangsu 214122, China
| | | | | |
Collapse
|
9
|
Xu HY, Zou JW, Min JQ, Wang W. A quantitative structure-property relationship analysis of soot-water partition coefficients for persistent organic pollutants. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2012; 80:1-5. [PMID: 22377400 DOI: 10.1016/j.ecoenv.2012.02.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Revised: 01/31/2012] [Accepted: 02/02/2012] [Indexed: 05/31/2023]
Abstract
Geometrical optimization and electrostatic potential calculations have been performed at the HF/6-31G level of theory for investigated persistent organic pollutants (POPs). A number of statistically based parameters have been obtained. Relationship between soot-water partition coefficients (logK(SC)) of POPs and the structural descriptors has been established by the multiple linear regression method. The result shows that the quantities derived from electrostatic potential V(s)(-)¯ and V(s,max), together with molecular surface area (A(S)) and the energy of the highest occupied molecular orbital (E(HOMO)) can be well used to express the quantitative relationship between structure and logK(SC) (QSPR) of POPs. Predictive capability of the model has been demonstrated by leave-one-out cross-validation with the cross-validated correlation coefficient of 0.9797. Furthermore, the predictive power of this model was further examined for the external test set with the correlation coefficient of 0.9811 between observed and predicted logK(SC), validating the robustness and good predictive ability of our model. Furthermore, in order to further investigate the applicability of these parameters derived from electrostatic potential in prediction of soot-water partition coefficient for organic pollutants, eleven polycyclic aromatic hydrocarbons (PAHs), eleven polychlorinated biphenyls (PCBs) and nine phenyl urea herbicides (PUHs) from other source have also been studied. The QSPR models established may provide a new powerful method for predicting soot-water partition coefficients (logK(SC)) of organic pollutants.
Collapse
Affiliation(s)
- Hui-Ying Xu
- College of Biology & Environment Engineering, Zhejiang Shuren University, Hangzhou 310015, China.
| | | | | | | |
Collapse
|