1
|
Huang BY, Lü QX, Tang ZX, Tang Z, Chen HP, Yang XP, Zhao FJ, Wang P. Machine learning methods to predict cadmium (Cd) concentration in rice grain and support soil management at a regional scale. FUNDAMENTAL RESEARCH 2024; 4:1196-1205. [PMID: 39431142 PMCID: PMC11489518 DOI: 10.1016/j.fmre.2023.02.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 01/15/2023] [Accepted: 02/28/2023] [Indexed: 03/12/2023] Open
Abstract
Rice is a major dietary source of the toxic metal cadmium (Cd). Concentration of Cd in rice grain varies widely at the regional scale, and it is challenging to predict grain Cd concentration using soil properties. The lack of reliable predictive models hampers management of contaminated soils. Here, we conducted a three-year survey of 601 pairs of soil and rice samples at a regional scale. Approximately 78.3% of the soil samples exceeded the soil screening values for Cd in China, and 53.9% of rice grain samples exceeded the Chinese maximum permissible limit for Cd. Predictive models were developed using multiple linear regression and machine learning methods. The correlations between rice grain Cd and soil total Cd concentrations were poor (R 2 < 0.17). Both linear regression and machine learning methods identified four key factors that significantly affect grain Cd concentrations, including Fe-Mn oxide bound Cd, soil pH, field soil moisture content, and the concentration of soil reducible Mn. The machine learning-based support vector machine model showed the best performance (R 2 = 0.87) in predicting grain Cd concentrations at a regional scale, followed by machine learning-based random forest model (R 2 = 0.67), and back propagation neural network model (R 2 = 0.64). Scenario simulations revealed that liming soil to a target pH of 6.5 could be one of the most cost-effective approaches to reduce the exceedance of Cd in rice grain. Taken together, these results show that machine learning methods can be used to predict Cd concentration in rice grain reliably at a regional scale and to support soil management and safe rice production.
Collapse
Affiliation(s)
- Bo-Yang Huang
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Qi-Xin Lü
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Zhi-Xian Tang
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Zhong Tang
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Hong-Ping Chen
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Xin-Ping Yang
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Fang-Jie Zhao
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
| | - Peng Wang
- Jiangsu Collaborative Innovation Center for Solid Organic Waste Resource Utilization, College of Resources and Environmental Sciences, Nanjing Agricultural University, Nanjing 210095, China
- Centre for Agriculture and Health, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing 210095, China
| |
Collapse
|
2
|
Xiang T, Liu Y, Guo Y, Zhang J, Liu J, Yao L, Mao Y, Yang X, Liu J, Liu R, Jin X, Shi J, Qu G, Jiang G. Occurrence and Prioritization of Human Androgen Receptor Disruptors in Sewage Sludges Across China. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:10309-10321. [PMID: 38795035 DOI: 10.1021/acs.est.4c02476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2024]
Abstract
The global practice of reusing sewage sludge in agriculture and its landfill disposal reintroduces environmental contaminants, posing risks to human and ecological health. This study screened sewage sludge from 30 Chinese cities for androgen receptor (AR) disruptors, utilizing a disruptor list from the Toxicology in the 21st Century program (Tox21), and identified 25 agonists and 33 antagonists across diverse use categories. Predominantly, natural products 5α-dihydrotestosterone and thymidine emerged as agonists, whereas the industrial intermediate caprolactam was the principal antagonist. In-house bioassays for identified disruptors displayed good alignment with Tox21 potency data, validating employing Tox21 toxicity data for theoretical toxicity estimations. Potency calculations revealed 5α-dihydrotestosterone and two pharmaceuticals (17β-trenbolone and testosterone isocaproate) as the most potent AR agonists and three dyes (rhodamine 6G, Victoria blue BO, and gentian violet) as antagonists. Theoretical effect contribution evaluations prioritized 5α-dihydrotestosterone and testosterone isocaproate as high-risk AR agonists and caprolactam, rhodamine 6G, and 8-hydroxyquinoline (as a biocide and a preservative) as key antagonists. Notably, 16 agonists and 20 antagonists were newly reported in the sludge, many exhibiting significant detection frequencies, concentrations, and/or toxicities, demanding future scrutiny. Our study presents an efficient strategy for estimating environmental sample toxicity and identifying key toxicants, thereby supporting the development of appropriate sludge management strategies.
Collapse
Affiliation(s)
- Tongtong Xiang
- College of Sciences, Northeastern University, Shenyang110004, China
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
| | - Yanna Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
| | - Yunhe Guo
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
- College of Environmental and Resource Science, Zhejiang University, Hangzhou 310058, China
| | - Jie Zhang
- School of Environmental Science and Engineering, Shandong University, Qingdao266237, China
| | - Jifu Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
- School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310000, China
| | - Linlin Yao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
| | - Yuxiang Mao
- School of Resources and Environment, Henan Polytechnic University, Jiaozuo 454000, China
| | - Xiaoxi Yang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
| | - Jun Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
| | - Runzeng Liu
- Shandong Key Laboratory of Environmental Processes and Health, School of Environmental Science and Engineering, Shandong University, Qingdao 266237, China
| | - Xiaoting Jin
- Department of Occupational Health and Environmental Health, School of Public Health, Qingdao University, Qingdao266071, China
| | - Jianbo Shi
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
- School of Environmental Studies, China University of Geosciences, Wuhan430074, China
| | - Guangbo Qu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
- School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310000, China
| | - Guibin Jiang
- College of Sciences, Northeastern University, Shenyang110004, China
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
- College of Environmental and Resource Science, Zhejiang University, Hangzhou 310058, China
- School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310000, China
| |
Collapse
|
3
|
Zhao L, Xue Q, Zhang H, Hao Y, Yi H, Liu X, Pan W, Fu J, Zhang A. CatNet: Sequence-based deep learning with cross-attention mechanism for identifying endocrine-disrupting chemicals. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133055. [PMID: 38016311 DOI: 10.1016/j.jhazmat.2023.133055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/02/2023] [Accepted: 11/20/2023] [Indexed: 11/30/2023]
Abstract
Endocrine-disrupting chemicals (EDCs) pose significant environmental and health risks due to their potential to interfere with nuclear receptors (NRs), key regulators of physiological processes. Despite the evident risks, the majority of existing research narrows its focus on the interaction between compounds and the individual NR target, neglecting a comprehensive assessment across the entire NR family. In response, this study assembled a comprehensive human NR dataset, capturing 49,244 interactions between 35,467 unique compounds and 42 NRs. We introduced a cross-attention network framework, "CatNet", innovatively integrating compound and protein representations through cross-attention mechanisms. The results showed that CatNet model achieved excellent performance with an area under the receiver operating characteristic curve (AUCROC) = 0.916 on the test set, and exhibited reliable generalization on unseen compound-NR pairs. A distinguishing feature of our research is its capacity to expand to novel targets. Beyond its predictive accuracy, CatNet offers a valuable mechanistic perspective on compound-NR interactions through feature visualization. Augmenting the utility of our research, we have also developed a graphical user interface, empowering researchers to predict chemical binding to diverse NRs. Our model enables the prediction of human NR-related EDCs and shows the potential to identify EDCs related to other targets.
Collapse
Affiliation(s)
- Lu Zhao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China
| | - Qiao Xue
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China.
| | - Huazhou Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China
| | - Yuxing Hao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China
| | - Hang Yi
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China
| | - Xian Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China
| | - Wenxiao Pan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China
| | - Jianjie Fu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310012, PR China
| | - Aiqian Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310012, PR China.
| |
Collapse
|
4
|
Wiriyarattanakul A, Xie W, Toopradab B, Wiriyarattanakul S, Shi L, Rungrotmongkol T, Maitarad P. Comparative Study of Machine Learning-Based QSAR Modeling of Anti-inflammatory Compounds from Durian Extraction. ACS OMEGA 2024; 9:7817-7826. [PMID: 38405441 PMCID: PMC10882656 DOI: 10.1021/acsomega.3c07386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 11/15/2023] [Accepted: 12/27/2023] [Indexed: 02/27/2024]
Abstract
Quantitative structure-activity relationship (QSAR) analysis, an in silico methodology, offers enhanced efficiency and cost effectiveness in investigating anti-inflammatory activity. In this study, a comprehensive comparative analysis employing four machine learning algorithms (random forest (RF), gradient boosting regression (GBR), support vector regression (SVR), and artificial neural networks (ANNs)) was conducted to elucidate the activities of naturally derived compounds from durian extraction. The analysis was grounded in the exploration of structural attributes encompassing steric and electrostatic properties. Notably, the nonlinear SVR model, utilizing five key features, exhibited superior performance compared to the other models. It demonstrated exceptional predictive accuracy for both the training and external test datasets, yielding R2 values of 0.907 and 0.812, respectively; in addition, their RMSE resulted in 0.123 and 0.097, respectively. The study outcomes underscore the significance of specific structural factors (denoted as shadow ratio, dipole z, methyl, ellipsoidal volume, and methoxy) in determining anti-inflammatory efficacy. Thus, the findings highlight the potential of molecular simulations and machine learning as alternative avenues for the rational design of novel anti-inflammatory agents.
Collapse
Affiliation(s)
- Amphawan Wiriyarattanakul
- Program
in Chemistry, Faculty of Science and Technology, Uttaradit Rajabhat University, Uttaradit 53000, Thailand
| | - Wanting Xie
- Research
Center of Nano Science and Technology, College of Sciences, Shanghai University, Shanghai 200444, P. R. China
| | - Borwornlak Toopradab
- Center
of Excellence in Structural and Computational Biology, Department
of Biochemistry, Chulalongkorn University, Bangkok 10330, Thailand
- Program
in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
| | - Sopon Wiriyarattanakul
- Program
in Computer Science, Faculty of Science and Technology, Uttaradit Rajabhat University, Uttaradit 53000, Thailand
| | - Liyi Shi
- Research
Center of Nano Science and Technology, College of Sciences, Shanghai University, Shanghai 200444, P. R. China
- Emerging
Industries Institute Shanghai University, Jiaxing, Zhejiang 314006, P. R. China
| | - Thanyada Rungrotmongkol
- Center
of Excellence in Structural and Computational Biology, Department
of Biochemistry, Chulalongkorn University, Bangkok 10330, Thailand
- Program
in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
| | - Phornphimon Maitarad
- Research
Center of Nano Science and Technology, College of Sciences, Shanghai University, Shanghai 200444, P. R. China
| |
Collapse
|
5
|
Shi W, Lin K, Zhao Y, Li Z, Zhou T. Toward a comprehensive understanding of alicyclic compounds: Bio-effects perspective and deep learning approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 912:168927. [PMID: 38042202 DOI: 10.1016/j.scitotenv.2023.168927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/17/2023] [Accepted: 11/25/2023] [Indexed: 12/04/2023]
Abstract
The escalating use of alicyclic compounds in modern industrial production has led to a rapid increase of these substances in the environment, posing significant health hazards. Addressing this challenge necessitates a comprehensive understanding of these compounds, which can be achieved through the deep learning approach. Graph neural networks (GNN) known for its' extraordinary ability to process graph data with rich relationships, have been employed in various molecular prediction tasks. In this study, alicyclic molecules screened from PCBA, Toxcast and Tox21 are made as general bioactivity and biological targets' activity prediction datasets. GNN-based models are trained on the two datasets, while the Attentive FP and PAGTN achieve best performance individually. In addition, alicyclic carbon atoms make the greatest contribution to biological activity, which indicate that the alicycle structures have significant impact on the carbon atoms' contribution. Moreover, there are terrific number of active molecules in other public datasets, indicates that alicyclic compounds deserve more attention in POPs control. This study uncovered deeper structural-activity relationships within these compounds, offering new perspectives and methodologies for academic research in the field.
Collapse
Affiliation(s)
- Wenjie Shi
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China.
| | - Kunsen Lin
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China.
| | - Youcai Zhao
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China; Shanghai Institute of Pollution Control and Ecological Security, 1515 North Zhongshan Rd. (No. 2), Shanghai 200092, PR China
| | - Zongsheng Li
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | - Tao Zhou
- The State Key Laboratory of Pollution Control and Resource Reuse, School of Environmental Science and Engineering, Tongji University, 1239 Siping Road, Shanghai 200092, China; Shanghai Institute of Pollution Control and Ecological Security, 1515 North Zhongshan Rd. (No. 2), Shanghai 200092, PR China.
| |
Collapse
|
6
|
Yang Z, Wang L, Yang Y, Pang X, Sun Y, Liang Y, Cao H. Screening of the Antagonistic Activity of Potential Bisphenol A Alternatives toward the Androgen Receptor Using Machine Learning and Molecular Dynamics Simulation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:2817-2829. [PMID: 38291630 DOI: 10.1021/acs.est.3c09779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Over the past few decades, extensive research has indicated that exposure to bisphenol A (BPA) increases the health risks in humans. Toxicological studies have demonstrated that BPA can bind to the androgen receptor (AR), resulting in endocrine-disrupting effects. In recent investigations, many alternatives to BPA have been detected in various environmental media as major pollutants. However, related experimental evaluations of BPA alternatives have not been systematically implemented for the assessment of chemical safety and the effects of structural characteristics on the antagonistic activity of the AR. To promote the green development of BPA alternatives, high-throughput toxicological screening is fundamental for prioritizing chemical tests. Therefore, we proposed a hybrid deep learning architecture that combines molecular descriptors and molecular graphs to predict AR antagonistic activity. Compared to previous models, this hybrid architecture can extract substantial chemical information from various molecular representations to improve the model's generalization ability for BPA alternatives. Our predictions suggest that lignin-derivable bisguaiacols, as alternatives to BPA, are likely to be nonantagonist for AR compared to bisphenol analogues. Additionally, molecular dynamics (MD) simulations identified the dihydrotestosterone-bound pocket, rather than the surface, as the major binding site of bisphenol analogues. The conformational changes of key helix H12 from an agonistic to an antagonistic conformation can be evaluated qualitatively by accelerated MD simulations to explain the underlying mechanism. Overall, our computational study is helpful for toxicological screening of BPA alternatives and the design of environmentally friendly BPA alternatives.
Collapse
Affiliation(s)
- Zeguo Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Ling Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Ying Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Xudi Pang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yuzhen Sun
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yong Liang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Huiming Cao
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| |
Collapse
|
7
|
Wu X, Ren J, Xu Q, Xiao Y, Li X, Peng Y. Priority screening of contaminant of emerging concern (CECs) in surface water from drinking water sources in the lower reaches of the Yangtze River based on exposure-activity ratios (EARs). THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 856:159016. [PMID: 36162578 DOI: 10.1016/j.scitotenv.2022.159016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/20/2022] [Accepted: 09/20/2022] [Indexed: 06/16/2023]
Abstract
Surface water provides ecological services such as drinking water supply. However, contaminants of emerging concern (CECs) are rising concerns because they are ubiquitously detected in surface water and pose potential risks to the aquatic environment and human health. This study investigated the occurrence of 165 CECs in surface water from drinking water source areas along the lower reaches of the Yangtze River to prioritize the CECs and to estimate potential biological activity based on exposure-activity ratio (EAR). A total of 70 CECs were detected in the surface water at least once at the selected 17 sampling sites, and their concentrations ranged from 0.592 to 4650 ng/L. Twenty-four CECs were detected at each site, and these were mostly pharmaceutical and personal care products and pesticides. Sucralose, 1H-benzotriazole and carbendazim were the most common CECs with high median concentrations in the study area. Specifically, sucralose, an artificial sweetener, was presented at each site with the highest median concentration (3010 ng/L), which indicated that anthropogenic inputs are an important source of contaminants. Medroxyprogesterone and trenbolone were identified as the priority contaminants of interest, with maximum EARchemical values of 0.389 and 0.183, respectively. Among all the sites, the higher cumulative EARmixture value was found from Nantong City (0.765), which indicated that this site could have a relatively greater potential for biological effects, and these effects were mainly due to medroxyprogesterone and trenbolone. In regard to the bioactivity of all detected CECs, nuclear receptors showed the greatest potential bioactivity in this region, particularly androgen receptor-mediated bioactivity, which is most likely affected organisms residing in the source water area. These results suggest that the drinking water sources from the studied region are contaminated with CECs, and highlight the prioritization of future monitoring and research to protect source waters.
Collapse
Affiliation(s)
- Xinyi Wu
- Research and Development Center for Watershed Environmental Eco-Engineering, Advanced Institute of Natural Sciences, Beijing Normal University, Zhuhai 519087, China
| | - Jinzhi Ren
- College of Life Science, Jinan University, Guangzhou 510000, China
| | - Qiang Xu
- School of the Environment, Nanjing University, Nanjing 210023, China
| | - Yao Xiao
- Research and Development Center for Watershed Environmental Eco-Engineering, Advanced Institute of Natural Sciences, Beijing Normal University, Zhuhai 519087, China
| | - Xia Li
- Research and Development Center for Watershed Environmental Eco-Engineering, Advanced Institute of Natural Sciences, Beijing Normal University, Zhuhai 519087, China; School of Environment, Beijing Normal University, Beijing 100875, China
| | - Ying Peng
- Research and Development Center for Watershed Environmental Eco-Engineering, Advanced Institute of Natural Sciences, Beijing Normal University, Zhuhai 519087, China; School of Environment, Beijing Normal University, Beijing 100875, China; School of the Environment, Nanjing University, Nanjing 210023, China.
| |
Collapse
|
8
|
Zhao Q, Yu Y, Gao Y, Shen L, Cui S, Gou Y, Zhang C, Zhuang S, Jiang G. Machine Learning-Based Models with High Accuracy and Broad Applicability Domains for Screening PMT/vPvM Substances. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:17880-17889. [PMID: 36475377 DOI: 10.1021/acs.est.2c06155] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Persistent, mobile, and toxic (PMT) substances and very persistent and very mobile (vPvM) substances can transport over long distances from various sources, increasing the public health risk. A rapid and high-throughput screening of PMT/vPvM substances is thus warranted to the risk prevention and mitigation measures. Herein, we construct a machine learning-based screening system integrated with five models for high-throughput classification of PMT/vPvM substances. The models are constructed with 44 971 substances by conventional learning, deep learning, and ensemble learning algorithms, among which, LightGBM and XGBoost outperform other algorithms with metrics exceeding 0.900. Good model interpretability is achieved through the number of free halogen atoms (fr_halogen) and the logarithm of partition coefficient (MolLogP) as the two most critical molecular descriptors representing the persistence and mobility of substances, respectively. Our screening system exhibits a great generalization capability with area under the receiver operating characteristic curve (AUROC) above 0.951 and is successfully applied to the persistent organic pollutants (POPs), prioritized PMT/vPvM substances, and pesticides. The screening system constructed in this study can serve as an efficient and reliable tool for high-throughput risk assessment and the prioritization of managing emerging contaminants.
Collapse
Affiliation(s)
- Qiming Zhao
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China
| | - Yang Yu
- Solid Waste and Chemicals Management Center, Ministry of Ecology and Environment of the People's Republic of China, Beijing100029, China
| | - Yuchen Gao
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China
| | - Lilai Shen
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China
| | - Shixuan Cui
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China
- Women's Reproductive Health Key Laboratory of Zhejiang Province, Women's Hospital, School of Medicine, Zhejiang University, Hangzhou310006, China
| | - Yiyuan Gou
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China
| | - Chunlong Zhang
- Department of Environmental Sciences, University of Houston-Clear Lake, 2700 Bay Area Blvd., Houston, Texas77058, United States
| | - Shulin Zhuang
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou310058, China
- Women's Reproductive Health Key Laboratory of Zhejiang Province, Women's Hospital, School of Medicine, Zhejiang University, Hangzhou310006, China
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing100085, China
| |
Collapse
|
9
|
Wang J, Lou C, Liu G, Li W, Wu Z, Tang Y. Profiling prediction of nuclear receptor modulators with multi-task deep learning methods: toward the virtual screening. Brief Bioinform 2022; 23:6673852. [PMID: 35998896 DOI: 10.1093/bib/bbac351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 07/13/2022] [Accepted: 07/27/2022] [Indexed: 11/13/2022] Open
Abstract
Nuclear receptors (NRs) are ligand-activated transcription factors, which constitute one of the most important targets for drug discovery. Current computational strategies mainly focus on a single target, and the transfer of learned knowledge among NRs was not considered yet. Herein we proposed a novel computational framework named NR-Profiler for prediction of potential NR modulators with high affinity and specificity. First, we built a comprehensive NR data set including 42 684 interactions to connect 42 NRs and 31 033 compounds. Then, we used multi-task deep neural network and multi-task graph convolutional neural network architectures to construct multi-task multi-classification models. To improve the predictive capability and robustness, we built a consensus model with an area under the receiver operating characteristic curve (AUC) = 0.883. Compared with conventional machine learning and structure-based approaches, the consensus model showed better performance in external validation. Using this consensus model, we demonstrated the practical value of NR-Profiler in virtual screening for NRs. In addition, we designed a selectivity score to quantitatively measure the specificity of NR modulators. Finally, we developed a freely available standalone software for users to make profiling predictions for their compounds of interest. In summary, our NR-Profiler provides a useful tool for NR-profiling prediction and is expected to facilitate NR-based drug discovery.
Collapse
Affiliation(s)
- Jiye Wang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Chaofeng Lou
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Guixia Liu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Weihua Li
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zengrui Wu
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Yun Tang
- Shanghai Frontiers Science Center of Optogenetic Techniques for Cell Metabolism, School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
10
|
Zushi Y. Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC-MS. Anal Chem 2022; 94:9149-9157. [PMID: 35700270 PMCID: PMC9246259 DOI: 10.1021/acs.analchem.2c01667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
With advances in
machine learning (ML) techniques, the quantitative
structure–activity relationship (QSAR) approach is becoming
popular for evaluating chemicals. However, the QSAR approach requires
that the chemical structure of the target compound is known and that
it should be convertible to molecular descriptors. These requirements
lead to limitations in predicting the properties and toxicities of
chemicals distributed in the environment as in the PubChem database;
the structural information on only 14% of compounds is available.
This study proposes a new ML-based QSAR approach that can predict
the properties and toxicities of compounds using analytical descriptors
of mass spectrum and retention index obtained via gas chromatography–mass
spectrometry without requiring exact structural information. The model
was developed based on the XGBoost ML method. The root-mean-square
errors (RMSEs) for log Ko-w, log (molecular weight), melting point,
boiling point, log (vapor pressure), log (water solubility), log (LD50) (rat, oral), and log (LD50) (mouse, oral) are
0.97, 0.052, 51, 23, 0.74, 1.1, 0.74, and 0.6, respectively. The model
performed well on a chemical standard mixture measurement, with similar
results to those of model validation. It also performed well on a
measurement of contaminated oil with spectral deconvolution. These
results indicate that the model is suitable for investigating unknown-structured
chemicals detected in measurements. Any online user can execute the
model through a web application named Detective-QSAR (http://www.mixture-platform.net/Detective_QSAR_Med_Open/). The analytical descriptor-based approach is expected to create
new opportunities for the evaluation of unknown chemicals around us.
Collapse
Affiliation(s)
- Yasuyuki Zushi
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology, 16-1 Onogawa, Tsukuba, Ibaraki 305-8506, Japan.,Graduate School of Science and Technology, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
11
|
Sui S, Liu H, Yang X. Research Progress of the Endocrine-Disrupting Effects of Disinfection Byproducts. J Xenobiot 2022; 12:145-157. [PMID: 35893263 PMCID: PMC9326600 DOI: 10.3390/jox12030013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/20/2022] [Accepted: 06/22/2022] [Indexed: 11/16/2022] Open
Abstract
Since 1974, more than 800 disinfection byproducts (DBPs) have been identified from disinfected drinking water, swimming pool water, wastewaters, etc. Some DBPs are recognized as contaminants of high environmental concern because they may induce many detrimental health (e.g., cancer, cytotoxicity, and genotoxicity) and/or ecological (e.g., acute toxicity and development toxicity on alga, crustacean, and fish) effects. However, the information on whether DBPs may elicit potential endocrine-disrupting effects in human and wildlife is scarce. It is the major objective of this paper to summarize the reported potential endocrine-disrupting effects of the identified DBPs in the view of adverse outcome pathways (AOPs). In this regard, we introduce the potential molecular initiating events (MIEs), key events (KEs), and adverse outcomes (AOs) associated with exposure to specific DBPs. The present evidence indicates that the endocrine system of organism can be perturbed by certain DBPs through some MIEs, including hormone receptor-mediated mechanisms and non-receptor-mediated mechanisms (e.g., hormone transport protein). Lastly, the gaps in our knowledge of the endocrine-disrupting effects of DBPs are highlighted, and critical directions for future studies are proposed.
Collapse
|
12
|
Jeong J, Choi J. Artificial Intelligence-Based Toxicity Prediction of Environmental Chemicals: Future Directions for Chemical Management Applications. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:7532-7543. [PMID: 35666838 DOI: 10.1021/acs.est.1c07413] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Recently, research on the development of artificial intelligence (AI)-based computational toxicology models that predict toxicity without the use of animal testing has emerged because of the rapid development of computer technology. Various computational toxicology techniques that predict toxicity based on the structure of chemical substances are gaining attention, including the quantitative structure-activity relationship. To understand the recent development of these models, we analyzed the databases, molecular descriptors, fingerprints, and algorithms considered in recent studies. Based on a selection of 96 papers published since 2014, we found that AI models have been developed to predict approximately 30 different toxicity end points using more than 20 toxicity databases. For model development, molecular access system and extended-connectivity fingerprints are the most commonly used molecular descriptors. The most used algorithm among the machine learning techniques is the random forest, while the most used algorithm among the deep learning techniques is a deep neural network. The use of AI technology in the development of toxicity prediction models is a new concept that will aid in achieving a scientific accord and meet regulatory applications. The comprehensive overview provided in this study will provide a useful guide for the further development and application of toxicity prediction models.
Collapse
Affiliation(s)
- Jaeseong Jeong
- School of Environmental Engineering, University of Seoul, 163 Seoulsiripdae-ro, Dongdaemun-gu, Seoul 02504, South Korea
| | - Jinhee Choi
- School of Environmental Engineering, University of Seoul, 163 Seoulsiripdae-ro, Dongdaemun-gu, Seoul 02504, South Korea
| |
Collapse
|
13
|
Sellami A, Réau M, Montes M, Lagarde N. Review of in silico studies dedicated to the nuclear receptor family: Therapeutic prospects and toxicological concerns. Front Endocrinol (Lausanne) 2022; 13:986016. [PMID: 36176461 PMCID: PMC9513233 DOI: 10.3389/fendo.2022.986016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
Being in the center of both therapeutic and toxicological concerns, NRs are widely studied for drug discovery application but also to unravel the potential toxicity of environmental compounds such as pesticides, cosmetics or additives. High throughput screening campaigns (HTS) are largely used to detect compounds able to interact with this protein family for both therapeutic and toxicological purposes. These methods lead to a large amount of data requiring the use of computational approaches for a robust and correct analysis and interpretation. The output data can be used to build predictive models to forecast the behavior of new chemicals based on their in vitro activities. This atrticle is a review of the studies published in the last decade and dedicated to NR ligands in silico prediction for both therapeutic and toxicological purposes. Over 100 articles concerning 14 NR subfamilies were carefully read and analyzed in order to retrieve the most commonly used computational methods to develop predictive models, to retrieve the databases deployed in the model building process and to pinpoint some of the limitations they faced.
Collapse
|
14
|
García-Sosa AT. Androgen Receptor Binding Category Prediction with Deep Neural Networks and Structure-, Ligand-, and Statistically Based Features. Molecules 2021; 26:1285. [PMID: 33652992 PMCID: PMC7956632 DOI: 10.3390/molecules26051285] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 02/23/2021] [Accepted: 02/24/2021] [Indexed: 01/10/2023] Open
Abstract
Substances that can modify the androgen receptor pathway in humans and animals are entering the environment and food chain with the proven ability to disrupt hormonal systems and leading to toxicity and adverse effects on reproduction, brain development, and prostate cancer, among others. State-of-the-art databases with experimental data of human, chimp, and rat effects by chemicals have been used to build machine-learning classifiers and regressors and to evaluate these on independent sets. Different featurizations, algorithms, and protein structures lead to different results, with deep neural networks (DNNs) on user-defined physicochemically relevant features developed for this work outperforming graph convolutional, random forest, and large featurizations. The results show that these user-provided structure-, ligand-, and statistically based features and specific DNNs provided the best results as determined by AUC (0.87), MCC (0.47), and other metrics and by their interpretability and chemical meaning of the descriptors/features. In addition, the same features in the DNN method performed better than in a multivariate logistic model: validation MCC = 0.468 and training MCC = 0.868 for the present work compared to evaluation set MCC = 0.2036 and training set MCC = 0.5364 for the multivariate logistic regression on the full, unbalanced set. Techniques of this type may improve AR and toxicity description and prediction, improving assessment and design of compounds. Source code and data are available on github.
Collapse
|
15
|
Lane TR, Foil DH, Minerali E, Urbina F, Zorn KM, Ekins S. Bioactivity Comparison across Multiple Machine Learning Algorithms Using over 5000 Datasets for Drug Discovery. Mol Pharm 2020; 18:403-415. [PMID: 33325717 DOI: 10.1021/acs.molpharmaceut.0c01013] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies, we and others have applied multiple machine learning algorithms and modeling metrics and, in some cases, compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and in comparison of our proprietary software Assay Central with random forest, k-nearest neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (three layers). Model performance was assessed using an array of fivefold cross-validation metrics including area-under-the-curve, F1 score, Cohen's kappa, and Matthews correlation coefficient. Based on ranked normalized scores for the metrics or datasets, all methods appeared comparable, while the distance from the top indicated that Assay Central and support vector classification were comparable. Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case. If anything, Assay Central may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay Central performance, although support vector classification seems to be a strong competitor. We also applied Assay Central to perform prospective predictions for the toxicity targets PXR and hERG to further validate these models. This work appears to be the largest scale comparison of these machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors, and machine learning algorithms and further refine the methods for evaluating and comparing such models.
Collapse
Affiliation(s)
- Thomas R Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Daniel H Foil
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Eni Minerali
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Fabio Urbina
- Department of Cell Biology and Physiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7545, United States
| | - Kimberley M Zorn
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| |
Collapse
|
16
|
Zorn KM, Foil DH, Lane TR, Hillwalker W, Feifarek DJ, Jones F, Klaren WD, Brinkman AM, Ekins S. Comparing Machine Learning Models for Aromatase (P450 19A1). ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:15546-15555. [PMID: 33207874 PMCID: PMC8194505 DOI: 10.1021/acs.est.0c05771] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Aromatase, or cytochrome P450 19A1, catalyzes the aromatization of androgens to estrogens within the body. Changes in the activity of this enzyme can produce hormonal imbalances that can be detrimental to sexual and skeletal development. Inhibition of this enzyme can occur with drugs and natural products as well as environmental chemicals. Therefore, predicting potential endocrine disruption via exogenous chemicals requires that aromatase inhibition be considered in addition to androgen and estrogen pathway interference. Bayesian machine learning methods can be used for prospective prediction from the molecular structure without the need for experimental data. Herein, the generation and evaluation of multiple machine learning models utilizing different sources of aromatase inhibition data are described. These models are applied to two test sets for external validation with molecules relevant to drug discovery from the public domain. In addition, the performance of multiple machine learning algorithms was evaluated by comparing internal five-fold cross-validation statistics of the training data. These methods to predict aromatase inhibition from molecular structure, when used in concert with estrogen and androgen machine learning models, allow for a more holistic assessment of endocrine-disrupting potential of chemicals with limited empirical data and enable the reduction of the use of hazardous substances.
Collapse
Affiliation(s)
- Kimberley M. Zorn
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, USA
| | - Daniel H. Foil
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, USA
| | - Thomas R. Lane
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, USA
| | - Wendy Hillwalker
- Global Product Safety, SC Johnson and Son, Inc., Racine, WI, USA
| | | | - Frank Jones
- Global Product Safety, SC Johnson and Son, Inc., Racine, WI, USA
| | | | | | - Sean Ekins
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC, USA
| |
Collapse
|