1
|
Zhao L, Xue Q, Zhang H, Hao Y, Yi H, Liu X, Pan W, Fu J, Zhang A. CatNet: Sequence-based deep learning with cross-attention mechanism for identifying endocrine-disrupting chemicals. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133055. [PMID: 38016311 DOI: 10.1016/j.jhazmat.2023.133055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/02/2023] [Accepted: 11/20/2023] [Indexed: 11/30/2023]
Abstract
Endocrine-disrupting chemicals (EDCs) pose significant environmental and health risks due to their potential to interfere with nuclear receptors (NRs), key regulators of physiological processes. Despite the evident risks, the majority of existing research narrows its focus on the interaction between compounds and the individual NR target, neglecting a comprehensive assessment across the entire NR family. In response, this study assembled a comprehensive human NR dataset, capturing 49,244 interactions between 35,467 unique compounds and 42 NRs. We introduced a cross-attention network framework, "CatNet", innovatively integrating compound and protein representations through cross-attention mechanisms. The results showed that CatNet model achieved excellent performance with an area under the receiver operating characteristic curve (AUCROC) = 0.916 on the test set, and exhibited reliable generalization on unseen compound-NR pairs. A distinguishing feature of our research is its capacity to expand to novel targets. Beyond its predictive accuracy, CatNet offers a valuable mechanistic perspective on compound-NR interactions through feature visualization. Augmenting the utility of our research, we have also developed a graphical user interface, empowering researchers to predict chemical binding to diverse NRs. Our model enables the prediction of human NR-related EDCs and shows the potential to identify EDCs related to other targets.
Collapse
Affiliation(s)
- Lu Zhao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China
| | - Qiao Xue
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China.
| | - Huazhou Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China
| | - Yuxing Hao
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China
| | - Hang Yi
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China
| | - Xian Liu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China
| | - Wenxiao Pan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China
| | - Jianjie Fu
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310012, PR China
| | - Aiqian Zhang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, PR China; Sino-Danish College, University of Chinese Academy of Sciences, Beijing 100049, PR China; College of Resources and Environment, University of Chinese Academy of Sciences, Beijing 100190, PR China; School of Environment, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310012, PR China.
| |
Collapse
|
2
|
Cao H, Peng J, Zhou Z, Yang Z, Wang L, Sun Y, Wang Y, Liang Y. Investigation of the Binding Fraction of PFAS in Human Plasma and Underlying Mechanisms Based on Machine Learning and Molecular Dynamics Simulation. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17762-17773. [PMID: 36282672 DOI: 10.1021/acs.est.2c04400] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
More than 7000 per- and polyfluorinated alkyl substances (PFAS) have been documented in the U.S. Environmental Protection Agency's CompTox Chemicals database. These PFAS can be used in a broad range of industrial and consumer applications but may pose potential environmental issues and health risks. However, little is known about emerging PFAS bioaccumulation to assess their chemical safety. This study focuses specifically on the large and high-quality data set of fluorochemicals from the related environmental and pharmaceutical chemicals databases, and machine learning (ML) models were developed for the classification prediction of the unbound fraction of compounds in plasma. A comprehensive evaluation of the ML models shows that the best blending model yields an accuracy of 0.901 for the test set. The predictions suggest that most PFAS (∼92%) have a high binding fraction in plasma. Introduction of alkaline amino groups is likely to reduce the binding affinities of PFAS with plasma proteins. Molecular dynamics simulations indicate a clear distinction between the high and low binding fractions of PFAS. These computational workflows can be used to predict the bioaccumulation of emerging PFAS and are also helpful for the molecular design of PFAS to prevent the release of high-bioaccumulation compounds into the environment.
Collapse
Affiliation(s)
- Huiming Cao
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Jianhua Peng
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zhen Zhou
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Zeguo Yang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Ling Wang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yuzhen Sun
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| | - Yawei Wang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Yong Liang
- Hubei Key Laboratory of Environmental and Health Effects of Persistent Toxic Substances, School of Environment and Health, Jianghan University, Wuhan 430056, China
| |
Collapse
|
3
|
Tan H, Gao P, Luo Y, Gou X, Xia P, Wang P, Yan L, Zhang S, Guo J, Zhang X, Yu H, Shi W. Are New Phthalate Ester Substitutes Safer than Traditional DBP and DiBP? Comparative Endocrine-Disrupting Analyses on Zebrafish Using In Vivo, Transcriptome, and In Silico Approaches. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:13744-13756. [PMID: 37677100 DOI: 10.1021/acs.est.3c03282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Although previous studies have confirmed the association between phthalate esters (PAEs) exposure and endocrine disorders in humans, few studies to date have systematically assessed the threats of new PAE alternatives to endocrine disruptions. Herein, zebrafish embryos were continuously exposed to two PAEs [di-n-butyl phthalate (DBP) and diisobutyl phthalate (DiBP)], two structurally related alternatives [diiononyl phthalate (DINP) and diisononyl hexahydrophthalate (DINCH)], and two non-PAE substitutes [dipropylene glycol dibenzoate (DGD) and glyceryl triacetate (GTA)], and the endocrine-disrupting effects were investigated during the early stages (8-48 hpf). For five endogenous hormones, including progesterone, testosterone, 17β-estradiol, triiodothyronine (T3), and cortisol, the tested chemicals disturbed the contents of at least one hormone at environmentally relevant concentrations (≤3.9 μM), except DINCH and GTA. Then, the concentration-dependent reduced zebrafish transcriptome analysis was performed. Thyroid hormone (TH)- and androgen/estrogen-regulated adverse outcome pathways (AOPs) were the two types of biological pathways most sensitive to PAE exposure. Notably, six compounds disrupted four TH-mediated AOPs, from the inhibition of deiodinases (molecular initiating event, MIE), a decrease in T3 levels (key event, KE), to mortality (adverse outcome, AO) with the quantitatively linear relationships between MIE-KE (|r| = 0.96, p = 0.002), KE-AO (|r| = 0.88, p = 0.02), and MIE-AO (|r| = 0.89, p = 0.02). Multiple structural analyses showed that benzoic acid is the critical toxicogenic fragment. Our data will facilitate the screening and development of green alternatives.
Collapse
Affiliation(s)
- Haoyue Tan
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, Jiangsu, China
| | - Pan Gao
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Yiwen Luo
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Xiao Gou
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Pu Xia
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Pingping Wang
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Lu Yan
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Shaoqing Zhang
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
| | - Jing Guo
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, Jiangsu, China
| | - Xiaowei Zhang
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, Jiangsu, China
| | - Hongxia Yu
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, Jiangsu, China
| | - Wei Shi
- State Key Laboratory of Pollution Control and Resources Reuse, School of Environment, Nanjing University, Nanjing 210023, Jiangsu, China
- Jiangsu Province Ecology and Environment Protection Key Laboratory of Chemical Safety and Health Risk, Nanjing 210023, Jiangsu, China
| |
Collapse
|
4
|
Liu W, Wang Z, Chen J, Tang W, Wang H. Machine Learning Model for Screening Thyroid Stimulating Hormone Receptor Agonists Based on Updated Datasets and Improved Applicability Domain Metrics. Chem Res Toxicol 2023. [PMID: 37209109 DOI: 10.1021/acs.chemrestox.3c00074] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Machine learning (ML) models for screening endocrine-disrupting chemicals (EDCs), such as thyroid stimulating hormone receptor (TSHR) agonists, are essential for sound management of chemicals. Previous models for screening TSHR agonists were built on imbalanced datasets and lacked applicability domain (AD) characterization essential for regulatory application. Herein, an updated TSHR agonist dataset was built, for which the ratio of active to inactive compounds greatly increased to 1:2.6, and chemical spaces of structure-activity landscapes (SALs) were enhanced. Resulting models based on 7 molecular representations and 4 ML algorithms were proven to outperform previous ones. Weighted similarity density (ρs) and weighted inconsistency of activities (IA) were proposed to characterize the SALs, and a state-of-the-art AD characterization methodology ADSAL{ρs, IA} was established. An optimal classifier developed with PubChem fingerprints and the random forest algorithm, coupled with ADSAL{ρs ≥ 0.15, IA ≤ 0.65}, exhibited good performance on the validation set with the area under the receiver operating characteristic curve being 0.984 and balanced accuracy being 0.941 and identified 90 TSHR agonist classes that could not be found previously. The classifier together with the ADSAL{ρs, IA} may serve as efficient tools for screening EDCs, and the AD characterization methodology may be applied to other ML models.
Collapse
Affiliation(s)
- Wenjia Liu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Weihao Tang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Haobo Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
5
|
Wu Y, Li M, Shen J, Pu X, Guo Y. A consensual machine-learning-assisted QSAR model for effective bioactivity prediction of xanthine oxidase inhibitors using molecular fingerprints. Mol Divers 2023:10.1007/s11030-023-10649-z. [PMID: 37043162 DOI: 10.1007/s11030-023-10649-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 04/06/2023] [Indexed: 04/13/2023]
Abstract
Xanthine oxidase inhibitors (XOIs) have been widely studied due to the promising potential as safe and effective therapeutics in hyperuricemia and gout. Currently, available XOI molecules have been developed from different experiments but they are with the wide structure diversity and significant varying bioactivities. So it is of great practical significance to present a consensual QSAR model for effective bioactivity prediction of XOIs based on a systematic compiling of these XOIs across different experiments. In this work, 249 XOIs belonging to 16 scaffolds were collected and were integrated into a consensual dataset by introducing the concept of IC50 values relative to allopurinol (RIC50). Here, extended connectivity fingerprints (ECFPs) were employed to represent XOI molecules. By performing effective feature selection by machine-learning method, 54 crucial fingerprints were indicated to be valuable for predicting the inhibitory potency (IP) of XOIs. The optimal predictor yields the promising performance by different cross-validation tests. Besides, an external validation of 43 XOIs and a case study on febuxostat also provide satisfactory results, indicating the powerful generalization of our predictor. Here, the predictor was interpreted by shapely additive explanation (SHAP) method which revealed several important substructures by mapping the featured fingerprints to molecular structures. Then, 15 new molecules were designed and predicted by our predictor to show superior IP than febuxostat. Finally, molecular docking simulation was performed to gain a deep insight into molecular binding mode with xanthine oxidase (XO) enzyme, showing that molecules with selenazole moiety, cyano group and isopropyl group tended to yield higher IP. The absorption, distribution, metabolism, excretion and toxicity (ADMET) prediction results further enhanced the potential of these novel XOIs as drug candidates. Overall, this work presents a QSAR model for accurate prediction of IP of XOIs, and is expected to provide new insights for further structure-guided design of novel XOIs.
Collapse
Affiliation(s)
- Yanling Wu
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Jinru Shen
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu, 610064, China.
| |
Collapse
|
6
|
Xu X, Wang C, Gui B, Yuan X, Li C, Zhao Y, Martyniuk CJ, Su L. Application of machine learning to predict the inhibitory activity of organic chemicals on thyroid stimulating hormone receptor. ENVIRONMENTAL RESEARCH 2022; 212:113175. [PMID: 35351457 DOI: 10.1016/j.envres.2022.113175] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 03/04/2022] [Accepted: 03/22/2022] [Indexed: 06/14/2023]
Abstract
With the promotion of carbon neutrality, it is also important to synchronously promote the assessment and sustainable management of chemicals so as to protect public health. Humans and animals are possibly exposed to endocrine disruptors that have inhibitory effects on thyroid stimulating hormone receptor (TSHR). As such, it is important to identify chemicals that inhibit TSHR and to develop models to predict their inhibitory activity. In this study, 5952 compounds derived from a cyclic adenosine monophosphate (cAMP) analysis, a key signaling pathway in thyrocytes, were used to establish a binary classification model comparing methods that included random forest (RF), extreme gradient boosting (XGB), and logistic regression (LR). The prediction model based on RF showed the highest identification accuracy for revealing chemicals that may inhibit TSHR. For the RF model, recall was calculated at 0.89, balance accuracy was 0.85, and its receiver operating characteristic (ROC) curve-area under (AUC) was 0.92, indicating that the model had very high predictive capacity. The lowest CDocker energy (CE) and CDocker interaction energy (CIE) for chemicals and TSHR were determined and were subsequently introduced into the predictive model as descriptors. A regression model, extreme gradient boosting-Regression (XGBR), was successfully established yielding an R2 = 0.65 to predict inhibitory activity for active compounds. Parameters that included dissociation characteristics, molecular structure, and binding energy were all key factors in the predictive model. We demonstrate that QSAR models are useful approaches, not only for identifying chemicals that inhibit TSHR, but for predicting inhibitory activity of active compounds.
Collapse
Affiliation(s)
- Xiaotian Xu
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Chen Wang
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Bingxin Gui
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Xiangyi Yuan
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Chao Li
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Yuanhui Zhao
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Christopher J Martyniuk
- Center for Environmental and Human Toxicology, Department of Physiological Sciences, College of Veterinary Medicine, UF Genetics Institute, Interdisciplinary Program in Biomedical Sciences Neuroscience, University of Florida, Gainesville, FL, 32611, USA
| | - Limin Su
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China.
| |
Collapse
|
7
|
Wang H, Wang Z, Chen J, Liu W. Graph Attention Network Model with Defined Applicability Domains for Screening PBT Chemicals. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:6774-6785. [PMID: 35475611 DOI: 10.1021/acs.est.2c00765] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In silico models for screening environmentally persistent, bio-accumulative, and toxic (PBT) substances are necessary for sound management of chemicals. Due to the complex structure-activity landscapes (SALs) on the PBT attributes, previous models for screening PBT chemicals lack either applicability domain (AD) characterizations or interpretability, restricting their applications. Herein, graph attention networks (GATs), a novel neural network architecture, were introduced to construct models for screening PBT chemicals. Results show that the GAT model not only outperformed those in previous studies but also exhibited interpretability since it optimizes attention weight parameters (PAW) that indicate contributions of each atom to the PBT attributes. An AD characterization termed ADFP-AC, which considers both molecular fingerprint (FP) similarities and compounds at activity cliffs (ACs) of SALs, was proposed to describe the ADs, which further assured the performance of the GAT model. Eight previously unidentified classes of compounds were identified as PBT chemicals from the Inventory of Existing Chemical Substances in China. The GAT model together with the ADFP-AC characterization may serve as efficient tools for screening PBT chemicals, and the modeling methodology can be applied to other physicochemical, environmental, behavioral, and toxicological parameters of chemicals that are necessary for their risk assessment and management.
Collapse
Affiliation(s)
- Haobo Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Wenjia Liu
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| |
Collapse
|
8
|
Ramaprasad ASE, Smith MT, McCoy D, Hubbard AE, La Merrill MA, Durkin KA. Predicting the binding of small molecules to nuclear receptors using machine learning. Brief Bioinform 2022; 23:6563938. [PMID: 35383362 PMCID: PMC9116378 DOI: 10.1093/bib/bbac114] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 03/07/2022] [Accepted: 03/09/2022] [Indexed: 12/14/2022] Open
Abstract
Nuclear receptors (NRs) are important biological targets of endocrine-disrupting chemicals (EDCs). Identifying chemicals that can act as EDCs and modulate the function of NRs is difficult because of the time and cost of in vitro and in vivo screening to determine the potential hazards of the 100 000s of chemicals that humans are exposed to. Hence, there is a need for computational approaches to prioritize chemicals for biological testing. Machine learning (ML) techniques are alternative methods that can quickly screen millions of chemicals and identify those that may be an EDC. Computational models of chemical binding to multiple NRs have begun to emerge. Recently, a Nuclear Receptor Activity (NuRA) dataset, describing experimentally derived small-molecule activity against various NRs has been created. We have used the NuRA dataset to develop an ensemble of ML-based models to predict the agonism, antagonism, binding and effector binding of small molecules to nine different human NRs. We defined the applicability domain of the ML models as a measure of Tanimoto similarity to the molecules in the training set, which enhanced the performance of the developed classifiers. We further developed a user-friendly web server named 'NR-ToxPred' to predict the binding of chemicals to the nine NRs using the best-performing models for each receptor. This web server is freely accessible at http://nr-toxpred.cchem.berkeley.edu. Users can upload individual chemicals using Simplified Molecular-Input Line-Entry System, CAS numbers or sketch the molecule in the provided space to predict the compound's activity against the different NRs and predict the binding mode for each.
Collapse
Affiliation(s)
| | - Martyn T Smith
- Divisions of Environmental Health Sciences and Biostatistics, School of Public Health, University of California Berkeley, CA 94720, USA
| | - David McCoy
- Divisions of Environmental Health Sciences and Biostatistics, School of Public Health, University of California Berkeley, CA 94720, USA
| | - Alan E Hubbard
- Divisions of Environmental Health Sciences and Biostatistics, School of Public Health, University of California Berkeley, CA 94720, USA
| | - Michele A La Merrill
- Department of Environmental Toxicology, University of California, Davis, CA 95616, USA
| | - Kathleen A Durkin
- Molecular Graphics and Computation Facility, College of Chemistry, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
9
|
Nakarin F, Boonpalit K, Kinchagawat J, Wachiraphan P, Rungrotmongkol T, Nutanong S. Assisting Multitargeted Ligand Affinity Prediction of Receptor Tyrosine Kinases Associated Nonsmall Cell Lung Cancer Treatment with Multitasking Principal Neighborhood Aggregation. Molecules 2022; 27:molecules27041226. [PMID: 35209011 PMCID: PMC8878292 DOI: 10.3390/molecules27041226] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 01/30/2022] [Accepted: 01/31/2022] [Indexed: 11/16/2022] Open
Abstract
A multitargeted therapeutic approach with hybrid drugs is a promising strategy to enhance anticancer efficiency and overcome drug resistance in nonsmall cell lung cancer (NSCLC) treatment. Estimating affinities of small molecules against targets of interest typically proceeds as a preliminary action for recent drug discovery in the pharmaceutical industry. In this investigation, we employed machine learning models to provide a computationally affordable means for computer-aided screening to accelerate the discovery of potential drug compounds. In particular, we introduced a quantitative structure–activity-relationship (QSAR)-based multitask learning model to facilitate an in silico screening system of multitargeted drug development. Our method combines a recently developed graph-based neural network architecture, principal neighborhood aggregation (PNA), with a descriptor-based deep neural network supporting synergistic utilization of molecular graph and fingerprint features. The model was generated by more than ten-thousands affinity-reported ligands of seven crucial receptor tyrosine kinases in NSCLC from two public data sources. As a result, our multitask model demonstrated better performance than all other benchmark models, as well as achieving satisfying predictive ability regarding applicable QSAR criteria for most tasks within the model’s applicability. Since our model could potentially be a screening tool for practical use, we have provided a model implementation platform with a tutorial that is freely accessible hence, advising the first move in a long journey of cancer drug development.
Collapse
Affiliation(s)
- Fahsai Nakarin
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong 21210, Thailand; (K.B.); (J.K.); (P.W.); (S.N.)
- Correspondence: ; Tel.: +66-33-014-444
| | - Kajjana Boonpalit
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong 21210, Thailand; (K.B.); (J.K.); (P.W.); (S.N.)
| | - Jiramet Kinchagawat
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong 21210, Thailand; (K.B.); (J.K.); (P.W.); (S.N.)
| | - Patcharapol Wachiraphan
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong 21210, Thailand; (K.B.); (J.K.); (P.W.); (S.N.)
| | - Thanyada Rungrotmongkol
- Center of Excellence in Biocatalyst and Sustainable Biotechnology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand;
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
| | - Sarana Nutanong
- School of Information Science and Technology, Vidyasirimedhi Institute of Science and Technology (VISTEC), Rayong 21210, Thailand; (K.B.); (J.K.); (P.W.); (S.N.)
| |
Collapse
|
10
|
Wang Z, Chen J, Hong H. Developing QSAR Models with Defined Applicability Domains on PPARγ Binding Affinity Using Large Data Sets and Machine Learning Algorithms. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:6857-6866. [PMID: 33914508 DOI: 10.1021/acs.est.0c07040] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Chemicals may cause adverse effects on human health through binding to peroxisome proliferator-activated receptor γ (PPARγ). Hence, binding affinity is useful for evaluating chemicals with potential endocrine-disrupting effects. Quantitative structure-activity relationship (QSAR) regression models with defined applicability domains (ADs) are important to enable efficient screening of chemicals with PPARγ binding activity. However, lack of large data sets hindered the development of QSAR models. In this study, based on PPARγ binding affinity data sets curated from various sources, 30 QSAR models were developed using molecular fingerprints, two-dimensional descriptors, and five machine learning algorithms. Structure-activity landscapes (SALs) of the training compounds were described by network-like similarity graphs (NSGs). Based on the NSGs, local discontinuity scores were calculated and found to be positively correlated with the cross-validation absolute prediction errors of the models using the different training sets, descriptors, and algorithms. Moreover, innovative ADs were defined based on pairwise similarities between compounds and were found to outperform some conventional ADs. The curated data sets and developed regression models could be useful for evaluating PPARγ-involved adverse effects of chemicals. The SAL analysis and the innovative ADs could facilitate understanding of prediction results from QSAR models.
Collapse
Affiliation(s)
- Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079, United States
| |
Collapse
|
11
|
Tang W, Chen J, Hong H. Development of classification models for predicting inhibition of mitochondrial fusion and fission using machine learning methods. CHEMOSPHERE 2020; 273:128567. [PMID: 34756375 DOI: 10.1016/j.chemosphere.2020.128567] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 10/03/2020] [Accepted: 10/06/2020] [Indexed: 06/13/2023]
Abstract
Mitochondrial fusion and fission are processes to maintain mitochondrial function when cells respond to environment stresses. Disruption of mitochondrial fusion and fission influences cell health and can cause adverse events such as neurodegenerative disorders. It is critical to identify environmental chemicals that can disrupt mitochondrial fusion and fission. However, experimentally testing all the chemicals is not practical because experimental methods are time-consuming and costly. Quantitative structure-activity relationship (QSAR) modeling is an attractive approach for evaluation of chemicals disrupting potential on mitochondrial fusion and fission. In this study, QSAR models were developed for differentiating chemicals capable of inhibition of mitochondrial fusion and fission using machine learning algorithms (i.e. random forest, logistic regression, Bernoulli naive Bayes, and deep neural network). One hundred iterations of five-fold cross validations and external validations showed that the best model on mitochondrial fusion had area under the receiver operating characteristic curve (AUC) of 82.8% and 78.1%, respectively; and the best model for mitochondrial fission yielded AUC of 84.3% and 97.5%, respectively. Furthermore, 45 and 56 structural alerts were identified for inhibition of mitochondrial fusion and fission, respectively. The results demonstrated that the models and the structural alerts could be useful for screening chemicals that inhibit mitochondrial fusion and fission.
Collapse
Affiliation(s)
- Weihao Tang
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian, 116024, China.
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Rd, Jefferson, AR, 72079, USA.
| |
Collapse
|