1
|
Masand VH, Al-Hussain SA, Alzahrani AY, Al-Mutairi AA, Hussien RA, Samad A, Zaki MEA. Estrogen Receptor Alpha Binders for Hormone-Dependent Forms of Breast Cancer: e-QSAR and Molecular Docking Supported by X-ray Resolved Structures. ACS OMEGA 2024; 9:16759-16774. [PMID: 38617692 PMCID: PMC11007693 DOI: 10.1021/acsomega.4c00906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2024] [Revised: 03/16/2024] [Accepted: 03/19/2024] [Indexed: 04/16/2024]
Abstract
Cancer, a life-disturbing and lethal disease with a high global impact, causes significant economic, social, and health challenges. Breast cancer refers to the abnormal growth of cells originating from breast tissues. Hormone-dependent forms of breast cancer, such as those influenced by estrogen, prompt the exploration of estrogen receptors as targets for potential therapeutic interventions. In this study, we conducted e-QSAR molecular docking and molecular dynamics analyses on a diverse set of inhibitors targeting estrogen receptor alpha (ER-α). The e-QSAR model is based on a genetic algorithm combined with multilinear regression analysis. The newly developed model possesses a balance between predictive accuracy and mechanistic insights adhering to the OECD guidelines. The e-QSAR model pointed out that sp2-hybridized carbon and nitrogen atoms are important atoms governing binding profiles. In addition, a specific combination of H-bond donors and acceptors with carbon, nitrogen, and ring sulfur atoms also plays a crucial role. The results are supported by molecular docking, MD simulations, and X-ray-resolved structures. The novel results could be useful for future drug development for ER-α.
Collapse
Affiliation(s)
- Vijay H Masand
- Department of Chemistry, Vidya Bharati Mahavidyalaya, Amravati 444 602, Maharashtra, India
| | - Sami A Al-Hussain
- Department of Chemistry, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh 11623, Saudi Arabia
| | - Abdullah Y Alzahrani
- Department of Chemistry, Faculty of Science and Arts, King Khalid University, Mohail 61421, Saudi Arabia
| | - Aamal A Al-Mutairi
- Department of Chemistry, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh 11623, Saudi Arabia
| | - Rania A Hussien
- Department of Chemistry, Faculty of Science, Al-Baha University, Al-Baha 65799, Kingdom of Saudi Arabia
| | - Abdul Samad
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Tishk International University, Erbil 44001, Iraq
| | - Magdi E A Zaki
- Department of Chemistry, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh 11623, Saudi Arabia
| |
Collapse
|
2
|
Pusparini RT, Krisnadhi AA, Firdayani. MATH: A Deep Learning Approach in QSAR for Estrogen Receptor Alpha Inhibitors. Molecules 2023; 28:5843. [PMID: 37570812 PMCID: PMC10421274 DOI: 10.3390/molecules28155843] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 07/24/2023] [Accepted: 07/24/2023] [Indexed: 08/13/2023] Open
Abstract
Breast cancer ranks as the second leading cause of death among women, but early screening and self-awareness can help prevent it. Hormone therapy drugs that target estrogen levels offer potential treatments. However, conventional drug discovery entails extensive, costly processes. This study presents a framework for analyzing the quantitative structure-activity relationship (QSAR) of estrogen receptor alpha inhibitors. Our approach utilizes supervised learning, integrating self-attention Transformer and molecular graph information, to predict estrogen receptor alpha inhibitors. We established five classification models for predicting these inhibitors in breast cancer. Among these models, our proposed MATH model achieved remarkable precision, recall, F1 score, and specificity, with values of 0.952, 0.972, 0.960, and 0.922, respectively, alongside an ROC AUC of 0.977. MATH exhibited robust performance, suggesting its potential to assist pharmaceutical and health researchers in identifying candidate compounds for estrogen alpha inhibitors and guiding drug discovery pathways.
Collapse
Affiliation(s)
- Rizki Triyani Pusparini
- Tokopedia-UI AI Center of Excellence, Faculty of Computer Science, Universitas Indonesia, Depok 16424, Indonesia
- Research Center for Vaccine and Drugs, Research Organization for Health, National Research and Innovation Agency (BRIN), Jakarta 10340, Indonesia;
| | - Adila Alfa Krisnadhi
- Tokopedia-UI AI Center of Excellence, Faculty of Computer Science, Universitas Indonesia, Depok 16424, Indonesia
| | - Firdayani
- Research Center for Vaccine and Drugs, Research Organization for Health, National Research and Innovation Agency (BRIN), Jakarta 10340, Indonesia;
| |
Collapse
|
3
|
Chung E, Russo DP, Ciallella HL, Wang YT, Wu M, Aleksunes LM, Zhu H. Data-Driven Quantitative Structure-Activity Relationship Modeling for Human Carcinogenicity by Chronic Oral Exposure. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:6573-6588. [PMID: 37040559 PMCID: PMC10134506 DOI: 10.1021/acs.est.3c00648] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/28/2023] [Accepted: 03/29/2023] [Indexed: 06/19/2023]
Abstract
Traditional methodologies for assessing chemical toxicity are expensive and time-consuming. Computational modeling approaches have emerged as low-cost alternatives, especially those used to develop quantitative structure-activity relationship (QSAR) models. However, conventional QSAR models have limited training data, leading to low predictivity for new compounds. We developed a data-driven modeling approach for constructing carcinogenicity-related models and used these models to identify potential new human carcinogens. To this goal, we used a probe carcinogen dataset from the US Environmental Protection Agency's Integrated Risk Information System (IRIS) to identify relevant PubChem bioassays. Responses of 25 PubChem assays were significantly relevant to carcinogenicity. Eight assays inferred carcinogenicity predictivity and were selected for QSAR model training. Using 5 machine learning algorithms and 3 types of chemical fingerprints, 15 QSAR models were developed for each PubChem assay dataset. These models showed acceptable predictivity during 5-fold cross-validation (average CCR = 0.71). Using our QSAR models, we can correctly predict and rank 342 IRIS compounds' carcinogenic potentials (PPV = 0.72). The models predicted potential new carcinogens, which were validated by a literature search. This study portends an automated technique that can be applied to prioritize potential toxicants using validated QSAR models based on extensive training sets from public data resources.
Collapse
Affiliation(s)
- Elena Chung
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| | - Daniel P. Russo
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| | - Heather L. Ciallella
- Department
of Toxicology, Cuyahoga County Medical Examiner’s
Office, 11001 Cedar Avenue, Cleveland, Ohio 44106, United States
| | - Yu-Tang Wang
- Institute
of Agro-Products Processing Science and Technology, Chinese Academy of Agricultural Sciences/Key Laboratory of Agro-Products
Processing, Ministry of Agriculture, Beijing 100193, China
| | - Min Wu
- School
of Life Science and Technology, China Pharmaceutical
University, No. 24, Tong Jia Xiang, Nanjing 210009, China
| | - Lauren M. Aleksunes
- Department
of Pharmacology and Toxicology, Rutgers
University, Ernest Mario School of Pharmacy, 170 Frelinghuysen Road, Piscataway, New Jersey 08854, United States
| | - Hao Zhu
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| |
Collapse
|
4
|
Ciallella HL, Russo DP, Sharma S, Li Y, Sloter E, Sweet L, Huang H, Zhu H. Predicting Prenatal Developmental Toxicity Based On the Combination of Chemical Structures and Biological Data. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:5984-5998. [PMID: 35451820 PMCID: PMC9191745 DOI: 10.1021/acs.est.2c01040] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
For hazard identification, classification, and labeling purposes, animal testing guidelines are required by law to evaluate the developmental toxicity potential of new and existing chemical products. However, guideline developmental toxicity studies are costly, time-consuming, and require many laboratory animals. Computational modeling has emerged as a promising, animal-sparing, and cost-effective method for evaluating the developmental toxicity potential of chemicals, such as endocrine disruptors, without the use of animals. We aimed to develop a predictive and explainable computational model for developmental toxicants. To this end, a comprehensive dataset of 1244 chemicals with developmental toxicity classifications was curated from public repositories and literature sources. Data from 2140 toxicological high-throughput screening assays were extracted from PubChem and the ToxCast program for this dataset and combined with information about 834 chemical fragments to group assays based on their chemical-mechanistic relationships. This effort revealed two assay clusters containing 83 and 76 assays, respectively, with high positive predictive rates for developmental toxicants identified with animal testing guidelines (PPV = 72.4 and 77.3% during cross-validation). These two assay clusters can be used as developmental toxicity models and were applied to predict new chemicals for external validation. This study provides a new strategy for constructing alternative chemical developmental toxicity evaluations that can be replicated for other toxicity modeling studies.
Collapse
Affiliation(s)
- Heather L. Ciallella
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
| | - Daniel P. Russo
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
- Department of Chemistry, Rutgers University, Camden, NJ, 08102, USA
| | - Swati Sharma
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
| | - Yafan Li
- The Lubrizol Corporation, Wickliffe, OH, 44092, USA
| | - Eddie Sloter
- The Lubrizol Corporation, Wickliffe, OH, 44092, USA
| | - Len Sweet
- The Lubrizol Corporation, Wickliffe, OH, 44092, USA
| | - Heng Huang
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
- Department of Chemistry, Rutgers University, Camden, NJ, 08102, USA
- Corresponding Author333 Hao Zhu, 201 South Broadway, Joint Health Sciences Center, Rutgers University, Camden, New Jersey 08103; Telephone: (856) 225-6781;
| |
Collapse
|
5
|
Russo DP, Zhu H. High-Throughput Screening Assay Profiling for Large Chemical Databases. Methods Mol Biol 2022; 2474:125-132. [PMID: 35294761 DOI: 10.1007/978-1-0716-2213-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
High-throughput screening (HTS) techniques are increasingly being adopted by a variety of fields of toxicology. Notably, large-scale research efforts from government, industrial, and academic laboratories are screening millions of chemicals against a variety of biomolecular targets, producing an enormous amount of publicly available HTS assay data. These HTS assay data provide toxicologists important information on how chemicals interact with different biomolecular targets and provide illustrations of potential toxicity mechanisms. Open public data repositories, such as the National Institutes of Health's PubChem ( http://pubchem.ncbi.nlm.nih.gov ), were established to accept, store, and share HTS data. Through the PubChem website, users can rapidly obtain the PubChem assay results for compounds by using different chemical identifiers (including SMILES, InChIKey, IUPAC names, etc.). However, obtaining these data in a user-friendly format suitable for modeling and other informatics analysis (e.g., gathering PubChem data for hundreds or thousands of chemicals in a modeling friendly format) directly through the PubChem web portal is not feasible. This chapter aims to introduce two approaches to obtain the HTS assay results for large datasets of compounds from the PubChem portal. First, programmatic access via PubChem's PUG-REST web service using the Python programming language will be described. Second, most users, who lack programming skills, can directly obtain PubChem data for a large set of compounds by using the freely available Chemical In vitro-In vivo Profiling (CIIPro) portal ( http://www.ciipro.rutgers.edu ).
Collapse
Affiliation(s)
- Daniel P Russo
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA
- Department of Chemistry, Rutgers University, Camden, NJ, USA
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA.
- Department of Chemistry, Rutgers University, Camden, NJ, USA.
| |
Collapse
|
6
|
Elucidation of Agonist and Antagonist Dynamic Binding Patterns in ER-α by Integration of Molecular Docking, Molecular Dynamics Simulations and Quantum Mechanical Calculations. Int J Mol Sci 2021; 22:ijms22179371. [PMID: 34502280 PMCID: PMC8431471 DOI: 10.3390/ijms22179371] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 08/25/2021] [Accepted: 08/27/2021] [Indexed: 12/13/2022] Open
Abstract
Estrogen receptor alpha (ERα) is a ligand-dependent transcriptional factor in the nuclear receptor superfamily. Many structures of ERα bound with agonists and antagonists have been determined. However, the dynamic binding patterns of agonists and antagonists in the binding site of ERα remains unclear. Therefore, we performed molecular docking, molecular dynamics (MD) simulations, and quantum mechanical calculations to elucidate agonist and antagonist dynamic binding patterns in ERα. 17β-estradiol (E2) and 4-hydroxytamoxifen (OHT) were docked in the ligand binding pockets of the agonist and antagonist bound ERα. The best complex conformations from molecular docking were subjected to 100 nanosecond MD simulations. Hierarchical clustering was conducted to group the structures in the trajectory from MD simulations. The representative structure from each cluster was selected to calculate the binding interaction energy value for elucidation of the dynamic binding patterns of agonists and antagonists in the binding site of ERα. The binding interaction energy analysis revealed that OHT binds ERα more tightly in the antagonist conformer, while E2 prefers the agonist conformer. The results may help identify ERα antagonists as drug candidates and facilitate risk assessment of chemicals through ER-mediated responses.
Collapse
|
7
|
Wilm A, Garcia de Lomana M, Stork C, Mathai N, Hirte S, Norinder U, Kühnl J, Kirchmair J. Predicting the Skin Sensitization Potential of Small Molecules with Machine Learning Models Trained on Biologically Meaningful Descriptors. Pharmaceuticals (Basel) 2021; 14:ph14080790. [PMID: 34451887 PMCID: PMC8402010 DOI: 10.3390/ph14080790] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 08/03/2021] [Accepted: 08/06/2021] [Indexed: 02/06/2023] Open
Abstract
In recent years, a number of machine learning models for the prediction of the skin sensitization potential of small organic molecules have been reported and become available. These models generally perform well within their applicability domains but, as a result of the use of molecular fingerprints and other non-intuitive descriptors, the interpretability of the existing models is limited. The aim of this work is to develop a strategy to replace the non-intuitive features by predicted outcomes of bioassays. We show that such replacement is indeed possible and that as few as ten interpretable, predicted bioactivities are sufficient to reach competitive performance. On a holdout data set of 257 compounds, the best model (“Skin Doctor CP:Bio”) obtained an efficiency of 0.82 and an MCC of 0.52 (at the significance level of 0.20). Skin Doctor CP:Bio is available free of charge for academic research. The modeling strategies explored in this work are easily transferable and could be adopted for the development of more interpretable machine learning models for the prediction of the bioactivity and toxicity of small organic compounds.
Collapse
Affiliation(s)
- Anke Wilm
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany; (A.W.); (C.S.)
- HITeC e.V., 22527 Hamburg, Germany
| | - Marina Garcia de Lomana
- Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090 Vienna, Austria; (M.G.d.L.); (S.H.)
| | - Conrad Stork
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany; (A.W.); (C.S.)
| | - Neann Mathai
- Computational Biology Unit (CBU), Department of Chemistry, University of Bergen, N-5020 Bergen, Norway;
| | - Steffen Hirte
- Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090 Vienna, Austria; (M.G.d.L.); (S.H.)
| | - Ulf Norinder
- MTM Research Centre, School of Science and Technology, Örebro University, SE-70182 Örebro, Sweden;
- Department of Computer and Systems Sciences, Stockholm University, SE-16407 Kista, Sweden
- Department of Pharmaceutical Biosciences, Uppsala University, SE-75124 Uppsala, Sweden
| | - Jochen Kühnl
- Front End Innovation, Beiersdorf AG, 22529 Hamburg, Germany;
| | - Johannes Kirchmair
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany; (A.W.); (C.S.)
- Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090 Vienna, Austria; (M.G.d.L.); (S.H.)
- Correspondence: ; Tel.: +43-1-4277-55104
| |
Collapse
|
8
|
Ciallella HL, Russo DP, Aleksunes LM, Grimm FA, Zhu H. Revealing Adverse Outcome Pathways from Public High-Throughput Screening Data to Evaluate New Toxicants by a Knowledge-Based Deep Neural Network Approach. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:10875-10887. [PMID: 34304572 PMCID: PMC8713073 DOI: 10.1021/acs.est.1c02656] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Traditional experimental testing to identify endocrine disruptors that enhance estrogenic signaling relies on expensive and labor-intensive experiments. We sought to design a knowledge-based deep neural network (k-DNN) approach to reveal and organize public high-throughput screening data for compounds with nuclear estrogen receptor α and β (ERα and ERβ) binding potentials. The target activity was rodent uterotrophic bioactivity driven by ERα/ERβ activations. After training, the resultant network successfully inferred critical relationships among ERα/ERβ target bioassays, shown as weights of 6521 edges between 1071 neurons. The resultant network uses an adverse outcome pathway (AOP) framework to mimic the signaling pathway initiated by ERα and identify compounds that mimic endogenous estrogens (i.e., estrogen mimetics). The k-DNN can predict estrogen mimetics by activating neurons representing several events in the ERα/ERβ signaling pathway. Therefore, this virtual pathway model, starting from a compound's chemistry initiating ERα activation and ending with rodent uterotrophic bioactivity, can efficiently and accurately prioritize new estrogen mimetics (AUC = 0.864-0.927). This k-DNN method is a potential universal computational toxicology strategy to utilize public high-throughput screening data to characterize hazards and prioritize potentially toxic compounds.
Collapse
Affiliation(s)
- Heather L Ciallella
- Center for Computational and Integrative Biology, Rutgers University Camden, Camden, New Jersey 08103, United States
| | - Daniel P Russo
- Center for Computational and Integrative Biology, Rutgers University Camden, Camden, New Jersey 08103, United States
- Department of Chemistry, Rutgers University Camden, Camden, New Jersey 08102, United States
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Fabian A Grimm
- ExxonMobil Biomedical Sciences, Inc., Annandale, New Jersey 08801, United States
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University Camden, Camden, New Jersey 08103, United States
- Department of Chemistry, Rutgers University Camden, Camden, New Jersey 08102, United States
| |
Collapse
|
9
|
Schaduangrat N, Malik AA, Nantasenamat C. ERpred: a web server for the prediction of subtype-specific estrogen receptor antagonists. PeerJ 2021; 9:e11716. [PMID: 34285834 PMCID: PMC8274494 DOI: 10.7717/peerj.11716] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 06/11/2021] [Indexed: 11/22/2022] Open
Abstract
Estrogen receptors alpha and beta (ERα and ERβ) are responsible for breast cancer metastasis through their involvement of clinical outcomes. Estradiol and hormone replacement therapy targets both ERs, but this often leads to an increased risk of breast and endometrial cancers as well as thromboembolism. A major challenge is posed for the development of compounds possessing ER subtype specificity. Herein, we present a large-scale classification structure-activity relationship (CSAR) study of inhibitors from the ChEMBL database which consisted of an initial set of 11,618 compounds for ERα and 7,810 compounds for ERβ. The IC50 was selected as the bioactivity unit for further investigation and after the data curation process, this led to a final data set of 1,593 and 1,281 compounds for ERα and ERβ, respectively. We employed the random forest (RF) algorithm for model building and of the 12 fingerprint types, models built using the PubChem fingerprint was the most robust (Ac of 94.65% and 92.25% and Matthews correlation coefficient (MCC) of 89% and 76% for ERα and ERβ, respectively) and therefore selected for feature interpretation. Results indicated the importance of features pertaining to aromatic rings, nitrogen-containing functional groups and aliphatic hydrocarbons. Finally, the model was deployed as the publicly available web server called ERpred at http://codes.bio/erpred where users can submit SMILES notation as the input query for prediction of the bioactivity against ERα and ERβ.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| | - Aijaz Ahmad Malik
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| |
Collapse
|
10
|
Wang Z, Chen J, Hong H. Developing QSAR Models with Defined Applicability Domains on PPARγ Binding Affinity Using Large Data Sets and Machine Learning Algorithms. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2021; 55:6857-6866. [PMID: 33914508 DOI: 10.1021/acs.est.0c07040] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Chemicals may cause adverse effects on human health through binding to peroxisome proliferator-activated receptor γ (PPARγ). Hence, binding affinity is useful for evaluating chemicals with potential endocrine-disrupting effects. Quantitative structure-activity relationship (QSAR) regression models with defined applicability domains (ADs) are important to enable efficient screening of chemicals with PPARγ binding activity. However, lack of large data sets hindered the development of QSAR models. In this study, based on PPARγ binding affinity data sets curated from various sources, 30 QSAR models were developed using molecular fingerprints, two-dimensional descriptors, and five machine learning algorithms. Structure-activity landscapes (SALs) of the training compounds were described by network-like similarity graphs (NSGs). Based on the NSGs, local discontinuity scores were calculated and found to be positively correlated with the cross-validation absolute prediction errors of the models using the different training sets, descriptors, and algorithms. Moreover, innovative ADs were defined based on pairwise similarities between compounds and were found to outperform some conventional ADs. The curated data sets and developed regression models could be useful for evaluating PPARγ-involved adverse effects of chemicals. The SAL analysis and the innovative ADs could facilitate understanding of prediction results from QSAR models.
Collapse
Affiliation(s)
- Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079, United States
| |
Collapse
|
11
|
Ciallella HL, Russo DP, Aleksunes LM, Grimm FA, Zhu H. Predictive modeling of estrogen receptor agonism, antagonism, and binding activities using machine- and deep-learning approaches. J Transl Med 2021; 101:490-502. [PMID: 32778734 PMCID: PMC7873171 DOI: 10.1038/s41374-020-00477-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 07/19/2020] [Accepted: 07/21/2020] [Indexed: 11/23/2022] Open
Abstract
As defined by the World Health Organization, an endocrine disruptor is an exogenous substance or mixture that alters function(s) of the endocrine system and consequently causes adverse health effects in an intact organism, its progeny, or (sub)populations. Traditional experimental testing regimens to identify toxicants that induce endocrine disruption can be expensive and time-consuming. Computational modeling has emerged as a promising and cost-effective alternative method for screening and prioritizing potentially endocrine-active compounds. The efficient identification of suitable chemical descriptors and machine-learning algorithms, including deep learning, is a considerable challenge for computational toxicology studies. Here, we sought to apply classic machine-learning algorithms and deep-learning approaches to a panel of over 7500 compounds tested against 18 Toxicity Forecaster assays related to nuclear estrogen receptor (ERα and ERβ) activity. Three binary fingerprints (Extended Connectivity FingerPrints, Functional Connectivity FingerPrints, and Molecular ACCess System) were used as chemical descriptors in this study. Each descriptor was combined with four machine-learning and two deep- learning (normal and multitask neural networks) approaches to construct models for all 18 ER assays. The resulting model performance was evaluated using the area under the receiver- operating curve (AUC) values obtained from a fivefold cross-validation procedure. The results showed that individual models have AUC values that range from 0.56 to 0.86. External validation was conducted using two additional sets of compounds (n = 592 and n = 966) with established interactions with nuclear ER demonstrated through experimentation. An agonist, antagonist, or binding score was determined for each compound by averaging its predicted probabilities in relevant assay models as an external validation, yielding AUC values ranging from 0.63 to 0.91. The results suggest that multitask neural networks offer advantages when modeling mechanistically related endpoints. Consensus predictions based on the average values of individual models remain the best modeling strategy for computational toxicity evaluations.
Collapse
Affiliation(s)
- Heather L Ciallella
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA
| | - Daniel P Russo
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ, USA
| | - Fabian A Grimm
- ExxonMobil Biomedical Sciences, Inc., Annandale, NJ, USA
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA.
- Department of Chemistry, Rutgers University, Camden, NJ, USA.
| |
Collapse
|
12
|
Idakwo G, Thangapandian S, Luttrell J, Li Y, Wang N, Zhou Z, Hong H, Yang B, Zhang C, Gong P. Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets. J Cheminform 2020; 12:66. [PMID: 33372637 PMCID: PMC7592558 DOI: 10.1186/s13321-020-00468-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 10/13/2020] [Indexed: 12/14/2022] Open
Abstract
The specificity of toxicant-target biomolecule interactions lends to the very imbalanced nature of many toxicity datasets, causing poor performance in Structure–Activity Relationship (SAR)-based chemical classification. Undersampling and oversampling are representative techniques for handling such an imbalance challenge. However, removing inactive chemical compound instances from the majority class using an undersampling technique can result in information loss, whereas increasing active toxicant instances in the minority class by interpolation tends to introduce artificial minority instances that often cross into the majority class space, giving rise to class overlapping and a higher false prediction rate. In this study, in order to improve the prediction accuracy of imbalanced learning, we employed SMOTEENN, a combination of Synthetic Minority Over-sampling Technique (SMOTE) and Edited Nearest Neighbor (ENN) algorithms, to oversample the minority class by creating synthetic samples, followed by cleaning the mislabeled instances. We chose the highly imbalanced Tox21 dataset, which consisted of 12 in vitro bioassays for > 10,000 chemicals that were distributed unevenly between binary classes. With Random Forest (RF) as the base classifier and bagging as the ensemble strategy, we applied four hybrid learning methods, i.e., RF without imbalance handling (RF), RF with Random Undersampling (RUS), RF with SMOTE (SMO), and RF with SMOTEENN (SMN). The performance of the four learning methods was compared using nine evaluation metrics, among which F1 score, Matthews correlation coefficient and Brier score provided a more consistent assessment of the overall performance across the 12 datasets. The Friedman’s aligned ranks test and the subsequent Bergmann-Hommel post hoc test showed that SMN significantly outperformed the other three methods. We also found that a strong negative correlation existed between the prediction accuracy and the imbalance ratio (IR), which is defined as the number of inactive compounds divided by the number of active compounds. SMN became less effective when IR exceeded a certain threshold (e.g., > 28). The ability to separate the few active compounds from the vast amounts of inactive ones is of great importance in computational toxicology. This work demonstrates that the performance of SAR-based, imbalanced chemical toxicity classification can be significantly improved through the use of data rebalancing.
Collapse
Affiliation(s)
- Gabriel Idakwo
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Sundar Thangapandian
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA
| | - Joseph Luttrell
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Yan Li
- Bennett Aerospace Inc, Cary, NC, 27518, USA
| | - Nan Wang
- Department of Computer Science, New Jersey City University, Jersey City, NJ, 07305, USA
| | - Zhaoxian Zhou
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Centre for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Bei Yang
- School of Information & Engineering, Zhengzhou University, Zhengzhou, 450000, China
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA.
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA.
| |
Collapse
|
13
|
Zhao L, Ciallella HL, Aleksunes LM, Zhu H. Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling. Drug Discov Today 2020; 25:1624-1638. [PMID: 32663517 PMCID: PMC7572559 DOI: 10.1016/j.drudis.2020.07.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 06/26/2020] [Accepted: 07/06/2020] [Indexed: 02/06/2023]
Abstract
Advancing a new drug to market requires substantial investments in time as well as financial resources. Crucial bioactivities for drug candidates, including their efficacy, pharmacokinetics (PK), and adverse effects, need to be investigated during drug development. With advancements in chemical synthesis and biological screening technologies over the past decade, a large amount of biological data points for millions of small molecules have been generated and are stored in various databases. These accumulated data, combined with new machine learning (ML) approaches, such as deep learning, have shown great potential to provide insights into relevant chemical structures to predict in vitro, in vivo, and clinical outcomes, thereby advancing drug discovery and development in the big data era.
Collapse
Affiliation(s)
- Linlin Zhao
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Heather L Ciallella
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ 08854, USA
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA; Department of Chemistry, Rutgers University, Camden, NJ 08102, USA.
| |
Collapse
|
14
|
Bafna D, Ban F, Rennie PS, Singh K, Cherkasov A. Computer-Aided Ligand Discovery for Estrogen Receptor Alpha. Int J Mol Sci 2020; 21:E4193. [PMID: 32545494 PMCID: PMC7352601 DOI: 10.3390/ijms21124193] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 05/30/2020] [Accepted: 06/09/2020] [Indexed: 02/08/2023] Open
Abstract
Breast cancer (BCa) is one of the most predominantly diagnosed cancers in women. Notably, 70% of BCa diagnoses are Estrogen Receptor α positive (ERα+) making it a critical therapeutic target. With that, the two subtypes of ER, ERα and ERβ, have contrasting effects on BCa cells. While ERα promotes cancerous activities, ERβ isoform exhibits inhibitory effects on the same. ER-directed small molecule drug discovery for BCa has provided the FDA approved drugs tamoxifen, toremifene, raloxifene and fulvestrant that all bind to the estrogen binding site of the receptor. These ER-directed inhibitors are non-selective in nature and may eventually induce resistance in BCa cells as well as increase the risk of endometrial cancer development. Thus, there is an urgent need to develop novel drugs with alternative ERα targeting mechanisms that can overcome the limitations of conventional anti-ERα therapies. Several functional sites on ERα, such as Activation Function-2 (AF2), DNA binding domain (DBD), and F-domain, have been recently considered as potential targets in the context of drug research and discovery. In this review, we summarize methods of computer-aided drug design (CADD) that have been employed to analyze and explore potential targetable sites on ERα, discuss recent advancement of ERα inhibitor development, and highlight the potential opportunities and challenges of future ERα-directed drug discovery.
Collapse
Affiliation(s)
| | | | | | | | - Artem Cherkasov
- Vancouver Prostate Centre, University of British Columbia, 2660 Oak Street, Vancouver, BC V6H 3Z6, Canada; (D.B.); (F.B.); (P.S.R.); (K.S.)
| |
Collapse
|
15
|
Wang Z, Chen J, Hong H. Applicability Domains Enhance Application of PPARγ Agonist Classifiers Trained by Drug-like Compounds to Environmental Chemicals. Chem Res Toxicol 2020; 33:1382-1388. [DOI: 10.1021/acs.chemrestox.9b00498] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Zhongyu Wang
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Jingwen Chen
- Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, United States
| |
Collapse
|
16
|
Matsuzaka Y, Uesawa Y. DeepSnap-Deep Learning Approach Predicts Progesterone Receptor Antagonist Activity With High Performance. Front Bioeng Biotechnol 2020; 7:485. [PMID: 32039185 PMCID: PMC6987043 DOI: 10.3389/fbioe.2019.00485] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 12/30/2019] [Indexed: 12/16/2022] Open
Abstract
The progesterone receptor (PR) is important therapeutic target for many malignancies and endocrine disorders due to its role in controlling ovulation and pregnancy via the reproductive cycle. Therefore, the modulation of PR activity using its agonists and antagonists is receiving increasing interest as novel treatment strategy. However, clinical trials using the PR modulators have not yet been found conclusive evidences. Recently, increasing evidence from several fields shows that the classification of chemical compounds, including agonists and antagonists, can be done with recent improvements in deep learning (DL) using deep neural network. Therefore, we recently proposed a novel DL-based quantitative structure-activity relationship (QSAR) strategy using transfer learning to build prediction models for agonists and antagonists. By employing this novel approach, referred as DeepSnap-DL method, which uses images captured from 3-dimension (3D) chemical structure with multiple angles as input data into the DL classification, we constructed prediction models of the PR antagonists in this study. Here, the DeepSnap-DL method showed a high performance prediction of the PR antagonists by optimization of some parameters and image adjustment from 3D-structures. Furthermore, comparison of the prediction models from this approach with conventional machine learnings (MLs) indicated the DeepSnap-DL method outperformed these MLs. Therefore, the models predicted by DeepSnap-DL would be powerful tool for not only QSAR field in predicting physiological and agonist/antagonist activities, toxicity, and molecular bindings; but also for identifying biological or pathological phenomena.
Collapse
Affiliation(s)
| | - Yoshihiro Uesawa
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo, Japan
| |
Collapse
|
17
|
Schneider M, Pons JL, Bourguet W, Labesse G. Towards accurate high-throughput ligand affinity prediction by exploiting structural ensembles, docking metrics and ligand similarity. Bioinformatics 2020; 36:160-168. [PMID: 31350558 PMCID: PMC6956784 DOI: 10.1093/bioinformatics/btz538] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2019] [Revised: 05/29/2019] [Accepted: 07/19/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Nowadays, virtual screening (VS) plays a major role in the process of drug development. Nonetheless, an accurate estimation of binding affinities, which is crucial at all stages, is not trivial and may require target-specific fine-tuning. Furthermore, drug design also requires improved predictions for putative secondary targets among which is Estrogen Receptor alpha (ERα). RESULTS VS based on combinations of Structure-Based VS (SBVS) and Ligand-Based VS (LBVS) is gaining momentum to improve VS performances. In this study, we propose an integrated approach using ligand docking on multiple structural ensembles to reflect receptor flexibility. Then, we investigate the impact of the two different types of features (structure-based and ligand molecular descriptors) on affinity predictions using a random forest algorithm. We find that ligand-based features have lower predictive power (rP = 0.69, R2 = 0.47) than structure-based features (rP = 0.78, R2 = 0.60). Their combination maintains high accuracy (rP = 0.73, R2 = 0.50) on the internal test set, but it shows superior robustness on external datasets. Further improvement and extending the training dataset to include xenobiotics, leads to a novel high-throughput affinity prediction method for ERα ligands (rP = 0.85, R2 = 0.71). The presented prediction tool is provided to the community as a dedicated satellite of the @TOME server in which one can upload a ligand dataset in mol2 format and get ligand docked and affinity predicted. AVAILABILITY AND IMPLEMENTATION http://edmon.cbs.cnrs.fr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Melanie Schneider
- Centre de Biochimie Structurale, CNRS, INSERM, Univ Montpellier, 34090 Montpellier, France
| | - Jean-Luc Pons
- Centre de Biochimie Structurale, CNRS, INSERM, Univ Montpellier, 34090 Montpellier, France
| | - William Bourguet
- Centre de Biochimie Structurale, CNRS, INSERM, Univ Montpellier, 34090 Montpellier, France
| | - Gilles Labesse
- Centre de Biochimie Structurale, CNRS, INSERM, Univ Montpellier, 34090 Montpellier, France
| |
Collapse
|
18
|
Abstract
Due to the massive data sets available for drug candidates, modern drug discovery has advanced to the big data era. Central to this shift is the development of artificial intelligence approaches to implementing innovative modeling based on the dynamic, heterogeneous, and large nature of drug data sets. As a result, recently developed artificial intelligence approaches such as deep learning and relevant modeling studies provide new solutions to efficacy and safety evaluations of drug candidates based on big data modeling and analysis. The resulting models provided deep insights into the continuum from chemical structure to in vitro, in vivo, and clinical outcomes. The relevant novel data mining, curation, and management techniques provided critical support to recent modeling studies. In summary, the new advancement of artificial intelligence in the big data era has paved the road to future rational drug development and optimization, which will have a significant impact on drug discovery procedures and, eventually, public health.
Collapse
Affiliation(s)
- Hao Zhu
- Department of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA;
| |
Collapse
|
19
|
Guo Y, Zhao L, Zhang X, Zhu H. Using a hybrid read-across method to evaluate chemical toxicity based on chemical structure and biological data. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2019; 178:178-187. [PMID: 31004930 PMCID: PMC6508079 DOI: 10.1016/j.ecoenv.2019.04.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 04/05/2019] [Accepted: 04/07/2019] [Indexed: 05/08/2023]
Abstract
Read-across has become a primary approach to fill data gaps for chemical safety assessments. Chemical similarity based on structure, reactivity, and physic-chemical property information is a traditional approach applied for read-across toxicity studies. However, toxicity mechanisms are usually complicated in a biological system, so only using chemical similarity to perform the read-across for new compounds was not satisfactory for most toxicity endpoints, especially when the chemically similar compounds show dissimilar toxicities. This study aims to develop an enhanced read-across method for chemical toxicity predictions. To this end, we used two large toxicity datasets for read-across purposes. One consists of 3979 compounds with Ames mutagenicity data, and the other contains 7332 compounds with rat acute oral toxicity data. First, biological data for all compounds in these two datasets were obtained by querying thousands of PubChem bioassays. The PubChem bioassays with at least five compounds from either of these two datasets showing active responses were selected to generate comprehensive bioprofiles. The read-across studies were performed by using chemical similarity search only and also by using a hybrid similarity search based on both chemical descriptors and bioprofiles. Compared to traditional read-across based on chemical similarity, the hybrid read-across approach showed improved accuracy of predictions for both Ames mutagenicity and acute oral toxicity. Furthermore, we could illustrate potential toxicity mechanisms by analyzing the bioprofiles used for this hybrid read-across study. The results of this study indicate that the new hybrid read-across approach could be an applicable computational tool for chemical toxicity predictions. In this way, the bottleneck of traditional read-across studies can be overcome by introducing public biological data into the traditional process. The incorporation of bioprofiles generated from the additional biological data for compounds can partially solve the "activity cliff" issue and reveal their potential toxicity mechanisms. This study leads to a promising direction to utilize data-driven approaches for computational toxicology studies in the big data era.
Collapse
Affiliation(s)
- Yajie Guo
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, China
| | - Linlin Zhao
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA
| | - Xiaoyi Zhang
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing, China.
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, USA; Department of Chemistry, Rutgers University, Camden, NJ, USA.
| |
Collapse
|
20
|
Liu X, Zhu Y, Liu T, Xue Q, Tian F, Yuan Y, Zhao C. Exploring toxicity of perfluorinated compounds through complex network and pathway modeling. J Biomol Struct Dyn 2019; 38:2604-2612. [PMID: 31244379 DOI: 10.1080/07391102.2019.1637281] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Perfluorinated compounds (PFCs) have serious impacts on human health, which could interfere with the body's signal pathways and affect the normal hormone balance of humans. PFCs were reported to bind to many proteins causing a series of biological effects. It was quite possible that the in vivo action of PFCs was not a single target or a single pathway, suggesting the toxic effect was due to the disturbance of protein or gene network, not limited to the modification of a single target protein or gene. Thus, a PFCs-targets interaction network was constructed and the significant differences in the characteristics of complex networks between the branched PFCs and linear PFCs were observed. A molecular dynamics simulation proved that binding ability of the branched PFCs to the target protein was much weaker than that of the linear PFCs, explaining why the branched PFCs presented significantly difference from the linear PFCs in terms of complex network characteristics. In addition, four target genes were identified as the central node genes of the network. The four target genes were proved to present certain influences on some diseases, which suggested a high correlation between PFCs to these diseases, including obesity, hepatocellular carcinoma and diabetes. The present work was helpful to develop new approaches to identify the key toxic targets of compounds and to explore the toxicity effects on pathways. AbbreviationsARandrogen receptorBPAbisphenol AESR1estrogen receptor 1ESR2estrogen receptor 2GLTPglycolipid transfer proteinHbFthe fetal hemoglobinHBG1hemoglobin subunit γ-1hERαhuman ERαHSD17B1hydroxysteroid 17-β dehydrogenase 1KEGGKenya encyclopedia of genes and genomesMDmolecular dynamics simulationPFCsperfluorinated compoundsPFOAperfluorooctanoic acidPFOSperfluorooctane sulfonatePOPspersistent organic pollutantsRMSDroot-mean-square deviationSHBGsex hormone binding globulinSPC/Eextended simple point charge modelTRthyroid hormone receptorCommunicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Xinhe Liu
- School of Pharmacy, Lanzhou University, Lanzhou, China
| | - Yu Zhu
- Department of Ecology and Environment of Gansu Province, Lanzhou, China
| | - Tingting Liu
- Gansu Provincial Maternity and Child-care Hospital, Lanzhou, China
| | - Qiao Xue
- Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China
| | - Fang Tian
- School of Pharmacy, Lanzhou University, Lanzhou, China
| | - Yongna Yuan
- School of Information Science & Engineering, Lanzhou University, Lanzhou, China
| | - Chunyan Zhao
- School of Pharmacy, Lanzhou University, Lanzhou, China
| |
Collapse
|
21
|
|
22
|
In silico identification of endogenous and exogenous agonists of Estrogen-related receptor α. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.comtox.2019.01.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
23
|
Russo DP, Zorn KM, Clark AM, Zhu H, Ekins S. Comparing Multiple Machine Learning Algorithms and Metrics for Estrogen Receptor Binding Prediction. Mol Pharm 2018; 15:4361-4370. [PMID: 30114914 PMCID: PMC6181119 DOI: 10.1021/acs.molpharmaceut.8b00546] [Citation(s) in RCA: 91] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Many chemicals that disrupt endocrine function have been linked to a variety of adverse biological outcomes. However, screening for endocrine disruption using in vitro or in vivo approaches is costly and time-consuming. Computational methods, e.g., quantitative structure-activity relationship models, have become more reliable due to bigger training sets, increased computing power, and advanced machine learning algorithms, such as multilayered artificial neural networks. Machine learning models can be used to predict compounds for endocrine disrupting capabilities, such as binding to the estrogen receptor (ER), and allow for prioritization and further testing. In this work, an exhaustive comparison of multiple machine learning algorithms, chemical spaces, and evaluation metrics for ER binding was performed on public data sets curated using in-house cheminformatics software (Assay Central). Chemical features utilized in modeling consisted of binary fingerprints (ECFP6, FCFP6, ToxPrint, or MACCS keys) and continuous molecular descriptors from RDKit. Each feature set was subjected to classic machine learning algorithms (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, Support Vector Machine) and Deep Neural Networks (DNN). Models were evaluated using a variety of metrics: recall, precision, F1-score, accuracy, area under the receiver operating characteristic curve, Cohen's Kappa, and Matthews correlation coefficient. For predicting compounds within the training set, DNN has an accuracy higher than that of other methods; however, in 5-fold cross validation and external test set predictions, DNN and most classic machine learning models perform similarly regardless of the data set or molecular descriptors used. We have also used the rank normalized scores as a performance-criteria for each machine learning method, and Random Forest performed best on the validation set when ranked by metric or by data sets. These results suggest classic machine learning algorithms may be sufficient to develop high quality predictive models of ER activity.
Collapse
Affiliation(s)
- Daniel P. Russo
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, 08102, USA
- first author
| | - Kimberley M. Zorn
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
- first author
| | - Alex M. Clark
- Molecular Materials Informatics, Inc., Montreal, Quebec, Canada
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, 08102, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
24
|
Luechtefeld T, Rowlands C, Hartung T. Big-data and machine learning to revamp computational toxicology and its use in risk assessment. Toxicol Res (Camb) 2018; 7:732-744. [PMID: 30310652 PMCID: PMC6116175 DOI: 10.1039/c8tx00051d] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Accepted: 04/20/2018] [Indexed: 01/08/2023] Open
Abstract
The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80% with specificities >70% in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.
Collapse
Affiliation(s)
- Thomas Luechtefeld
- Center for Alternatives to Animal Testing at Johns Hopkins Bloomberg School of Public Health , 615 N. Wolfe Street , Baltimore , MD 21205 , USA .
| | - Craig Rowlands
- Underwriters Laboratories (UL) , UL Product Supply Chain Intelligence , 333 Pfingsten Road , Northbrook , IL 60062 , USA
| | - Thomas Hartung
- Center for Alternatives to Animal Testing at Johns Hopkins Bloomberg School of Public Health , 615 N. Wolfe Street , Baltimore , MD 21205 , USA .
| |
Collapse
|
25
|
Russo DP, Kim MT, Wang W, Pinolini D, Shende S, Strickland J, Hartung T, Zhu H. CIIPro: a new read-across portal to fill data gaps using public large-scale chemical and biological data. Bioinformatics 2018; 33:464-466. [PMID: 28172359 DOI: 10.1093/bioinformatics/btw640] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Revised: 09/29/2016] [Accepted: 10/05/2016] [Indexed: 11/14/2022] Open
Abstract
Summary We have developed a public Chemical In vitro–In vivo Profiling (CIIPro) portal, which can automatically extract in vitro biological data from public resources (i.e. PubChem) for user-supplied compounds. For compounds with in vivo target activity data (e.g. animal toxicity testing results), the integrated cheminformatics algorithm will optimize the extracted biological data using in vitro–in vivo correlations. The resulting in vitro biological data for target compounds can be used for read-across risk assessment of target compounds. Additionally, the CIIPro portal can identify the most similar compounds based on their optimized bioprofiles. The CIIPro portal provides new powerful assessment capabilities to the scientific community and can be easily integrated with other cheminformatics tools. Availability and Implementation ciipro.rutgers.edu. Contact danrusso@scarletmail.rutgers.edu or hao.zhu99@rutgers.edu
Collapse
Affiliation(s)
- Daniel P Russo
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
| | - Marlene T Kim
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA.,Department of Chemistry, Rutgers University, Camden, NJ, USA
| | - Wenyi Wang
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
| | - Daniel Pinolini
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
| | - Sunil Shende
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA.,Department of Computer Science, Rutgers University, Camden, NJ, USA
| | | | - Thomas Hartung
- Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, MD, USA.,University of Konstanz, CAAT-Europe, Konstanz, Germany
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA.,Department of Chemistry, Rutgers University, Camden, NJ, USA
| |
Collapse
|
26
|
Andersson N, Arena M, Auteri D, Barmaz S, Grignard E, Kienzler A, Lepper P, Lostia AM, Munn S, Parra Morte JM, Pellizzato F, Tarazona J, Terron A, Van der Linden S. Guidance for the identification of endocrine disruptors in the context of Regulations (EU) No 528/2012 and (EC) No 1107/2009. EFSA J 2018; 16:e05311. [PMID: 32625944 PMCID: PMC7009395 DOI: 10.2903/j.efsa.2018.5311] [Citation(s) in RCA: 162] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
This Guidance describes how to perform hazard identification for endocrine‐disrupting properties by following the scientific criteria which are outlined in Commission Delegated Regulation (EU) 2017/2100 and Commission Regulation (EU) 2018/605 for biocidal products and plant protection products, respectively. This publication is linked to the following EFSA Supporting Publications article: http://onlinelibrary.wiley.com/doi/10.2903/sp.efsa.2018.EN-1447/full
Collapse
|
27
|
Cronin MT, Richarz AN. Relationship Between Adverse Outcome Pathways and Chemistry-BasedIn SilicoModels to Predict Toxicity. ACTA ACUST UNITED AC 2017. [DOI: 10.1089/aivt.2017.0021] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Affiliation(s)
- Mark T.D. Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, England
| | - Andrea-Nicole Richarz
- European Commission, Joint Research Centre, Directorate for Health, Consumers and Reference Materials, Ispra, Italy
| |
Collapse
|
28
|
Wong JC, Zidar J, Ho J, Wang Y, Lee KK, Zheng J, Sullivan MB, You X, Kriegel R. Assessment of several machine learning methods towards reliable prediction of hormone receptor binding affinity. ACTA ACUST UNITED AC 2017. [DOI: 10.1016/j.cdc.2017.05.002] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|