1
|
Mansouri K, Moreira-Filho JT, Lowe CN, Charest N, Martin T, Tkachenko V, Judson R, Conway M, Kleinstreuer NC, Williams AJ. Free and open-source QSAR-ready workflow for automated standardization of chemical structures in support of QSAR modeling. J Cheminform 2024; 16:19. [PMID: 38378618 PMCID: PMC10880251 DOI: 10.1186/s13321-024-00814-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/10/2024] [Indexed: 02/22/2024] Open
Abstract
The rapid increase of publicly available chemical structures and associated experimental data presents a valuable opportunity to build robust QSAR models for applications in different fields. However, the common concern is the quality of both the chemical structure information and associated experimental data. This is especially true when those data are collected from multiple sources as chemical substance mappings can contain many duplicate structures and molecular inconsistencies. Such issues can impact the resulting molecular descriptors and their mappings to experimental data and, subsequently, the quality of the derived models in terms of accuracy, repeatability, and reliability. Herein we describe the development of an automated workflow to standardize chemical structures according to a set of standard rules and generate two and/or three-dimensional "QSAR-ready" forms prior to the calculation of molecular descriptors. The workflow was designed in the KNIME workflow environment and consists of three high-level steps. First, a structure encoding is read, and then the resulting in-memory representation is cross-referenced with any existing identifiers for consistency. Finally, the structure is standardized using a series of operations including desalting, stripping of stereochemistry (for two-dimensional structures), standardization of tautomers and nitro groups, valence correction, neutralization when possible, and then removal of duplicates. This workflow was initially developed to support collaborative modeling QSAR projects to ensure consistency of the results from the different participants. It was then updated and generalized for other modeling applications. This included modification of the "QSAR-ready" workflow to generate "MS-ready structures" to support the generation of substance mappings and searches for software applications related to non-targeted analysis mass spectrometry. Both QSAR and MS-ready workflows are freely available in KNIME, via standalone versions on GitHub, and as docker container resources for the scientific community. Scientific contribution: This work pioneers an automated workflow in KNIME, systematically standardizing chemical structures to ensure their readiness for QSAR modeling and broader scientific applications. By addressing data quality concerns through desalting, stereochemistry stripping, and normalization, it optimizes molecular descriptors' accuracy and reliability. The freely available resources in KNIME, GitHub, and docker containers democratize access, benefiting collaborative research and advancing diverse modeling endeavors in chemistry and mass spectrometry.
Collapse
Affiliation(s)
- Kamel Mansouri
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.
| | - José T Moreira-Filho
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA
| | - Charles N Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Nathaniel Charest
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Todd Martin
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | | | - Richard Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Mike Conway
- National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA
| | - Nicole C Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| |
Collapse
|
2
|
Edwards SW, Nelms M, Hench VK, Ponder J, Sullivan K. Mapping Mechanistic Pathways of Acute Oral Systemic Toxicity Using Chemical Structure and Bioactivity Measurements. FRONTIERS IN TOXICOLOGY 2022; 4:824094. [PMID: 35295211 PMCID: PMC8915918 DOI: 10.3389/ftox.2022.824094] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 01/31/2022] [Indexed: 12/16/2022] Open
Abstract
Regulatory agencies around the world have committed to reducing or eliminating animal testing for establishing chemical safety. Adverse outcome pathways can facilitate replacement by providing a mechanistic framework for identifying the appropriate non-animal methods and connecting them to apical adverse outcomes. This study separated 11,992 chemicals with curated rat oral acute toxicity information into clusters of structurally similar compounds. Each cluster was then assigned one or more ToxCast/Tox21 assays by looking for the minimum number of assays required to record at least one positive hit call below cytotoxicity for all acutely toxic chemicals in the cluster. When structural information is used to select assays for testing, none of the chemicals required more than four assays and 98% required two assays or less. Both the structure-based clusters and activity from the associated assays were significantly associated with the GHS toxicity classification of the chemicals, which suggests that a combination of bioactivity and structural information could be as reproducible as traditional in vivo studies. Predictivity is improved when the in vitro assay directly corresponds to the mechanism of toxicity, but many indirect assays showed promise as well. Given the lower cost of in vitro testing, a small assay battery including both general cytotoxicity assays and two or more orthogonal assays targeting the toxicological mechanism could be used to improve performance further. This approach illustrates the promise of combining existing in silico approaches, such as the Collaborative Acute Toxicity Modeling Suite (CATMoS), with structure-based bioactivity information as part of an efficient tiered testing strategy that can reduce or eliminate animal testing for acute oral toxicity.
Collapse
Affiliation(s)
- Stephen W. Edwards
- GenOmics, Bioinformatics, and Translational Research Center, RTI International, Research Triangle Park, Durham, NC, United States
| | - Mark Nelms
- GenOmics, Bioinformatics, and Translational Research Center, RTI International, Research Triangle Park, Durham, NC, United States
| | - Virginia K. Hench
- GenOmics, Bioinformatics, and Translational Research Center, RTI International, Research Triangle Park, Durham, NC, United States
| | - Jessica Ponder
- Physicians Committee for Responsible Medicine, Washington, DC, United States
| | - Kristie Sullivan
- Physicians Committee for Responsible Medicine, Washington, DC, United States
| |
Collapse
|
3
|
Allen TEH, Nelms MD, Edwards SW, Goodman JM, Gutsell S, Russell PJ. In Silico Guidance for In Vitro Androgen and Glucocorticoid Receptor ToxCast Assays. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:7461-7470. [PMID: 32432465 DOI: 10.1021/acs.est.0c01105] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Molecular initiating events (MIEs) are key events in adverse outcome pathways that link molecular chemistry to target biology. As they are based on chemistry, these interactions are excellent targets for computational chemistry approaches to in silico modeling. In this work, we aim to link ligand chemical structures to MIEs for androgen receptor (AR) and glucocorticoid receptor (GR) binding using ToxCast data. This has been done using an automated computational algorithm to perform maximal common substructure searches on chemical binders for each target from the ToxCast dataset. The models developed show a high level of accuracy, correctly assigning 87.20% of AR binders and 96.81% of GR binders in a 25% test set using holdout cross-validation. The 2D structural alerts developed can be used as in silico models to predict these MIEs and as guidance for in vitro ToxCast assays to confirm hits. These models can target such experimental work, reducing the number of assays to be performed to gain required toxicological insight. Development of these models has also allowed some structural alerts to be identified as predictors for agonist or antagonist behavior at the receptor target. This work represents a first step in using computational methods to guide and target experimental approaches.
Collapse
Affiliation(s)
- Timothy E H Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
- MRC Toxicology Unit, University of Cambridge, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, U.K
| | - Mark D Nelms
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee 37830, United States
- Integrated Systems Toxicology Division, National Health and Environmental Effects Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
| | - Stephen W Edwards
- Integrated Systems Toxicology Division, National Health and Environmental Effects Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, North Carolina 27709, United States
| | - Jonathan M Goodman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Steve Gutsell
- Unilever Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, U.K
| | - Paul J Russell
- Unilever Safety and Environmental Assurance Centre, Colworth Science Park, Sharnbrook, Bedfordshire MK44 1LQ, U.K
| |
Collapse
|
4
|
Mansouri K, Kleinstreuer N, Abdelaziz AM, Alberga D, Alves VM, Andersson PL, Andrade CH, Bai F, Balabin I, Ballabio D, Benfenati E, Bhhatarai B, Boyer S, Chen J, Consonni V, Farag S, Fourches D, García-Sosa AT, Gramatica P, Grisoni F, Grulke CM, Hong H, Horvath D, Hu X, Huang R, Jeliazkova N, Li J, Li X, Liu H, Manganelli S, Mangiatordi GF, Maran U, Marcou G, Martin T, Muratov E, Nguyen DT, Nicolotti O, Nikolov NG, Norinder U, Papa E, Petitjean M, Piir G, Pogodin P, Poroikov V, Qiao X, Richard AM, Roncaglioni A, Ruiz P, Rupakheti C, Sakkiah S, Sangion A, Schramm KW, Selvaraj C, Shah I, Sild S, Sun L, Taboureau O, Tang Y, Tetko IV, Todeschini R, Tong W, Trisciuzzi D, Tropsha A, Van Den Driessche G, Varnek A, Wang Z, Wedebye EB, Williams AJ, Xie H, Zakharov AV, Zheng Z, Judson RS. CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity. ENVIRONMENTAL HEALTH PERSPECTIVES 2020; 128:27002. [PMID: 32074470 DOI: 10.23645/epacomptox.5176876] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
BACKGROUND Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of ∼875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.
Collapse
Affiliation(s)
- Kamel Mansouri
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
- ScitoVation LLC, Research Triangle Park, North Carolina, USA
- Integrated Laboratory Systems, Inc., Morrisville, North Carolina, USA
| | - Nicole Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Ahmed M Abdelaziz
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Domenico Alberga
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Vinicius M Alves
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Carolina H Andrade
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
| | - Fang Bai
- School of Pharmacy, Lanzhou University, China
| | - Ilya Balabin
- Information Systems & Global Solutions (IS&GS), Lockheed Martin, USA
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Emilio Benfenati
- Istituto di Ricerche Farmacologiche "Mario Negri", IRCCS, Milan, Italy
| | - Barun Bhhatarai
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Scott Boyer
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Jingwen Chen
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Sherif Farag
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Denis Fourches
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | | | - Paola Gramatica
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Francesca Grisoni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Chris M Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Dragos Horvath
- Laboratoire de Chémoinformatique-UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Xin Hu
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | | | - Jiazhong Li
- School of Pharmacy, Lanzhou University, China
| | - Xuehua Li
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | | | - Serena Manganelli
- Istituto di Ricerche Farmacologiche "Mario Negri", IRCCS, Milan, Italy
| | | | - Uko Maran
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique-UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Todd Martin
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
| | - Eugene Muratov
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Orazio Nicolotti
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Nikolai G Nikolov
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Ulf Norinder
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Ester Papa
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Michel Petitjean
- Computational Modeling of Protein-Ligand Interactions (CMPLI)-INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Geven Piir
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Pavel Pogodin
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Vladimir Poroikov
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Xianliang Qiao
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Ann M Richard
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | | | - Patricia Ruiz
- Computational Toxicology and Methods Development Laboratory, Division of Toxicology and Human Health Sciences, Agency for Toxic Substances and Disease Registry, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Chetan Rupakheti
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
- Department of Biochemistry and Molecular Biophysics, University of Chicago, Chicago, Illinois, USA
| | - Sugunadevi Sakkiah
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Alessandro Sangion
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Karl-Werner Schramm
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Chandrabose Selvaraj
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Sulev Sild
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Lixia Sun
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Olivier Taboureau
- Computational Modeling of Protein-Ligand Interactions (CMPLI)-INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Yun Tang
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Igor V Tetko
- BIGCHEM GmbH, Neuherberg, Germany
- Helmholtz Zentrum Muenchen - German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Roberto Todeschini
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | | | - Alexander Tropsha
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - George Van Den Driessche
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique-UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Zhongyu Wang
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Eva B Wedebye
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Antony J Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Hongbin Xie
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Alexey V Zakharov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ziye Zheng
- Chemistry Department, Umeå University, Umeå, Sweden
| | - Richard S Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| |
Collapse
|
5
|
Mansouri K, Kleinstreuer N, Abdelaziz AM, Alberga D, Alves VM, Andersson PL, Andrade CH, Bai F, Balabin I, Ballabio D, Benfenati E, Bhhatarai B, Boyer S, Chen J, Consonni V, Farag S, Fourches D, García-Sosa AT, Gramatica P, Grisoni F, Grulke CM, Hong H, Horvath D, Hu X, Huang R, Jeliazkova N, Li J, Li X, Liu H, Manganelli S, Mangiatordi GF, Maran U, Marcou G, Martin T, Muratov E, Nguyen DT, Nicolotti O, Nikolov NG, Norinder U, Papa E, Petitjean M, Piir G, Pogodin P, Poroikov V, Qiao X, Richard AM, Roncaglioni A, Ruiz P, Rupakheti C, Sakkiah S, Sangion A, Schramm KW, Selvaraj C, Shah I, Sild S, Sun L, Taboureau O, Tang Y, Tetko IV, Todeschini R, Tong W, Trisciuzzi D, Tropsha A, Van Den Driessche G, Varnek A, Wang Z, Wedebye EB, Williams AJ, Xie H, Zakharov AV, Zheng Z, Judson RS. CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity. ENVIRONMENTAL HEALTH PERSPECTIVES 2020; 128:27002. [PMID: 32074470 PMCID: PMC7064318 DOI: 10.1289/ehp5580] [Citation(s) in RCA: 104] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 11/27/2019] [Accepted: 12/05/2019] [Indexed: 05/04/2023]
Abstract
BACKGROUND Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of ∼ 875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.
Collapse
Affiliation(s)
- Kamel Mansouri
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
- ScitoVation LLC, Research Triangle Park, North Carolina, USA
- Integrated Laboratory Systems, Inc., Morrisville, North Carolina, USA
| | - Nicole Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Ahmed M. Abdelaziz
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Domenico Alberga
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Vinicius M. Alves
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Carolina H. Andrade
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
| | - Fang Bai
- School of Pharmacy, Lanzhou University, China
| | - Ilya Balabin
- Information Systems & Global Solutions (IS&GS), Lockheed Martin, USA
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Emilio Benfenati
- Istituto di Ricerche Farmacologiche “Mario Negri”, IRCCS, Milan, Italy
| | - Barun Bhhatarai
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Scott Boyer
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Jingwen Chen
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Sherif Farag
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Denis Fourches
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | | | - Paola Gramatica
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Francesca Grisoni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Chris M. Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Dragos Horvath
- Laboratoire de Chémoinformatique—UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Xin Hu
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | | | - Jiazhong Li
- School of Pharmacy, Lanzhou University, China
| | - Xuehua Li
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | | | - Serena Manganelli
- Istituto di Ricerche Farmacologiche “Mario Negri”, IRCCS, Milan, Italy
| | | | - Uko Maran
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique—UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Todd Martin
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
| | - Eugene Muratov
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Orazio Nicolotti
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Nikolai G. Nikolov
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Ulf Norinder
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Ester Papa
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Michel Petitjean
- Computational Modeling of Protein-Ligand Interactions (CMPLI)–INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Geven Piir
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Pavel Pogodin
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Vladimir Poroikov
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Xianliang Qiao
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Ann M. Richard
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | | | - Patricia Ruiz
- Computational Toxicology and Methods Development Laboratory, Division of Toxicology and Human Health Sciences, Agency for Toxic Substances and Disease Registry, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Chetan Rupakheti
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
- Department of Biochemistry and Molecular Biophysics, University of Chicago, Chicago, Illinois, USA
| | - Sugunadevi Sakkiah
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Alessandro Sangion
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Karl-Werner Schramm
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Chandrabose Selvaraj
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Sulev Sild
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Lixia Sun
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Olivier Taboureau
- Computational Modeling of Protein-Ligand Interactions (CMPLI)–INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Yun Tang
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Igor V. Tetko
- BIGCHEM GmbH, Neuherberg, Germany
- Helmholtz Zentrum Muenchen – German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Roberto Todeschini
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | | | - Alexander Tropsha
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - George Van Den Driessche
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique—UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Zhongyu Wang
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Eva B. Wedebye
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Antony J. Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Hongbin Xie
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Alexey V. Zakharov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ziye Zheng
- Chemistry Department, Umeå University, Umeå, Sweden
| | - Richard S. Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| |
Collapse
|