1
|
Lunghini F, Fava A, Pisapia V, Sacco F, Iaconis D, Beccari AR. ProfhEX: AI-based platform for small molecules liability profiling. J Cheminform 2023; 15:60. [PMID: 37296454 DOI: 10.1186/s13321-023-00728-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 05/28/2023] [Indexed: 06/12/2023] Open
Abstract
Off-target drug interactions are a major reason for candidate failure in the drug discovery process. Anticipating potential drug's adverse effects in the early stages is necessary to minimize health risks to patients, animal testing, and economical costs. With the constantly increasing size of virtual screening libraries, AI-driven methods can be exploited as first-tier screening tools to provide liability estimation for drug candidates. In this work we present ProfhEX, an AI-driven suite of 46 OECD-compliant machine learning models that can profile small molecules on 7 relevant liability groups: cardiovascular, central nervous system, gastrointestinal, endocrine, renal, pulmonary and immune system toxicities. Experimental affinity data was collected from public and commercial data sources. The entire chemical space comprised 289'202 activity data for a total of 210'116 unique compounds, spanning over 46 targets with dataset sizes ranging from 819 to 18896. Gradient boosting and random forest algorithms were initially employed and ensembled for the selection of a champion model. Models were validated according to the OECD principles, including robust internal (cross validation, bootstrap, y-scrambling) and external validation. Champion models achieved an average Pearson correlation coefficient of 0.84 (SD of 0.05), an R2 determination coefficient of 0.68 (SD = 0.1) and a root mean squared error of 0.69 (SD of 0.08). All liability groups showed good hit-detection power with an average enrichment factor at 5% of 13.1 (SD of 4.5) and AUC of 0.92 (SD of 0.05). Benchmarking against already existing tools demonstrated the predictive power of ProfhEX models for large-scale liability profiling. This platform will be further expanded with the inclusion of new targets and through complementary modelling approaches, such as structure and pharmacophore-based models. ProfhEX is freely accessible at the following address: https://profhex.exscalate.eu/ .
Collapse
Affiliation(s)
- Filippo Lunghini
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123, Naples, Italy
| | - Anna Fava
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123, Naples, Italy
| | - Vincenzo Pisapia
- Professional Service Department, SAS Institute, Via Darwin 20/22, 20143, Milan, Italy
| | - Francesco Sacco
- Professional Service Department, SAS Institute, Via Darwin 20/22, 20143, Milan, Italy
| | - Daniela Iaconis
- EXSCALATE, Dompé Farmaceutici SpA, Via Tommaso de Amicis 95, 80123, Naples, Italy
| | | |
Collapse
|
2
|
Wright PSR, Smith GF, Briggs KA, Thomas R, Maglennon G, Mikulskis P, Chapman M, Greene N, Phillips BU, Bender A. Retrospective analysis of the potential use of virtual control groups in preclinical toxicity assessment using the eTOX database. Regul Toxicol Pharmacol 2023; 138:105309. [PMID: 36481280 DOI: 10.1016/j.yrtph.2022.105309] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 11/17/2022] [Accepted: 11/25/2022] [Indexed: 12/12/2022]
Abstract
Virtual Control Groups (VCGs) based on Historical Control Data (HCD) in preclinical toxicity testing have the potential to reduce animal usage. As a case study we retrospectively analyzed the impact of replacing Concurrent Control Groups (CCGs) with VCGs on the treatment-relatedness of 28 selected histopathological findings reported in either rat or dog in the eTOX database. We developed a novel methodology whereby statistical predictions of treatment-relatedness using either CCGs or VCGs of varying covariate similarity to CCGs were compared to designations from original toxicologist reports; and changes in agreement were used to quantify changes in study outcomes. Generally, the best agreement was achieved when CCGs were replaced with VCGs with the highest level of similarity; the same species, strain, sex, administration route, and vehicle. For example, balanced accuracies for rat findings were 0.704 (predictions based on CCGs) vs. 0.702 (predictions based on VCGs). Moreover, we identified covariates which resulted in poorer identification of treatment-relatedness. This was related to an increasing incidence rate divergence in HCD relative to CCGs. Future databases which collect data at the individual animal level including study details such as animal age and testing facility are required to build adequate VCGs to accurately identify treatment-related effects.
Collapse
Affiliation(s)
| | - Graham F Smith
- AstraZeneca, Data Science and AI, Clinical Pharmacology and Safety Sciences, R&D, Cambridge, United Kingdom
| | | | | | - Gareth Maglennon
- AstraZeneca, Oncology Pathology, Clinical Pharmacology and Safety Sciences, R&D, Melbourn, United Kingdom
| | - Paulius Mikulskis
- AstraZeneca, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, Gothenburg, Sweden
| | - Melissa Chapman
- AstraZeneca, Toxicology, Clinical Pharmacology and Safety Sciences, R&D, Melbourn, United Kingdom
| | - Nigel Greene
- AstraZeneca, Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, Waltham, MA, USA
| | - Benjamin U Phillips
- AstraZeneca, Data Sciences and Quantitative Biology, Discovery Sciences, Cambridge Biomedical Campus, Cambridge, United Kingdom
| | - Andreas Bender
- University of Cambridge, Chemistry, Cambridge, United Kingdom.
| |
Collapse
|
3
|
Kolmar SS, Grulke CM. The effect of noise on the predictive limit of QSAR models. J Cheminform 2021; 13:92. [PMID: 34823605 PMCID: PMC8613965 DOI: 10.1186/s13321-021-00571-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 11/14/2021] [Indexed: 01/09/2023] Open
Abstract
A key challenge in the field of Quantitative Structure Activity Relationships (QSAR) is how to effectively treat experimental error in the training and evaluation of computational models. It is often assumed in the field of QSAR that models cannot produce predictions which are more accurate than their training data. Additionally, it is implicitly assumed, by necessity, that data points in test sets or validation sets do not contain error, and that each data point is a population mean. This work proposes the hypothesis that QSAR models can make predictions which are more accurate than their training data and that the error-free test set assumption leads to a significant misevaluation of model performance. This work used 8 datasets with six different common QSAR endpoints, because different endpoints should have different amounts of experimental error associated with varying complexity of the measurements. Up to 15 levels of simulated Gaussian distributed random error was added to the datasets, and models were built on the error laden datasets using five different algorithms. The models were trained on the error laden data, evaluated on error-laden test sets, and evaluated on error-free test sets. The results show that for each level of added error, the RMSE for evaluation on the error free test sets was always better. The results support the hypothesis that, at least under the conditions of Gaussian distributed random error, QSAR models can make predictions which are more accurate than their training data, and that the evaluation of models on error laden test and validation sets may give a flawed measure of model performance. These results have implications for how QSAR models are evaluated, especially for disciplines where experimental error is very large, such as in computational toxicology. ![]()
Collapse
Affiliation(s)
- Scott S Kolmar
- Center for Computational Toxicology and Exposure, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC, USA.
| | - Christopher M Grulke
- Center for Computational Toxicology and Exposure, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC, USA
| |
Collapse
|
4
|
Pradeep P, Friedman KP, Judson R. Structure-based QSAR Models to Predict Repeat Dose Toxicity Points of Departure. ACTA ACUST UNITED AC 2020; 16. [PMID: 34017928 DOI: 10.1016/j.comtox.2020.100139] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Human health risk assessment for environmental chemical exposure is limited by a vast majority of chemicals with little or no experimental in vivo toxicity data. Data gap filling techniques, such as quantitative structure activity relationship (QSAR) models based on chemical structure information, can predict hazard in the absence of experimental data. Risk assessment requires identification of a quantitative point-of-departure (POD) value, the point on the dose-response curve that marks the beginning of a low-dose extrapolation. This study presents two sets of QSAR models to predict POD values (PODQSAR) for repeat dose toxicity. For training and validation, a publicly available in vivo toxicity dataset for 3592 chemicals was compiled using the U.S. Environmental Protection Agency's Toxicity Value database (ToxValDB). The first set of QSAR models predict point-estimates of POD values (PODQSAR) using structural and physicochemical descriptors for repeat dose study types and species combinations. A random forest QSAR model using study type and species as descriptors showed the best performance, with an external test set root mean square error (RMSE) of 0.71 log10-mg/kg/day and coefficient of determination (R2) of 0.53. The second set of QSAR models predict the 95% confidence intervals for PODQSAR using a constructed POD distribution with a mean equal to the median POD value and a standard deviation of 0.5 log10-mg/kg/day, based on previously published typical study-to-study variability that may lead to uncertainty in model predictions. Bootstrap resampling of the pre-generated POD distribution was used to derive point-estimates and 95% confidence intervals for each POD prediction. Enrichment analysis to evaluate the accuracy of PODQSAR showed that 80% of the 5% most potent chemicals were found in the top 20% of the most potent chemical predictions, suggesting that the repeat dose POD QSAR models presented here may help inform screening level human health risk assessments in the absence of other data.
Collapse
Affiliation(s)
- Prachi Pradeep
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee.,Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Katie Paul Friedman
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Richard Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| |
Collapse
|
5
|
Ly Pham L, Watford S, Pradeep P, Martin MT, Thomas R, Judson R, Setzer RW, Paul Friedman K. Variability in in vivo studies: Defining the upper limit of performance for predictions of systemic effect levels. COMPUTATIONAL TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2020; 15:1-100126. [PMID: 33426408 PMCID: PMC7787987 DOI: 10.1016/j.comtox.2020.100126] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
New approach methodologies (NAMs) for chemical hazard assessment are often evaluated via comparison to animal studies; however, variability in animal study data limits NAM accuracy. The US EPA Toxicity Reference Database (ToxRefDB) enables consideration of variability in effect levels, including the lowest effect level (LEL) for a treatment-related effect and the lowest observable adverse effect level (LOAEL) defined by expert review, from subacute, subchronic, chronic, multi-generation reproductive, and developmental toxicity studies. The objectives of this work were to quantify the variance within systemic LEL and LOAEL values, defined as potency values for effects in adult or parental animals only, and to estimate the upper limit of NAM prediction accuracy. Multiple linear regression (MLR) and augmented cell means (ACM) models were used to quantify the total variance, and the fraction of variance in systemic LEL and LOAEL values explained by available study descriptors (e.g., administration route, study type). The MLR approach considered each study descriptor as an independent contributor to variance, whereas the ACM approach combined categorical descriptors into cells to define replicates. Using these approaches, total variance in systemic LEL and LOAEL values (in log10-mg/kg/day units) ranged from 0.74 to 0.92. Unexplained variance in LEL and LOAEL values, approximated by the residual mean square error (MSE), ranged from 0.20-0.39. Considering subchronic, chronic, or developmental study designs separately resulted in similar values. Based on the relationship between MSE and R-squared for goodness-of-fit, the maximal R-squared may approach 55 to 73% for a NAM-based predictive model of systemic toxicity using these data as reference. The root mean square error (RMSE) ranged from 0.47 to 0.63 log10-mg/kg/day, depending on dataset and regression approach, suggesting that a two-sided minimum prediction interval for systemic effect levels may have a width of 58 to 284-fold. These findings suggest quantitative considerations for building scientific confidence in NAM-based systemic toxicity predictions.
Collapse
Affiliation(s)
- Ly Ly Pham
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
- Oak Ridge Institute for Science and Education, 100 ORAU Way, Oak Ridge, TN 37830
| | - Sean Watford
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
- ORAU, contractor to U.S. Environmental Protection Agency through the National Student Services Contract, 100 ORAU Way, Oak Ridge, TN 37830
| | - Prachi Pradeep
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
- Oak Ridge Institute for Science and Education, 100 ORAU Way, Oak Ridge, TN 37830
| | - Matthew T. Martin
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
- Currently at Global Investigative Toxicology, Drug Safety Research and Development, Pfizer Inc. 445 Eastern Point Road, Groton, CT 06340
| | - Russell Thomas
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - Richard Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - R. Woodrow Setzer
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| | - Katie Paul Friedman
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, 27711, USA
| |
Collapse
|
6
|
Sheffield TY, Judson RS. Ensemble QSAR Modeling to Predict Multispecies Fish Toxicity Lethal Concentrations and Points of Departure. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:12793-12802. [PMID: 31560848 PMCID: PMC7047609 DOI: 10.1021/acs.est.9b03957] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
QSAR modeling can be used to aid testing prioritization of the thousands of chemical substances for which no ecological toxicity data are available. We drew on the U.S. Environmental Protection Agency's ECOTOX database with additional data from ECHA to build a large data set containing in vivo test data on fish for thousands of chemical substances. This was used to create QSAR models to predict two types of end points: acute LC50 (median lethal concentration) and points of departure similar to the NOEC (no observed effect concentration) for any duration (named the "LC50" and "NOEC" models, respectively). These models used study covariates, such as species and exposure route, as features to facilitate the simultaneous use of varied data types. A novel method of substituting taxonomy groups for species dummy variables was introduced to maximize generalizability to different species. A stacked ensemble of three machine learning methods-random forest, gradient boosted trees, and support vector regression-was implemented to best make use of a large data set with many descriptors. The LC50 and NOEC models predicted end points within 1 order of magnitude 81% and 76% of the time, respectively, and had RMSEs of roughly 0.83 and 0.98 log10(mg/L), respectively. Benchmarks against the existing TEST and ECOSAR tools suggest improved prediction accuracy.
Collapse
Affiliation(s)
- Thomas Y. Sheffield
- U.S. Department of Energy, Oak Ridge Institute for Science and Education, Oak Ridge, TN, 37830, USA
| | - Richard S. Judson
- U.S. Environmental Protection Agency, National Center for Computational Toxicology, Research Triangle Park, NC, 27709, USA
| |
Collapse
|
7
|
Watford S, Ly Pham L, Wignall J, Shin R, Martin MT, Friedman KP. ToxRefDB version 2.0: Improved utility for predictive and retrospective toxicology analyses. Reprod Toxicol 2019; 89:145-158. [PMID: 31340180 PMCID: PMC6944327 DOI: 10.1016/j.reprotox.2019.07.012] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Revised: 05/31/2019] [Accepted: 07/12/2019] [Indexed: 02/08/2023]
Abstract
The Toxicity Reference Database (ToxRefDB) structures information from over 5000 in vivo toxicity studies, conducted largely to guidelines or specifications from the US Environmental Protection Agency and the National Toxicology Program, into a public resource for training and validation of predictive models. Herein, ToxRefDB version 2.0 (ToxRefDBv2) development is described. Endpoints were annotated (e.g. required, not required) according to guidelines for subacute, subchronic, chronic, developmental, and multigenerational reproductive designs, distinguishing negative responses from untested. Quantitative data were extracted, and dose-response modeling for nearly 28,000 datasets from nearly 400 endpoints using Benchmark Dose (BMD) Modeling Software were generated and stored. Implementation of controlled vocabulary improved data quality; standardization to guideline requirements and cross-referencing with United Medical Language System (UMLS) connects ToxRefDBv2 observations to vocabularies linked to UMLS, including PubMed medical subject headings. ToxRefDBv2 allows for increased connections to other resources and has greatly enhanced quantitative and qualitative utility for predictive toxicology.
Collapse
Affiliation(s)
- Sean Watford
- ORAU, Contractor to U.S. Environmental Protection Agency through the National Student Services Contract, United States; National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency, United States
| | - Ly Ly Pham
- ORAU, Contractor to U.S. Environmental Protection Agency through the National Student Services Contract, United States; ORISE Postdoctoral Research Participant, United States
| | | | | | - Matthew T Martin
- ORAU, Contractor to U.S. Environmental Protection Agency through the National Student Services Contract, United States; Currently at Drug Safety Research and Development, Global Investigative Toxicology, Pfizer, Groton, CT, United States
| | - Katie Paul Friedman
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency, United States.
| |
Collapse
|
8
|
Thomas RS, Bahadori T, Buckley TJ, Cowden J, Deisenroth C, Dionisio KL, Frithsen JB, Grulke CM, Gwinn MR, Harrill JA, Higuchi M, Houck KA, Hughes MF, Hunter ES, Isaacs KK, Judson RS, Knudsen TB, Lambert JC, Linnenbrink M, Martin TM, Newton SR, Padilla S, Patlewicz G, Paul-Friedman K, Phillips KA, Richard AM, Sams R, Shafer TJ, Setzer RW, Shah I, Simmons JE, Simmons SO, Singh A, Sobus JR, Strynar M, Swank A, Tornero-Valez R, Ulrich EM, Villeneuve DL, Wambaugh JF, Wetmore BA, Williams AJ. The Next Generation Blueprint of Computational Toxicology at the U.S. Environmental Protection Agency. Toxicol Sci 2019; 169:317-332. [PMID: 30835285 PMCID: PMC6542711 DOI: 10.1093/toxsci/kfz058] [Citation(s) in RCA: 211] [Impact Index Per Article: 42.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The U.S. Environmental Protection Agency (EPA) is faced with the challenge of efficiently and credibly evaluating chemical safety often with limited or no available toxicity data. The expanding number of chemicals found in commerce and the environment, coupled with time and resource requirements for traditional toxicity testing and exposure characterization, continue to underscore the need for new approaches. In 2005, EPA charted a new course to address this challenge by embracing computational toxicology (CompTox) and investing in the technologies and capabilities to push the field forward. The return on this investment has been demonstrated through results and applications across a range of human and environmental health problems, as well as initial application to regulatory decision-making within programs such as the EPA's Endocrine Disruptor Screening Program. The CompTox initiative at EPA is more than a decade old. This manuscript presents a blueprint to guide the strategic and operational direction over the next 5 years. The primary goal is to obtain broader acceptance of the CompTox approaches for application to higher tier regulatory decisions, such as chemical assessments. To achieve this goal, the blueprint expands and refines the use of high-throughput and computational modeling approaches to transform the components in chemical risk assessment, while systematically addressing key challenges that have hindered progress. In addition, the blueprint outlines additional investments in cross-cutting efforts to characterize uncertainty and variability, develop software and information technology tools, provide outreach and training, and establish scientific confidence for application to different public health and environmental regulatory decisions.
Collapse
Affiliation(s)
- Russell S. Thomas
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Tina Bahadori
- National Center for Environmental Assessment, Office of Research and Development, US Environmental Protection Agency
| | - Timothy J. Buckley
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - John Cowden
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Chad Deisenroth
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Kathie L. Dionisio
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Jeffrey B. Frithsen
- Chemical Safety for Sustainability National Research Program, Office of Research and Development, US Environmental Protection Agency
| | - Christopher M. Grulke
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Maureen R. Gwinn
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Joshua A. Harrill
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Mark Higuchi
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Keith A. Houck
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Michael F. Hughes
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - E. Sidney Hunter
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Kristin K. Isaacs
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Richard S. Judson
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Thomas B. Knudsen
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Jason C. Lambert
- National Center for Environmental Assessment, Office of Research and Development, US Environmental Protection Agency
| | - Monica Linnenbrink
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Todd M. Martin
- National Risk Management Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Seth R. Newton
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Stephanie Padilla
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Grace Patlewicz
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Katie Paul-Friedman
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Katherine A. Phillips
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Ann M. Richard
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Reeder Sams
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Timothy J. Shafer
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - R. Woodrow Setzer
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Jane E. Simmons
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Steven O. Simmons
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Amar Singh
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Jon R. Sobus
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Mark Strynar
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Adam Swank
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Rogelio Tornero-Valez
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Elin M. Ulrich
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Daniel L Villeneuve
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - John F. Wambaugh
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Barbara A. Wetmore
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Antony J. Williams
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| |
Collapse
|
9
|
Pinches MD, Thomas R, Porter R, Camidge L, Briggs K. Curation and analysis of clinical pathology parameters and histopathologic findings from eTOXsys, a large database project (eTOX) for toxicologic studies. Regul Toxicol Pharmacol 2019; 107:104396. [PMID: 31128168 DOI: 10.1016/j.yrtph.2019.05.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 05/07/2019] [Accepted: 05/21/2019] [Indexed: 11/30/2022]
Abstract
Large data sharing projects amongst the pharmaceutical industry have the potential to generate new insights using data on a scale that has not been previously available. A retrospective analysis of the preclinical toxicology data collected as part of the eTOX project was conducted with the aim to provide background rates and treatment-related value analysis on both clinical pathology and histopathology datasets. Incorporated into this analysis was an extensive data consolidation task to standardise all data. Reference intervals for common clinical pathology parameters in rat and dog were generated, alongside background histopathology incidence rates in the liver, heart and kidney. Systematically applied decision thresholds allowed consistent relabelling of data points considered anomalous, and maximum fold change estimates. Relabelling of anomalous data points was conducted for the histopathology data using a Bayesian model to identify dose-dependent increases in pathologies. The results of this study allow: newly generated data to be analysed using the same methodology, rates and distributions to be used when building predictive dose-response models, and the possibility to correlate clinical pathology findings with concurrent histopathology findings. In the first half of this paper we discuss data curation, in the second half we report on the analytical methods and results.
Collapse
Affiliation(s)
- Mark D Pinches
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK
| | - Robert Thomas
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK
| | - Rosemary Porter
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK
| | - Lucinda Camidge
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK
| | - Katharine Briggs
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds, LS11 5PS, UK.
| |
Collapse
|