1
|
Seal S, Williams D, Hosseini-Gerami L, Mahale M, Carpenter AE, Spjuth O, Bender A. Improved Detection of Drug-Induced Liver Injury by Integrating Predicted In Vivo and In Vitro Data. Chem Res Toxicol 2024; 37:1290-1305. [PMID: 38981058 PMCID: PMC11337212 DOI: 10.1021/acs.chemrestox.4c00015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 06/27/2024] [Accepted: 07/01/2024] [Indexed: 07/11/2024]
Abstract
Drug-induced liver injury (DILI) has been a significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. Over the last decade, the existing suite of in vitro proxy-DILI assays has generally improved at identifying compounds with hepatotoxicity. However, there is considerable interest in enhancing the in silico prediction of DILI because it allows for evaluating large sets of compounds more quickly and cost-effectively, particularly in the early stages of projects. In this study, we aim to study ML models for DILI prediction that first predict nine proxy-DILI labels and then use them as features in addition to chemical structural features to predict DILI. The features include in vitro (e.g., mitochondrial toxicity, bile salt export pump inhibition) data, in vivo (e.g., preclinical rat hepatotoxicity studies) data, pharmacokinetic parameters of maximum concentration, structural fingerprints, and physicochemical parameters. We trained DILI-prediction models on 888 compounds from the DILI data set (composed of DILIst and DILIrank) and tested them on a held-out external test set of 223 compounds from the DILI data set. The best model, DILIPredictor, attained an AUC-PR of 0.79. This model enabled the detection of the top 25 toxic compounds (2.68 LR+, positive likelihood ratio) compared to models using only structural features (1.65 LR+ score). Using feature interpretation from DILIPredictor, we identified the chemical substructures causing DILI and differentiated cases of DILI caused by compounds in animals but not in humans. For example, DILIPredictor correctly recognized 2-butoxyethanol as nontoxic in humans despite its hepatotoxicity in mice models. Overall, the DILIPredictor model improves the detection of compounds causing DILI with an improved differentiation between animal and human sensitivity and the potential for mechanism evaluation. DILIPredictor required only chemical structures as input for prediction and is publicly available at https://broad.io/DILIPredictor for use via web interface and with all code available for download.
Collapse
Affiliation(s)
- Srijit Seal
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Rd, Cambridge CB2 1EW, United Kingdom
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, United States
| | - Dominic Williams
- Safety
Innovation, Clinical Pharmacology and Safety Sciences, AstraZeneca, Cambridge CB4 0FZ, United Kingdom
- Quantitative
Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, United Kingdom
| | - Layla Hosseini-Gerami
- Ignota
Laboratories, County Hall, Westminster Bridge Rd, London SE1 7PB, United Kingdom
| | - Manas Mahale
- Bombay
College
of Pharmacy Kalina Santacruz (E), Mumbai 400 098, India
| | - Anne E. Carpenter
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02141, United States
| | - Ola Spjuth
- Department
of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, Uppsala SE-75124, Sweden
| | - Andreas Bender
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Rd, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
2
|
Dee W, Sequeira I, Lobley A, Slabaugh G. Cell-vision fusion: A Swin transformer-based approach for predicting kinase inhibitor mechanism of action from Cell Painting data. iScience 2024; 27:110511. [PMID: 39175778 PMCID: PMC11340608 DOI: 10.1016/j.isci.2024.110511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 04/08/2024] [Accepted: 07/11/2024] [Indexed: 08/24/2024] Open
Abstract
Image-based profiling of the cellular response to drug compounds has proven effective at characterizing the morphological changes resulting from perturbation experiments. As data availability increases, however, there are growing demands for novel deep-learning methods. We applied the SwinV2 computer vision architecture to predict the mechanism of action of 10 kinase inhibitor compounds directly from Cell Painting images. This method outperforms the standard approach of using image-based profiles (IBP)-multidimensional feature set representations generated by bioimaging software. Furthermore, our fusion approach-cell-vision fusion, combining three different data modalities, images, IBPs, and chemical structures-achieved 69.79% accuracy and 70.56% F1 score, 4.20% and 5.49% higher, respectively, than the best-performing IBP method. We provide three techniques, specific to Cell Painting images, which enable deep-learning architectures to train effectively and demonstrate approaches to combat the significant batch effects present in large Cell Painting datasets.
Collapse
Affiliation(s)
- William Dee
- Digital Environment Research Institute (DERI), Queen Mary University of London, London E1 1HH, UK
- Centre for Oral Immunobiology and Regenerative Medicine, Barts Centre for Squamous Cancer, Institute of Dentistry, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AD, UK
- Exscientia Plc, The Schrödinger Building Oxford Science Park, Oxford OX4 4GE, UK
| | - Ines Sequeira
- Centre for Oral Immunobiology and Regenerative Medicine, Barts Centre for Squamous Cancer, Institute of Dentistry, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London E1 2AD, UK
| | - Anna Lobley
- Exscientia Plc, The Schrödinger Building Oxford Science Park, Oxford OX4 4GE, UK
| | - Gregory Slabaugh
- Digital Environment Research Institute (DERI), Queen Mary University of London, London E1 1HH, UK
| |
Collapse
|
3
|
Odje F, Meijer D, von Coburg E, van der Hooft JJJ, Dunst S, Medema MH, Volkamer A. Unleashing the potential of cell painting assays for compound activities and hazards prediction. FRONTIERS IN TOXICOLOGY 2024; 6:1401036. [PMID: 39086553 PMCID: PMC11288911 DOI: 10.3389/ftox.2024.1401036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Accepted: 06/14/2024] [Indexed: 08/02/2024] Open
Abstract
The cell painting (CP) assay has emerged as a potent imaging-based high-throughput phenotypic profiling (HTPP) tool that provides comprehensive input data for in silico prediction of compound activities and potential hazards in drug discovery and toxicology. CP enables the rapid, multiplexed investigation of various molecular mechanisms for thousands of compounds at the single-cell level. The resulting large volumes of image data provide great opportunities but also pose challenges to image and data analysis routines as well as property prediction models. This review addresses the integration of CP-based phenotypic data together with or in substitute of structural information from compounds into machine (ML) and deep learning (DL) models to predict compound activities for various human-relevant disease endpoints and to identify the underlying modes-of-action (MoA) while avoiding unnecessary animal testing. The successful application of CP in combination with powerful ML/DL models promises further advances in understanding compound responses of cells guiding therapeutic development and risk assessment. Therefore, this review highlights the importance of unlocking the potential of CP assays when combined with molecular fingerprints for compound evaluation and discusses the current challenges that are associated with this approach.
Collapse
Affiliation(s)
- Floriane Odje
- Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
| | - Elena von Coburg
- Department Experimental Toxicology and ZEBET, German Federal Institute for Risk Assessment (BfR), German Centre for the Protection of Laboratory Animals (Bf3R), Berlin, Germany
| | | | - Sebastian Dunst
- Department Experimental Toxicology and ZEBET, German Federal Institute for Risk Assessment (BfR), German Centre for the Protection of Laboratory Animals (Bf3R), Berlin, Germany
| | - Marnix H. Medema
- Bioinformatics Group, Wageningen University, Wageningen, Netherlands
| | - Andrea Volkamer
- Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| |
Collapse
|
4
|
Manen-Freixa L, Antolin AA. Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery. Expert Opin Drug Discov 2024:1-27. [PMID: 39004919 DOI: 10.1080/17460441.2024.2376643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024]
Abstract
INTRODUCTION Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.
Collapse
Affiliation(s)
- Leticia Manen-Freixa
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Albert A Antolin
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
- Center for Cancer Drug Discovery, The Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK
| |
Collapse
|
5
|
Saha US, Vendruscolo M, Carpenter AE, Singh S, Bender A, Seal S. Step Forward Cross Validation for Bioactivity Prediction: Out of Distribution Validation in Drug Discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.02.601740. [PMID: 39005404 PMCID: PMC11245006 DOI: 10.1101/2024.07.02.601740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Recent advances in machine learning methods for materials science have significantly enhanced accurate predictions of the properties of novel materials. Here, we explore whether these advances can be adapted to drug discovery by addressing the problem of prospective validation - the assessment of the performance of a method on out-of-distribution data. First, we tested whether k-fold n-step forward cross-validation could improve the accuracy of out-of-distribution small molecule bioactivity predictions. We found that it is more helpful than conventional random split cross-validation in describing the accuracy of a model in real-world drug discovery settings. We also analyzed discovery yield and novelty error, finding that these two metrics provide an understanding of the applicability domain of models and an assessment of their ability to predict molecules with desirable bioactivity compared to other small molecules. Based on these results, we recommend incorporating a k-fold n-step forward cross-validation and these metrics when building state-of-the-art models for bioactivity prediction in drug discovery.
Collapse
Affiliation(s)
| | | | | | | | - Andreas Bender
- Department of Chemistry, University of Cambridge, UK
- STAR-UBB Institute, Babeş-Bolyai University, Cluj-Napoca, Romania
| | - Srijit Seal
- Department of Chemistry, University of Cambridge, UK
- Broad Institute of MIT and Harvard, Cambridge, MA, US
| |
Collapse
|
6
|
Liu G, Seal S, Arevalo J, Liang Z, Carpenter AE, Jiang M, Singh S. Learning Molecular Representation in a Cell. ARXIV 2024:arXiv:2406.12056v2. [PMID: 38947938 PMCID: PMC11213146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignment (InfoAlign) approach to learn molecular representations through the information bottleneck method in cells. We integrate molecules and cellular response data as nodes into a context graph, connecting them with weighted edges based on chemical, biological, and computational criteria. For each molecule in a training batch, InfoAlign optimizes the encoder's latent representation with a minimality objective to discard redundant structural information. A sufficiency objective decodes the representation to align with different feature spaces from the molecule's neighborhood in the context graph. We demonstrate that the proposed sufficiency objective for alignment is tighter than existing encoder-based contrastive methods. Empirically, we validate representations from InfoAlign in two downstream tasks: molecular property prediction against up to 19 baseline methods across four datasets, plus zero-shot molecule-morphology matching.
Collapse
|
7
|
Seal S, Williams DP, Hosseini-Gerami L, Mahale M, Carpenter AE, Spjuth O, Bender A. Improved Detection of Drug-Induced Liver Injury by Integrating Predicted in vivo and in vitro Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.10.575128. [PMID: 38895462 PMCID: PMC11185581 DOI: 10.1101/2024.01.10.575128] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Drug-induced liver injury (DILI) has been significant challenge in drug discovery, often leading to clinical trial failures and necessitating drug withdrawals. The existing suite of in vitro proxy-DILI assays is generally effective at identifying compounds with hepatotoxicity. However, there is considerable interest in enhancing in silico prediction of DILI because it allows for the evaluation of large sets of compounds more quickly and cost-effectively, particularly in the early stages of projects. In this study, we aim to study ML models for DILI prediction that first predicts nine proxy-DILI labels and then uses them as features in addition to chemical structural features to predict DILI. The features include in vitro (e.g., mitochondrial toxicity, bile salt export pump inhibition) data, in vivo (e.g., preclinical rat hepatotoxicity studies) data, pharmacokinetic parameters of maximum concentration, structural fingerprints, and physicochemical parameters. We trained DILI-prediction models on 888 compounds from the DILIst dataset and tested on a held-out external test set of 223 compounds from DILIst dataset. The best model, DILIPredictor, attained an AUC-ROC of 0.79. This model enabled the detection of top 25 toxic compounds compared to models using only structural features (2.68 LR+ score). Using feature interpretation from DILIPredictor, we were able to identify the chemical substructures causing DILI as well as differentiate cases DILI is caused by compounds in animals but not in humans. For example, DILIPredictor correctly recognized 2-butoxyethanol as non-toxic in humans despite its hepatotoxicity in mice models. Overall, the DILIPredictor model improves the detection of compounds causing DILI with an improved differentiation between animal and human sensitivity as well as the potential for mechanism evaluation. DILIPredictor is publicly available at https://broad.io/DILIPredictor for use via web interface and with all code available for download and local implementation via https://pypi.org/project/dilipred/.
Collapse
Affiliation(s)
- Srijit Seal
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, CB2 1EW, Cambridge, United Kingdom
- Imaging Platform, Broad Institute of MIT and Harvard, US
| | - Dominic P. Williams
- Safety Innovation, Clinical Pharmacology and Safety Sciences, AstraZeneca, Cambridge CB4 0FZ, United Kingdom
- Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge CB4 0FZ, United Kingdom
| | | | - Manas Mahale
- Bombay College of Pharmacy Kalina Santacruz (E), Mumbai 400 098, India
| | | | - Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, SE-75124, Uppsala, Sweden
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, CB2 1EW, Cambridge, United Kingdom
| |
Collapse
|
8
|
Seal S, Trapotsi MA, Spjuth O, Singh S, Carreras-Puigvert J, Greene N, Bender A, Carpenter AE. A Decade in a Systematic Review: The Evolution and Impact of Cell Painting. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.04.592531. [PMID: 38766203 PMCID: PMC11100607 DOI: 10.1101/2024.05.04.592531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Advancements include improvements in the Cell Painting protocol, assay adaptations for different types of perturbations and applications, and improved methodologies for feature extraction, quality control, and batch effect correction. Moreover, machine learning methods recently surpassed classical approaches in their ability to extract biologically useful information from Cell Painting images. Cell Painting data have been used alone or in combination with other - omics data to decipher the mechanism of action of a compound, its toxicity profile, and many other biological effects. Overall, key methodological advances have expanded Cell Painting's ability to capture cellular responses to various perturbations. Future advances will likely lie in advancing computational and experimental techniques, developing new publicly available datasets, and integrating them with other high-content data types.
Collapse
Affiliation(s)
- Srijit Seal
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW, Cambridge, United Kingdom
| | - Maria-Anna Trapotsi
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, 1 Francis Crick Avenue, Cambridge, CB2 0AA, United Kingdom
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, SE-75124, Uppsala, Sweden
| | - Shantanu Singh
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, 1 Francis Crick Avenue, Cambridge, CB2 0AA, United Kingdom
| | - Jordi Carreras-Puigvert
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, SE-75124, Uppsala, Sweden
| | - Nigel Greene
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, 35 Gatehouse Drive, Waltham, MA 02451, USA
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW, Cambridge, United Kingdom
| | - Anne E. Carpenter
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States
| |
Collapse
|
9
|
Seal S, Trapotsi MA, Spjuth O, Singh S, Carreras-Puigvert J, Greene N, Bender A, Carpenter AE. A Decade in a Systematic Review: The Evolution and Impact of Cell Painting. ARXIV 2024:arXiv:2405.02767v1. [PMID: 38745696 PMCID: PMC11092692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
High-content image-based assays have fueled significant discoveries in the life sciences in the past decade (2013-2023), including novel insights into disease etiology, mechanism of action, new therapeutics, and toxicology predictions. Here, we systematically review the substantial methodological advancements and applications of Cell Painting. Advancements include improvements in the Cell Painting protocol, assay adaptations for different types of perturbations and applications, and improved methodologies for feature extraction, quality control, and batch effect correction. Moreover, machine learning methods recently surpassed classical approaches in their ability to extract biologically useful information from Cell Painting images. Cell Painting data have been used alone or in combination with other -omics data to decipher the mechanism of action of a compound, its toxicity profile, and many other biological effects. Overall, key methodological advances have expanded Cell Painting's ability to capture cellular responses to various perturbations. Future advances will likely lie in advancing computational and experimental techniques, developing new publicly available datasets, and integrating them with other high-content data types.
Collapse
Affiliation(s)
- Srijit Seal
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW, Cambridge, United Kingdom
| | - Maria-Anna Trapotsi
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, 1 Francis Crick Avenue, Cambridge, CB2 0AA, United Kingdom
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, SE-75124, Uppsala, Sweden
| | - Shantanu Singh
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, 1 Francis Crick Avenue, Cambridge, CB2 0AA, United Kingdom
| | - Jordi Carreras-Puigvert
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box 591, SE-75124, Uppsala, Sweden
| | - Nigel Greene
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, 35 Gatehouse Drive, Waltham, MA 02451, USA
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, CB2 1EW, Cambridge, United Kingdom
| | - Anne E. Carpenter
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States
| |
Collapse
|
10
|
Fredin Haslum J, Lardeau CH, Karlsson J, Turkki R, Leuchowius KJ, Smith K, Müllers E. Cell Painting-based bioactivity prediction boosts high-throughput screening hit-rates and compound diversity. Nat Commun 2024; 15:3470. [PMID: 38658534 PMCID: PMC11043326 DOI: 10.1038/s41467-024-47171-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 03/22/2024] [Indexed: 04/26/2024] Open
Abstract
Identifying active compounds for a target is a time- and resource-intensive task in early drug discovery. Accurate bioactivity prediction using morphological profiles could streamline the process, enabling smaller, more focused compound screens. We investigate the potential of deep learning on unrefined single-concentration activity readouts and Cell Painting data, to predict compound activity across 140 diverse assays. We observe an average ROC-AUC of 0.744 ± 0.108 with 62% of assays achieving ≥0.7, 30% ≥0.8, and 7% ≥0.9. In many cases, the high prediction performance can be achieved using only brightfield images instead of multichannel fluorescence images. A comprehensive analysis shows that Cell Painting-based bioactivity prediction is robust across assay types, technologies, and target classes, with cell-based assays and kinase targets being particularly well-suited for prediction. Experimental validation confirms the enrichment of active compounds. Our findings indicate that models trained on Cell Painting data, combined with a small set of single-concentration data points, can reliably predict the activity of a compound library across diverse targets and assays while maintaining high hit rates and scaffold diversity. This approach has the potential to reduce the size of screening campaigns, saving time and resources, and enabling primary screening with more complex assays.
Collapse
Affiliation(s)
- Johan Fredin Haslum
- KTH Royal Institute of Technology, Stockholm, Sweden
- Science for Life Laboratory, Stockholm, Sweden
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
| | | | - Johan Karlsson
- Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Riku Turkki
- Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | | | - Kevin Smith
- KTH Royal Institute of Technology, Stockholm, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| | - Erik Müllers
- Research and Early Development, Cardiovascular, Renal and Metabolism (CVRM), BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden.
| |
Collapse
|
11
|
Ghiandoni GM, Evertsson E, Riley DJ, Tyrchan C, Rathi PC. Augmenting DMTA using predictive AI modelling at AstraZeneca. Drug Discov Today 2024; 29:103945. [PMID: 38460568 DOI: 10.1016/j.drudis.2024.103945] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 03/11/2024]
Abstract
Design-Make-Test-Analyse (DMTA) is the discovery cycle through which molecules are designed, synthesised, and assayed to produce data that in turn are analysed to inform the next iteration. The process is repeated until viable drug candidates are identified, often requiring many cycles before reaching a sweet spot. The advent of artificial intelligence (AI) and cloud computing presents an opportunity to innovate drug discovery to reduce the number of cycles needed to yield a candidate. Here, we present the Predictive Insight Platform (PIP), a cloud-native modelling platform developed at AstraZeneca. The impact of PIP in each step of DMTA, as well as its architecture, integration, and usage, are discussed and used to provide insights into the future of drug discovery.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK.
| | - Emma Evertsson
- Research and Early Development, Respiratory and Immunology (R&I), Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden, Mölndal, SE 43183, Sweden
| | - David J Riley
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK
| | - Christian Tyrchan
- Research and Early Development, Respiratory and Immunology (R&I), Biopharmaceuticals R&D, AstraZeneca, Pepparedsleden, Mölndal, SE 43183, Sweden
| | - Prakash Chandra Rathi
- Augmented DMTA Platform, R&D IT, AstraZeneca, The Discovery Centre (DISC), Francis Crick Avenue, Cambridge CB2 0AA, UK
| |
Collapse
|
12
|
Pahl I, Pahl A, Hauk A, Budde D, Sievers S, Fruth L, Menzel R. Assessing biologic/toxicologic effects of extractables from plastic contact materials for advanced therapy manufacturing using cell painting assay and cytotoxicity screening. Sci Rep 2024; 14:5933. [PMID: 38467674 PMCID: PMC10928227 DOI: 10.1038/s41598-024-55952-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 02/29/2024] [Indexed: 03/13/2024] Open
Abstract
Plastic components are essential in the pharmaceutical industry, encompassing container closure systems, laboratory handling equipment, and single-use systems. As part of their material qualification process, studies on interactions between plastic contact materials and process solutions or drug products are conducted. The assessment of single-use systems includes their potential impact on patient safety, product quality, and process performance. This is particularly crucial in cell and gene therapy applications since interactions with the plastic contact material may result in an adverse effect on the isolated therapeutic human cells. We utilized the cell painting assay (CPA), a non-targeted method, for profiling the morphological characteristics of U2OS human osteosarcoma cells in contact with chemicals related to plastic contact materials. Specifically, we conducted a comprehensive analysis of 45 common plastic extractables, and two extracts from single-use systems. Results of the CPA are compared with a standard cytotoxicity assay, an osteogenesis differentiation assay, and in silico toxicity predictions. The findings of this feasibility study demonstrate that the device extracts and most of the tested compounds do not evoke any measurable biological changes on the cells (induction ≤ 5%) among the 579 cell features measured at concentrations ≤ 50 µM. CPA can serve as an important assay to reveal unique information not accessible through quantitative structure-activity relationship analysis and vice versa. The results highlight the need for a combination of in vitro and in silico methods in a comprehensive assessment of single-use equipment utilized in advanced therapy medicinal products manufacturing.
Collapse
Affiliation(s)
- Ina Pahl
- Sartorius Stedim Biotech GmbH, August-Spindler-Str. 11, 37079, Göttingen, Germany.
| | - Axel Pahl
- Compound Management and Screening Center, MPI of Molecular Physiology, Otto-Hahn-Str. 11, 44227, Dortmund, Germany
| | - Armin Hauk
- Sartorius Stedim Biotech GmbH, August-Spindler-Str. 11, 37079, Göttingen, Germany
| | - Dana Budde
- Sartorius Stedim Biotech GmbH, August-Spindler-Str. 11, 37079, Göttingen, Germany
| | - Sonja Sievers
- Compound Management and Screening Center, MPI of Molecular Physiology, Otto-Hahn-Str. 11, 44227, Dortmund, Germany
| | - Lothar Fruth
- Tox Expert GmbH, An der Feldscheide 1, 37083, Göttingen, Germany
| | - Roberto Menzel
- Sartorius Stedim Biotech GmbH, August-Spindler-Str. 11, 37079, Göttingen, Germany
| |
Collapse
|
13
|
Seal S, Carreras-Puigvert J, Singh S, Carpenter AE, Spjuth O, Bender A. From pixels to phenotypes: Integrating image-based profiling with cell health data as BioMorph features improves interpretability. Mol Biol Cell 2024; 35:mr2. [PMID: 38170589 PMCID: PMC10916876 DOI: 10.1091/mbc.e23-08-0298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/07/2023] [Accepted: 12/22/2023] [Indexed: 01/05/2024] Open
Abstract
Cell Painting assays generate morphological profiles that are versatile descriptors of biological systems and have been used to predict in vitro and in vivo drug effects. However, Cell Painting features extracted from classical software such as CellProfiler are based on statistical calculations and often not readily biologically interpretable. In this study, we propose a new feature space, which we call BioMorph, that maps these Cell Painting features with readouts from comprehensive Cell Health assays. We validated that the resulting BioMorph space effectively connected compounds not only with the morphological features associated with their bioactivity but with deeper insights into phenotypic characteristics and cellular processes associated with the given bioactivity. The BioMorph space revealed the mechanism of action for individual compounds, including dual-acting compounds such as emetine, an inhibitor of both protein synthesis and DNA replication. Overall, BioMorph space offers a biologically relevant way to interpret the cell morphological features derived using software such as CellProfiler and to generate hypotheses for experimental validation.
Collapse
Affiliation(s)
- Srijit Seal
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge MA 02142
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| | - Jordi Carreras-Puigvert
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, 752 37 Uppsala, Sweden
| | - Shantanu Singh
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge MA 02142
| | - Anne E Carpenter
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge MA 02142
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, 752 37 Uppsala, Sweden
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
14
|
Seal S, Spjuth O, Hosseini-Gerami L, García-Ortegón M, Singh S, Bender A, Carpenter AE. Insights into Drug Cardiotoxicity from Biological and Chemical Data: The First Public Classifiers for FDA Drug-Induced Cardiotoxicity Rank. J Chem Inf Model 2024; 64:1172-1186. [PMID: 38300851 PMCID: PMC10900289 DOI: 10.1021/acs.jcim.3c01834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/11/2024] [Accepted: 01/16/2024] [Indexed: 02/03/2024]
Abstract
Drug-induced cardiotoxicity (DICT) is a major concern in drug development, accounting for 10-14% of postmarket withdrawals. In this study, we explored the capabilities of chemical and biological data to predict cardiotoxicity, using the recently released DICTrank data set from the United States FDA. We found that such data, including protein targets, especially those related to ion channels (e.g., hERG), physicochemical properties (e.g., electrotopological state), and peak concentration in plasma offer strong predictive ability for DICT. Compounds annotated with mechanisms of action such as cyclooxygenase inhibition could distinguish between most-concern and no-concern DICT. Cell Painting features for ER stress discerned most-concern cardiotoxic from nontoxic compounds. Models based on physicochemical properties provided substantial predictive accuracy (AUCPR = 0.93). With the availability of omics data in the future, using biological data promises enhanced predictability and deeper mechanistic insights, paving the way for safer drug development. All models from this study are available at https://broad.io/DICTrank_Predictor.
Collapse
Affiliation(s)
- Srijit Seal
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Ola Spjuth
- Department
of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Box
591, SE-75124 Uppsala, Sweden
| | - Layla Hosseini-Gerami
- Ignota
Labs, The Bradfield Centre, Cambridge Science Park, County Hall, Westminster Bridge Road, Cambridge CB4 0GA, U.K.
| | - Miguel García-Ortegón
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Shantanu Singh
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| | - Andreas Bender
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K.
| | - Anne E. Carpenter
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| |
Collapse
|
15
|
Horne R, Wilson-Godber J, González Díaz A, Brotzakis ZF, Seal S, Gregory RC, Possenti A, Chia S, Vendruscolo M. Using Generative Modeling to Endow with Potency Initially Inert Compounds with Good Bioavailability and Low Toxicity. J Chem Inf Model 2024; 64:590-596. [PMID: 38261763 PMCID: PMC10865343 DOI: 10.1021/acs.jcim.3c01777] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 12/10/2023] [Accepted: 12/12/2023] [Indexed: 01/25/2024]
Abstract
In the early stages of drug development, large chemical libraries are typically screened to identify compounds of promising potency against the chosen targets. Often, however, the resulting hit compounds tend to have poor drug metabolism and pharmacokinetics (DMPK), with negative developability features that may be difficult to eliminate. Therefore, starting the drug discovery process with a "null library", compounds that have highly desirable DMPK properties but no potency against the chosen targets, could be advantageous. Here, we explore the opportunities offered by machine learning to realize this strategy in the case of the inhibition of α-synuclein aggregation, a process associated with Parkinson's disease. We apply MolDQN, a generative machine learning method, to build an inhibitory activity against α-synuclein aggregation into an initial inactive compound with good DMPK properties. Our results illustrate how generative modeling can be used to endow initially inert compounds with desirable developability properties.
Collapse
Affiliation(s)
- Robert
I. Horne
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| | - Jared Wilson-Godber
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| | - Alicia González Díaz
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| | - Z. Faidon Brotzakis
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| | - Srijit Seal
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
- Imaging
Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, United States
| | - Rebecca C. Gregory
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| | - Andrea Possenti
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| | - Sean Chia
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
- Bioprocessing
Technology Institute, Agency for Science, Technology and Research (A*STAR), 138668 Singapore, Singapore
| | - Michele Vendruscolo
- Centre
for Misfolding Diseases, Department of Chemistry, University of Cambridge, Cambridge CB2 1EW, United
Kingdom
| |
Collapse
|
16
|
Seal S, Spjuth O, Hosseini-Gerami L, García-Ortegón M, Singh S, Bender A, Carpenter AE. Insights into Drug Cardiotoxicity from Biological and Chemical Data: The First Public Classifiers for FDA DICTrank. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.15.562398. [PMID: 37905146 PMCID: PMC10614794 DOI: 10.1101/2023.10.15.562398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Drug-induced cardiotoxicity (DICT) is a major concern in drug development, accounting for 10-14% of postmarket withdrawals. In this study, we explored the capabilities of various chemical and biological data to predict cardiotoxicity, using the recently released Drug-Induced Cardiotoxicity Rank (DICTrank) dataset from the United States FDA. We analyzed a diverse set of data sources, including physicochemical properties, annotated mechanisms of action (MOA), Cell Painting, Gene Expression, and more, to identify indications of cardiotoxicity. We found that such data, including protein targets, especially those related to ion channels (such as hERG), physicochemical properties (such as electrotopological state) as well as peak concentration in plasma offer strong predictive ability as well as valuable insights into DICT. We also found compounds annotated with particular mechanisms of action, such as cyclooxygenase inhibition, could distinguish between most-concern and no-concern DICT compounds. Cell Painting features related to ER stress discern the most-concern cardiotoxic compounds from non-toxic compounds. While models based on physicochemical properties currently provide substantial predictive accuracy (AUCPR = 0.93), this study also underscores the potential benefits of incorporating more comprehensive biological data in future DICT predictive models. With the availability of - omics data in the future, using biological data promises enhanced predictability and delivers deeper mechanistic insights, paving the way for safer therapeutic drug development. All models and data used in this study are publicly released at https://broad.io/DICTrank_Predictor.
Collapse
Affiliation(s)
- Srijit Seal
- Imaging Platform, Broad Institute of MIT and Harvard, US
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Sweden
| | | | | | - Shantanu Singh
- Imaging Platform, Broad Institute of MIT and Harvard, US
| | | | | |
Collapse
|