1
|
Theisen R, Wang T, Ravikumar B, Rahman R, Cichońska A. Leveraging multiple data types for improved compound-kinase bioactivity prediction. Nat Commun 2024; 15:7596. [PMID: 39217147 PMCID: PMC11365929 DOI: 10.1038/s41467-024-52055-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Accepted: 08/21/2024] [Indexed: 09/04/2024] Open
Abstract
Machine learning provides efficient ways to map compound-kinase interactions. However, diverse bioactivity data types, including single-dose and multi-dose-response assay results, present challenges. Traditional models utilize only multi-dose data, overlooking information contained in single-dose measurements. Here, we propose a machine learning methodology for compound-kinase activity prediction that leverages both single-dose and dose-response data. We demonstrate that our two-stage approach yields accurate activity predictions and significantly improves model performance compared to training solely on dose-response labels. This superior performance is consistent across five diverse machine learning methods. Using the best performing model, we carried out extensive experimental profiling on a total of 347 selected compound-kinase pairs, achieving a high hit rate of 40% and a negative predictive value of 78%. We show that these rates can be improved further by incorporating model uncertainty estimates into the compound selection process. By integrating multiple activity data types, we demonstrate that our approach holds promise for facilitating the development of training activity datasets in a more efficient and cost-effective way.
Collapse
Affiliation(s)
- Ryan Theisen
- Harmonic Discovery Inc., New York City, NY, USA.
| | | | | | | | | |
Collapse
|
2
|
Hao Y, Li B, Huang D, Wu S, Wang T, Fu L, Liu X. Developing a Semi-Supervised Approach Using a PU-Learning-Based Data Augmentation Strategy for Multitarget Drug Discovery. Int J Mol Sci 2024; 25:8239. [PMID: 39125808 PMCID: PMC11312053 DOI: 10.3390/ijms25158239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 07/26/2024] [Accepted: 07/26/2024] [Indexed: 08/12/2024] Open
Abstract
Multifactorial diseases demand therapeutics that can modulate multiple targets for enhanced safety and efficacy, yet the clinical approval of multitarget drugs remains rare. The integration of machine learning (ML) and deep learning (DL) in drug discovery has revolutionized virtual screening. This study investigates the synergy between ML/DL methodologies, molecular representations, and data augmentation strategies. Notably, we found that SVM can match or even surpass the performance of state-of-the-art DL methods. However, conventional data augmentation often involves a trade-off between the true positive rate and false positive rate. To address this, we introduce Negative-Augmented PU-bagging (NAPU-bagging) SVM, a novel semi-supervised learning framework. By leveraging ensemble SVM classifiers trained on resampled bags containing positive, negative, and unlabeled data, our approach is capable of managing false positive rates while maintaining high recall rates. We applied this method to the identification of multitarget-directed ligands (MTDLs), where high recall rates are critical for compiling a list of interaction candidate compounds. Case studies demonstrate that NAPU-bagging SVM can identify structurally novel MTDL hits for ALK-EGFR with favorable docking scores and binding modes, as well as pan-agonists for dopamine receptors. The NAPU-bagging SVM methodology should serve as a promising avenue to virtual screening, especially for the discovery of MTDLs.
Collapse
Affiliation(s)
- Yang Hao
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZX, UK
| | - Bo Li
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZX, UK
| | - Daiyun Huang
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
- School of Life Sciences, Fudan University, Shanghai 200092, China
| | - Sijin Wu
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
| | - Tianjun Wang
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZX, UK
| | - Lei Fu
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
| | - Xin Liu
- Wisdom Lake Academy of Pharmacy, Xi’an Jiaotong-Liverpool University, Suzhou 215123, China; (Y.H.); (B.L.); (S.W.); (T.W.); (L.F.)
| |
Collapse
|
3
|
Catacutan DB, Alexander J, Arnold A, Stokes JM. Machine learning in preclinical drug discovery. Nat Chem Biol 2024:10.1038/s41589-024-01679-1. [PMID: 39030362 DOI: 10.1038/s41589-024-01679-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/13/2024] [Indexed: 07/21/2024]
Abstract
Drug-discovery and drug-development endeavors are laborious, costly and time consuming. These programs can take upward of 12 years and cost US $2.5 billion, with a failure rate of more than 90%. Machine learning (ML) presents an opportunity to improve the drug-discovery process. Indeed, with the growing abundance of public and private large-scale biological and chemical datasets, ML techniques are becoming well positioned as useful tools that can augment the traditional drug-development process. In this Perspective, we discuss the integration of algorithmic methods throughout the preclinical phases of drug discovery. Specifically, we highlight an array of ML-based efforts, across diverse disease areas, to accelerate initial hit discovery, mechanism-of-action (MOA) elucidation and chemical property optimization. With advances in the application of ML across diverse therapeutic areas, we posit that fully ML-integrated drug-discovery pipelines will define the future of drug-development programs.
Collapse
Affiliation(s)
- Denise B Catacutan
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jeremie Alexander
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Autumn Arnold
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jonathan M Stokes
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada.
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada.
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada.
| |
Collapse
|
4
|
Huang D, Xie J. EMPDTA: An End-to-End Multimodal Representation Learning Framework with Pocket Online Detection for Drug-Target Affinity Prediction. Molecules 2024; 29:2912. [PMID: 38930976 PMCID: PMC11206982 DOI: 10.3390/molecules29122912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 06/15/2024] [Accepted: 06/17/2024] [Indexed: 06/28/2024] Open
Abstract
Accurately predicting drug-target interactions is a critical yet challenging task in drug discovery. Traditionally, pocket detection and drug-target affinity prediction have been treated as separate aspects of drug-target interaction, with few methods combining these tasks within a unified deep learning system to accelerate drug development. In this study, we propose EMPDTA, an end-to-end framework that integrates protein pocket prediction and drug-target affinity prediction to provide a comprehensive understanding of drug-target interactions. The EMPDTA framework consists of three main modules: pocket online detection, multimodal representation learning for affinity prediction, and multi-task joint training. The performance and potential of the proposed framework have been validated across diverse benchmark datasets, achieving robust results in both tasks. Furthermore, the visualization results of the predicted pockets demonstrate accurate pocket detection, confirming the effectiveness of our framework.
Collapse
Affiliation(s)
| | - Jiang Xie
- School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China;
| |
Collapse
|
5
|
Gutkin E, Gusev F, Gentile F, Ban F, Koby SB, Narangoda C, Isayev O, Cherkasov A, Kurnikova MG. In silico screening of LRRK2 WDR domain inhibitors using deep docking and free energy simulations. Chem Sci 2024; 15:8800-8812. [PMID: 38873063 PMCID: PMC11168082 DOI: 10.1039/d3sc06880c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 04/10/2024] [Indexed: 06/15/2024] Open
Abstract
The Critical Assessment of Computational Hit-Finding Experiments (CACHE) Challenge series is focused on identifying small molecule inhibitors of protein targets using computational methods. Each challenge contains two phases, hit-finding and follow-up optimization, each of which is followed by experimental validation of the computational predictions. For the CACHE Challenge #1, the Leucine-Rich Repeat Kinase 2 (LRRK2) WD40 Repeat (WDR) domain was selected as the target for in silico hit-finding and optimization. Mutations in LRRK2 are the most common genetic cause of the familial form of Parkinson's disease. The LRRK2 WDR domain is an understudied drug target with no known molecular inhibitors. Herein we detail the first phase of our winning submission to the CACHE Challenge #1. We developed a framework for the high-throughput structure-based virtual screening of a chemically diverse small molecule space. Hit identification was performed using the large-scale Deep Docking (DD) protocol followed by absolute binding free energy (ABFE) simulations. ABFEs were computed using an automated molecular dynamics (MD)-based thermodynamic integration (TI) approach. 4.1 billion ligands from Enamine REAL were screened with DD followed by ABFEs computed by MD TI for 793 ligands. 76 ligands were prioritized for experimental validation, with 59 compounds successfully synthesized and 5 compounds identified as hits, yielding a 8.5% hit rate. Our results demonstrate the efficacy of the combined DD and ABFE approaches for hit identification for a target with no previously known hits. This approach is widely applicable for the efficient screening of ultra-large chemical libraries as well as rigorous protein-ligand binding affinity estimation leveraging modern computational resources.
Collapse
Affiliation(s)
- Evgeny Gutkin
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Filipp Gusev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
- Computational Biology Department, School of Computer Science, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Francesco Gentile
- Department of Chemistry and Biomolecular Sciences, University of Ottawa Ottawa ON Canada
- Ottawa Institute of Systems Biology Ottawa ON Canada
| | - Fuqiang Ban
- Vancouver Prostate Centre, The University of British Columbia Vancouver BC Canada
| | - S Benjamin Koby
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Chamali Narangoda
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
- Computational Biology Department, School of Computer Science, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Artem Cherkasov
- Vancouver Prostate Centre, The University of British Columbia Vancouver BC Canada
| | - Maria G Kurnikova
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University Pittsburgh PA 15213 USA
| |
Collapse
|
6
|
Munson BP, Chen M, Bogosian A, Kreisberg JF, Licon K, Abagyan R, Kuenzi BM, Ideker T. De novo generation of multi-target compounds using deep generative chemistry. Nat Commun 2024; 15:3636. [PMID: 38710699 DOI: 10.1038/s41467-024-47120-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 03/18/2024] [Indexed: 05/08/2024] Open
Abstract
Polypharmacology drugs-compounds that inhibit multiple proteins-have many applications but are difficult to design. To address this challenge we have developed POLYGON, an approach to polypharmacology based on generative reinforcement learning. POLYGON embeds chemical space and iteratively samples it to generate new molecular structures; these are rewarded by the predicted ability to inhibit each of two protein targets and by drug-likeness and ease-of-synthesis. In binding data for >100,000 compounds, POLYGON correctly recognizes polypharmacology interactions with 82.5% accuracy. We subsequently generate de-novo compounds targeting ten pairs of proteins with documented co-dependency. Docking analysis indicates that top structures bind their two targets with low free energies and similar 3D orientations to canonical single-protein inhibitors. We synthesize 32 compounds targeting MEK1 and mTOR, with most yielding >50% reduction in each protein activity and in cell viability when dosed at 1-10 μM. These results support the potential of generative modeling for polypharmacology.
Collapse
Affiliation(s)
- Brenton P Munson
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Michael Chen
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Audrey Bogosian
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Jason F Kreisberg
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Katherine Licon
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, 92093, USA
| | - Brent M Kuenzi
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA
| | - Trey Ideker
- Division of Human Genomics and Precision Medicine, Department of Medicine, University of California San Diego, La Jolla, CA, 92093, USA.
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA.
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
7
|
Vidović D, Waller A, Holmes J, Sklar LA, Schürer SC. Best practices for managing and disseminating resources and outreach and evaluating the impact of the IDG Consortium. Drug Discov Today 2024; 29:103953. [PMID: 38508231 PMCID: PMC11335350 DOI: 10.1016/j.drudis.2024.103953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 03/08/2024] [Accepted: 03/14/2024] [Indexed: 03/22/2024]
Abstract
The Illuminating the Druggable Genome (IDG) consortium generated reagents, biological model systems, data, informatic databases, and computational tools. The Resource Dissemination and Outreach Center (RDOC) played a central administrative role, organized internal meetings, fostered collaboration, and coordinated consortium-wide efforts. The RDOC developed and deployed a Resource Management System (RMS) to enable efficient workflows for collecting, accessing, validating, registering, and publishing resource metadata. IDG policies for repositories and standardized representations of resources were established, adopting the FAIR (findable, accessible, interoperable, reusable) principles. The RDOC also developed metrics of IDG impact. Outreach initiatives included digital content, the Protein Illumination Timeline (representing milestones in generating data and reagents), the Target Watch publication series, the e-IDG Symposium series, and leveraging social media platforms.
Collapse
Affiliation(s)
- Dušica Vidović
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL, USA; Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL, USA
| | - Anna Waller
- Department of Pathology, Health Sciences Center, University of New Mexico, Albuquerque, NM, USA
| | - Jayme Holmes
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | - Larry A Sklar
- Department of Pathology, Health Sciences Center, University of New Mexico, Albuquerque, NM, USA; Autophagy, Inflammation, & Metabolism (AIM) Center, University of New Mexico, Albuquerque, NM, USA
| | - Stephan C Schürer
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL, USA; Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL, USA; Frost Institute for Data Science & Computing, University of Miami, Miami, FL, USA.
| |
Collapse
|
8
|
Hu J, Allen BK, Stathias V, Ayad NG, Schürer SC. Kinome-Wide Virtual Screening by Multi-Task Deep Learning. Int J Mol Sci 2024; 25:2538. [PMID: 38473785 PMCID: PMC10932040 DOI: 10.3390/ijms25052538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 02/04/2024] [Accepted: 02/17/2024] [Indexed: 03/14/2024] Open
Abstract
Deep learning is a machine learning technique to model high-level abstractions in data by utilizing a graph composed of multiple processing layers that experience various linear and non-linear transformations. This technique has been shown to perform well for applications in drug discovery, utilizing structural features of small molecules to predict activity. Here, we report a large-scale study to predict the activity of small molecules across the human kinome-a major family of drug targets, particularly in anti-cancer agents. While small-molecule kinase inhibitors exhibit impressive clinical efficacy in several different diseases, resistance often arises through adaptive kinome reprogramming or subpopulation diversity. Polypharmacology and combination therapies offer potential therapeutic strategies for patients with resistant diseases. Their development would benefit from a more comprehensive and dense knowledge of small-molecule inhibition across the human kinome. Leveraging over 650,000 bioactivity annotations for more than 300,000 small molecules, we evaluated multiple machine learning methods to predict the small-molecule inhibition of 342 kinases across the human kinome. Our results demonstrated that multi-task deep neural networks outperformed classical single-task methods, offering the potential for conducting large-scale virtual screening, predicting activity profiles, and bridging the gaps in the available data.
Collapse
Affiliation(s)
- Jiaming Hu
- Dr. John T. Macdonald Foundation Department of Human Genetics and John P. Hussman Institute for Human Genomics, Miller School of Medicine, University of Miami, Miami, FL 33136, USA;
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA; (B.K.A.); (V.S.)
| | - Bryce K. Allen
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA; (B.K.A.); (V.S.)
- Institute for Data Science & Computing, University of Miami, Miami, FL 33136, USA
| | - Vasileios Stathias
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA; (B.K.A.); (V.S.)
| | - Nagi G. Ayad
- Center for Therapeutic Innovation Miller School of Medicine, University of Miami, Miami, FL 33136, USA;
- Miami Project to Cure Paralysis, Department of Psychiatry and Behavioral Sciences, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
- Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| | - Stephan C. Schürer
- Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, Miami, FL 33136, USA; (B.K.A.); (V.S.)
- Institute for Data Science & Computing, University of Miami, Miami, FL 33136, USA
- Center for Therapeutic Innovation Miller School of Medicine, University of Miami, Miami, FL 33136, USA;
- Sylvester Comprehensive Cancer Center, Miller School of Medicine, University of Miami, Miami, FL 33136, USA
| |
Collapse
|
9
|
Roskoski R. Properties of FDA-approved small molecule protein kinase inhibitors: A 2024 update. Pharmacol Res 2024; 200:107059. [PMID: 38216005 DOI: 10.1016/j.phrs.2024.107059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 01/04/2024] [Indexed: 01/14/2024]
Abstract
Owing to the dysregulation of protein kinase activity in many diseases including cancer, this enzyme family has become one of the most important drug targets in the 21st century. There are 80 FDA-approved therapeutic agents that target about two dozen different protein kinases and seven of these drugs were approved in 2023. Of the approved drugs, thirteen target protein-serine/threonine protein kinases, four are directed against dual specificity protein kinases (MEK1/2), twenty block nonreceptor protein-tyrosine kinases, and 43 inhibit receptor protein-tyrosine kinases. The data indicate that 69 of these drugs are prescribed for the treatment of neoplasms. Six drugs (abrocitinib, baricitinib, deucravacitinib, ritlecitinib, tofacitinib, upadacitinib) are used for the treatment of inflammatory diseases (atopic dermatitis, rheumatoid arthritis, psoriasis, alopecia areata, and ulcerative colitis). Of the 80 approved drugs, nearly two dozen are used in the treatment of multiple diseases. The following seven drugs received FDA approval in 2023: capivasertib (HER2-positive breast cancer), fruquintinib (metastatic colorectal cancer), momelotinib (myelofibrosis), pirtobrutinib (mantle cell lymphoma, chronic lymphocytic leukemia, small lymphocytic lymphoma), quizartinib (Flt3-mutant acute myelogenous leukemia), repotrectinib (ROS1-positive lung cancer), and ritlecitinib (alopecia areata). All of the FDA-approved drugs are orally effective with the exception of netarsudil, temsirolimus, and trilaciclib. This review summarizes the physicochemical properties of all 80 FDA-approved small molecule protein kinase inhibitors including the molecular weight, number of hydrogen bond donors/acceptors, polar surface area, potency, solubility, lipophilic efficiency, and ligand efficiency.
Collapse
Affiliation(s)
- Robert Roskoski
- Blue Ridge Institute for Medical Research, 221 Haywood Knolls Drive, Hendersonville, NC 28791, United States.
| |
Collapse
|
10
|
Cichońska A, Ravikumar B, Rahman R. AI for targeted polypharmacology: The next frontier in drug discovery. Curr Opin Struct Biol 2024; 84:102771. [PMID: 38215530 DOI: 10.1016/j.sbi.2023.102771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/30/2023] [Accepted: 12/20/2023] [Indexed: 01/14/2024]
Abstract
In drug discovery, targeted polypharmacology, i.e., targeting multiple molecular targets with a single drug, is redefining therapeutic design to address complex diseases. Pre-selected pharmacological profiles, as exemplified in kinase drugs, promise enhanced efficacy and reduced toxicity. Historically, many of such drugs were discovered serendipitously, limiting predictability and efficacy, but currently artificial intelligence (AI) offers a transformative solution. Machine learning and deep learning techniques enable modeling protein structures, generating novel compounds, and decoding their polypharmacological effects, opening an avenue for more systematic and predictive multi-target drug design. This review explores the use of AI in identifying synergistic co-targets and delineating them from anti-targets that lead to adverse effects, and then discusses advances in AI-enabled docking, generative chemistry, and proteochemometric modeling of proteome-wide compound interactions, in the context of polypharmacology. We also provide insights into challenges ahead.
Collapse
|
11
|
Outhwaite IR, Singh S, Berger BT, Knapp S, Chodera JD, Seeliger MA. Death by a thousand cuts through kinase inhibitor combinations that maximize selectivity and enable rational multitargeting. eLife 2023; 12:e86189. [PMID: 38047771 PMCID: PMC10769483 DOI: 10.7554/elife.86189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2023] [Accepted: 12/03/2023] [Indexed: 12/05/2023] Open
Abstract
Kinase inhibitors are successful therapeutics in the treatment of cancers and autoimmune diseases and are useful tools in biomedical research. However, the high sequence and structural conservation of the catalytic kinase domain complicate the development of selective kinase inhibitors. Inhibition of off-target kinases makes it difficult to study the mechanism of inhibitors in biological systems. Current efforts focus on the development of inhibitors with improved selectivity. Here, we present an alternative solution to this problem by combining inhibitors with divergent off-target effects. We develop a multicompound-multitarget scoring (MMS) method that combines inhibitors to maximize target inhibition and to minimize off-target inhibition. Additionally, this framework enables optimization of inhibitor combinations for multiple on-targets. Using MMS with published kinase inhibitor datasets we determine potent inhibitor combinations for target kinases with better selectivity than the most selective single inhibitor and validate the predicted effect and selectivity of inhibitor combinations using in vitro and in cellulo techniques. MMS greatly enhances selectivity in rational multitargeting applications. The MMS framework is generalizable to other non-kinase biological targets where compound selectivity is a challenge and diverse compound libraries are available.
Collapse
Affiliation(s)
- Ian R Outhwaite
- Department of Pharmacological Sciences, Stony Brook UniversityStony BrookUnited States
| | - Sukrit Singh
- Department of Pharmacological Sciences, Stony Brook UniversityStony BrookUnited States
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer CenterNew YorkUnited States
| | - Benedict-Tilman Berger
- Institute of Pharmaceutical Chemistry, Goethe University FrankfurtFrankfurt am MainGermany
- Structural Genomics Consortium, Buchmann Institute for Life Sciences, Goethe University FrankfurtFrankfurt am MainGermany
| | - Stefan Knapp
- Institute of Pharmaceutical Chemistry, Goethe University FrankfurtFrankfurt am MainGermany
- Structural Genomics Consortium, Buchmann Institute for Life Sciences, Goethe University FrankfurtFrankfurt am MainGermany
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer CenterNew YorkUnited States
| | - Markus A Seeliger
- Department of Pharmacological Sciences, Stony Brook UniversityStony BrookUnited States
| |
Collapse
|
12
|
Luo Y, Liu Y, Peng J. Calibrated geometric deep learning improves kinase-drug binding predictions. NAT MACH INTELL 2023; 5:1390-1401. [PMID: 38962391 PMCID: PMC11221792 DOI: 10.1038/s42256-023-00751-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 09/29/2023] [Indexed: 07/05/2024]
Abstract
Protein kinases regulate various cellular functions and hold significant pharmacological promise in cancer and other diseases. Although kinase inhibitors are one of the largest groups of approved drugs, much of the human kinome remains unexplored but potentially druggable. Computational approaches, such as machine learning, offer efficient solutions for exploring kinase-compound interactions and uncovering novel binding activities. Despite the increasing availability of three-dimensional (3D) protein and compound structures, existing methods predominantly focus on exploiting local features from one-dimensional protein sequences and two-dimensional molecular graphs to predict binding affinities, overlooking the 3D nature of the binding process. Here we present KDBNet, a deep learning algorithm that incorporates 3D protein and molecule structure data to predict binding affinities. KDBNet uses graph neural networks to learn structure representations of protein binding pockets and drug molecules, capturing the geometric and spatial characteristics of binding activity. In addition, we introduce an algorithm to quantify and calibrate the uncertainties of KDBNet's predictions, enhancing its utility in model-guided discovery in chemical or protein space. Experiments demonstrated that KDBNet outperforms existing deep learning models in predicting kinase-drug binding affinities. The uncertainties estimated by KDBNet are informative and well-calibrated with respect to prediction errors. When integrated with a Bayesian optimization framework, KDBNet enables data-efficient active learning and accelerates the exploration and exploitation of diverse high-binding kinase-drug pairs.
Collapse
Affiliation(s)
- Yunan Luo
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
- These authors contributed equally: Yunan Luo, Yang Liu
| | - Yang Liu
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
- These authors contributed equally: Yunan Luo, Yang Liu
| | - Jian Peng
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL, USA
| |
Collapse
|
13
|
Brahma R, Shin JM, Cho KH. KinScan: AI-based rapid profiling of activity across the kinome. Brief Bioinform 2023; 24:bbad396. [PMID: 37985454 DOI: 10.1093/bib/bbad396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 09/22/2023] [Accepted: 10/14/2023] [Indexed: 11/22/2023] Open
Abstract
Kinases play a vital role in regulating essential cellular processes, including cell cycle progression, growth, apoptosis, and metabolism, by catalyzing the transfer of phosphate groups from adenosing triphosphate to substrates. Their dysregulation has been closely associated with numerous diseases, including cancer development, making them attractive targets for drug discovery. However, accurately predicting the binding affinity between chemical compounds and kinase targets remains challenging due to the highly conserved structural similarities across the kinome. To address this limitation, we present KinScan, a novel computational approach that leverages large-scale bioactivity data and integrates the Multi-Scale Context Aware Transformer framework to construct a virtual profiling model encompassing 391 protein kinases. The developed model demonstrates exceptional prediction capability, distinguishing between kinases by utilizing structurally aligned kinase binding site features derived from multiple sequence alignment for fast and accurate predictions. Through extensive validation and benchmarking, KinScan demonstrated its robust predictive power and generalizability for large-scale kinome-wide profiling and selectivity, uncovering associations with specific diseases and providing valuable insights into kinase activity profiles of compounds. Furthermore, we deployed a web platform for end-to-end profiling and selectivity analysis, accessible at https://kinscan.drugonix.com/softwares/kinscan.
Collapse
Affiliation(s)
- Rahul Brahma
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| | - Jae-Min Shin
- AzothBio, Rm. DA724 Hyundai Knowledge Industry Center, Hanam-si, Gyeonggi-do, Republic of Korea
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| |
Collapse
|
14
|
Ong WJG, Kirubakaran P, Karanicolas J. Poor Generalization by Current Deep Learning Models for Predicting Binding Affinities of Kinase Inhibitors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.04.556234. [PMID: 37732243 PMCID: PMC10508770 DOI: 10.1101/2023.09.04.556234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
The extreme surge of interest over the past decade surrounding the use of neural networks has inspired many groups to deploy them for predicting binding affinities of drug-like molecules to their receptors. A model that can accurately make such predictions has the potential to screen large chemical libraries and help streamline the drug discovery process. However, despite reports of models that accurately predict quantitative inhibition using protein kinase sequences and inhibitors' SMILES strings, it is still unclear whether these models can generalize to previously unseen data. Here, we build a Convolutional Neural Network (CNN) analogous to those previously reported and evaluate the model over four datasets commonly used for inhibitor/kinase predictions. We find that the model performs comparably to those previously reported, provided that the individual data points are randomly split between the training set and the test set. However, model performance is dramatically deteriorated when all data for a given inhibitor is placed together in the same training/testing fold, implying that information leakage underlies the models' performance. Through comparison to simple models in which the SMILES strings are tokenized, or in which test set predictions are simply copied from the closest training set data points, we demonstrate that there is essentially no generalization whatsoever in this model. In other words, the model has not learned anything about molecular interactions, and does not provide any benefit over much simpler and more transparent models. These observations strongly point to the need for richer structure-based encodings, to obtain useful prospective predictions of not-yet-synthesized candidate inhibitors.
Collapse
Affiliation(s)
- Wern Juin Gabriel Ong
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
- Bowdoin College, Brunswick, ME 04011
| | - Palani Kirubakaran
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
| | - John Karanicolas
- Cancer Signaling & Microenvironment Program, Fox Chase Cancer Center, Philadelphia, PA 19111
| |
Collapse
|
15
|
Kanev GK, Zhang Y, Kooistra AJ, Bender A, Leurs R, Bailey D, Würdinger T, de Graaf C, de Esch IJP, Westerman BA. Predicting the target landscape of kinase inhibitors using 3D convolutional neural networks. PLoS Comput Biol 2023; 19:e1011301. [PMID: 37669273 PMCID: PMC10508635 DOI: 10.1371/journal.pcbi.1011301] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/19/2023] [Accepted: 06/25/2023] [Indexed: 09/07/2023] Open
Abstract
Many therapies in clinical trials are based on single drug-single target relationships. To further extend this concept to multi-target approaches using multi-targeted drugs, we developed a machine learning pipeline to unravel the target landscape of kinase inhibitors. This pipeline, which we call 3D-KINEssence, uses a new type of protein fingerprints (3D FP) based on the structure of kinases generated through a 3D convolutional neural network (3D-CNN). These 3D-CNN kinase fingerprints were matched to molecular Morgan fingerprints to predict the targets of each respective kinase inhibitor based on available bioactivity data. The performance of the pipeline was evaluated on two test sets: a sparse drug-target set where each drug is matched in most cases to a single target and also on a densely-covered drug-target set where each drug is matched to most if not all targets. This latter set is more challenging to train, given its non-exclusive character. Our model's root-mean-square error (RMSE) based on the two datasets was 0.68 and 0.8, respectively. These results indicate that 3D FP can predict the target landscape of kinase inhibitors at around 0.8 log units of bioactivity. Our strategy can be utilized in proteochemometric or chemogenomic workflows by consolidating the target landscape of kinase inhibitors.
Collapse
Affiliation(s)
- Georgi K. Kanev
- Division of Medicinal Chemistry, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Department of Neurosurgery, Amsterdam University Medical Centers, Cancer Center Amsterdam, Brain Tumor Center Amsterdam, Amsterdam, The Netherlands
| | - Yaran Zhang
- Department of Neurosurgery, Amsterdam University Medical Centers, Cancer Center Amsterdam, Brain Tumor Center Amsterdam, Amsterdam, The Netherlands
| | - Albert J. Kooistra
- Division of Medicinal Chemistry, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
- Department of Drug Design and Pharmacology, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Bender
- Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - Rob Leurs
- Division of Medicinal Chemistry, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - David Bailey
- The WINDOW consortium, www.window-consortium.org
- IOTA Pharmaceuticals Ltd, St Johns Innovation Centre, Cambridge, United Kingdom
| | - Thomas Würdinger
- Department of Neurosurgery, Amsterdam University Medical Centers, Cancer Center Amsterdam, Brain Tumor Center Amsterdam, Amsterdam, The Netherlands
- The WINDOW consortium, www.window-consortium.org
| | - Chris de Graaf
- Division of Medicinal Chemistry, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Iwan J. P. de Esch
- Division of Medicinal Chemistry, Amsterdam Institute of Molecular and Life Sciences (AIMMS), Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| | - Bart A. Westerman
- Department of Neurosurgery, Amsterdam University Medical Centers, Cancer Center Amsterdam, Brain Tumor Center Amsterdam, Amsterdam, The Netherlands
- The WINDOW consortium, www.window-consortium.org
| |
Collapse
|
16
|
Flanary VL, Fisher JL, Wilk EJ, Howton TC, Lasseigne BN. Computational Advancements in Cancer Combination Therapy Prediction. JCO Precis Oncol 2023; 7:e2300261. [PMID: 37824797 DOI: 10.1200/po.23.00261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 07/20/2023] [Accepted: 08/15/2023] [Indexed: 10/14/2023] Open
Abstract
Given the high attrition rate of de novo drug discovery and limited efficacy of single-agent therapies in cancer treatment, combination therapy prediction through in silico drug repurposing has risen as a time- and cost-effective alternative for identifying novel and potentially efficacious therapies for cancer. The purpose of this review is to provide an introduction to computational methods for cancer combination therapy prediction and to summarize recent studies that implement each of these methods. A systematic search of the PubMed database was performed, focusing on studies published within the past 10 years. Our search included reviews and articles of ongoing and retrospective studies. We prioritized articles with findings that suggest considerations for improving combination therapy prediction methods over providing a meta-analysis of all currently available cancer combination therapy prediction methods. Computational methods used for drug combination therapy prediction in cancer research include networks, regression-based machine learning, classifier machine learning models, and deep learning approaches. Each method class has its own advantages and disadvantages, so careful consideration is needed to determine the most suitable class when designing a combination therapy prediction method. Future directions to improve current combination therapy prediction technology include incorporation of disease pathobiology, drug characteristics, patient multiomics data, and drug-drug interactions to determine maximally efficacious and tolerable drug regimens for cancer. As computational methods improve in their capability to integrate patient, drug, and disease data, more comprehensive models can be developed to more accurately predict safe and efficacious combination drug therapies for cancer and other complex diseases.
Collapse
Affiliation(s)
- Victoria L Flanary
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL
| | - Jennifer L Fisher
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL
| | - Elizabeth J Wilk
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL
| | - Timothy C Howton
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL
| | - Brittany N Lasseigne
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, The University of Alabama at Birmingham, Birmingham, AL
| |
Collapse
|
17
|
Roskoski R. Small molecule protein kinase inhibitors approved by regulatory agencies outside of the United States. Pharmacol Res 2023; 194:106847. [PMID: 37454916 DOI: 10.1016/j.phrs.2023.106847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 07/03/2023] [Indexed: 07/18/2023]
Abstract
Owing to genetic alterations and overexpression, the dysregulation of protein kinases plays a significant role in the pathogenesis of many autoimmune and neoplastic disorders and protein kinase antagonists have become an important drug target. Although the efficacy of imatinib in the treatment of chronic myelogenous leukemia in the United States in 2001 was the main driver of protein kinase inhibitor drug discovery, this was preceded by the approval of fasudil (a ROCK antagonist) in Japan in 1995 for the treatment of cerebral vasospasm. There are 21 small molecule protein kinase inhibitors that are approved in China, Japan, Europe, and South Korea that are not approved in the United Sates and 75 FDA-approved inhibitors in the United States. Of the 21 agents, eleven target receptor protein-tyrosine kinases, eight inhibit nonreceptor protein-tyrosine kinases, and two block protein-serine/threonine kinases. All 21 drugs are orally bioavailable or topically effective. Of the non-FDA approved drugs, sixteen are prescribed for the treatment of neoplastic diseases, three are directed toward inflammatory disorders, one is used for glaucoma, and fasudil is used in the management of vasospasm. The leading targets of kinase inhibitors approved by both international regulatory agencies and by the FDA are members of the EGFR family, the VEGFR family, and the JAK family. One-third of the 21 internationally approved drugs are not compliant with Lipinski's rule of five for orally bioavailable drugs. The rule of five relies on four parameters including molecular weight, number of hydrogen bond donors and acceptors, and the Log of the partition coefficient.
Collapse
Affiliation(s)
- Robert Roskoski
- Blue Ridge Institute for Medical Research, 221 Haywood Knolls Drive, Hendersonville, NC 28791-8717, United States.
| |
Collapse
|
18
|
Oršolić D, Šmuc T. Dynamic applicability domain (dAD): compound-target binding affinity estimates with local conformal prediction. Bioinformatics 2023; 39:btad465. [PMID: 37594752 PMCID: PMC10457664 DOI: 10.1093/bioinformatics/btad465] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 04/26/2023] [Accepted: 08/17/2023] [Indexed: 08/19/2023] Open
Abstract
MOTIVATION Increasing efforts are being made in the field of machine learning to advance the learning of robust and accurate models from experimentally measured data and enable more efficient drug discovery processes. The prediction of binding affinity is one of the most frequent tasks of compound bioactivity modelling. Learned models for binding affinity prediction are assessed by their average performance on unseen samples, but point predictions are typically not provided with a rigorous confidence assessment. Approaches, such as the conformal predictor framework equip conventional models with a more rigorous assessment of confidence for individual point predictions. In this article, we extend the inductive conformal prediction framework for interaction data, in particular the compound-target binding affinity prediction task. The new framework is based on dynamically defined calibration sets that are specific for each testing pair and provides prediction assessment in the context of calibration pairs from its compound-target neighbourhood, enabling improved estimates based on the local properties of the prediction model. RESULTS The effectiveness of the approach is benchmarked on several publicly available datasets and tested in realistic use-case scenarios with increasing levels of difficulty on a complex compound-target binding affinity space. We demonstrate that in such scenarios, novel approach combining applicability domain paradigm with conformal prediction framework, produces superior confidence assessment with valid and more informative prediction regions compared to other 'state-of-the-art' conformal prediction approaches. AVAILABILITY AND IMPLEMENTATION Dataset and the code are available on GitHub (https://github.com/mlkr-rbi/dAD).
Collapse
Affiliation(s)
- Davor Oršolić
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, Zagreb 10000, Croatia
| | - Tomislav Šmuc
- Division of Electronics, Ruđer Bošković Institute, Bijenička cesta 54, Zagreb 10000, Croatia
| |
Collapse
|
19
|
Wang Y, Wang C, Zhou Z, Si J, Li S, Zeng Y, Deng Y, Chen Z. Advances in Simple, Rapid, and Contamination-Free Instantaneous Nucleic Acid Devices for Pathogen Detection. BIOSENSORS 2023; 13:732. [PMID: 37504131 PMCID: PMC10377012 DOI: 10.3390/bios13070732] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 07/05/2023] [Accepted: 07/12/2023] [Indexed: 07/29/2023]
Abstract
Pathogenic pathogens invade the human body through various pathways, causing damage to host cells, tissues, and their functions, ultimately leading to the development of diseases and posing a threat to human health. The rapid and accurate detection of pathogenic pathogens in humans is crucial and pressing. Nucleic acid detection offers advantages such as higher sensitivity, accuracy, and specificity compared to antibody and antigen detection methods. However, conventional nucleic acid testing is time-consuming, labor-intensive, and requires sophisticated equipment and specialized medical personnel. Therefore, this review focuses on advanced nucleic acid testing systems that aim to address the issues of testing time, portability, degree of automation, and cross-contamination. These systems include extraction-free rapid nucleic acid testing, fully automated extraction, amplification, and detection, as well as fully enclosed testing and commercial nucleic acid testing equipment. Additionally, the biochemical methods used for extraction, amplification, and detection in nucleic acid testing are briefly described. We hope that this review will inspire further research and the development of more suitable extraction-free reagents and fully automated testing devices for rapid, point-of-care diagnostics.
Collapse
Affiliation(s)
- Yue Wang
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou 412007, China
| | - Chengming Wang
- Department of Cardiovascular Medicine, The Affiliated Zhuzhou Hospital Xiangya Medical College, Central South University, Zhuzhou 412000, China
| | - Zepeng Zhou
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou 412007, China
| | - Jiajia Si
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou 412007, China
| | - Song Li
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou 412007, China
| | - Yezhan Zeng
- School of Electrical and Information Engineering, Hunan University of Technology, Zhuzhou 412007, China
| | - Yan Deng
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou 412007, China
| | - Zhu Chen
- Hunan Key Laboratory of Biomedical Nanomaterials and Devices, Hunan University of Technology, Zhuzhou 412007, China
| |
Collapse
|
20
|
Luukkonen S, Meijer E, Tricarico GA, Hofmans J, Stouten PFW, van Westen GJP, Lenselink EB. Large-Scale Modeling of Sparse Protein Kinase Activity Data. J Chem Inf Model 2023. [PMID: 37294674 DOI: 10.1021/acs.jcim.3c00132] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Protein kinases are a protein family that plays an important role in several complex diseases such as cancer and cardiovascular and immunological diseases. Protein kinases have conserved ATP binding sites, which when targeted can lead to similar activities of inhibitors against different kinases. This can be exploited to create multitarget drugs. On the other hand, selectivity (lack of similar activities) is desirable in order to avoid toxicity issues. There is a vast amount of protein kinase activity data in the public domain, which can be used in many different ways. Multitask machine learning models are expected to excel for these kinds of data sets because they can learn from implicit correlations between tasks (in this case activities against a variety of kinases). However, multitask modeling of sparse data poses two major challenges: (i) creating a balanced train-test split without data leakage and (ii) handling missing data. In this work, we construct a protein kinase benchmark set composed of two balanced splits without data leakage, using random and dissimilarity-driven cluster-based mechanisms, respectively. This data set can be used for benchmarking and developing protein kinase activity prediction models. Overall, the performance on the dissimilarity-driven cluster-based split is lower than on random split-based sets for all models, indicating poor generalizability of models. Nevertheless, we show that multitask deep learning models, on this very sparse data set, outperform single-task deep learning and tree-based models. Finally, we demonstrate that data imputation does not improve the performance of (multitask) models on this benchmark set.
Collapse
Affiliation(s)
- Sohvi Luukkonen
- Leiden Academic Centre of Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | - Erik Meijer
- Leiden Academic Centre of Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | | | - Johan Hofmans
- Galapagos NV, Generaal De Wittelaan L11 A3, 2800 Mechelen, Belgium
| | - Pieter F W Stouten
- Leiden Academic Centre of Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, The Netherlands
- Galapagos NV, Generaal De Wittelaan L11 A3, 2800 Mechelen, Belgium
- Stouten Pharma Consultancy BV, Kempenarestraat 47, 2860 Sint-Katelijne-Waver, Belgium
| | - Gerard J P van Westen
- Leiden Academic Centre of Drug Research, Leiden University, Einsteinweg 55, 2333 CC Leiden, The Netherlands
| | | |
Collapse
|
21
|
Qian WW, Wei JN, Sanchez-Lengeling B, Lee BK, Luo Y, Vlot M, Dechering K, Peng J, Gerkin RC, Wiltschko AB. Metabolic activity organizes olfactory representations. eLife 2023; 12:e82502. [PMID: 37129358 PMCID: PMC10154027 DOI: 10.7554/elife.82502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Accepted: 04/11/2023] [Indexed: 05/03/2023] Open
Abstract
Hearing and vision sensory systems are tuned to the natural statistics of acoustic and electromagnetic energy on earth and are evolved to be sensitive in ethologically relevant ranges. But what are the natural statistics of odors, and how do olfactory systems exploit them? Dissecting an accurate machine learning model (Lee et al., 2022) for human odor perception, we find a computable representation for odor at the molecular level that can predict the odor-evoked receptor, neural, and behavioral responses of nearly all terrestrial organisms studied in olfactory neuroscience. Using this olfactory representation (principal odor map [POM]), we find that odorous compounds with similar POM representations are more likely to co-occur within a substance and be metabolically closely related; metabolic reaction sequences (Caspi et al., 2014) also follow smooth paths in POM despite large jumps in molecular structure. Just as the brain's visual representations have evolved around the natural statistics of light and shapes, the natural statistics of metabolism appear to shape the brain's representation of the olfactory world.
Collapse
Affiliation(s)
- Wesley W Qian
- OsmoCambridgeUnited States
- Google Research, Brain TeamCambridgeUnited States
| | | | | | - Brian K Lee
- Google Research, Brain TeamCambridgeUnited States
| | - Yunan Luo
- Department of Computer Science, University of IllinoisUrbanaUnited States
| | | | | | - Jian Peng
- Department of Computer Science, University of IllinoisUrbanaUnited States
| | - Richard C Gerkin
- OsmoCambridgeUnited States
- Google Research, Brain TeamCambridgeUnited States
| | | |
Collapse
|
22
|
Yousefi N, Yazdani-Jahromi M, Tayebi A, Kolanthai E, Neal CJ, Banerjee T, Gosai A, Balasubramanian G, Seal S, Ozmen Garibay O. BindingSite-AugmentedDTA: enabling a next-generation pipeline for interpretable prediction models in drug repurposing. Brief Bioinform 2023; 24:7140297. [PMID: 37096593 DOI: 10.1093/bib/bbad136] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 03/02/2022] [Accepted: 03/16/2023] [Indexed: 04/26/2023] Open
Abstract
While research into drug-target interaction (DTI) prediction is fairly mature, generalizability and interpretability are not always addressed in the existing works in this field. In this paper, we propose a deep learning (DL)-based framework, called BindingSite-AugmentedDTA, which improves drug-target affinity (DTA) predictions by reducing the search space of potential-binding sites of the protein, thus making the binding affinity prediction more efficient and accurate. Our BindingSite-AugmentedDTA is highly generalizable as it can be integrated with any DL-based regression model, while it significantly improves their prediction performance. Also, unlike many existing models, our model is highly interpretable due to its architecture and self-attention mechanism, which can provide a deeper understanding of its underlying prediction mechanism by mapping attention weights back to protein-binding sites. The computational results confirm that our framework can enhance the prediction performance of seven state-of-the-art DTA prediction algorithms in terms of four widely used evaluation metrics, including concordance index, mean squared error, modified squared correlation coefficient ($r^2_m$) and the area under the precision curve. We also contribute to three benchmark drug-traget interaction datasets by including additional information on 3D structure of all proteins contained in those datasets, which include the two most commonly used datasets, namely Kiba and Davis, as well as the data from IDG-DREAM drug-kinase binding prediction challenge. Furthermore, we experimentally validate the practical potential of our proposed framework through in-lab experiments. The relatively high agreement between computationally predicted and experimentally observed binding interactions supports the potential of our framework as the next-generation pipeline for prediction models in drug repurposing.
Collapse
Affiliation(s)
- Niloofar Yousefi
- Industrial Engineering and Management Systems, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| | - Mehdi Yazdani-Jahromi
- Computer Science, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| | - Aida Tayebi
- Industrial Engineering and Management Systems, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| | - Elayaraja Kolanthai
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
| | - Craig J Neal
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
| | - Tanumoy Banerjee
- Department of Mechanical Engineering and Mechanics, Lehigh University, Bethlehem 18015, PA, USA
| | | | - Ganesh Balasubramanian
- Department of Mechanical Engineering and Mechanics, Lehigh University, Bethlehem 18015, PA, USA
| | - Sudipta Seal
- College of Medicine, Bionix Cluster, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
- Advanced Materials Processing and Analysis Center, Department of Materials Science and Engineering, University of Central Florida, 4000 Central Florida Blvd., Orlando 32816, FL, USA
| | - Ozlem Ozmen Garibay
- Industrial Engineering and Management Systems, University of Central Florida, 32816, 4000 Central Florida Blvd., Orlando, FL, USA
| |
Collapse
|
23
|
Roskoski R. Rule of five violations among the FDA-approved small molecule protein kinase inhibitors. Pharmacol Res 2023; 191:106774. [PMID: 37075870 DOI: 10.1016/j.phrs.2023.106774] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 04/16/2023] [Indexed: 04/21/2023]
Abstract
Because genetic alterations including mutations, overexpression, translocations, and dysregulation of protein kinases are involved in the pathogenesis of many illnesses, this enzyme family is the target of many drug discovery programs in the pharmaceutical industry. Overall, the US FDA has approved 74 small molecule protein kinase inhibitors, nearly all of which are orally effective. Of the 74 approved drugs, forty block receptor protein-tyrosine kinases, eighteen target nonreceptor protein-tyrosine kinases, twelve are directed against protein-serine/threonine protein kinases, and four target dual specificity protein kinases. The data indicate that 63 of these medicinals are approved for the management of neoplasms (51 against solid tumors such as breast, colon, and lung cancers, eight against nonsolid tumors such as leukemia, and four against both types of tumors). Seven of the FDA-approved kinase inhibitors form covalent bonds with their target enzymes and they are accordingly classified as TCIs (targeted covalent inhibitors). Medicinal chemists have examined the physicochemical properties of drugs that are orally effective. Lipinski's rule of five (Ro5) is a computational procedure that is used to estimate solubility, membrane permeability, and pharmacological effectiveness in the drug-discovery setting. It relies on four parameters including molecular weight, number of hydrogen bond donors and acceptors, and the Log of the partition coefficient. Other important descriptors include the lipophilic efficiency, the polar surface area, and the number of rotatable bonds and aromatic rings. We tabulated these and other properties of the FDA-approved kinase inhibitors. Of the 74 approved drugs, 30 fail to comply with the rule of five.
Collapse
Affiliation(s)
- Robert Roskoski
- Blue Ridge Institute for Medical Research, 221 Haywood Knolls Drive, Hendersonville, NC 28791-8717, United States.
| |
Collapse
|
24
|
Sadybekov AV, Katritch V. Computational approaches streamlining drug discovery. Nature 2023; 616:673-685. [PMID: 37100941 DOI: 10.1038/s41586-023-05905-z] [Citation(s) in RCA: 184] [Impact Index Per Article: 184.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 03/01/2023] [Indexed: 04/28/2023]
Abstract
Computer-aided drug discovery has been around for decades, although the past few years have seen a tectonic shift towards embracing computational technologies in both academia and pharma. This shift is largely defined by the flood of data on ligand properties and binding to therapeutic targets and their 3D structures, abundant computing capacities and the advent of on-demand virtual libraries of drug-like small molecules in their billions. Taking full advantage of these resources requires fast computational methods for effective ligand screening. This includes structure-based virtual screening of gigascale chemical spaces, further facilitated by fast iterative screening approaches. Highly synergistic are developments in deep learning predictions of ligand properties and target activities in lieu of receptor structure. Here we review recent advances in ligand discovery technologies, their potential for reshaping the whole process of drug discovery and development, as well as the challenges they encounter. We also discuss how the rapid identification of highly diverse, potent, target-selective and drug-like ligands to protein targets can democratize the drug discovery process, presenting new opportunities for the cost-effective development of safer and more effective small-molecule treatments.
Collapse
Affiliation(s)
- Anastasiia V Sadybekov
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Center for New Technologies in Drug Discovery and Development, Bridge Institute, Michelson Center for Convergent Biosciences, University of Southern California, Los Angeles, CA, USA
| | - Vsevolod Katritch
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
- Center for New Technologies in Drug Discovery and Development, Bridge Institute, Michelson Center for Convergent Biosciences, University of Southern California, Los Angeles, CA, USA.
- Department of Chemistry, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
25
|
Lien ST, Lin TE, Hsieh JH, Sung TY, Chen JH, Hsu KC. Establishment of extensive artificial intelligence models for kinase inhibitor prediction: Identification of novel PDGFRB inhibitors. Comput Biol Med 2023; 156:106722. [PMID: 36878123 DOI: 10.1016/j.compbiomed.2023.106722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 02/16/2023] [Accepted: 02/26/2023] [Indexed: 03/06/2023]
Abstract
Identifying hit compounds is an important step in drug development. Unfortunately, this process continues to be a challenging task. Several machine learning models have been generated to aid in simplifying and improving the prediction of candidate compounds. Models tuned for predicting kinase inhibitors have been established. However, an effective model can be limited by the size of the chosen training dataset. In this study, we tested several machine learning models to predict potential kinase inhibitors. A dataset was curated from a number of publicly available repositories. This resulted in a comprehensive dataset covering more than half of the human kinome. More than 2,000 kinase models were established using different model approaches. The performances of the models were compared, and the Keras-MLP model was determined to be the best performing model. The model was then used to screen a chemical library for potential inhibitors targeting platelet-derived growth factor receptor-β (PDGFRB). Several PDGFRB candidates were selected, and in vitro assays confirmed four compounds with PDGFRB inhibitory activity and IC50 values in the nanomolar range. These results show the effectiveness of machine learning models trained on the reported dataset. This report would aid in the establishment of machine learning models as well as in the discovery of novel kinase inhibitors.
Collapse
Affiliation(s)
- Ssu-Ting Lien
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Tony Eight Lin
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Jui-Hua Hsieh
- Division of Translational Toxicology, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, NC, USA
| | - Tzu-Ying Sung
- Biomedical Translation Research Center, Academia Sinica, Taipei, Taiwan
| | - Jun-Hong Chen
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Kai-Cheng Hsu
- Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Ph.D. Program for Cancer Molecular Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan; Ph.D. Program in Drug Discovery and Development Industry, College of Pharmacy, Taipei Medical University, Taipei, Taiwan; Cancer Center, Wan Fang Hospital, Taipei Medical University, Taipei, Taiwan; TMU Research Center of Cancer Translational Medicine, Taipei Medical University, Taipei, Taiwan; TMU Research Center of Drug Discovery, Taipei Medical University, Taipei, Taiwan.
| |
Collapse
|
26
|
Atas Guvenilir H, Doğan T. How to approach machine learning-based prediction of drug/compound-target interactions. J Cheminform 2023; 15:16. [PMID: 36747300 PMCID: PMC9901167 DOI: 10.1186/s13321-023-00689-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 01/30/2023] [Indexed: 02/08/2023] Open
Abstract
The identification of drug/compound-target interactions (DTIs) constitutes the basis of drug discovery, for which computational predictive approaches have been developed. As a relatively new data-driven paradigm, proteochemometric (PCM) modeling utilizes both protein and compound properties as a pair at the input level and processes them via statistical/machine learning. The representation of input samples (i.e., proteins and their ligands) in the form of quantitative feature vectors is crucial for the extraction of interaction-related properties during the artificial learning and subsequent prediction of DTIs. Lately, the representation learning approach, in which input samples are automatically featurized via training and applying a machine/deep learning model, has been utilized in biomedical sciences. In this study, we performed a comprehensive investigation of different computational approaches/techniques for protein featurization (including both conventional approaches and the novel learned embeddings), data preparation and exploration, machine learning-based modeling, and performance evaluation with the aim of achieving better data representations and more successful learning in DTI prediction. For this, we first constructed realistic and challenging benchmark datasets on small, medium, and large scales to be used as reliable gold standards for specific DTI modeling tasks. We developed and applied a network analysis-based splitting strategy to divide datasets into structurally different training and test folds. Using these datasets together with various featurization methods, we trained and tested DTI prediction models and evaluated their performance from different angles. Our main findings can be summarized under 3 items: (i) random splitting of datasets into train and test folds leads to near-complete data memorization and produce highly over-optimistic results, as a result, should be avoided, (ii) learned protein sequence embeddings work well in DTI prediction and offer high potential, despite interaction-related properties (e.g., structures) of proteins are unused during their self-supervised model training, and (iii) during the learning process, PCM models tend to rely heavily on compound features while partially ignoring protein features, primarily due to the inherent bias in DTI data, indicating the requirement for new and unbiased datasets. We hope this study will aid researchers in designing robust and high-performing data-driven DTI prediction systems that have real-world translational value in drug discovery.
Collapse
Affiliation(s)
- Heval Atas Guvenilir
- Biological Data Science Laboratory, Department of Computer Engineering, Hacettepe University, Ankara, Turkey
- Department of Health Informatics, Graduate School of Informatics, METU, Ankara, Turkey
| | - Tunca Doğan
- Biological Data Science Laboratory, Department of Computer Engineering, Hacettepe University, Ankara, Turkey.
- Institute of Informatics, Hacettepe University, Ankara, Turkey.
- Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey.
| |
Collapse
|
27
|
Outhwaite IR, Singh S, Berger BT, Knapp S, Chodera JD, Seeliger MA. Death by a Thousand Cuts â€" Combining Kinase Inhibitors for Selective Target Inhibition and Rational Polypharmacology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.13.523972. [PMID: 36711619 PMCID: PMC9882273 DOI: 10.1101/2023.01.13.523972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Kinase inhibitors are successful therapeutics in the treatment of cancers and autoimmune diseases and are useful tools in biomedical research. The high sequence and structural conservation of the catalytic kinase domain complicates the development of specific kinase inhibitors. As a consequence, most kinase inhibitors also inhibit off-target kinases which complicates the interpretation of phenotypic responses. Additionally, inhibition of off-targets may cause toxicity in patients. Therefore, highly selective kinase inhibition is a major goal in both biomedical research and clinical practice. Currently, efforts to improve selective kinase inhibition are dominated by the development of new kinase inhibitors. Here, we present an alternative solution to this problem by combining inhibitors with divergent off-target activities. We have developed a multicompound-multitarget scoring (MMS) method framework that combines inhibitors to maximize target inhibition and to minimize off-target inhibition. Additionally, this framework enables rational polypharmacology by allowing optimization of inhibitor combinations against multiple selected on-targets and off-targets. Using MMS with previously published chemogenomic kinase inhibitor datasets we determine inhibitor combinations that achieve potent activity against a target kinase and that are more selective than the most selective single inhibitor against that target. We validate the calculated effect and selectivity of a combination of inhibitors using the in cellulo NanoBRET assay. The MMS framework is generalizable to other pharmacological targets where compound specificity is a challenge and diverse compound libraries are available.
Collapse
|
28
|
Properties of FDA-approved small molecule protein kinase inhibitors: A 2023 update. Pharmacol Res 2023; 187:106552. [PMID: 36403719 DOI: 10.1016/j.phrs.2022.106552] [Citation(s) in RCA: 111] [Impact Index Per Article: 111.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 11/08/2022] [Indexed: 11/18/2022]
Abstract
Owing to the dysregulation of protein kinase activity in many diseases including cancer, this enzyme family has become one of the most important drug targets in the 21st century. There are 72 FDA-approved therapeutic agents that target about two dozen different protein kinases and three of these drugs were approved in 2022. Of the approved drugs, twelve target protein-serine/threonine protein kinases, four are directed against dual specificity protein kinases (MEK1/2), sixteen block nonreceptor protein-tyrosine kinases, and 40 target receptor protein-tyrosine kinases. The data indicate that 62 of these drugs are prescribed for the treatment of neoplasms (57 against solid tumors including breast, lung, and colon, ten against nonsolid tumors such as leukemia, and four against both solid and nonsolid tumors: acalabrutinib, ibrutinib, imatinib, and midostaurin). Four drugs (abrocitinib, baricitinib, tofacitinib, upadacitinib) are used for the treatment of inflammatory diseases (atopic dermatitis, psoriatic arthritis, rheumatoid arthritis, Crohn disease, and ulcerative colitis). Of the 72 approved drugs, eighteen are used in the treatment of multiple diseases. The following three drugs received FDA approval in 2022 for the treatment of these specified diseases: abrocitinib (atopic dermatitis), futibatinib (cholangiocarcinomas), pacritinib (myelofibrosis). All of the FDA-approved drugs are orally effective with the exception of netarsudil, temsirolimus, and trilaciclib. This review summarizes the physicochemical properties of all 72 FDA-approved small molecule protein kinase inhibitors including lipophilic efficiency and ligand efficiency.
Collapse
|
29
|
Wang Y, Aldahdooh J, Hu Y, Yang H, Vähä-Koskela M, Tang J, Tanoli Z. DrugRepo: a novel approach to repurposing drugs based on chemical and genomic features. Sci Rep 2022; 12:21116. [PMID: 36477604 PMCID: PMC9729186 DOI: 10.1038/s41598-022-24980-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/23/2022] [Indexed: 12/12/2022] Open
Abstract
The drug development process consumes 9-12 years and approximately one billion US dollars in costs. Due to the high finances and time costs required by the traditional drug discovery paradigm, repurposing old drugs to treat cancer and rare diseases is becoming popular. Computational approaches are mainly data-driven and involve a systematic analysis of different data types leading to the formulation of repurposing hypotheses. This study presents a novel scoring algorithm based on chemical and genomic data to repurpose drugs for 669 diseases from 22 groups, including various cancers, musculoskeletal, infections, cardiovascular, and skin diseases. The data types used to design the scoring algorithm are chemical structures, drug-target interactions (DTI), pathways, and disease-gene associations. The repurposed scoring algorithm is strengthened by integrating the most comprehensive manually curated datasets for each data type. At DrugRepo score ≥ 0.4, we repurposed 516 approved drugs across 545 diseases. Moreover, hundreds of novel predicted compounds can be matched with ongoing studies at clinical trials. Our analysis is supported by a web tool available at: http://drugrepo.org/ .
Collapse
Affiliation(s)
- Yinyin Wang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jehad Aldahdooh
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Yingying Hu
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Hongbin Yang
- Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Ziaurrehman Tanoli
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.
- BioICAWtech, Helsinki, Finland.
| |
Collapse
|
30
|
Raies A, Tulodziecka E, Stainer J, Middleton L, Dhindsa RS, Hill P, Engkvist O, Harper AR, Petrovski S, Vitsios D. DrugnomeAI is an ensemble machine-learning framework for predicting druggability of candidate drug targets. Commun Biol 2022; 5:1291. [PMID: 36434048 PMCID: PMC9700683 DOI: 10.1038/s42003-022-04245-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 11/09/2022] [Indexed: 11/27/2022] Open
Abstract
The druggability of targets is a crucial consideration in drug target selection. Here, we adopt a stochastic semi-supervised ML framework to develop DrugnomeAI, which estimates the druggability likelihood for every protein-coding gene in the human exome. DrugnomeAI integrates gene-level properties from 15 sources resulting in 324 features. The tool generates exome-wide predictions based on labelled sets of known drug targets (median AUC: 0.97), highlighting features from protein-protein interaction networks as top predictors. DrugnomeAI provides generic as well as specialised models stratified by disease type or drug therapeutic modality. The top-ranking DrugnomeAI genes were significantly enriched for genes previously selected for clinical development programs (p value < 1 × 10-308) and for genes achieving genome-wide significance in phenome-wide association studies of 450 K UK Biobank exomes for binary (p value = 1.7 × 10-5) and quantitative traits (p value = 1.6 × 10-7). We accompany our method with a web application ( http://drugnomeai.public.cgr.astrazeneca.com ) to visualise the druggability predictions and the key features that define gene druggability, per disease type and modality.
Collapse
Affiliation(s)
- Arwa Raies
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Ewa Tulodziecka
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - James Stainer
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Lawrence Middleton
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Ryan S Dhindsa
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, USA
| | - Pamela Hill
- Emerging Innovations, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Waltham, MA, USA
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Andrew R Harper
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
| | - Slavé Petrovski
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK
- Department of Medicine, University of Melbourne, Austin Health, Melbourne, VIC, Australia
| | - Dimitrios Vitsios
- Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK.
| |
Collapse
|
31
|
Zhou Y, Al‐Jarf R, Alavi A, Nguyen TB, Rodrigues CHM, Pires DEV, Ascher DB. kinCSM: Using graph-based signatures to predict small molecule CDK2 inhibitors. Protein Sci 2022; 31:e4453. [PMID: 36305769 PMCID: PMC9597374 DOI: 10.1002/pro.4453] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Revised: 09/14/2022] [Accepted: 09/15/2022] [Indexed: 11/20/2022]
Abstract
Protein phosphorylation acts as an essential on/off switch in many cellular signaling pathways. This has led to ongoing interest in targeting kinases for therapeutic intervention. Computer-aided drug discovery has been proven a useful and cost-effective approach for facilitating prioritization and enrichment of screening libraries, but limited effort has been devoted providing insights on what makes a potent kinase inhibitor. To fill this gap, here we developed kinCSM, an integrative computational tool capable of accurately identifying potent cyclin-dependent kinase 2 (CDK2) inhibitors, quantitatively predicting CDK2 ligand-kinase inhibition constants (pKi ) and classifying different types of inhibitors based on their favorable binding modes. kinCSM predictive models were built using supervised learning and leveraged the concept of graph-based signatures to capture both physicochemical properties and geometry properties of small molecules. CDK2 inhibitors were accurately identified with Matthew's Correlation Coefficients (MCC) of up to 0.74, and inhibition constants predicted with Pearson's correlation of up to 0.76, both with consistent performances of 0.66 and 0.68 on a nonredundant blind test, respectively. kinCSM was also able to identify the potential type of inhibition for a given molecule, achieving MCC of up to 0.80 on cross-validation and 0.73 on the blind test. Analyzing the molecular composition of revealed enriched chemical fragments in CDK2 inhibitors and different types of inhibitors, which provides insights into the molecular mechanisms behind ligand-kinase interactions. kinCSM will be an invaluable tool to guide future kinase drug discovery. To aid the fast and accurate screening of CDK2 inhibitors, kinCSM is freely available at https://biosig.lab.uq.edu.au/kin_csm/.
Collapse
Affiliation(s)
- Yunzhuo Zhou
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Raghad Al‐Jarf
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Azadeh Alavi
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Thanh Binh Nguyen
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Carlos H. M. Rodrigues
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| | - Douglas E. V. Pires
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
- School of Computing and Information SystemsUniversity of MelbourneMelbourneVictoriaAustralia
| | - David B. Ascher
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneQueenslandAustralia
- Structural Biology and Bioinformatics, Department of BiochemistryUniversity of MelbourneMelbourneVictoriaAustralia
- Systems and Computational Biology, Bio21 InstituteUniversity of MelbourneMelbourneVictoriaAustralia
- Computational Biology and Clinical InformaticsBaker Heart and Diabetes InstituteMelbourneVictoriaAustralia
| |
Collapse
|
32
|
Wang T, Pulkkinen OI, Aittokallio T. Target-specific compound selectivity for multi-target drug discovery and repurposing. Front Pharmacol 2022; 13:1003480. [PMID: 36225560 PMCID: PMC9549418 DOI: 10.3389/fphar.2022.1003480] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Accepted: 08/15/2022] [Indexed: 11/13/2022] Open
Abstract
Most drug molecules modulate multiple target proteins, leading either to therapeutic effects or unwanted side effects. Such target promiscuity partly contributes to high attrition rates and leads to wasted costs and time in the current drug discovery process, and makes the assessment of compound selectivity an important factor in drug development and repurposing efforts. Traditionally, selectivity of a compound is characterized in terms of its target activity profile (wide or narrow), which can be quantified using various statistical and information theoretic metrics. Even though the existing selectivity metrics are widely used for characterizing the overall selectivity of a compound, they fall short in quantifying how selective the compound is against a particular target protein (e.g., disease target of interest). We therefore extended the concept of compound selectivity towards target-specific selectivity, defined as the potency of a compound to bind to the particular protein in comparison to the other potential targets. We decompose the target-specific selectivity into two components: 1) the compound’s potency against the target of interest (absolute potency), and 2) the compound’s potency against the other targets (relative potency). The maximally selective compound-target pairs are then identified as a solution of a bi-objective optimization problem that simultaneously optimizes these two potency metrics. In computational experiments carried out using large-scale kinase inhibitor dataset, which represents a wide range of polypharmacological activities, we show how the optimization-based selectivity scoring offers a systematic approach to finding both potent and selective compounds against given kinase targets. Compared to the existing selectivity metrics, we show how the target-specific selectivity provides additional insights into the target selectivity and promiscuity of multi-targeting kinase inhibitors. Even though the selectivity score is shown to be relatively robust against both missing bioactivity values and the dataset size, we further developed a permutation-based procedure to calculate empirical p-values to assess the statistical significance of the observed selectivity of a compound-target pair in the given bioactivity dataset. We present several case studies that show how the target-specific selectivity can distinguish between highly selective and broadly-active kinase inhibitors, hence facilitating the discovery or repurposing of multi-targeting drugs.
Collapse
Affiliation(s)
- Tianduanyi Wang
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Otto I. Pulkkinen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
- Department of Mathematics and Statistics and InFLAMES Research Flagship, University of Turku, Turku, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
- Department of Mathematics and Statistics and InFLAMES Research Flagship, University of Turku, Turku, Finland
- Institute for Cancer Research, Department of Cancer Genetics, Oslo University Hospital, Oslo, Norway
- Oslo Centre for Biostatistics and Epidemiology (OCBE), Faculty of Medicine, University of Oslo, Oslo, Norway
- *Correspondence: Tero Aittokallio,
| |
Collapse
|
33
|
Aldahdooh J, Vähä-Koskela M, Tang J, Tanoli Z. Using BERT to identify drug-target interactions from whole PubMed. BMC Bioinformatics 2022; 23:245. [PMID: 35729494 PMCID: PMC9214985 DOI: 10.1186/s12859-022-04768-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 06/03/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Drug-target interactions (DTIs) are critical for drug repurposing and elucidation of drug mechanisms, and are manually curated by large databases, such as ChEMBL, BindingDB, DrugBank and DrugTargetCommons. However, the number of curated articles likely constitutes only a fraction of all the articles that contain experimentally determined DTIs. Finding such articles and extracting the experimental information is a challenging task, and there is a pressing need for systematic approaches to assist the curation of DTIs. To this end, we applied Bidirectional Encoder Representations from Transformers (BERT) to identify such articles. Because DTI data intimately depends on the type of assays used to generate it, we also aimed to incorporate functions to predict the assay format. RESULTS Our novel method identified 0.6 million articles (along with drug and protein information) which are not previously included in public DTI databases. Using 10-fold cross-validation, we obtained ~ 99% accuracy for identifying articles containing quantitative drug-target profiles. The F1 micro for the prediction of assay format is 88%, which leaves room for improvement in future studies. CONCLUSION The BERT model in this study is robust and the proposed pipeline can be used to identify previously overlooked articles containing quantitative DTIs. Overall, our method provides a significant advancement in machine-assisted DTI extraction and curation. We expect it to be a useful addition to drug mechanism discovery and repurposing.
Collapse
Affiliation(s)
- Jehad Aldahdooh
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Doctoral Programme in Computer Science, University of Helsinki, Helsinki, Finland
| | - Markus Vähä-Koskela
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
| | - Ziaurrehman Tanoli
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland. .,BioICAWtech, Helsinki, Finland.
| |
Collapse
|
34
|
Bender A, Schneider N, Segler M, Patrick Walters W, Engkvist O, Rodrigues T. Evaluation guidelines for machine learning tools in the chemical sciences. Nat Rev Chem 2022; 6:428-442. [PMID: 37117429 DOI: 10.1038/s41570-022-00391-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/13/2022] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (or the impossibility of) comparing and assessing the relevance of new algorithms. Ultimately, this may delay the digitalization of chemistry at scale and confuse method developers, experimentalists, reviewers and journal editors. In this Perspective, we critically discuss a set of method development and evaluation guidelines for different types of ML-based publications, emphasizing supervised learning. We provide a diverse collection of examples from various authors and disciplines in chemistry. While taking into account varying accessibility across research groups, our recommendations focus on reporting completeness and standardizing comparisons between tools. We aim to further contribute to improved ML transparency and credibility by suggesting a checklist of retro-/prospective tests and dissecting their importance. We envisage that the wide adoption and continuous update of best practices will encourage an informed use of ML on real-world problems related to the chemical sciences.
Collapse
|
35
|
Douglass EF, Allaway RJ, Szalai B, Wang W, Tian T, Fernández-Torras A, Realubit R, Karan C, Zheng S, Pessia A, Tanoli Z, Jafari M, Wan F, Li S, Xiong Y, Duran-Frigola M, Bertoni M, Badia-i-Mompel P, Mateo L, Guitart-Pla O, Chung V, Tang J, Zeng J, Aloy P, Saez-Rodriguez J, Guinney J, Gerhard DS, Califano A. A community challenge for a pancancer drug mechanism of action inference from perturbational profile data. Cell Rep Med 2022; 3:100492. [PMID: 35106508 PMCID: PMC8784774 DOI: 10.1016/j.xcrm.2021.100492] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 08/08/2021] [Accepted: 12/15/2021] [Indexed: 12/14/2022]
Abstract
The Columbia Cancer Target Discovery and Development (CTD2) Center is developing PANACEA, a resource comprising dose-responses and RNA sequencing (RNA-seq) profiles of 25 cell lines perturbed with ∼400 clinical oncology drugs, to study a tumor-specific drug mechanism of action. Here, this resource serves as the basis for a DREAM Challenge assessing the accuracy and sensitivity of computational algorithms for de novo drug polypharmacology predictions. Dose-response and perturbational profiles for 32 kinase inhibitors are provided to 21 teams who are blind to the identity of the compounds. The teams are asked to predict high-affinity binding targets of each compound among ∼1,300 targets cataloged in DrugBank. The best performing methods leverage gene expression profile similarity analysis as well as deep-learning methodologies trained on individual datasets. This study lays the foundation for future integrative analyses of pharmacogenomic data, reconciliation of polypharmacology effects in different tumor contexts, and insights into network-based assessments of drug mechanisms of action. Drug-perturbed RNA sequencing data can be used to identify drug targets Technology-based drug-target definitions often subsume literature definitions Literature and screening datasets provide complementary information on drug mechanisms
Collapse
Affiliation(s)
- Eugene F. Douglass
- Department of Systems Biology, Columbia University Irving Medical Center, 1130 Saint Nicholas Ave., New York, NY 10032, USA
- Pharmaceutical and Biomedical Sciences, University of Georgia, 250 W. Green Street, Athens, GA 30602, USA
| | - Robert J. Allaway
- Computational Oncology Group, Sage Bionetworks, 2901 Third Ave., Ste 330, Seattle, WA 98121, USA
| | - Bence Szalai
- Semmelweis University, Faculty of Medicine, Department of Physiology, Budapest, Hungary
| | - Wenyu Wang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Tingzhong Tian
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Adrià Fernández-Torras
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Ron Realubit
- Department of Systems Biology, Columbia University Irving Medical Center, 1130 Saint Nicholas Ave., New York, NY 10032, USA
| | - Charles Karan
- Department of Systems Biology, Columbia University Irving Medical Center, 1130 Saint Nicholas Ave., New York, NY 10032, USA
| | - Shuyu Zheng
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Alberto Pessia
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Ziaurrehman Tanoli
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Mohieddin Jafari
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Fangping Wan
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Shuya Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Yuanpeng Xiong
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| | - Miquel Duran-Frigola
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Martino Bertoni
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Pau Badia-i-Mompel
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Lídia Mateo
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Oriol Guitart-Pla
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Verena Chung
- Computational Oncology Group, Sage Bionetworks, 2901 Third Ave., Ste 330, Seattle, WA 98121, USA
| | | | - Jing Tang
- Research Program in Systems Oncology, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing 100084, China
| | - Patrick Aloy
- Joint IRB-BSC-CRG Program in Computational Biology, Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Justin Guinney
- Computational Oncology Group, Sage Bionetworks, 2901 Third Ave., Ste 330, Seattle, WA 98121, USA
| | - Daniela S. Gerhard
- Office of Cancer Genomics, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Andrea Califano
- Department of Systems Biology, Columbia University Irving Medical Center, 1130 Saint Nicholas Ave., New York, NY 10032, USA
- Herbert Irving Comprehensive Cancer Center, Columbia University Irving Medical Center, 1130 Saint Nicholas Ave., New York, NY 10032, USA
- Department of Medicine, Columbia University Irving Medical Center, 630 W 168th Street, New York, NY 10032, USA
- Department of Biochemistry & Molecular Biophysics, Columbia University Irving Medical Center, 701 W 168th Street, New York, NY 10032, USA
- Department of Biomedical Informatics, Columbia University Irving Medical Center, 622 W 168th Street, New York, NY 10032, USA
- Corresponding author
| |
Collapse
|
36
|
Polypharmacology: The science of multi-targeting molecules. Pharmacol Res 2022; 176:106055. [PMID: 34990865 DOI: 10.1016/j.phrs.2021.106055] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Revised: 12/23/2021] [Accepted: 12/31/2021] [Indexed: 12/28/2022]
Abstract
Polypharmacology is a concept where a molecule can interact with two or more targets simultaneously. It offers many advantages as compared to the conventional single-targeting molecules. A multi-targeting drug is much more efficacious due to its cumulative efficacy at all of its individual targets making it much more effective in complex and multifactorial diseases like cancer, where multiple proteins and pathways are involved in the onset and development of the disease. For a molecule to be polypharmacologic in nature, it needs to possess promiscuity which is the ability to interact with multiple targets; and at the same time avoid binding to antitargets which would otherwise result in off-target adverse effects. There are certain structural features and physicochemical properties which when present would help researchers to predict if the designed molecule would possess promiscuity or not. Promiscuity can also be identified via advanced state-of-the-art computational methods. In this review, we also elaborate on the methods by which one can intentionally incorporate promiscuity in their molecules and make them polypharmacologic. The polypharmacology paradigm of "one drug-multiple targets" has numerous applications especially in drug repurposing where an already established drug is redeveloped for a new indication. Though designing a polypharmacological drug is much more difficult than designing a single-targeting drug, with the current technologies and information regarding different diseases and chemical functional groups, it is plausible for researchers to intentionally design a polypharmacological drug and unlock its advantages.
Collapse
|
37
|
Kong W, Midena G, Chen Y, Athanasiadis P, Wang T, Rousu J, He L, Aittokallio T. Systematic review of computational methods for drug combination prediction. Comput Struct Biotechnol J 2022; 20:2807-2814. [PMID: 35685365 PMCID: PMC9168078 DOI: 10.1016/j.csbj.2022.05.055] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/27/2022] [Accepted: 05/27/2022] [Indexed: 12/26/2022] Open
Abstract
Synergistic effects between drugs are rare and highly context-dependent and patient-specific. Hence, there is a need to develop novel approaches to stratify patients for optimal therapy regimens, especially in the context of personalized design of combinatorial treatments. Computational methods enable systematic in-silico screening of combination effects, and can thereby prioritize most potent combinations for further testing, among the massive number of potential combinations. To help researchers to choose a prediction method that best fits for various real-world applications, we carried out a systematic literature review of 117 computational methods developed to date for drug combination prediction, and classified the methods in terms of their combination prediction tasks and input data requirements. Most current methods focus on prediction or classification of combination synergy, and only a few methods consider the efficacy and potential toxicity of the combinations, which are the key determinants of therapeutic success of drug treatments. Furthermore, there is a need to further develop methods that enable dose-specific predictions of combination effects across multiple doses, which is important for clinical translation of the predictions, as well as model-based identification of biomarkers predictive of heterogeneous drug combination responses. Even if most of the computational methods reviewed focus on anticancer applications, many of the modelling approaches are also applicable to antiviral and other diseases or indications.
Collapse
|
38
|
Roskoski R. Properties of FDA-approved small molecule protein kinase inhibitors: A 2022 update. Pharmacol Res 2021; 175:106037. [PMID: 34921994 DOI: 10.1016/j.phrs.2021.106037] [Citation(s) in RCA: 127] [Impact Index Per Article: 42.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 12/12/2021] [Indexed: 01/03/2023]
Abstract
Owing to the dysregulation of protein kinase activity in many diseases including cancer, this enzyme family has become one of the most important drug targets in the 21st century. There are 68 FDA-approved therapeutic agents that target about two dozen different protein kinases and six of these drugs were approved in 2021. Of the approved drugs, twelve target protein-serine/threonine protein kinases, four are directed against dual specificity protein kinases (MEK1/2), thirteen block nonreceptor protein-tyrosine kinases, and 39 target receptor protein-tyrosine kinases. The data indicate that 58 of these drugs are prescribed for the treatment of neoplasms (49 against solid tumors including breast, lung, and colon, five against nonsolid tumors such as leukemias, and four against both solid and nonsolid tumors: acalabrutinib, ibrutinib, imatinib, and midostaurin). Three drugs (baricitinib, tofacitinib, upadacitinib) are used for the treatment of inflammatory diseases including rheumatoid arthritis. Of the 68 approved drugs, eighteen are used in the treatment of multiple diseases. The following six drugs received FDA approval in 2021 for the treatment of these specified diseases: belumosudil (graft vs. host disease), infigratinib (cholangiocarcinomas), mobocertinib and tepotinib (specific forms of non-small cell lung cancer), tivozanib (renal cell carcinoma), and trilaciclib (to decrease chemotherapy-induced myelosuppression). All of the FDA-approved drugs are orally effective with the exception of netarsudil, temsirolimus, and the newly approved trilaciclib. This review summarizes the physicochemical properties of all 68 FDA-approved small molecule protein kinase inhibitors including lipophilic efficiency and ligand efficiency.
Collapse
Affiliation(s)
- Robert Roskoski
- Blue Ridge Institute for Medical Research, 3754 Brevard Road, Suite 106, Box 19, Horse Shoe, NC 28742-8814, United States.
| |
Collapse
|
39
|
Born J, Huynh T, Stroobants A, Cornell WD, Manica M. Active Site Sequence Representations of Human Kinases Outperform Full Sequence Representations for Affinity Prediction and Inhibitor Generation: 3D Effects in a 1D Model. J Chem Inf Model 2021; 62:240-257. [PMID: 34905358 DOI: 10.1021/acs.jcim.1c00889] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Recent advances in deep learning have enabled the development of large-scale multimodal models for virtual screening and de novo molecular design. The human kinome with its abundant sequence and inhibitor data presents an attractive opportunity to develop proteochemometric models that exploit the size and internal diversity of this family of targets. Here, we challenge a standard practice in sequence-based affinity prediction models: instead of leveraging the full primary structure of proteins, each target is represented by a sequence of 29 discontiguous residues defining the ATP binding site. In kinase-ligand binding affinity prediction, our results show that the reduced active site sequence representation is not only computationally more efficient but consistently yields significantly higher performance than the full primary structure. This trend persists across different models, data sets, and performance metrics and holds true when predicting pIC50 for both unseen ligands and kinases. Our interpretability analysis reveals a potential explanation for the superiority of the active site models: whereas only mild statistical effects about the extraction of three-dimensional (3D) interaction sites take place in the full sequence models, the active site models are equipped with an implicit but strong inductive bias about the 3D structure stemming from the discontiguity of the active sites. Moreover, in direct comparisons, our models perform similarly or better than previous state-of-the-art approaches in affinity prediction. We then investigate a de novo molecular design task and find that the active site provides benefits in the computational efficiency, but otherwise, both kinase representations yield similar optimized affinities (for both SMILES- and SELFIES-based molecular generators). Our work challenges the assumption that the full primary structure is indispensable for modeling human kinases.
Collapse
Affiliation(s)
- Jannis Born
- IBM Research Europe, 8804 Rüschlikon, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland
| | - Tien Huynh
- IBM Research, Yorktown Heights, New York 10598, United States
| | - Astrid Stroobants
- Department of Chemistry, Imperial College London, SW7 2AZ London, United Kingdom
| | - Wendy D Cornell
- IBM Research, Yorktown Heights, New York 10598, United States
| | | |
Collapse
|
40
|
Xiong Z, Jeon M, Allaway RJ, Kang J, Park D, Lee J, Jeon H, Ko M, Jiang H, Zheng M, Tan AC, Guo X, Dang KK, Tropsha A, Hecht C, Das TK, Carlson HA, Abagyan R, Guinney J, Schlessinger A, Cagan R. Crowdsourced identification of multi-target kinase inhibitors for RET- and TAU- based disease: The Multi-Targeting Drug DREAM Challenge. PLoS Comput Biol 2021; 17:e1009302. [PMID: 34520464 PMCID: PMC8483411 DOI: 10.1371/journal.pcbi.1009302] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 09/30/2021] [Accepted: 07/23/2021] [Indexed: 01/22/2023] Open
Abstract
A continuing challenge in modern medicine is the identification of safer and more efficacious drugs. Precision therapeutics, which have one molecular target, have been long promised to be safer and more effective than traditional therapies. This approach has proven to be challenging for multiple reasons including lack of efficacy, rapidly acquired drug resistance, and narrow patient eligibility criteria. An alternative approach is the development of drugs that address the overall disease network by targeting multiple biological targets ('polypharmacology'). Rational development of these molecules will require improved methods for predicting single chemical structures that target multiple drug targets. To address this need, we developed the Multi-Targeting Drug DREAM Challenge, in which we challenged participants to predict single chemical entities that target pro-targets but avoid anti-targets for two unrelated diseases: RET-based tumors and a common form of inherited Tauopathy. Here, we report the results of this DREAM Challenge and the development of two neural network-based machine learning approaches that were applied to the challenge of rational polypharmacology. Together, these platforms provide a potentially useful first step towards developing lead therapeutic compounds that address disease complexity through rational polypharmacology.
Collapse
Affiliation(s)
- Zhaoping Xiong
- Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China
| | - Minji Jeon
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | | | - Jaewoo Kang
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Republic of Korea
| | - Donghyeon Park
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Jinhyuk Lee
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Hwisang Jeon
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Republic of Korea
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Miyoung Ko
- Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
| | - Hualiang Jiang
- Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Aik Choon Tan
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, Florida, United States of America
| | - Xindi Guo
- Sage Bionetworks, Seattle, Washington, United States of America
| | | | - Kristen K. Dang
- Sage Bionetworks, Seattle, Washington, United States of America
| | - Alex Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Chana Hecht
- Department of Cell, Developmental, and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
| | - Tirtha K. Das
- Department of Cell, Developmental, and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
| | - Heather A. Carlson
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, California, United States of America
| | - Justin Guinney
- Sage Bionetworks, Seattle, Washington, United States of America
| | - Avner Schlessinger
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
| | - Ross Cagan
- Department of Cell, Developmental, and Regenerative Biology, Icahn School of Medicine at Mount Sinai, New York City, New York, United States of America
- Institute of Cancer Sciences, University of Glasgow; Glasgow, Scotland, United Kingdom
| |
Collapse
|