1
|
Jabarin A, Shtar G, Feinshtein V, Mazuz E, Shapira B, Ben-Shabat S, Rokach L. Eravacycline, an antibacterial drug, repurposed for pancreatic cancer therapy: insights from a molecular-based deep learning model. Brief Bioinform 2024; 25:bbae108. [PMID: 38647152 PMCID: PMC11033730 DOI: 10.1093/bib/bbae108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 02/04/2024] [Accepted: 02/25/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Pancreatic ductal adenocarcinoma (PDAC) remains a serious threat to health, with limited effective therapeutic options, especially due to advanced stage at diagnosis and its inherent resistance to chemotherapy, making it one of the leading causes of cancer-related deaths worldwide. The lack of clear treatment directions underscores the urgent need for innovative approaches to address and manage this deadly condition. In this research, we repurpose drugs with potential anti-cancer activity using machine learning (ML). METHODS We tackle the problem by using a neural network trained on drug-target interaction information enriched with drug-drug interaction information, which has not been used for anti-cancer drug repurposing before. We focus on eravacycline, an antibacterial drug, which was selected and evaluated to assess its anti-cancer effects. RESULTS Eravacycline significantly inhibited the proliferation and migration of BxPC-3 cells and induced apoptosis. CONCLUSION Our study highlights the potential of drug repurposing for cancer treatment using ML. Eravacycline showed promising results in inhibiting cancer cell proliferation, migration and inducing apoptosis in PDAC. These findings demonstrate that our developed ML drug repurposing models can be applied to a wide range of new oncology therapeutics, to identify potential anti-cancer agents. This highlights the potential and presents a promising approach for identifying new therapeutic options.
Collapse
Affiliation(s)
- Adi Jabarin
- Department of Clinical Biochemistry and Pharmacology, Ben-Gurion University of the Negev (BGU), P.O.B. 653, Beer-Sheva 8410501, Israel
| | - Guy Shtar
- Department of Information Systems and Software Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva 8410501, Israel
| | - Valeria Feinshtein
- Department of Clinical Biochemistry and Pharmacology, Ben-Gurion University of the Negev (BGU), P.O.B. 653, Beer-Sheva 8410501, Israel
| | - Eyal Mazuz
- Department of Information Systems and Software Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva 8410501, Israel
| | - Bracha Shapira
- Department of Information Systems and Software Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva 8410501, Israel
| | - Shimon Ben-Shabat
- Department of Clinical Biochemistry and Pharmacology, Ben-Gurion University of the Negev (BGU), P.O.B. 653, Beer-Sheva 8410501, Israel
| | - Lior Rokach
- Department of Information Systems and Software Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva 8410501, Israel
| |
Collapse
|
2
|
Shtar G, Solomon A, Mazuz E, Rokach L, Shapira B. A simplified similarity-based approach for drug-drug interaction prediction. PLoS One 2023; 18:e0293629. [PMID: 37943768 PMCID: PMC10635435 DOI: 10.1371/journal.pone.0293629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/17/2023] [Indexed: 11/12/2023] Open
Abstract
Drug-drug interactions (DDIs) are a critical component of drug safety surveillance. Laboratory studies aimed at detecting DDIs are typically difficult, expensive, and time-consuming; therefore, developing in-silico methods is critical. Machine learning-based approaches for DDI prediction have been developed; however, in many cases, their ability to achieve high accuracy relies on data only available towards the end of the molecule lifecycle. Here, we propose a simple yet effective similarity-based method for preclinical DDI prediction where only the chemical structure is available. We test the model on new, unseen drugs. To focus on the preclinical problem setting, we conducted a retrospective analysis and tested the models on drugs that were added to a later version of the DrugBank database. We extend an existing method, adjacency matrix factorization with propagation (AMFP), to support unseen molecules by applying a new lookup mechanism to the drugs' chemical structure, lookup adjacency matrix factorization with propagation (LAMFP). We show that using an ensemble of different similarity measures improves the results. We also demonstrate that Chemprop, a message-passing neural network, can be used for DDI prediction. In computational experiments, LAMFP results in high accuracy, with an area under the receiver operating characteristic curve of 0.82 for interactions involving a new drug and an existing drug and for interactions involving only existing drugs. Moreover, LAMFP outperforms state-of-the-art, complex graph neural network DDI prediction methods.
Collapse
Affiliation(s)
- Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
- Department of Information Systems, University of Haifa, Haifa, Israel
| | - Adir Solomon
- Department of Information Systems, University of Haifa, Haifa, Israel
| | - Eyal Mazuz
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
3
|
Mazuz E, Shtar G, Kutsky N, Rokach L, Shapira B. Pretrained transformer models for predicting the withdrawal of drugs from the market. Bioinformatics 2023; 39:btad519. [PMID: 37610328 PMCID: PMC10469107 DOI: 10.1093/bioinformatics/btad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 07/24/2023] [Accepted: 08/22/2023] [Indexed: 08/24/2023] Open
Abstract
MOTIVATION The process of drug discovery is notoriously complex, costing an average of 2.6 billion dollars and taking ∼13 years to bring a new drug to the market. The success rate for new drugs is alarmingly low (around 0.0001%), and severe adverse drug reactions (ADRs) frequently occur, some of which may even result in death. Early identification of potential ADRs is critical to improve the efficiency and safety of the drug development process. RESULTS In this study, we employed pretrained large language models (LLMs) to predict the likelihood of a drug being withdrawn from the market due to safety concerns. Our method achieved an area under the curve (AUC) of over 0.75 through cross-database validation, outperforming classical machine learning models and graph-based models. Notably, our pretrained LLMs successfully identified over 50% drugs that were subsequently withdrawn, when predictions were made on a subset of drugs with inconsistent labeling between the training and test sets. AVAILABILITY AND IMPLEMENTATION The code and datasets are available at https://github.com/eyalmazuz/DrugWithdrawn.
Collapse
Affiliation(s)
- Eyal Mazuz
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Nir Kutsky
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva, 8410501, Israel
| |
Collapse
|
4
|
Mazuz E, Shtar G, Shapira B, Rokach L. Molecule generation using transformers and policy gradient reinforcement learning. Sci Rep 2023; 13:8799. [PMID: 37258546 DOI: 10.1038/s41598-023-35648-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 05/22/2023] [Indexed: 06/02/2023] Open
Abstract
Generating novel valid molecules is often a difficult task, because the vast chemical space relies on the intuition of experienced chemists. In recent years, deep learning models have helped accelerate this process. These advanced models can also help identify suitable molecules for disease treatment. In this paper, we propose Taiga, a transformer-based architecture for the generation of molecules with desired properties. Using a two-stage approach, we first treat the problem as a language modeling task of predicting the next token, using SMILES strings. Then, we use reinforcement learning to optimize molecular properties such as QED. This approach allows our model to learn the underlying rules of chemistry and more easily optimize for molecules with desired properties. Our evaluation of Taiga, which was performed with multiple datasets and tasks, shows that Taiga is comparable to, or even outperforms, state-of-the-art baselines for molecule optimization, with improvements in the QED ranging from 2 to over 20 percent. The improvement was demonstrated both on datasets containing lead molecules and random molecules. We also show that with its two stages, Taiga is capable of generating molecules with higher biological property scores than the same model without reinforcement learning.
Collapse
Affiliation(s)
- Eyal Mazuz
- Ben-Gurion University of the Negev, Beersheba, Israel.
| | - Guy Shtar
- Ben-Gurion University of the Negev, Beersheba, Israel
| | | | - Lior Rokach
- Ben-Gurion University of the Negev, Beersheba, Israel
| |
Collapse
|
5
|
Shtar G, Greenstein-Messica A, Mazuz E, Rokach L, Shapira B. Predicting drug characteristics using biomedical text embedding. BMC Bioinformatics 2022; 23:526. [PMID: 36476573 PMCID: PMC9730627 DOI: 10.1186/s12859-022-05083-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 11/25/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Drug-drug interactions (DDIs) are preventable causes of medical injuries and often result in doctor and emergency room visits. Previous research demonstrates the effectiveness of using matrix completion approaches based on known drug interactions to predict unknown Drug-drug interactions. However, in the case of a new drug, where there is limited or no knowledge regarding the drug's existing interactions, such an approach is unsuitable, and other drug's preferences can be used to accurately predict new Drug-drug interactions. METHODS We propose adjacency biomedical text embedding (ABTE) to address this limitation by using a hybrid approach which combines known drugs' interactions and the drug's biomedical text embeddings to predict the DDIs of both new and well known drugs. RESULTS Our evaluation demonstrates the superiority of this approach compared to recently published DDI prediction models and matrix factorization-based approaches. Furthermore, we compared the use of different text embedding methods in ABTE, and found that the concept embedding approach, which involves biomedical information in the embedding process, provides the highest performance for this task. Additionally, we demonstrate the effectiveness of leveraging biomedical text embedding for additional drugs' biomedical prediction task by presenting text embedding's contribution to a multi-modal pregnancy drug safety classification. CONCLUSION Text and concept embeddings created by analyzing a domain-specific large-scale biomedical corpora can be used for predicting drug-related properties such as Drug-drug interactions and drug safety prediction. Prediction models based on the embeddings resulted in comparable results to hand-crafted features, however text embeddings do not require manual categorization or data collection and rely solely on the published literature.
Collapse
Affiliation(s)
- Guy Shtar
- grid.7489.20000 0004 1937 0511Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Asnat Greenstein-Messica
- grid.7489.20000 0004 1937 0511Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Eyal Mazuz
- grid.7489.20000 0004 1937 0511Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Lior Rokach
- grid.7489.20000 0004 1937 0511Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Bracha Shapira
- grid.7489.20000 0004 1937 0511Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|
6
|
Shtar G, Rokach L, Shapira B, Kohn E, Berkovitch M, Berlin M. Explainable multimodal machine learning model for classifying pregnancy drug safety. Bioinformatics 2022; 38:1102-1109. [PMID: 34791058 DOI: 10.1093/bioinformatics/btab769] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 10/07/2021] [Accepted: 11/04/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Teratogenic drugs can cause severe fetal malformation and therefore have critical impact on the health of the fetus, yet the teratogenic risks are unknown for most approved drugs. This article proposes an explainable machine learning model for classifying pregnancy drug safety based on multimodal data and suggests an orthogonal ensemble for modeling multimodal data. To train the proposed model, we created a set of labeled drugs by processing over 100 000 textual responses collected by a large teratology information service. Structured textual information is incorporated into the model by applying clustering analysis to textual features. RESULTS We report an area under the receiver operating characteristic curve (AUC) of 0.891 using cross-validation and an AUC of 0.904 for cross-expert validation. Our findings suggest the safety of two drugs during pregnancy, Varenicline and Mebeverine, and suggest that Meloxicam, an NSAID, is of higher risk; according to existing data, the safety of these three drugs during pregnancy is unknown. We also present a web-based application that enables physicians to examine a specific drug and its risk factors. AVAILABILITY AND IMPLEMENTATION The code and data is available from https://github.com/goolig/drug_safety_pregnancy_prediction.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Shevam, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Shevam, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Shevam, Israel
| | - Elkana Kohn
- Clinical Pharmacology and Toxicology Unit, Drug Consultation Center, Shamir Medical Center (Assaf Harofeh), Zerifin, Affiliated to Sackler Faculty of Medicine, Tel-Aviv University, Tel Aviv, Israel
| | - Matitiahu Berkovitch
- Clinical Pharmacology and Toxicology Unit, Drug Consultation Center, Shamir Medical Center (Assaf Harofeh), Zerifin, Affiliated to Sackler Faculty of Medicine, Tel-Aviv University, Tel Aviv, Israel
| | - Maya Berlin
- Clinical Pharmacology and Toxicology Unit, Drug Consultation Center, Shamir Medical Center (Assaf Harofeh), Zerifin, Affiliated to Sackler Faculty of Medicine, Tel-Aviv University, Tel Aviv, Israel
| |
Collapse
|
7
|
Shtar G, Rokach L, Shapira B, Berkovitch M, Dinavitser N, Cohen R, De Haan T, Kohn E, Berlin M. Artificial Intelligence – Game Changer in the Teratology Information Service. Reprod Toxicol 2020. [DOI: 10.1016/j.reprotox.2020.04.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
8
|
Shtar G, Rokach L, Shapira B, Nissan R, Hershkovitz A. Using Machine Learning to Predict Rehabilitation Outcomes in Postacute Hip Fracture Patients. Arch Phys Med Rehabil 2020; 102:386-394. [PMID: 32949551 DOI: 10.1016/j.apmr.2020.08.011] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 07/12/2020] [Accepted: 08/12/2020] [Indexed: 12/18/2022]
Abstract
OBJECTIVE To use machine learning-based methods in designing a predictive model of rehabilitation outcomes for postacute hip fracture patients. DESIGN A retrospective analysis using linear models, AdaBoost, CatBoost, ExtraTrees, K-Nearest Neighbors, RandomForest, Support vector machine, XGBoost, and voting of all models to develop and validate a predictive model. SETTING A university-affiliated 300-bed major postacute geriatric rehabilitation center. PARTICIPANTS Consecutive hip fracture patients (N=1625) admitted to an postacute rehabilitation department. MAIN OUTCOME MEASURES The FIM instrument, motor FIM (mFIM), and the relative functional gain on mFIM (mFIM effectiveness) as a continuous and binary variable. Ten predictive models were created: base models (linear/logistic regression), and 8 machine learning models (AdaBoost, CatBoost, ExtraTrees, K-Nearest Neighbors, RandomForest, Support vector machine, XGBoost, and a voting ensemble). R2 was used to evaluate their performance in predicting a continuous outcome variable, and the area under the receiver operating characteristic curve was used to evaluate the binary outcome. A paired 2-tailed t test compared the results of the different models. RESULTS Machine learning-based models yielded better results than the linear and logistic regression models in predicting rehabilitation outcomes. The 3 most important predictors of the mFIM effectiveness score were the Mini Mental State Examination (MMSE), prefracture mFIM scores, and age. The 3 most important predictors of the discharge mFIM score were the admission mFIM, MMSE, and prefracture mFIM scores. The most contributing factors for favorable outcomes (mFIM effectiveness > median) with higher prediction confidence level were high MMSE (25.7±2.8), high prefacture mFIM (81.5±7.8), and high admission mFIM (48.6±8) scores. We present a simple prediction instrument for estimating the expected performance of postacute hip fracture patients. CONCLUSIONS The use of machine learning models to predict rehabilitation outcomes of postacute hip fracture patients is superior to linear and logistic regression models. The higher the MMSE, prefracture mFIM, and admission mFIM scores are, the higher the confidence levels of the predicted parameters.
Collapse
Affiliation(s)
- Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Ran Nissan
- 'Beit Rivka' Geriatric Rehabilitation Center, Petach Tikva, Israel
| | - Avital Hershkovitz
- 'Beit Rivka' Geriatric Rehabilitation Center, Petach Tikva, Israel; Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel.
| |
Collapse
|
9
|
Shtar G, Rokach L, Shapira B. Detecting drug-drug interactions using artificial neural networks and classic graph similarity measures. PLoS One 2019; 14:e0219796. [PMID: 31369568 PMCID: PMC6675052 DOI: 10.1371/journal.pone.0219796] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 07/01/2019] [Indexed: 11/19/2022] Open
Abstract
Drug-drug interactions are preventable causes of medical injuries and often result in doctor and emergency room visits. Computational techniques can be used to predict potential drug-drug interactions. We approach the drug-drug interaction prediction problem as a link prediction problem and present two novel methods for drug-drug interaction prediction based on artificial neural networks and factor propagation over graph nodes: adjacency matrix factorization (AMF) and adjacency matrix factorization with propagation (AMFP). We conduct a retrospective analysis by training our models on a previous release of the DrugBank database with 1,141 drugs and 45,296 drug-drug interactions and evaluate the results on a later version of DrugBank with 1,440 drugs and 248,146 drug-drug interactions. Additionally, we perform a holdout analysis using DrugBank. We report an area under the receiver operating characteristic curve score of 0.807 and 0.990 for the retrospective and holdout analyses respectively. Finally, we create an ensemble-based classifier using AMF, AMFP, and existing link prediction methods and obtain an area under the receiver operating characteristic curve of 0.814 and 0.991 for the retrospective and the holdout analyses. We demonstrate that AMF and AMFP provide state of the art results compared to existing methods and that the ensemble-based classifier improves the performance by combining various predictors. Additionally, we compare our methods with multi-source data-based predictors using cross-validation. In the multi-source data comparison, our methods outperform various ensembles created using 29 different predictors based on several data sources. These results suggest that AMF, AMFP, and the proposed ensemble-based classifier can provide important information during drug development and regarding drug prescription given only partial or noisy data. Additionally, the results indicate that the interaction network (known DDIs) is the most useful data source for identifying potential DDIs and that our methods take advantage of it better than the other methods investigated. The methods we present can also be used to solve other link prediction problems. Drug embeddings (compressed representations) created when training our models using the interaction network have been made public.
Collapse
Affiliation(s)
- Guy Shtar
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
- * E-mail:
| | - Lior Rokach
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Bracha Shapira
- Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| |
Collapse
|