1
|
Kelleci Çelik F, Doğan S, Karaduman G. Drug-induced torsadogenicity prediction model: An explainable machine learning-driven quantitative structure-toxicity relationship approach. Comput Biol Med 2024; 182:109209. [PMID: 39332120 DOI: 10.1016/j.compbiomed.2024.109209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2024] [Revised: 09/03/2024] [Accepted: 09/23/2024] [Indexed: 09/29/2024]
Abstract
Drug-induced Torsade de Pointes (TdP), a life-threatening polymorphic ventricular tachyarrhythmia, emerges due to the cardiotoxic effects of pharmaceuticals. The need for precise mechanisms and clinical biomarkers to detect this adverse effect presents substantial challenges in drug safety assessment. In this study, we propose that analyzing the physicochemical properties of pharmaceuticals can provide valuable insights into their potential for torsadogenic cardiotoxicity. Our research centers on estimating TdP risk based on the molecular structure of drugs. We introduce a novel quantitative structure-toxicity relationship (QSTR) prediction model that leverages an in silico approach developed by adopting the 4R rule in laboratory animals. This approach eliminates the need for animal testing, saves time, and reduces cost. Our algorithm has successfully predicted the torsadogenic risks of various pharmaceutical compounds. To develop this model, we employed Support Vector Machine (SVM) and ensemble techniques, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). We enhanced the model's predictive accuracy through a rigorous two-step feature selection process. Furthermore, we utilized the SHapley Additive exPlanations (SHAP) technique to explain the prediction of torsadogenic risk, particularly within the RF model. This study represents a significant step towards creating a robust QSTR model, which can serve as an early screening tool for assessing the torsadogenic potential of pharmaceutical candidates or existing drugs. By incorporating molecular structure-based insights, we aim to enhance drug safety evaluation and minimize the risks of drug-induced TdP, ultimately benefiting both patients and the pharmaceutical industry.
Collapse
Affiliation(s)
- Feyza Kelleci Çelik
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, 70200, Karaman, Turkey.
| | - Seyyide Doğan
- Karamanoğlu Mehmetbey University, Faculty of Economics and Administrative Science, 70200, Karaman, Turkey
| | - Gül Karaduman
- Karamanoğlu Mehmetbey University, Department of Mathematics, 70100, Karaman, Turkey
| |
Collapse
|
2
|
Kelleci Çelik F, Karaduman G. Computational modeling of air pollutants for aquatic risk: Prediction of ecological toxicity and exploring structural characteristics. CHEMOSPHERE 2024; 366:143501. [PMID: 39384138 DOI: 10.1016/j.chemosphere.2024.143501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 09/22/2024] [Accepted: 10/05/2024] [Indexed: 10/11/2024]
Abstract
Assessing the aquatic toxicity originating from air pollutants is essential in sustaining water resources and maintaining the ecosystem's safety. Quantitative structure-activity relationship (QSAR) models provide a computational tool for predicting pollutant toxicity, facilitating the identification/evaluation of the contaminants and identifying responsible structural fragments. One-vs-all (OvA) QSAR is a tailored approach to address multi-class QSAR problems. The study aims to determine five distinct levels of aquatic hazard categories for airborne pollutants using OvA-QSAR modeling containing 254 air contaminants. This QSAR analysis reveals the critical descriptors of air pollutants to target for molecular modification. Various factors, including the selection of relevant mechanistic descriptors, data quality, and outliers, determine the reliability of QSAR models. By employing feature selection and outlier identification approaches, the robustness and accuracy of our QSAR models were significantly increased, leading to more reliable predictions in chemical hazard assessment. The results revealed that models using the Random Forest algorithm performed the best based on the selected descriptors, with internal and external validation accuracy ranging from 71.90% to 97.53% and 76.47%-98.03%, respectively. This study indicated that the aquatic risk of air contaminants might be attributed predominantly to their sp3/sp2 carbon ratio, hydrogen-bond acceptor capability, hydrophilicity/lipophilicity, and van der Waals volumes. These structures can be critical in developing innovative strategies to mitigate or avoid the chemicals' harmful effects. Supporting air quality improvement, this study contributes to the rapid implementation of measures to protect aquatic ecosystems affected by air pollution.
Collapse
Affiliation(s)
- Feyza Kelleci Çelik
- Karamanoglu Mehmetbey University, Vocational School of Health Services, 70200, Karaman, Turkey.
| | - Gul Karaduman
- Karamanoglu Mehmetbey University, Department of Mathematics, 70100, Karaman, Turkey.
| |
Collapse
|
3
|
Zhu Y, Zhang Y, Li X, Wang L. 3MTox: A motif-level graph-based multi-view chemical language model for toxicity identification with deep interpretation. JOURNAL OF HAZARDOUS MATERIALS 2024; 476:135114. [PMID: 38986414 DOI: 10.1016/j.jhazmat.2024.135114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2024] [Revised: 06/24/2024] [Accepted: 07/04/2024] [Indexed: 07/12/2024]
Abstract
Toxicity identification plays a key role in maintaining human health, as it can alert humans to the potential hazards caused by long-term exposure to a wide variety of chemical compounds. Experimental methods for determining toxicity are time-consuming, and costly, while computational methods offer an alternative for the early identification of toxicity. For example, some classical ML and DL methods, which demonstrate excellent performance in toxicity prediction. However, these methods also have some defects, such as over-reliance on artificial features and easy overfitting, etc. Proposing novel models with superior prediction performance is still an urgent task. In this study, we propose a motifs-level graph-based multi-view pretraining language model, called 3MTox, for toxicity identification. The 3MTox model uses Bidirectional Encoder Representations from Transformers (BERT) as the backbone framework, and a motif graph as input. The results of extensive experiments showed that our 3MTox model achieved state-of-the-art performance on toxicity benchmark datasets and outperformed the baseline models considered. In addition, the interpretability of the model ensures that the it can quickly and accurately identify toxicity sites in a given molecule, thereby contributing to the determination of the status of toxicity and associated analyses. We think that the 3MTox model is among the most promising tools that are currently available for toxicity identification.
Collapse
Affiliation(s)
- Yingying Zhu
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Yanhong Zhang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Xinze Li
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China
| | - Ling Wang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Joint International Research Laboratory of Synthetic Biology and Medicine, Ministry of Education, Guangdong Provincial Engineering and Technology Research Center of Biopharmaceuticals, School of Biology and Biological Engineering, South China University of Technology, Guangzhou 510006, China.
| |
Collapse
|
4
|
Khan MZI, Ren JN, Cao C, Ye HYX, Wang H, Guo YM, Yang JR, Chen JZ. Comprehensive hepatotoxicity prediction: ensemble model integrating machine learning and deep learning. Front Pharmacol 2024; 15:1441587. [PMID: 39234116 PMCID: PMC11373136 DOI: 10.3389/fphar.2024.1441587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2024] [Accepted: 07/24/2024] [Indexed: 09/06/2024] Open
Abstract
Background Chemicals may lead to acute liver injuries, posing a serious threat to human health. Achieving the precise safety profile of a compound is challenging due to the complex and expensive testing procedures. In silico approaches will aid in identifying the potential risk of drug candidates in the initial stage of drug development and thus mitigating the developmental cost. Methods In current studies, QSAR models were developed for hepatotoxicity predictions using the ensemble strategy to integrate machine learning (ML) and deep learning (DL) algorithms using various molecular features. A large dataset of 2588 chemicals and drugs was randomly divided into training (80%) and test (20%) sets, followed by the training of individual base models using diverse machine learning or deep learning based on three different kinds of descriptors and fingerprints. Feature selection approaches were employed to proceed with model optimizations based on the model performance. Hybrid ensemble approaches were further utilized to determine the method with the best performance. Results The voting ensemble classifier emerged as the optimal model, achieving an excellent prediction accuracy of 80.26%, AUC of 82.84%, and recall of over 93% followed by bagging and stacking ensemble classifiers method. The model was further verified by an external test set, internal 10-fold cross-validation, and rigorous benchmark training, exhibiting much better reliability than the published models. Conclusion The proposed ensemble model offers a dependable assessment with a good performance for the prediction regarding the risk of chemicals and drugs to induce liver damage.
Collapse
Affiliation(s)
| | - Jia-Nan Ren
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Cheng Cao
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Hong-Yu-Xiang Ye
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Hao Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Ya-Min Guo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| | - Jin-Rong Yang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Polytechnic Institute, Zhejiang University, Hangzhou, China
| | - Jian-Zhong Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
5
|
Abou Hajal A, Al Meslamani AZ. Overcoming barriers to machine learning applications in toxicity prediction. Expert Opin Drug Metab Toxicol 2024; 20:549-553. [PMID: 38088128 DOI: 10.1080/17425255.2023.2294939] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/11/2023] [Indexed: 07/25/2024]
Affiliation(s)
- Abdallah Abou Hajal
- College of Pharmacy, Al Ain University, Abu Dhabi, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi, United Arab Emirates
| | - Ahmad Z Al Meslamani
- College of Pharmacy, Al Ain University, Abu Dhabi, United Arab Emirates
- AAU Health and Biomedical Research Center, Al Ain University, Abu Dhabi, United Arab Emirates
| |
Collapse
|
6
|
Umemori Y, Handa K, Yoshimura S, Kageyama M, Iijima T. Development of a Novel In Silico Classification Model to Assess Reactive Metabolite Formation in the Cysteine Trapping Assay and Investigation of Important Substructures. Biomolecules 2024; 14:535. [PMID: 38785942 PMCID: PMC11117661 DOI: 10.3390/biom14050535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 04/25/2024] [Accepted: 04/26/2024] [Indexed: 05/25/2024] Open
Abstract
Predicting whether a compound can cause drug-induced liver injury (DILI) is difficult due to the complexity of drug mechanism. The cysteine trapping assay is a method for detecting reactive metabolites that bind to microsomes covalently. However, it is cumbersome to use 35S isotope-labeled cysteine for this assay. Therefore, we constructed an in silico classification model for predicting a positive/negative outcome in the cysteine trapping assay. We collected 475 compounds (436 in-house compounds and 39 publicly available drugs) based on experimental data performed in this study, and the composition of the results showed 248 positives and 227 negatives. Using a Message Passing Neural Network (MPNN) and Random Forest (RF) with extended connectivity fingerprint (ECFP) 4, we built machine learning models to predict the covalent binding risk of compounds. In the time-split dataset, AUC-ROC of MPNN and RF were 0.625 and 0.559 in the hold-out test, restrictively. This result suggests that the MPNN model has a higher predictivity than RF in the time-split dataset. Hence, we conclude that the in silico MPNN classification model for the cysteine trapping assay has a better predictive power. Furthermore, most of the substructures that contributed positively to the cysteine trapping assay were consistent with previous results.
Collapse
Affiliation(s)
| | - Koichi Handa
- DMPK Research Department, Teijin Institute for Bio-Medical Research, TEIJIN PHARMA LIMITED, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan; (Y.U.); (S.Y.); (M.K.); (T.I.)
| | | | | | | |
Collapse
|
7
|
Karaduman G, Kelleci Çelik F. Towards safer pesticide management: A quantitative structure-activity relationship based hazard prediction model. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 916:170173. [PMID: 38266732 DOI: 10.1016/j.scitotenv.2024.170173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 01/07/2024] [Accepted: 01/13/2024] [Indexed: 01/26/2024]
Abstract
Pesticides are recognized as common environmental contaminants. The potential pesticide hazard to non-target organisms, including various mammal species, is a global concern. The global problem requires a comprehensive risk assessment. To assess the toxic effects of pesticides at the early stage, a toxicological risk analysis is conducted to determine pesticide hazard levels. World Health Organization (WHO) has established five pesticide hazard classes based on lethal dose (LD50) values to perform these assessments. In this paper, we have developed one-vs-all quantitative structure-activity relationship (OvA-QSAR) models using five machine-learning techniques with the selected optimum molecular descriptors. Descriptor selection was conducted based on correlation to evaluate the relevance and significance of individual features in our dataset. Our OvA-QSAR model was built using a dataset obtained from the WHO, covering a wide range of chemical pesticides. These models can predict the hazard category for a pesticide within the five available categories. Notably, our experiments demonstrate the outstanding performance and robustness of the Random Forest (RF) model in addressing the challenge of multi-class classification with the selected descriptors.
Collapse
Affiliation(s)
- Gül Karaduman
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, 70200 Karaman, Turkey; University of Texas at Arlington, Department of Mathematics, Arlington, TX 76019-0408, USA.
| | - Feyza Kelleci Çelik
- Karamanoğlu Mehmetbey University, Vocational School of Health Services, 70200 Karaman, Turkey.
| |
Collapse
|