1
|
Banerjee A, Roy K. The application of chemical similarity measures in an unconventional modeling framework c-RASAR along with dimensionality reduction techniques to a representative hepatotoxicity dataset. Sci Rep 2024; 14:20812. [PMID: 39242880 PMCID: PMC11379871 DOI: 10.1038/s41598-024-71892-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 09/02/2024] [Indexed: 09/09/2024] Open
Abstract
With the exponential progress in the field of cheminformatics, the conventional modeling approaches have so far been to employ supervised and unsupervised machine learning (ML) and deep learning models, utilizing the standard molecular descriptors, which represent the structural, physicochemical, and electronic properties of a particular compound. Deviating from the conventional approach, in this investigation, we have employed the classification Read-Across Structure-Activity Relationship (c-RASAR), which involves the amalgamation of the concepts of classification-based quantitative structure-activity relationship (QSAR) and Read-Across to incorporate Read-Across-derived similarity and error-based descriptors into a statistical and machine learning modeling framework. ML models developed from these RASAR descriptors use similarity-based information from the close source neighbors of a particular query compound. We have employed different classification modeling algorithms on the selected QSAR and RASAR descriptors to develop predictive models for efficient prediction of query compounds' hepatotoxicity. The predictivity of each of these models was evaluated on a large number of test set compounds. The best-performing model was also used to screen a true external data set. The concepts of explainable AI (XAI) coupled with Read-Across were used to interpret the contributions of the RASAR descriptors in the best c-RASAR model and to explain the chemical diversity in the dataset. The application of various unsupervised dimensionality reduction techniques like t-SNE and UMAP and the supervised ARKA framework showed the usefulness of the RASAR descriptors over the selected QSAR descriptors in their ability to group similar compounds, enhancing the modelability of the dataset and efficiently identifying activity cliffs. Furthermore, the activity cliffs were also identified from Read-Across by observing the nature of compounds constituting the nearest neighbors for a particular query compound. On comparing our simple linear c-RASAR model with the previously reported models developed using the same dataset derived from the US FDA Orange Book ( https://www.accessdata.fda.gov/scripts/cder/ob/index.cfm ), it was observed that our model is simple, reproducible, transferable, and highly predictive. The performance of the LDA c-RASAR model on the true external set supersedes that of the previously reported work. Therefore, the present simple LDA c-RASAR model can efficiently be used to predict the hepatotoxicity of query chemicals.
Collapse
Affiliation(s)
- Arkaprava Banerjee
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India
| | - Kunal Roy
- Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.
| |
Collapse
|
2
|
Zhang Y, Tian Y, Yan A. A SAR and QSAR study on 3CLpro inhibitors of SARS-CoV-2 using machine learning methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:531-563. [PMID: 39077983 DOI: 10.1080/1062936x.2024.2375513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 06/27/2024] [Indexed: 07/31/2024]
Abstract
The 3C-like Proteinase (3CLpro) of novel coronaviruses is intricately linked to viral replication, making it a crucial target for antiviral agents. In this study, we employed two fingerprint descriptors (ECFP_4 and MACCS) to comprehensively characterize 889 compounds in our dataset. We constructed 24 classification models using machine learning algorithms, including Support Vector Machine (SVM), Random Forest (RF), extreme Gradient Boosting (XGBoost), and Deep Neural Networks (DNN). Among these models, the DNN- and ECFP_4-based Model 1D_2 achieved the most promising results, with a remarkable Matthews correlation coefficient (MCC) value of 0.796 in the 5-fold cross-validation and 0.722 on the test set. The application domains of the models were analysed using dSTD-PRO calculations. The collected 889 compounds were clustered by K-means algorithm, and the relationships between structural fragments and inhibitory activities of the highly active compounds were analysed for the 10 obtained subsets. In addition, based on 464 3CLpro inhibitors, 27 QSAR models were constructed using three machine learning algorithms with a minimum root mean square error (RMSE) of 0.509 on the test set. The applicability domains of the models and the structure-activity relationships responded from the descriptors were also analysed.
Collapse
Affiliation(s)
- Y Zhang
- State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, Beijing University of Chemical Technology, Beijing, P. R. China
| | - Y Tian
- State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, Beijing University of Chemical Technology, Beijing, P. R. China
| | - A Yan
- State Key Laboratory of Chemical Resource Engineering, Department of Pharmaceutical Engineering, Beijing University of Chemical Technology, Beijing, P. R. China
| |
Collapse
|
3
|
Sulimov AV, Ilin IS, Tashchilova AS, Kondakova OA, Kutov DC, Sulimov VB. Docking and other computing tools in drug design against SARS-CoV-2. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2024; 35:91-136. [PMID: 38353209 DOI: 10.1080/1062936x.2024.2306336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 01/10/2024] [Indexed: 02/16/2024]
Abstract
The use of computer simulation methods has become an indispensable component in identifying drugs against the SARS-CoV-2 coronavirus. There is a huge body of literature on application of molecular modelling to predict inhibitors against target proteins of SARS-CoV-2. To keep our review clear and readable, we limited ourselves primarily to works that use computational methods to find inhibitors and test the predicted compounds experimentally either in target protein assays or in cell culture with live SARS-CoV-2. Some works containing results of experimental discovery of corresponding inhibitors without using computer modelling are included as examples of a success. Also, some computational works without experimental confirmations are also included if they attract our attention either by simulation methods or by databases used. This review collects studies that use various molecular modelling methods: docking, molecular dynamics, quantum mechanics, machine learning, and others. Most of these studies are based on docking, and other methods are used mainly for post-processing to select the best compounds among those found through docking. Simulation methods are presented concisely, information is also provided on databases of organic compounds that can be useful for virtual screening, and the review itself is structured in accordance with coronavirus target proteins.
Collapse
Affiliation(s)
- A V Sulimov
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - I S Ilin
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - A S Tashchilova
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - O A Kondakova
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - D C Kutov
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| | - V B Sulimov
- Dimonta Ltd., Moscow, Russia
- Research Computing Center, Lomonosov Moscow State University, Moscow, Russia
| |
Collapse
|
4
|
Wang Q, Lu X, Jia R, Yan X, Wang J, Zhao L, Zhong R, Sun G. Recent advances in chemometric modelling of inhibitors against SARS-CoV-2. Heliyon 2024; 10:e24209. [PMID: 38293468 PMCID: PMC10826659 DOI: 10.1016/j.heliyon.2024.e24209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 01/02/2024] [Accepted: 01/04/2024] [Indexed: 02/01/2024] Open
Abstract
The outbreak of the novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has caused great harm to all countries worldwide. This disease can be prevented by vaccination and managed using various treatment methods, including injections, oral medications, or aerosol therapies. However, the selection of suitable compounds for the research and development of anti-SARS-CoV-2 drugs is a daunting task because of the vast databases of available compounds. The traditional process of drug research and development is time-consuming, labour-intensive, and costly. The application of chemometrics can significantly expedite drug R&D. This is particularly necessary and important for drug development against pandemic public emergency diseases, such as COVID-19. Through various chemometric techniques, such as quantitative structure-activity relationship (QSAR) modelling, molecular docking, and molecular dynamics (MD) simulations, compounds with inhibitory activity against SARS-CoV-2 can be quickly screened, allowing researchers to focus on the few prioritised candidates. In addition, the ADMET properties of the screened candidate compounds should be further explored to promote the successful discovery of anti-SARS-CoV-2 drugs. In this case, considerable time and economic costs can be saved while minimising the need for extensive animal experiments, in line with the 3R principles. This paper focuses on recent advances in chemometric modelling studies of COVID-19-related inhibitors, highlights current limitations, and outlines potential future directions for development.
Collapse
Affiliation(s)
- Qianqian Wang
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| | - Xinyi Lu
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| | - Runqing Jia
- Department of Biology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| | - Xinlong Yan
- Department of Biology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| | - Jianhua Wang
- Beijing Municipal Key Laboratory of Child Development and Nutriomics, Translational Medicine Laboratory, Capital Institute of Pediatrics, Beijing 100124, PR China
| | - Lijiao Zhao
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| | - Rugang Zhong
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| | - Guohui Sun
- Beijing Key Laboratory of Environmental and Viral Oncology, Faculty of Environment and Life, Beijing University of Technology, Beijing 100124, PR China
| |
Collapse
|
5
|
Kumar A, Kumar V, Podder T, Ojha PK. First report on ecotoxicological QSTR and I-QSTR modeling for the prediction of acute ecotoxicity of diverse organic chemicals against three protozoan species. CHEMOSPHERE 2023:139066. [PMID: 37257655 DOI: 10.1016/j.chemosphere.2023.139066] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 05/15/2023] [Accepted: 05/27/2023] [Indexed: 06/02/2023]
Abstract
The recent years have witnessed an upsurge of interest to assess the toxicity of organic chemicals exhibiting harmful impacts on the environment. In this investigation, we have developed regression-based quantitative structure-toxicity relationship (QSTR) models against three protozoan species (Entosiphon sulcantum, Uronema parduczi, and Chilomonas paramecium) using three sets of descriptor combinations such as ETA indices only, non-ETA descriptors only, and both ETA and non-ETA descriptors to examine the key structural features that determine the toxic properties of protozoa. The interspecies models (i-QSTRs) were also generated for efficient data gap-filling of toxicity databases. The statistical results of the validated models in terms of both internal and external validation metrics suggested that the models are statistically reliable and robust. Additionally, using these validated models, we screened the DrugBank database containing 11,300 pharmaceuticals for assessing the ecotoxicological properties. The features appearing in the models suggested that nonpolar characteristics, electronegativity, hydrogen bonding, π-π, and hydrophobic interactions are responsible for chemical toxicity toward protozoan. The validated models may be utilized for the development of eco-friendly drugs & chemicals, data gap-filling of toxicity databases for regulatory purposes and research, as well as to decrease the use of toxic and hazardous chemicals in the environment.
Collapse
Affiliation(s)
- Ankur Kumar
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Vinay Kumar
- Drug Theoretics and Cheminformatics Laboratory (DTC Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Trina Podder
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India
| | - Probir Kumar Ojha
- Drug Discovery and Development Laboratory (DDD Lab), Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700032, India.
| |
Collapse
|
6
|
Daoui O, Elkhattabi S, Chtita S. Rational identification of small molecules derived from 9,10-dihydrophenanthrene as potential inhibitors of 3CL pro enzyme for COVID-19 therapy: a computer-aided drug design approach. Struct Chem 2022; 33:1667-1690. [PMID: 35818588 PMCID: PMC9261181 DOI: 10.1007/s11224-022-02004-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 06/23/2022] [Indexed: 01/11/2023]
Abstract
Small molecules such as 9,10-dihydrophenanthrene derivatives have remarkable activity toward inhibition of SARS-CoV-2 3CLpro and COVID-19 proliferation, which show a strong correlation between their structures and bioactivity. Therefore, these small compounds could be suitable for clinical pharmaceutical use against COVID-19. The objective of this study was to remodel the structures of 9,10-dihydrophenanthrene derivatives to achieve a powerful biological activity against 3CLpro and favorable pharmacokinetic properties for drug design and discovery. Therefore, by the use of bioinformatics techniques, we developed robust 3D-QSAR models that are capable of describing the structure-activity relationship for 46 molecules based on 9,10-dihydrophenanthrene derivatives using CoMFA/SE (R 2 = 0.97, Q 2 = 0.81, R 2 pred = 0.95, c R 2 p = 0.71) and CoMSIA/SEHDA (R 2 = 0.94, Q 2 = 0.76, R 2 pred = 0.91, c R 2 p = 0.65) techniques. Accordingly, 96 lead compounds were generated based on a template molecule that showed the highest observed activity in vitro (T40, pIC50 = 5.81) and predicted their activities and bioavailability in silico. The rational screening outputs of 3D-QSAR, Molecular docking, ADMET, and MM-GBSA led to the identification of 9 novel modeled molecules as potent noncovalent drugs against SARS-CoV-2-3CLpro. Finally, by molecular dynamics simulations, the stability and structural dynamics of 3CLpro free and complex (PDB code: 6LU7) were discussed in the presence of samples of 9,10-dihydrophenanthrene derivative in an aqueous environment. Overall, the retrosynthesis of the proposed drug compounds in this study and the evaluation of their bioactivity in vitro and in vivo may be interesting for designing and discovering a new drug effective against COVID-19. Supplementary Information The online version contains supplementary material available at 10.1007/s11224-022-02004-z.
Collapse
Affiliation(s)
- Ossama Daoui
- Laboratory of Engineering, Systems and Applications, National School of Applied Sciences, Sidi Mohamed Ben Abdellah-Fez University, BP Box 72, Fez, Morocco
| | - Souad Elkhattabi
- Laboratory of Engineering, Systems and Applications, National School of Applied Sciences, Sidi Mohamed Ben Abdellah-Fez University, BP Box 72, Fez, Morocco
| | - Samir Chtita
- Laboratory of Analytical and Molecular Chemistry, Faculty of Sciences Ben M’Sik, Hassan II University of Casablanca, B.P 7955 Casablanca, Morocco
| |
Collapse
|