1
|
Zapadka M, Dekowski P, Kupcewicz B. HATS5m as an Example of GETAWAY Molecular Descriptor in Assessing the Similarity/Diversity of the Structural Features of 4-Thiazolidinone. Int J Mol Sci 2022; 23:ijms23126576. [PMID: 35743020 PMCID: PMC9223869 DOI: 10.3390/ijms23126576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 04/30/2022] [Accepted: 06/10/2022] [Indexed: 11/29/2022] Open
Abstract
Among the various methods for drug design, the approach using molecular descriptors for quantitative structure–activity relationships (QSAR) bears promise for the prediction of innovative molecular structures with bespoke pharmacological activity. Despite the growing number of successful potential applications, the QSAR models often remain hard to interpret. The difficulty arises from the use of advanced chemometric or machine learning methods on the one hand, and the complexity of molecular descriptors on the other hand. Thus, there is a need to interpret molecular descriptors for identifying the features of molecules crucial for desirable activity. For example, the development of structure–activity modeling of different molecule endpoints confirmed the usefulness of H-GETAWAY (H-GEometry, Topology, and Atom-Weights AssemblY) descriptors in molecular sciences. However, compared with other 3D molecular descriptors, H-GETAWAY interpretation is much more complicated. The present study provides insights into the interpretation of the HATS5m descriptor (H-GETAWAY) concerning the molecular structures of the 4-thiazolidinone derivatives with antitrypanosomal activity. According to the published study, an increase in antitrypanosomal activity is associated with both a decrease and an increase in HATS5m (leverage-weighted autocorrelation with lag 5, weighted by atomic masses) values. The substructure-based method explored how the changes in molecular features affect the HATS5m value. Based on this approach, we proposed substituents that translate into low and high HATS5m. The detailed interpretation of H-GETAWAY descriptors requires the consideration of three elements: weighting scheme, leverages, and the Dirac delta function. Particular attention should be paid to the impact of chemical compounds’ size and shape and the leverage values of individual atoms.
Collapse
Affiliation(s)
- Mariusz Zapadka
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Nicolaus Copernicus University in Toruń, Jurasza 2, 85-089 Bydgoszcz, Poland
- Correspondence: (M.Z.); (B.K.)
| | - Przemysław Dekowski
- New Technologies Department, Softmaks.pl Sp. z o.o., Kraszewskiego 1, 85-241 Bydgoszcz, Poland;
| | - Bogumiła Kupcewicz
- Department of Inorganic and Analytical Chemistry, Faculty of Pharmacy, Nicolaus Copernicus University in Toruń, Jurasza 2, 85-089 Bydgoszcz, Poland
- Correspondence: (M.Z.); (B.K.)
| |
Collapse
|
2
|
Application of Multivariate Adaptive Regression Splines (MARSplines) for Predicting Antitumor Activity of Anthrapyrazole Derivatives. Int J Mol Sci 2022; 23:ijms23095132. [PMID: 35563523 PMCID: PMC9104800 DOI: 10.3390/ijms23095132] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/26/2022] [Accepted: 04/29/2022] [Indexed: 02/01/2023] Open
Abstract
An approach using multivariate adaptive regression splines (MARSplines) was applied for quantitative structure–activity relationship studies of the antitumor activity of anthrapyrazoles. At the first stage, the structures of anthrapyrazole derivatives were subjected to geometrical optimization by the AM1 method using the Polak–Ribiere algorithm. In the next step, a data set of 73 compounds was coded over 2500 calculated molecular descriptors. It was shown that fourteen independent variables appearing in the statistically significant MARS model (i.e., descriptors belonging to 3D-MoRSE, 2D autocorrelations, GETAWAY, burden eigenvalues and RDF descriptors), significantly affect the antitumor activity of anthrapyrazole compounds. The study confirmed the benefit of using a modern machine learning algorithm, since the high predictive power of the obtained model had proven to be useful for the prediction of antitumor activity against murine leukemia L1210. It could certainly be considered as a tool for predicting activity against other cancer cell lines.
Collapse
|
3
|
Kaneko H. Examining variable selection methods for the predictive performance of regression models and the proportion of selected variables and selected random variables. Heliyon 2021; 7:e07356. [PMID: 34195450 PMCID: PMC8237311 DOI: 10.1016/j.heliyon.2021.e07356] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/02/2021] [Accepted: 06/16/2021] [Indexed: 11/24/2022] Open
Abstract
The selection of a descriptor, X, is crucial for improving the interpretation and prediction accuracy of a regression model. In this study, the prediction accuracy of models constructed using the selected X was determined and the results of variable selection, according to the number of selected X and number of selected variables that are unrelated to an objective variable, such as activities and properties (y), were investigated to evaluate the variable or feature selection methods. Variable selection methods include least absolute shrinkage and selection operator, genetic algorithm-based partial least squares, genetic algorithm-based support vector regression, and Boruta. Several regression analysis methods were used to test the prediction accuracy of the model constructed using the selected X. The characteristics of each variable selection method were analyzed using eight datasets. The results showed that even when variables unrelated to y were selected by variable selection and the number of unrelated variables was the same as the number of the original variables, a regression model with good accuracy, which ignores the influence of such noise variables, can be constructed by applying various regression analysis methods. Additionally, the variables related to y must not to be deleted. These findings provide a basis for improving the variable selection methods.
Collapse
Affiliation(s)
- Hiromasa Kaneko
- Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
| |
Collapse
|
4
|
Winkler DA. Use of Artificial Intelligence and Machine Learning for Discovery of Drugs for Neglected Tropical Diseases. Front Chem 2021; 9:614073. [PMID: 33791277 PMCID: PMC8005575 DOI: 10.3389/fchem.2021.614073] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 01/18/2021] [Indexed: 12/11/2022] Open
Abstract
Neglected tropical diseases continue to create high levels of morbidity and mortality in a sizeable fraction of the world’s population, despite ongoing research into new treatments. Some of the most important technological developments that have accelerated drug discovery for diseases of affluent countries have not flowed down to neglected tropical disease drug discovery. Pharmaceutical development business models, cost of developing new drug treatments and subsequent costs to patients, and accessibility of technologies to scientists in most of the affected countries are some of the reasons for this low uptake and slow development relative to that for common diseases in developed countries. Computational methods are starting to make significant inroads into discovery of drugs for neglected tropical diseases due to the increasing availability of large databases that can be used to train ML models, increasing accuracy of these methods, lower entry barrier for researchers, and widespread availability of public domain machine learning codes. Here, the application of artificial intelligence, largely the subset called machine learning, to modelling and prediction of biological activities and discovery of new drugs for neglected tropical diseases is summarized. The pathways for the development of machine learning methods in the short to medium term and the use of other artificial intelligence methods for drug discovery is discussed. The current roadblocks to, and likely impacts of, synergistic new technological developments on the use of ML methods for neglected tropical disease drug discovery in the future are also discussed.
Collapse
Affiliation(s)
- David A Winkler
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia.,Latrobe Institute for Molecular Science, La Trobe University, Bundoora, VIC, Australia.,School of Pharmacy, University of Nottingham, Nottingham, United Kingdom.,CSIRO Data61, Pullenvale, QLD, Australia
| |
Collapse
|
5
|
Yang B, Si H, Zhai H. QSAR Studies on the IC50 of a Class of Thiazolidinone/Thiazolide Based Hybrids as Antitrypanosomal Agents. LETT DRUG DES DISCOV 2021. [DOI: 10.2174/1570180817999201102200015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background::
Trypanosomiasis is a widespread zoonotic disease and the existing drugs are
not enough to prevent and treat it.
Objective::
This study aimed to build a quantitative structure-activity relationship model by the chemical
structures of a class of thiazolidone/thiazolidamide based hybrids. The model was used to screen new
antitrypanosomal agents and predict the properties of composite molecules.
Methods::
All compounds were randomly divided into a training set and a test set. A large number of
descriptors were calculated by the software, then some of the best descriptors were selected to build the
models. The linear model was built by the heuristic method and the nonlinear model was built by gene
expression programming method.
Results::
In the heuristic method, the correlation coefficients ,R2, R2cv, F and S2 were 0.581, 0.457,
14.053 and 15.311, respectively. In gene expression programming, the R2 and S2 were 0.715, 10.997
in the training set and 0.617, 22.778 in the test set. The results showed that the relative number of S atoms
and the minimum bond order of an H atom had a significant positive contribution to IC50. Meanwhile,
the relative number of double bonds and the count of hydrogen-bonding acceptor sites had a great
negative impact on IC50.
Conclusion::
Both the heuristic method and gene expression programming had a good predictive performance.
By contrast, the gene expression programming method fitted well with the experimental values
and it was expected to be beneficial in the synthesis of new antitrypanosomal drugs.
Collapse
Affiliation(s)
- Bo Yang
- College of Veterinary Medicine, Nanjing Agricultural University, Nanjing 210095,China
| | - Hongzong Si
- Institute for Computational Science and Engineering, Qingdao University, Qingdao 266071,China
| | - Honglin Zhai
- Department of Chemistry, Lanzhou University, Lanzhou 730000,China
| |
Collapse
|
6
|
Kryshchyshyn A, Kaminskyy D, Karpenko O, Gzella A, Grellier P, Lesyk R. Thiazolidinone/thiazole based hybrids - New class of antitrypanosomal agents. Eur J Med Chem 2019; 174:292-308. [PMID: 31051403 DOI: 10.1016/j.ejmech.2019.04.052] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 04/17/2019] [Accepted: 04/17/2019] [Indexed: 12/22/2022]
Abstract
Different compounds have been investigated as potent drugs for trypanosomiasis treatment, but no new drug has been marketed in the past 3 decades. 4-Thiazolidinone/thiazole as privileged structures and thiosemicarbazides cyclic analogs are well known scaffolds in novel antitrypanosomal agent design. We present here the design and synthesis of new hybrid molecules bearing thiazolidinone/thiazole cores linked by the hydrazone group with various molecular fragments. Structure optimization led to compounds with phenyl-indole or phenyl-imidazo[2,1-b][1,3,4]thiadiazole moieties showing excellent antitrypanosomal activity towards Trypanosoma brucei brucei and Trypanosoma brucei gambiense. Biological study allowed identifying compounds with the submicromolar levels of IC50, good selectivity indexes and relatively low cytotoxicity upon human primary fibroblasts as well as low acute toxicity.
Collapse
Affiliation(s)
- Anna Kryshchyshyn
- Department of Pharmaceutical, Organic and Bioorganic Chemistry, Danylo Halytsky Lviv National Medical University, Pekarska 69, Lviv, 79010, Ukraine
| | - Danylo Kaminskyy
- Department of Pharmaceutical, Organic and Bioorganic Chemistry, Danylo Halytsky Lviv National Medical University, Pekarska 69, Lviv, 79010, Ukraine
| | | | - Andrzej Gzella
- Department of Organic Chemistry, Poznan University of Medical Sciences, Grunwaldzka 6, Poznan, 60-780, Poland
| | - Philippe Grellier
- National Museum of Natural History, UMR 7245 CNRS-MNHN, Team BAMEE, CP 52, 57 Rue Cuvier, 75005, Paris, France
| | - Roman Lesyk
- Department of Pharmaceutical, Organic and Bioorganic Chemistry, Danylo Halytsky Lviv National Medical University, Pekarska 69, Lviv, 79010, Ukraine; Department of Public Health, Dietetics and Lifestyle Disorders, Faculty of Medicine, University of Information Technology and Management in Rzeszow, Sucharskiego 2, 35-225 Rzeszow, Poland.
| |
Collapse
|
7
|
Ancuceanu R, Dinu M, Neaga I, Laszlo FG, Boda D. Development of QSAR machine learning-based models to forecast the effect of substances on malignant melanoma cells. Oncol Lett 2019; 17:4188-4196. [PMID: 31007759 PMCID: PMC6466999 DOI: 10.3892/ol.2019.10068] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 11/15/2018] [Indexed: 11/20/2022] Open
Abstract
SK-MEL-5 is a human melanoma cell line that has been used in various studies to explore new therapies against melanoma in different in vitro experiments. Based on this study we report on the development of quantitative structure-activity relationship (QSAR) models able to predict the cytotoxic effect of diverse chemical compounds on this cancer cell line. The dataset of cytotoxic and inactive compounds were downloaded from the PubChem database. It contains the data for all chemical compounds for which cytotoxicity results expressed by GI50 was recorded. In total 13 blocks of molecular descriptors were computed and used, after appropriate pre-processing in building QSAR models with four machine learning classifiers: Random forest (RF), gradient boosting, support vector machine and random k-nearest neighbors. Among the 186 models reported none had a positive predictive value (PPV) higher than 0.90 in both nested cross-validation and on an external dataset testing, but 7 models had a PPV higher than 0.85 in both evaluations, all seven using the RFs algorithm as a classifier, and topological descriptors, information indices, 2D-autocorrelation descriptors, P-VSA-like descriptors, and edge-adjacency descriptors as sets of features used for classification. The y-scrambling test was associated with considerably worse performance (confirming the non-random character of the models) and the applicability domain was assessed through three different methods.
Collapse
Affiliation(s)
- Robert Ancuceanu
- Department of Pharmaceutical Botany and Cell Biology, Faculty of Pharmacy, 'Carol Davila' University of Medicine and Pharmacy, 020956 Bucharest, Romania
| | - Mihaela Dinu
- Department of Pharmaceutical Botany and Cell Biology, Faculty of Pharmacy, 'Carol Davila' University of Medicine and Pharmacy, 020956 Bucharest, Romania
| | - Iana Neaga
- Department of Public Health and Management, Faculty of Medicine, 'Carol Davila' University of Medicine and Pharmacy, 050463 Bucharest, Romania
| | - Fekete Gyula Laszlo
- Department of Dermatology, University of Medicine and Pharmacy of Târgu Mureş, 540142 Târgu Mureş, Romania
| | - Daniel Boda
- Dermatology Research Laboratory, 'Carol Davila' University of Medicine and Pharmacy, 050474 Bucharest, Romania
| |
Collapse
|
8
|
Sheikhpour R, Sarram MA, Rezaeian M, Sheikhpour E. QSAR modelling using combined simple competitive learning networks and RBF neural networks. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018; 29:257-276. [PMID: 29372662 DOI: 10.1080/1062936x.2018.1424030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2017] [Accepted: 01/02/2018] [Indexed: 06/07/2023]
Abstract
The aim of this study was to propose a QSAR modelling approach based on the combination of simple competitive learning (SCL) networks with radial basis function (RBF) neural networks for predicting the biological activity of chemical compounds. The proposed QSAR method consisted of two phases. In the first phase, an SCL network was applied to determine the centres of an RBF neural network. In the second phase, the RBF neural network was used to predict the biological activity of various phenols and Rho kinase (ROCK) inhibitors. The predictive ability of the proposed QSAR models was evaluated and compared with other QSAR models using external validation. The results of this study showed that the proposed QSAR modelling approach leads to better performances than other models in predicting the biological activity of chemical compounds. This indicated the efficiency of simple competitive learning networks in determining the centres of RBF neural networks.
Collapse
Affiliation(s)
- R Sheikhpour
- a Department of Computer Engineering , Yazd University , Yazd , Iran
| | - M A Sarram
- a Department of Computer Engineering , Yazd University , Yazd , Iran
| | - M Rezaeian
- a Department of Computer Engineering , Yazd University , Yazd , Iran
| | - E Sheikhpour
- b Hematology and Oncology Research Center , Shahid Sadoughi University of Medical Sciences , Yazd , Iran
| |
Collapse
|