Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Polishchuk PG, Muratov EN, Artemenko AG, Kolumbin OG, Muratov NN, Kuz’min VE. Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity. J Chem Inf Model 2009;49:2481-8. [DOI: 10.1021/ci900203n] [Citation(s) in RCA: 122] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

For:	Polishchuk PG, Muratov EN, Artemenko AG, Kolumbin OG, Muratov NN, Kuz’min VE. Application of Random Forest Approach to QSAR Prediction of Aquatic Toxicity. J Chem Inf Model 2009;49:2481-8. [DOI: 10.1021/ci900203n] [Citation(s) in RCA: 122] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Kim D, Jeong J, Choi J. Identification of Optimal Machine Learning Algorithms and Molecular Fingerprints for Explainable Toxicity Prediction Models Using ToxCast/Tox21 Bioassay Data. ACS OMEGA 2024;9:37934-37941. [PMID: 39281924 PMCID: PMC11391437 DOI: 10.1021/acsomega.4c04474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/18/2024]

Abstract

Recent studies have primarily focused on introducing novel frameworks to enhance the predictive power of toxicity prediction models by refining molecular representation methods and algorithms. However, these methods are inherently complex and often pose challenges in understanding and explaining, leading to barriers in their regulatory adoption and validation. Therefore, it is necessary to select the optimal model, considering not only model performance but also interpretability. This study aimed to identify the optimal combination of molecular fingerprints (pattern-based versus algorithm-based) and machine learning algorithms (simple versus complex) for developing explainable toxicity prediction models through an comprehensive investigation of the ToxCast/Tox21 bioassay data set. For 1092 ToxCast/Tox21 assays, five molecular fingerprints (MACCS, Morgan, RDKit, Layered, and Patterned) and six algorithms (MLP, GBT, Random Forest, kNN, Logistic Regression, and Naïve Bayes) were used to train the models. Results showed that 35 models revealed acceptable performance (F1 score or accuracy is 0.8 or higher). Among the combinations, either MACCS or Morgan, paired with Random Forest, demonstrated robust performance compared with other molecular fingerprints and algorithms. MACCS and Random Forest are valuable, even when prioritizing interpretability. Consequently, the MACCS-Random Forest combination model based on four assays, targeting G protein-coupled receptor and kinase, were identified and they can be used to discern specific structural features or patterns in chemical compounds, offering explainable insights into toxicity-related chemical structures. This study indicates the importance of not disregarding the utilization of simple models when assessing both predictivity and interpretability within the context of chemical feature-based Tox21 data analysis.

Collapse

Zhao X, Kong Y, Ji Y, Xin X, Chen L, Chen G, Yu C. Classification models for predicting the bioactivity of pan-TRK inhibitors and SAR analysis. Mol Divers 2024;28:2077-2097. [PMID: 37910346 DOI: 10.1007/s11030-023-10735-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/22/2023] [Indexed: 11/03/2023]

Daghighi A, Casanola-Martin GM, Iduoku K, Kusic H, González-Díaz H, Rasulev B. Multi-Endpoint Acute Toxicity Assessment of Organic Compounds Using Large-Scale Machine Learning Modeling. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024;58:10116-10127. [PMID: 38797941 DOI: 10.1021/acs.est.4c01017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Melo-Filho CC, Su G, Liu K, Muratov EN, Tropsha A, Liu J. Modeling interactions between Heparan sulfate and proteins based on the Heparan sulfate microarray analysis. Glycobiology 2024;34:cwae039. [PMID: 38836441 PMCID: PMC11180703 DOI: 10.1093/glycob/cwae039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 04/30/2024] [Accepted: 05/29/2024] [Indexed: 06/06/2024] Open

Puri D, Lee D, khankal DV, Thakur MS, Alfaisal FM, Alam S, Kumar R, Khan MA. Decision Tree-Based Modeling of the Aeration Effectiveness of Circular Plunging Jets. ACS OMEGA 2023;8:38950-38960. [PMID: 37901507 PMCID: PMC10601425 DOI: 10.1021/acsomega.3c03375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 09/13/2023] [Indexed: 10/31/2023]

Dias-Silva JR, Oliveira VM, Sanches-Neto FO, Wilhelms RZ, Queiroz Júnior LHK. SpectraFP: a new spectra-based descriptor to aid in cheminformatics, molecular characterization and search algorithm applications. Phys Chem Chem Phys 2023. [PMID: 37378661 DOI: 10.1039/d3cp00734k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]

Abstract

We have developed an algorithm to generate a new spectra-based descriptor, called SpectraFP, in order to digitalize the chemical shifts of ¹³C NMR spectra, as well as potentially important data from other spectroscopic techniques. This descriptor is a fingerprint vector with defined sizes and values of 0 and 1, with the ability to correct chemical shift fluctuations. To explore the applicability of SpectraFP, we outlined two application scenarios: (1) the prediction of six functional groups by machine learning (ML) models and (2) the search for structures based on the similarity between the query spectrum and spectra in an experimental database, both in the SpectraFP format. For each functional group, five ML models were built and validated following the OECD principles: internal and external validations, applicability domains, and mechanistic interpretations. All the models resulted in high goodness-of-fit for the training and test sets with MCC respectively between 0.626 and 0.909 and 0.653 and 0.917, and J ranging from 0.812 to 0.957 and 0.825 to 0.961. Using the SHAP (SHapley Additive exPlanations) approach, the mechanistic interpretations of the models were explored; the results indicated that the most important variables for model decision making were coherent with the expected chemical shifts for each functional group. Several metrics, including Tanimoto, geometric, arithmetic, and Tversky, can be used to perform the similarity calculation for the search algorithm. This algorithm can also incorporate additional variables, such as the correction parameter and the difference between the amount of signals in the query spectrum and the database spectra, while preserving its high performance speed. We hope that our descriptor can link information from spectroscopic/spectrometric techniques with ML models to expand the possibilities in understanding the field of cheminformatics. All databases and algorithms developed for this work are open sources and freely accessible.

Collapse

Sharma B, Chenthamarakshan V, Dhurandhar A, Pereira S, Hendler JA, Dordick JS, Das P. Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations. Sci Rep 2023;13:4908. [PMID: 36966203 PMCID: PMC10039880 DOI: 10.1038/s41598-023-31169-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 03/07/2023] [Indexed: 03/27/2023] Open

Abstract

Explainable machine learning for molecular toxicity prediction is a promising approach for efficient drug development and chemical safety. A predictive ML model of toxicity can reduce experimental cost and time while mitigating ethical concerns by significantly reducing animal and clinical testing. Herein, we use a deep learning framework for simultaneously modeling in vitro, in vivo, and clinical toxicity data. Two different molecular input representations are used; Morgan fingerprints and pre-trained SMILES embeddings. A multi-task deep learning model accurately predicts toxicity for all endpoints, including clinical, as indicated by the area under the Receiver Operator Characteristic curve and balanced accuracy. In particular, pre-trained molecular SMILES embeddings as input to the multi-task model improved clinical toxicity predictions compared to existing models in MoleculeNet benchmark. Additionally, our multitask approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms. Through both the multi-task model and transfer learning, we were able to indicate the minimal need of in vivo data for clinical toxicity predictions. To provide confidence and explain the model's predictions, we adapt a post-hoc contrastive explanation method that returns pertinent positive and negative features, which correspond well to known mutagenic and reactive toxicophores, such as unsubstituted bonded heteroatoms, aromatic amines, and Michael receptors. Furthermore, toxicophore recovery by pertinent feature analysis captures more of the in vitro (53%) and in vivo (56%), rather than of the clinical (8%), endpoints, and indeed uncovers a preference in known toxicophore data towards in vitro and in vivo experimental data. To our knowledge, this is the first contrastive explanation, using both present and absent substructures, for predictions of clinical and in vivo molecular toxicity.

Collapse

Hernandez-Betancur JD, Ruiz-Mercado GJ, Martin M. Predicting Chemical End-of-Life Scenarios Using Structure-Based Classification Models. ACS SUSTAINABLE CHEMISTRY & ENGINEERING 2023;11:3594-3602. [PMID: 36911873 PMCID: PMC9993395 DOI: 10.1021/acssuschemeng.2c05662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 02/10/2023] [Indexed: 06/18/2023]

Xu Z, Chughtai H, Tian L, Liu L, Roy JF, Bayen S. Development of quantitative structure-retention relationship models to improve the identification of leachables in food packaging using non-targeted analysis. Talanta 2023;253:123861. [PMID: 36095943 DOI: 10.1016/j.talanta.2022.123861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 08/15/2022] [Accepted: 08/17/2022] [Indexed: 12/13/2022]

Nascimben M, Rimondini L. Molecular Toxicity Virtual Screening Applying a Quantized Computational SNN-Based Framework. Molecules 2023;28:molecules28031342. [PMID: 36771009 PMCID: PMC9919191 DOI: 10.3390/molecules28031342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/27/2023] [Accepted: 01/29/2023] [Indexed: 02/04/2023] Open

Belfield SJ, Cronin MTD, Enoch SJ, Firman JW. Guidance for good practice in the application of machine learning in development of toxicological quantitative structure-activity relationships (QSARs). PLoS One 2023;18:e0282924. [PMID: 37163504 PMCID: PMC10171609 DOI: 10.1371/journal.pone.0282924] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 02/26/2023] [Indexed: 05/12/2023] Open

Abstract

Recent years have seen a substantial growth in the adoption of machine learning approaches for the purposes of quantitative structure-activity relationship (QSAR) development. Such a trend has coincided with desire to see a shifting in the focus of methodology employed within chemical safety assessment: away from traditional reliance upon animal-intensive in vivo protocols, and towards increased application of in silico (or computational) predictive toxicology. With QSAR central amongst techniques applied in this area, the emergence of algorithms trained through machine learning with the objective of toxicity estimation has, quite naturally, arisen. On account of the pattern-recognition capabilities of the underlying methods, the statistical power of the ensuing models is potentially considerable-appropriate for the handling even of vast, heterogeneous datasets. However, such potency comes at a price: this manifesting as the general practical deficits observed with respect to the reproducibility, interpretability and generalisability of the resulting tools. Unsurprisingly, these elements have served to hinder broader uptake (most notably within a regulatory setting). Areas of uncertainty liable to accompany (and hence detract from applicability of) toxicological QSAR have previously been highlighted, accompanied by the forwarding of suggestions for "best practice" aimed at mitigation of their influence. However, the scope of such exercises has remained limited to "classical" QSAR-that conducted through use of linear regression and related techniques, with the adoption of comparatively few features or descriptors. Accordingly, the intention of this study has been to extend the remit of best practice guidance, so as to address concerns specific to employment of machine learning within the field. In doing so, the impact of strategies aimed at enhancing the transparency (feature importance, feature reduction), generalisability (cross-validation) and predictive power (hyperparameter optimisation) of algorithms, trained upon real toxicity data through six common learning approaches, is evaluated.

Collapse

Abrahamsson D, Siddharth A, Robinson JF, Soshilov A, Elmore S, Cogliano V, Ng C, Khan E, Ashton R, Chiu WA, Fung J, Zeise L, Woodruff TJ. Modeling the transplacental transfer of small molecules using machine learning: a case study on per- and polyfluorinated substances (PFAS). JOURNAL OF EXPOSURE SCIENCE & ENVIRONMENTAL EPIDEMIOLOGY 2022;32:808-819. [PMID: 36207486 PMCID: PMC9742309 DOI: 10.1038/s41370-022-00481-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 09/14/2022] [Accepted: 09/15/2022] [Indexed: 05/10/2023]

Abstract

BACKGROUND

Despite their large numbers and widespread use, very little is known about the extent to which per- and polyfluoroalkyl substances (PFAS) can cross the placenta and expose the developing fetus.

OBJECTIVE

The aim of our study is to develop a computational approach that can be used to evaluate the of extend to which small molecules, and in particular PFAS, can cross to cross the placenta and partition to cord blood.

METHODS

We collected experimental values of the concentration ratio between cord and maternal blood (R_CM) for 260 chemical compounds and calculated their physicochemical descriptors using the cheminformatics package Mordred. We used the compiled database to, train and test an artificial neural network (ANN). And then applied the best performing model to predict R_CM for a large dataset of PFAS chemicals (n = 7982). We, finally, examined the calculated physicochemical descriptors of the chemicals to identify which properties correlated significantly with R_CM.

RESULTS

We determined that 7855 compounds were within the applicability domain and 127 compounds are outside the applicability domain of our model. Our predictions of R_CM for PFAS suggested that 3623 compounds had a log R_CM > 0 indicating preferable partitioning to cord blood. Some examples of these compounds were bisphenol AF, 2,2-bis(4-aminophenyl)hexafluoropropane, and nonafluoro-tert-butyl 3-methylbutyrate.

SIGNIFICANCE

These observations have important public health implications as many PFAS have been shown to interfere with fetal development. In addition, as these compounds are highly persistent and many of them can readily cross the placenta, they are expected to remain in the population for a long time as they are being passed from parent to offspring.

IMPACT

Understanding the behavior of chemicals in the human body during pregnancy is critical in preventing harmful exposures during critical periods of development. Many chemicals can cross the placenta and expose the fetus, however, the mechanism by which this transport occurs is not well understood. In our study, we developed a machine learning model that describes the transplacental transfer of chemicals as a function of their physicochemical properties. The model was then used to make predictions for a set of 7982 per- and polyfluorinated alkyl substances that are listed on EPA's CompTox Chemicals Dashboard. The model can be applied to make predictions for other chemical categories of interest, such as plasticizers and pesticides. Accurate predictions of R_CM can help scientists and regulators to prioritize chemicals that have the potential to cause harm by exposing the fetus.

Collapse

Affiliation(s)

Dimitri Abrahamsson Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California, San Francisco, 490 Illinois Street, San Francisco, CA, 94143, USA.
Adi Siddharth Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California, San Francisco, 490 Illinois Street, San Francisco, CA, 94143, USA
Joshua F Robinson Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California, San Francisco, 490 Illinois Street, San Francisco, CA, 94143, USA
Anatoly Soshilov California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1001 I St, Sacramento, CA, 95814, USA California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1515 Clay St, Oakland, CA, 94612, USA
Sarah Elmore California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1001 I St, Sacramento, CA, 95814, USA California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1515 Clay St, Oakland, CA, 94612, USA
Vincent Cogliano California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1001 I St, Sacramento, CA, 95814, USA California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1515 Clay St, Oakland, CA, 94612, USA
Carla Ng Department of Civil and Environmental Engineering, University of Pittsburgh, 3700 O'Hara St, Pittsburgh, PA, 15261, USA
Elaine Khan California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1001 I St, Sacramento, CA, 95814, USA California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1515 Clay St, Oakland, CA, 94612, USA
Randolph Ashton Wisconsin Institute for Discovery, University of Wisconsin, Madison, 330 N Orchard St, Madison, WI, 53715, USA The Stem Cell and Regenerative Medicine Center, University of Wisconsin, Madison, 1111 Highland Avenue, Madison, WI, 53705, USA Department of Biomedical Engineering, University of Wisconsin - Madison, 1550 Engineering Drive, Madison, WI, 53706, USA
Weihsueh A Chiu Department of Veterinary Physiology and Pharmacology, School of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX, 77843, USA
Jennifer Fung Department of Obstetrics, Gynecology, and Reproductive Science and the Center of Reproductive Science, University of California, San Francisco, San Francisco, CA, 94143-2240, USA
Lauren Zeise California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1001 I St, Sacramento, CA, 95814, USA California Environmental Protection Agency, Office of Environmental Health Hazard Assessment, 1515 Clay St, Oakland, CA, 94612, USA
Tracey J Woodruff Department of Obstetrics, Gynecology and Reproductive Sciences, Program on Reproductive Health and the Environment, University of California, San Francisco, 490 Illinois Street, San Francisco, CA, 94143, USA.

Collapse

Gao Z, Xia R, Zhang P. Prediction of anti-proliferation effect of [1,2,3]triazolo[4,5-d]pyrimidine derivatives by random forest and mix-kernel function SVM with PSO. Chem Pharm Bull (Tokyo) 2022;70:684-693. [PMID: 35922903 DOI: 10.1248/cpb.c22-00376] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Yoo JE, Rho M. Large-Scale Survey Data Analysis with Penalized Regression: A Monte Carlo Simulation on Missing Categorical Predictors. MULTIVARIATE BEHAVIORAL RESEARCH 2022;57:642-657. [PMID: 33703972 DOI: 10.1080/00273171.2021.1891856] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Ji Y, Li R, Tian Y, Chen G, Yan A. Classification models and SAR analysis on thromboxane A₂ synthase inhibitors by machine learning methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2022;33:429-462. [PMID: 35678125 DOI: 10.1080/1062936x.2022.2078880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 05/11/2022] [Indexed: 06/15/2023]

Prediction of the Neurotoxic Potential of Chemicals Based on Modelling of Molecular Initiating Events Upstream of the Adverse Outcome Pathways of (Developmental) Neurotoxicity. Int J Mol Sci 2022;23:ijms23063053. [PMID: 35328472 PMCID: PMC8954925 DOI: 10.3390/ijms23063053] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 12/23/2022] Open

Carrera GVSM, Inês J, Bernardes CES, Klimenko K, Shimizu K, Canongia Lopes JN. The Solubility of Gases in Ionic Liquids: A Chemoinformatic Predictive and Interpretable Approach. Chemphyschem 2021;22:2190-2200. [PMID: 34464013 DOI: 10.1002/cphc.202100632] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Indexed: 11/07/2022]

Gajewicz-Skretna A, Furuhama A, Yamamoto H, Suzuki N. Generating accurate in silico predictions of acute aquatic toxicity for a range of organic chemicals: Towards similarity-based machine learning methods. CHEMOSPHERE 2021;280:130681. [PMID: 34162070 DOI: 10.1016/j.chemosphere.2021.130681] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 04/21/2021] [Accepted: 04/22/2021] [Indexed: 06/13/2023]

Fernandes PO, Martins DM, de Souza Bozzi A, Martins JPA, de Moraes AH, Maltarollo VG. Molecular insights on ABL kinase activation using tree-based machine learning models and molecular docking. Mol Divers 2021;25:1301-1314. [PMID: 34191245 PMCID: PMC8241884 DOI: 10.1007/s11030-021-10261-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/18/2021] [Indexed: 12/14/2022]

Kuz’min V, Artemenko A, Ognichenko L, Hromov A, Kosinskaya A, Stelmakh S, Sessions ZL, Muratov EN. Simplex representation of molecular structure as universal QSAR/QSPR tool. Struct Chem 2021;32:1365-1392. [PMID: 34177203 PMCID: PMC8218296 DOI: 10.1007/s11224-021-01793-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 05/07/2021] [Indexed: 10/24/2022]

Jiang J, Wang R, Wei GW. GGL-Tox: Geometric Graph Learning for Toxicity Prediction. J Chem Inf Model 2021;61:1691-1700. [PMID: 33719422 PMCID: PMC8155789 DOI: 10.1021/acs.jcim.0c01294] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Schmidt F. Computational Toxicology. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11534-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Idakwo G, Thangapandian S, Luttrell J, Li Y, Wang N, Zhou Z, Hong H, Yang B, Zhang C, Gong P. Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets. J Cheminform 2020;12:66. [PMID: 33372637 PMCID: PMC7592558 DOI: 10.1186/s13321-020-00468-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 10/13/2020] [Indexed: 12/14/2022] Open

Abstract

The specificity of toxicant-target biomolecule interactions lends to the very imbalanced nature of many toxicity datasets, causing poor performance in Structure–Activity Relationship (SAR)-based chemical classification. Undersampling and oversampling are representative techniques for handling such an imbalance challenge. However, removing inactive chemical compound instances from the majority class using an undersampling technique can result in information loss, whereas increasing active toxicant instances in the minority class by interpolation tends to introduce artificial minority instances that often cross into the majority class space, giving rise to class overlapping and a higher false prediction rate. In this study, in order to improve the prediction accuracy of imbalanced learning, we employed SMOTEENN, a combination of Synthetic Minority Over-sampling Technique (SMOTE) and Edited Nearest Neighbor (ENN) algorithms, to oversample the minority class by creating synthetic samples, followed by cleaning the mislabeled instances. We chose the highly imbalanced Tox21 dataset, which consisted of 12 in vitro bioassays for > 10,000 chemicals that were distributed unevenly between binary classes. With Random Forest (RF) as the base classifier and bagging as the ensemble strategy, we applied four hybrid learning methods, i.e., RF without imbalance handling (RF), RF with Random Undersampling (RUS), RF with SMOTE (SMO), and RF with SMOTEENN (SMN). The performance of the four learning methods was compared using nine evaluation metrics, among which F₁ score, Matthews correlation coefficient and Brier score provided a more consistent assessment of the overall performance across the 12 datasets. The Friedman’s aligned ranks test and the subsequent Bergmann-Hommel post hoc test showed that SMN significantly outperformed the other three methods. We also found that a strong negative correlation existed between the prediction accuracy and the imbalance ratio (IR), which is defined as the number of inactive compounds divided by the number of active compounds. SMN became less effective when IR exceeded a certain threshold (e.g., > 28). The ability to separate the few active compounds from the vast amounts of inactive ones is of great importance in computational toxicology. This work demonstrates that the performance of SAR-based, imbalanced chemical toxicity classification can be significantly improved through the use of data rebalancing.

Collapse

Wang Y, Chen X. A joint optimization QSAR model of fathead minnow acute toxicity based on a radial basis function neural network and its consensus modeling. RSC Adv 2020;10:21292-21308. [PMID: 35518745 PMCID: PMC9054390 DOI: 10.1039/d0ra02701d] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 05/24/2020] [Indexed: 01/07/2023] Open

Abstract

Acute toxicity of the fathead minnow (Pimephales promelas) is an important indicator to evaluate the hazards and risks of compounds in aquatic environments. The aim of our study is to explore the predictive power of the quantitative structure-activity relationship (QSAR) model based on a radial basis function (RBF) neural network with the joint optimization method to study the acute toxicity mechanism, and to develop a potential acute toxicity prediction model, for fathead minnow. To ensure the symmetry and fairness of the data splitting and to generate multiple chemically diverse training and validation sets, we used a self-organizing mapping (SOM) neural network to split the modeling dataset (containing 955 compounds) characterized by PaDEL-descriptors. After preliminary selection of descriptors via the mean decrease impurity method, a hybrid quantum particle swarm optimization (HQPSO) algorithm was used to jointly optimize the parameters of RBF and select the key descriptors. We established 20 RBF-based QSAR models, and the statistical results showed that the 10-fold cross-validation results (R cv10 2) and the adjusted coefficients of determination (R adj 2) were all great than 0.7 and 0.8, respectively. The Q ext 2 of these models was between 0.6480 and 0.7317, and the R ext 2 was between 0.6563 and 0.7318. Combined with the frequency and importance of the descriptors used in RBF-based models, and the correlation between the descriptors and acute toxicity, we concluded that the water distribution coefficient, molar refractivity, and first ionization potential are important factors affecting the acute toxicity of fathead minnow. A consensus QSAR model with RBF-based models was established; this model showed good performance with R 2 = 0.9118, R cv10 2 = 0.7632, and Q ext 2 = 0.7430. A frequency weighted and distance (FWD)-based application domain (AD) definition method was proposed, and the outliers were analyzed carefully. Compared with previous studies the method proposed in this paper has obvious advantages and its robustness and external predictive power are also better than Xgboost-based model. It is an effective QSAR modeling method.

Collapse

Mozafari Z, Arab Chamjangali M, Beglari M, Doosti R. The efficiency of ligand-receptor interaction information alone as new descriptors in QSAR modeling via random forest artificial neural network. Chem Biol Drug Des 2020;96:812-824. [PMID: 32259386 DOI: 10.1111/cbdd.13690] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Revised: 02/15/2020] [Accepted: 03/15/2020] [Indexed: 11/28/2022]

Chen CH, Tanaka K, Kotera M, Funatsu K. Comparison and improvement of the predictability and interpretability with ensemble learning models in QSPR applications. J Cheminform 2020;12:19. [PMID: 33430997 PMCID: PMC7106596 DOI: 10.1186/s13321-020-0417-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 02/05/2020] [Indexed: 12/23/2022] Open

Jiao Z, Yuan S, Zhang Z, Wang Q. Machine learning prediction of hydrocarbon mixture lower flammability limits using quantitative structure‐property relationship models. PROCESS SAFETY PROGRESS 2019. [DOI: 10.1002/prs.12103] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Toxicity Prediction Method Based on Multi-Channel Convolutional Neural Network. Molecules 2019;24:molecules24183383. [PMID: 31533341 PMCID: PMC6766985 DOI: 10.3390/molecules24183383] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 09/03/2019] [Accepted: 09/13/2019] [Indexed: 02/08/2023] Open

Zhang Y, Zhao J, Wang Y, Fan Y, Zhu L, Yang Y, Chen X, Lu T, Chen Y, Liu H. Prediction of hERG K+ channel blockage using deep neural networks. Chem Biol Drug Des 2019;94:1973-1985. [PMID: 31394026 DOI: 10.1111/cbdd.13600] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 07/23/2019] [Accepted: 07/30/2019] [Indexed: 01/08/2023]

Gadaleta D, Vuković K, Toma C, Lavado GJ, Karmaus AL, Mansouri K, Kleinstreuer NC, Benfenati E, Roncaglioni A. SAR and QSAR modeling of a large collection of LD₅₀ rat acute oral toxicity data. J Cheminform 2019;11:58. [PMID: 33430989 PMCID: PMC6717335 DOI: 10.1186/s13321-019-0383-2] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 08/13/2019] [Indexed: 11/10/2022] Open

Suthar M. Applying several machine learning approaches for prediction of unconfined compressive strength of stabilized pond ashes. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04411-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Majumdar S, Basak SC, Lungu CN, Diudea MV, Grunwald GD. Finding Needles in a Haystack: Determining Key Molecular Descriptors Associated with the Blood-brain Barrier Entry of Chemical Compounds Using Machine Learning. Mol Inform 2019;38:e1800164. [PMID: 31322827 DOI: 10.1002/minf.201800164] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2018] [Accepted: 04/11/2019] [Indexed: 12/23/2022]

Assessment of the cardiovascular adverse effects of drug-drug interactions through a combined analysis of spontaneous reports and predicted drug-target interactions. PLoS Comput Biol 2019;15:e1006851. [PMID: 31323029 PMCID: PMC6668846 DOI: 10.1371/journal.pcbi.1006851] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 07/31/2019] [Accepted: 06/29/2019] [Indexed: 12/11/2022] Open

Abstract

Adverse drug effects (ADEs) are one of the leading causes of death in developed countries and are the main reason for drug recalls from the market, whereas the ADEs that are associated with action on the cardiovascular system are the most dangerous and widespread. The treatment of human diseases often requires the intake of several drugs, which can lead to undesirable drug-drug interactions (DDIs), thus causing an increase in the frequency and severity of ADEs. An evaluation of DDI-induced ADEs is a nontrivial task and requires numerous experimental and clinical studies. Therefore, we developed a computational approach to assess the cardiovascular ADEs of DDIs. This approach is based on the combined analysis of spontaneous reports (SRs) and predicted drug-target interactions to estimate the five cardiovascular ADEs that are induced by DDIs, namely, myocardial infarction, ischemic stroke, ventricular tachycardia, cardiac failure, and arterial hypertension. We applied a method based on least absolute shrinkage and selection operator (LASSO) logistic regression to SRs for the identification of interacting pairs of drugs causing corresponding ADEs, as well as noninteracting pairs of drugs. As a result, five datasets containing, on average, 3100 potentially ADE-causing and non-ADE-causing drug pairs were created. The obtained data, along with information on the interaction of drugs with 1553 human targets predicted by PASS Targets software, were used to create five classification models using the Random Forest method. The average area under the ROC curve of the obtained models, sensitivity, specificity and balanced accuracy were 0.837, 0.764, 0.754 and 0.759, respectively. The predicted drug targets were also used to hypothesize the potential mechanisms of DDI-induced ventricular tachycardia for the top-scoring drug pairs. The created five classification models can be used for the identification of drug combinations that are potentially the most or least dangerous for the cardiovascular system.

Collapse

Cortés-Ciriano I, Bender A. KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images. J Cheminform 2019;11:41. [PMID: 31218493 PMCID: PMC6582521 DOI: 10.1186/s13321-019-0364-5] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 06/09/2019] [Indexed: 02/08/2023] Open

Abstract

The application of convolutional neural networks (ConvNets) to harness high-content screening images or 2D compound representations is gaining increasing attention in drug discovery. However, existing applications often require large data sets for training, or sophisticated pretraining schemes. Here, we show using 33 IC50 data sets from ChEMBL 23 that the in vitro activity of compounds on cancer cell lines and protein targets can be accurately predicted on a continuous scale from their Kekulé structure representations alone by extending existing architectures (AlexNet, DenseNet-201, ResNet152 and VGG-19), which were pretrained on unrelated image data sets. We show that the predictive power of the generated models, which just require standard 2D compound representations as input, is comparable to that of Random Forest (RF) models and fully-connected Deep Neural Networks trained on circular (Morgan) fingerprints. Notably, including additional fully-connected layers further increases the predictive power of the ConvNets by up to 10%. Analysis of the predictions generated by RF models and ConvNets shows that by simply averaging the output of the RF models and ConvNets we obtain significantly lower errors in prediction for multiple data sets, although the effect size is small, than those obtained with either model alone, indicating that the features extracted by the convolutional layers of the ConvNets provide complementary predictive signal to Morgan fingerprints. Lastly, we show that multi-task ConvNets trained on compound images permit to model COX isoform selectivity on a continuous scale with errors in prediction comparable to the uncertainty of the data. Overall, in this work we present a set of ConvNet architectures for the prediction of compound activity from their Kekulé structure representations with state-of-the-art performance, that require no generation of compound descriptors or use of sophisticated image processing techniques. The code needed to reproduce the results presented in this study and all the data sets are provided at https://github.com/isidroc/kekulescope .

Collapse

Melo-Filho CC, Braga RC, Muratov EN, Franco CH, Moraes CB, Freitas-Junior LH, Andrade CH. Discovery of new potent hits against intracellular Trypanosoma cruzi by QSAR-based virtual screening. Eur J Med Chem 2018;163:649-659. [PMID: 30562700 DOI: 10.1016/j.ejmech.2018.11.062] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Revised: 11/21/2018] [Accepted: 11/23/2018] [Indexed: 12/17/2022]

Abstract

Chagas disease is a neglected tropical disease (NTD) caused by the protozoan parasite Trypanosoma cruzi and is primarily transmitted to humans by the feces of infected Triatominae insects during their blood meal. The disease affects 6-8 million people, mostly in Latin America countries, and kills more people in the region each year than any other parasite-born disease, including malaria. Moreover, patient numbers are currently increasing in non-endemic, developed countries, such as Australia, Japan, Canada, and the United States. The treatment is limited to one drug, benznidazole, which is only effective in the acute phase of the disease and is very toxic. Thus, there is an urgent need to develop new, safer, and effective drugs against the chronic phase of Chagas disease. Using a QSAR-based virtual screening followed by in vitro experimental evaluation, we report herein the identification of novel potent and selective hits against T. cruzi intracellular stage. We developed and validated binary QSAR models for prediction of anti-trypanosomal activity and cytotoxicity against mammalian cells using the best practices for QSAR modeling. These models were then used for virtual screening of a commercial database, leading to the identification of 39 virtual hits. Further in vitro assays showed that seven compounds were potent against intracellular T. cruzi at submicromolar concentrations (EC₅₀ < 1 μM) and were very selective (SI > 30). Furthermore, other six compounds were also inside the hit criteria for Chagas disease, which presented activity at low micromolar concentrations (EC₅₀ < 10 μM) against intracellular T. cruzi and were also selective (SI > 15). Moreover, we performed a multi-parameter analysis for the comparison of tested compounds regarding their balance between potency, selectivity, and predicted ADMET properties. In the next studies, the most promising compounds will be submitted to additional in vitro and in vivo assays in acute model of Chagas disease, and can be further optimized for the development of new promising drug candidates against this important yet neglected disease.

Collapse

Kensert A, Alvarsson J, Norinder U, Spjuth O. Evaluating parameters for ligand-based modeling with random forest on sparse data sets. J Cheminform 2018;10:49. [PMID: 30306349 PMCID: PMC6755600 DOI: 10.1186/s13321-018-0304-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 10/03/2018] [Indexed: 11/10/2022] Open

Wu Y, Wang G. Machine Learning Based Toxicity Prediction: From Chemical Structural Description to Transcriptome Analysis. Int J Mol Sci 2018;19:E2358. [PMID: 30103448 PMCID: PMC6121588 DOI: 10.3390/ijms19082358] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Revised: 07/31/2018] [Accepted: 08/08/2018] [Indexed: 02/07/2023] Open

Majumdar S, Basak SC, Lungu CN, Diudea MV, Grunwald GD. Mathematical structural descriptors and mutagenicity assessment: a study with congeneric and diverse datasets^$. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2018;29:579-590. [PMID: 30025481 DOI: 10.1080/1062936x.2018.1496475] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 07/01/2018] [Indexed: 06/08/2023]

Gadaleta D, Manganelli S, Roncaglioni A, Toma C, Benfenati E, Mombelli E. QSAR Modeling of ToxCast Assays Relevant to the Molecular Initiating Events of AOPs Leading to Hepatic Steatosis. J Chem Inf Model 2018;58:1501-1517. [PMID: 29949360 DOI: 10.1021/acs.jcim.8b00297] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Mayr A, Klambauer G, Unterthiner T, Steijaert M, Wegner JK, Ceulemans H, Clevert DA, Hochreiter S. Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chem Sci 2018;9:5441-5451. [PMID: 30155234 PMCID: PMC6011237 DOI: 10.1039/c8sc00148k] [Citation(s) in RCA: 262] [Impact Index Per Article: 43.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 05/16/2018] [Indexed: 12/24/2022] Open

Piras P, Sheridan R, Sherer EC, Schafer W, Welch CJ, Roussel C. Modeling and predicting chiral stationary phase enantioselectivity: An efficient random forest classifier using an optimally balanced training dataset and an aggregation strategy. J Sep Sci 2018;41:1365-1375. [PMID: 29383846 DOI: 10.1002/jssc.201701334] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2017] [Revised: 01/17/2018] [Accepted: 01/17/2018] [Indexed: 11/10/2022]

Simm J, Klambauer G, Arany A, Steijaert M, Wegner JK, Gustin E, Chupakhin V, Chong YT, Vialard J, Buijnsters P, Velter I, Vapirev A, Singh S, Carpenter AE, Wuyts R, Hochreiter S, Moreau Y, Ceulemans H. Repurposing High-Throughput Image Assays Enables Biological Activity Prediction for Drug Discovery. Cell Chem Biol 2018;25:611-618.e3. [PMID: 29503208 DOI: 10.1016/j.chembiol.2018.01.015] [Citation(s) in RCA: 127] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 10/31/2017] [Accepted: 01/29/2018] [Indexed: 12/19/2022]

Kaneko H. Discussion on Regression Methods Based on Ensemble Learning and Applicability Domains of Linear Submodels. J Chem Inf Model 2018;58:480-489. [PMID: 29425038 DOI: 10.1021/acs.jcim.7b00649] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]

Baskin II. Machine Learning Methods in Computational Toxicology. Methods Mol Biol 2018;1800:119-139. [PMID: 29934890 DOI: 10.1007/978-1-4939-7899-1_5] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Polishchuk P. Interpretation of Quantitative Structure–Activity Relationship Models: Past, Present, and Future. J Chem Inf Model 2017;57:2618-2639. [DOI: 10.1021/acs.jcim.7b00274] [Citation(s) in RCA: 120] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Klimenko K, Lyakhov S, Shibinskaya M, Karpenko A, Marcou G, Horvath D, Zenkova M, Goncharova E, Amirkhanov R, Krysko A, Andronati S, Levandovskiy I, Polishchuk P, Kuz'min V, Varnek A. Virtual screening, synthesis and biological evaluation of DNA intercalating antiviral agents. Bioorg Med Chem Lett 2017;27:3915-3919. [PMID: 28666733 DOI: 10.1016/j.bmcl.2017.06.035] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Revised: 06/09/2017] [Accepted: 06/11/2017] [Indexed: 01/01/2023]

Affiliation(s)

Kyrylo Klimenko Laboratoire de Chemoinformatique, (UMR 7140 CNRS/UniStra), Université de Strasbourg, 4, rue B. Pascal, Strasbourg 67000, France; A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Sergey Lyakhov A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Marina Shibinskaya A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Alexander Karpenko A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Gilles Marcou Laboratoire de Chemoinformatique, (UMR 7140 CNRS/UniStra), Université de Strasbourg, 4, rue B. Pascal, Strasbourg 67000, France
Dragos Horvath Laboratoire de Chemoinformatique, (UMR 7140 CNRS/UniStra), Université de Strasbourg, 4, rue B. Pascal, Strasbourg 67000, France
Marina Zenkova Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia
Elena Goncharova Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia
Rinat Amirkhanov Institute of Chemical Biology and Fundamental Medicine, Siberian Branch of Russian Academy of Sciences, 8 Lavrentiev Avenue, Novosibirsk 630090, Russia
Andrei Krysko A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Sergei Andronati A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Igor Levandovskiy Department of Organic Chemistry, Kiev Polytechnic Institute, Pr. Pobedy 37, 03056 Kiev, Ukraine
Pavel Polishchuk A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine; Institute of Molecular and Translational Medicine, Palacky University Olomouc, Hněvotínská 1333/5, Olomouc 779 00, Czech Republic
Victor Kuz'min A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine, Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
Alexandre Varnek Laboratoire de Chemoinformatique, (UMR 7140 CNRS/UniStra), Université de Strasbourg, 4, rue B. Pascal, Strasbourg 67000, France; Federal University of Kazan, Kremlevskaya str., 18, Kazan, Russia.

Collapse

Zhao P, Liu B, Wang C. Hepatotoxicity evaluation of traditional Chinese medicines using a computational molecular model. Clin Toxicol (Phila) 2017;55:996-1000. [PMID: 28594241 DOI: 10.1080/15563650.2017.1333123] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Muratov E, Lewis M, Fourches D, Tropsha A, Cox WC. Computer-Assisted Decision Support for Student Admissions Based on Their Predicted Academic Performance. AMERICAN JOURNAL OF PHARMACEUTICAL EDUCATION 2017;81:46. [PMID: 28496266 PMCID: PMC5423062 DOI: 10.5688/ajpe81346] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 04/20/2016] [Indexed: 05/22/2023]

Dearden JC. The History and Development of Quantitative Structure-Activity Relationships (QSARs). Oncology 2017. [DOI: 10.4018/978-1-5225-0549-5.ch003] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Nattee C, Khamsemanan N, Lawtrakul L, Toochinda P, Hannongbua S. A novel prediction approach for antimalarial activities of Trimethoprim, Pyrimethamine, and Cycloguanil analogues using extremely randomized trees. J Mol Graph Model 2016;71:13-27. [PMID: 27835827 DOI: 10.1016/j.jmgm.2016.09.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 09/19/2016] [Accepted: 09/20/2016] [Indexed: 10/20/2022]

Abstract

Malaria is still one of the most serious diseases in tropical regions. This is due in part to the high resistance against available drugs for the inhibition of parasites, Plasmodium, the cause of the disease. New potent compounds with high clinical utility are urgently needed. In this work, we created a novel model using a regression tree to study structure-activity relationships and predict the inhibition constant, K_i of three different antimalarial analogues (Trimethoprim, Pyrimethamine, and Cycloguanil) based on their molecular descriptors. To the best of our knowledge, this work is the first attempt to study the structure-activity relationships of all three analogues combined. The most relevant descriptors and appropriate parameters of the regression tree are harvested using extremely randomized trees. These descriptors are water accessible surface area, Log of the aqueous solubility, total hydrophobic van der Waals surface area, and molecular refractivity. Out of all possible combinations of these selected parameters and descriptors, the tree with the strongest coefficient of determination is selected to be our prediction model. Predicted K_i values from the proposed model show a strong coefficient of determination, R²=0.996, to experimental K_i values. From the structure of the regression tree, compounds with high accessible surface area of all hydrophobic atoms (ASA_H) and low aqueous solubility of inhibitors (Log S) generally possess low K_i values. Our prediction model can also be utilized as a screening test for new antimalarial drug compounds which may reduce the time and expenses for new drug development. New compounds with high predicted K_i should be excluded from further drug development. It is also our inference that a threshold of ASA_H greater than 575.80 and Log S less than or equal to -4.36 is a sufficient condition for a new compound to possess a low K_i.

Collapse