1
|
Obradović D, Stavrianidi A, Fedorova E, Bogojević A, Shpigun O, Buryak A, Lazović S. A comparative study of the predictive performance of different descriptor calculation tools: Molecular-based elution order modeling and interpretation of retention mechanism for isomeric compounds from METLIN database. J Chromatogr A 2024; 1719:464731. [PMID: 38377661 DOI: 10.1016/j.chroma.2024.464731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/08/2024] [Accepted: 02/09/2024] [Indexed: 02/22/2024]
Abstract
In the pharmaceutical industry, the need for analytical standards is a bottleneck for comprehensive evaluation and quality control of intermediate and end products. These are complex mixtures containing structurally related molecules. In this regard, chromatographic peak annotation, especially for critical pairs of isomers and closest structural analogs, can be supported by using a Quantitative Structure Retention Relationship (QSRR) approach. In our study, we investigated the fundamental basis of the reversed-phase (RP) retention mechanism for 1141 isomeric compounds from the METLIN SMRT dataset. Nine different descriptor calculation tools combined with different feature selection methods (genetic algorithm (GA), stepwise, Boruta) and machine learning (ML) approaches (support vector machine (SVM), multiple linear regression (MLR), random forest (RF), XGBoost) were applied to provide a reliable molecular structure-based interpretation of RP retention behaviour of the isomeric compounds. Strict internal and external validation metrics were used to select models with the best predictive capabilities (rtest > 0.73, order of elution > 60 %). For the developed models, mean absolute errors were in the range of 60 to 110 s. Stepwise and GA showed the most suitable performance as descriptor selection methods, while SVM and XGBoost modeling gave satisfactory predictive characteristics in most cases. Validation performed on the published experimental data for structurally related pharmaceutical compounds confirmed the best accuracy of MLR modeling in combination with GA feature selection of general physico-chemical properties. The resulting models will be useful for the prediction of separation and identification of structurally related compounds in pharmaceutical analysis, providing a simultaneous understanding of the interaction mechanisms leading to their retention under RP conditions.
Collapse
Affiliation(s)
- Darija Obradović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| | - Andrey Stavrianidi
- Chemistry Department, Lomonosov Moscow State University, 1/3 Leninskie Gory, GSP-1, Moscow 119991, Russia; A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia.
| | - Elizaveta Fedorova
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia
| | - Aleksandar Bogojević
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| | - Oleg Shpigun
- Chemistry Department, Lomonosov Moscow State University, 1/3 Leninskie Gory, GSP-1, Moscow 119991, Russia
| | - Aleksey Buryak
- A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, Russian Academy of Sciences, 31 Leninsky Prospect, GSP-1, Moscow 119071, Russia
| | - Saša Lazović
- Institute of Physics Belgrade, National Institute of the Republic of Serbia, Pregrevica 118, Belgrade 11080, Serbia
| |
Collapse
|
2
|
Ibrahim AE, El Gohary NA, Aboushady D, Samir L, Karim SEA, Herz M, Salman BI, Al-Harrasi A, Hanafi R, El Deeb S. Recent advances in chiral selectors immobilization and chiral mobile phase additives in liquid chromatographic enantio-separations: A review. J Chromatogr A 2023; 1706:464214. [PMID: 37506464 DOI: 10.1016/j.chroma.2023.464214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 07/10/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023]
Abstract
For decades now, the separation of chiral enantiomers of drugs has been gaining the interest and attention of researchers. In 1991, the first guidelines for development of chiral drugs were firstly released by the US-FDA. Since then, the development in chromatographic enantioseparation tools has been fast and variable, aiming at creating a suitable environment where the physically and chemically identical enantiomers can be separated. Among those tools, the immobilization of chiral selectors (CS) on different stationary phases and the chiral mobile phase additives (CMPA) which have been progressed and studied extensively. This review article highlights the major advances in immobilization of CS together with their different recognition mechanisms as well as CMPA as a cheaper and successful alternative for chiral stationary phases. Moreover, the role of molecular modeling tool as a pre-step in the choice of CS for evaluating possible interactions with different ligands has been pointed up. Illustrations of reported methods and updates for immobilized CS and CMPA have been included.
Collapse
Affiliation(s)
- Adel Ehab Ibrahim
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Port-Said University, Port-Said 42511, Egypt; Natural and Medical Sciences Research Center, University of Nizwa, P.O. Box 33, Birkat Al Mauz, Nizwa 616, Sultanate of Oman
| | - Nesrine Abdelrehim El Gohary
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Dina Aboushady
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Liza Samir
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Shereen Ekram Abdel Karim
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Magy Herz
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Baher I Salman
- Pharmaceutical Analytical Chemistry Department, Faculty of Pharmacy, Al-Azhar University, Assiut Branch, Assiut, 71524, Egypt
| | - Ahmed Al-Harrasi
- Natural and Medical Sciences Research Center, University of Nizwa, P.O. Box 33, Birkat Al Mauz, Nizwa 616, Sultanate of Oman
| | - Rasha Hanafi
- Pharmaceutical Chemistry Department, Faculty of Pharmacy and Biotechnology, German University in Cairo, Cairo 11835, Egypt
| | - Sami El Deeb
- Institute of Medicinal and Pharmaceutical Chemistry, Technische Universität Braunschweig, Braunschweig 38092, Germany; Institute of Pharmacy, Freie Universität Berlin, Königin-Luise-Str. 2+4, 14195 Berlin, Germany.
| |
Collapse
|
3
|
Emonts J, Buyel J. An overview of descriptors to capture protein properties - Tools and perspectives in the context of QSAR modeling. Comput Struct Biotechnol J 2023; 21:3234-3247. [PMID: 38213891 PMCID: PMC10781719 DOI: 10.1016/j.csbj.2023.05.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/23/2023] [Accepted: 05/23/2023] [Indexed: 01/13/2024] Open
Abstract
Proteins are important ingredients in food and feed, they are the active components of many pharmaceutical products, and they are necessary, in the form of enzymes, for the success of many technical processes. However, production can be challenging, especially when using heterologous host cells such as bacteria to express and assemble recombinant mammalian proteins. The manufacturability of proteins can be hindered by low solubility, a tendency to aggregate, or inefficient purification. Tools such as in silico protein engineering and models that predict separation criteria can overcome these issues but usually require the complex shape and surface properties of proteins to be represented by a small number of quantitative numeric values known as descriptors, as similarly used to capture the features of small molecules. Here, we review the current status of protein descriptors, especially for application in quantitative structure activity relationship (QSAR) models. First, we describe the complexity of proteins and the properties that descriptors must accommodate. Then we introduce descriptors of shape and surface properties that quantify the global and local features of proteins. Finally, we highlight the current limitations of protein descriptors and propose strategies for the derivation of novel protein descriptors that are more informative.
Collapse
Affiliation(s)
- J. Emonts
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Germany
| | - J.F. Buyel
- University of Natural Resources and Life Sciences, Vienna (BOKU), Department of Biotechnology (DBT), Institute of Bioprocess Science and Engineering (IBSE), Muthgasse 18, 1190 Vienna, Austria
- Institute for Molecular Biotechnology, Worringerweg 1, RWTH Aachen University, 52074 Aachen, Germany
| |
Collapse
|
4
|
Cuesta SA, Moreno M, López RA, Mora JR, Paz JL, Márquez EA. ElectroPredictor: An Application to Predict Mayr's Electrophilicity E through Implementation of an Ensemble Model Based on Machine Learning Algorithms. J Chem Inf Model 2023; 63:507-521. [PMID: 36594600 DOI: 10.1021/acs.jcim.2c01367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Electrophilicity (E) is one of the most important parameters to understand the reactivity of an organic molecule. Although the theoretical electrophilicity index (ω) has been associated with E in a small homologous series, the use of w to predict E in a structurally heterogeneous set of compounds is not a trivial task. In this study, a robust ensemble model is created using Mayr's database of reactivity parameters. A combination of topological and quantum mechanical descriptors and different machine learning algorithms are employed for the model's development. The predictability of the model is assessed using different statistical parameters, and its validation is examined, including a training/test partition, an applicability domain, and a y-scrambling test. The global ensemble model presents a Q5-fold2 of 0.909 and a Qext2 of 0.912, demonstrating an excellent predictability performance of E values and showing that w is not a good descriptor for the prediction of E, especially for the case of neutral compounds. ElectroPredictor, a noncommercial Python application (https://github.com/mmoreno1/ElectroPredictor), is developed to predict E. QM9, a well-known large dataset containing 133885 neutral molecules, is used to perform a virtual screening (94.0% coverage). Finally, the 10 most electrophilic molecules are analyzed as possible new Mayr's electrophiles, which have not yet been experimentally tested. This study confirms the necessity to build an ensemble model using nonlinear machine learning algorithms, topographic descriptors, and separating molecules into charged and neutral compounds to predict E with precision.
Collapse
Affiliation(s)
- Sebastián A Cuesta
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
- Department of Chemistry, Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, ManchesterM1 7DN, U.K
| | - Martín Moreno
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
| | - Romina A López
- Colegio San Ignacio de Loyola─Fe y Alegría, Ministerio de Educación, Quito170901, Ecuador
| | - José R Mora
- Instituto de Simulación Computacional (ISC-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito170901, Ecuador
| | - José Luis Paz
- Departamento Académico de Química Inorgánica, Facultad de Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, Cercado de Lima, Lima15081, Peru
| | - Edgar A Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla081007, Colombia
| |
Collapse
|
5
|
Metwally AA, Nayel AA, Hathout RM. In silico prediction of siRNA ionizable-lipid nanoparticles In vivo efficacy: Machine learning modeling based on formulation and molecular descriptors. Front Mol Biosci 2022; 9:1042720. [PMID: 36619167 PMCID: PMC9811823 DOI: 10.3389/fmolb.2022.1042720] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open
Abstract
In silico prediction of the in vivo efficacy of siRNA ionizable-lipid nanoparticles is desirable as it can save time and resources dedicated to wet-lab experimentation. This study aims to computationally predict siRNA nanoparticles in vivo efficacy. A data set containing 120 entries was prepared by combining molecular descriptors of the ionizable lipids together with two nanoparticles formulation characteristics. Input descriptor combinations were selected by an evolutionary algorithm. Artificial neural networks, support vector machines and partial least squares regression were used for QSAR modeling. Depending on how the data set is split, two training sets and two external validation sets were prepared. Training and validation sets contained 90 and 30 entries respectively. The results showed the successful predictions of validation set log (siRNA dose) with Rval 2= 0.86-0.89 and 0.75-80 for validation sets one and two, respectively. Artificial neural networks resulted in the best Rval 2 for both validation sets. For predictions that have high bias, improvement of Rval 2 from 0.47 to 0.96 was achieved by selecting the training set lipids lying within the applicability domain. In conclusion, in vivo performance of siRNA nanoparticles was successfully predicted by combining cheminformatics with machine learning techniques.
Collapse
Affiliation(s)
- Abdelkader A. Metwally
- Department of Pharmaceutics, Faculty of Pharmacy, Health Sciences Center, Kuwait University, Kuwait City, Kuwait,Department of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt,*Correspondence: Abdelkader A. Metwally,
| | - Amira A. Nayel
- Clinical Pharmacy Department, Alexandria Ophthalmology Hospital, Alexandria, Egypt,Department of Clinical Pharmacy and Pharmacy Practice, Faculty of Pharmacy, Alexandria University, Alexandria, Egypt
| | - Rania M. Hathout
- Department of Pharmaceutics and Industrial Pharmacy, Faculty of Pharmacy, Ain Shams University, Cairo, Egypt
| |
Collapse
|
6
|
González-Castañeda Y, Marrero-Ponce Y, Guerra JO, Echevarría-Díaz Y, Pérez N, Pérez-Giménez F, Simonet AM, Macías FA, Nogueiras CM, Olazabal E, Serrano H. Computational discovery of novel anthelmintic natural compounds from Agave Brittoniana trel. Spp. Brachypus. BIONATURA 2022. [DOI: 10.21931/rb/2022.07.04.53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Helminth infections are a medical problem in the world nowadays. This report used bond-based 2D quadratic indices, a bond-level QuBiLs-MAS molecular descriptor family, and Linear Discriminant Analysis (LDA) to obtain a quantitative linear model that discriminates between anthelmintic and non-anthelmintic drug-like organic-compounds. The model obtained correctly classified 87.46% and 81.82% of the training and external data sets, respectively. The developed model was used in a virtual screening to predict the biological activity of all chemicals (19) previously obtained and chemically characterized by some authors of this report from Agave brittoniana Trel. spp. Brachypus. The model identified several metabolites (12) as possible anthelmintics, and a group of 5 novel natural products was tested in an in vitro assay against Fasciola hepatica (100% effectivity at 500 µg/mL). Finally, the two best hits were evaluated in vivo in bald/c mice and the same helminth parasite using a 25 mg/kg dose. Compound 8 (Karatavinoside A) showed an efficacy of 92.2% in vivo. It is important to remark that this natural compound exhibits similar-to-superior activity as triclabendazole, the best human fasciolicide available in the market against Fasciola hepatica, resulting in a novel lead scaffold with anti-helminthic activity.
Keywords: TOMOCOMD-CARDD Software; QuBiLs-MAS, nonstochastic and stochastic bond-based quadratic indices; LDA-based QSAR model; Computational Screening, Anthelmintic Agent; Agave brittoniana Trel. spp. Brachypus, Fasciola hepatica.
Collapse
Affiliation(s)
- Yeniel González-Castañeda
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA)
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Jose O. Guerra
- Chemistry Department, Faculty of Chemistry-Pharmacy. Universidad Central “Marta Abreu” de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | - Yunaimy Echevarría-Díaz
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE)
| | - Noel Pérez
- Colegio de Ciencias e Ingenierías “El Politécnico”, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Ana M. Simonet
- Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
| | - Francisco A. Macías
- Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
| | - Clara M. Nogueiras
- Departamento de Química Orgánica, Facultad de Química, Universidad de La Habana
| | - Ervelio Olazabal
- Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
| | - Hector Serrano
- Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
| |
Collapse
|
7
|
Searching glycolate oxidase inhibitors based on QSAR, molecular docking, and molecular dynamic simulation approaches. Sci Rep 2022; 12:19969. [PMID: 36402831 PMCID: PMC9675741 DOI: 10.1038/s41598-022-24196-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 11/11/2022] [Indexed: 11/21/2022] Open
Abstract
Primary hyperoxaluria type 1 (PHT1) treatment is mainly focused on inhibiting the enzyme glycolate oxidase, which plays a pivotal role in the production of glyoxylate, which undergoes oxidation to produce oxalate. When the renal secretion capacity exceeds, calcium oxalate forms stones that accumulate in the kidneys. In this respect, detailed QSAR analysis, molecular docking, and dynamics simulations of a series of inhibitors containing glycolic, glyoxylic, and salicylic acid groups have been performed employing different regression machine learning techniques. Three robust models with less than 9 descriptors-based on a tenfold cross (Q2 CV) and external (Q2 EXT) validation-were found i.e., MLR1 (Q2 CV = 0.893, Q2 EXT = 0.897), RF1 (Q2 CV = 0.889, Q2 EXT = 0.907), and IBK1 (Q2 CV = 0.891, Q2 EXT = 0.907). An ensemble model was built by averaging the predicted pIC50 of the three models, obtaining a Q2 EXT = 0.933. Physicochemical properties such as charge, electronegativity, hardness, softness, van der Waals volume, and polarizability were considered as attributes to build the models. To get more insight into the potential biological activity of the compouds studied herein, docking and dynamic analysis were carried out, finding the hydrophobic and polar residues show important interactions with the ligands. A screening of the DrugBank database V.5.1.7 was performed, leading to the proposal of seven commercial drugs within the applicability domain of the models, that can be suggested as possible PHT1 treatment.
Collapse
|
8
|
Diéguez-Santana K, Casañola-Martin GM, Torres R, Rasulev B, Green JR, González-Díaz H. Machine Learning Study of Metabolic Networks vs ChEMBL Data of Antibacterial Compounds. Mol Pharm 2022; 19:2151-2163. [PMID: 35671399 PMCID: PMC9986951 DOI: 10.1021/acs.molpharmaceut.2c00029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Antibacterial drugs (AD) change the metabolic status of bacteria, contributing to bacterial death. However, antibiotic resistance and the emergence of multidrug-resistant bacteria increase interest in understanding metabolic network (MN) mutations and the interaction of AD vs MN. In this study, we employed the IFPTML = Information Fusion (IF) + Perturbation Theory (PT) + Machine Learning (ML) algorithm on a huge dataset from the ChEMBL database, which contains >155,000 AD assays vs >40 MNs of multiple bacteria species. We built a linear discriminant analysis (LDA) and 17 ML models centered on the linear index and based on atoms to predict antibacterial compounds. The IFPTML-LDA model presented the following results for the training subset: specificity (Sp) = 76% out of 70,000 cases, sensitivity (Sn) = 70%, and Accuracy (Acc) = 73%. The same model also presented the following results for the validation subsets: Sp = 76%, Sn = 70%, and Acc = 73.1%. Among the IFPTML nonlinear models, the k nearest neighbors (KNN) showed the best results with Sn = 99.2%, Sp = 95.5%, Acc = 97.4%, and Area Under Receiver Operating Characteristic (AUROC) = 0.998 in training sets. In the validation series, the Random Forest had the best results: Sn = 93.96% and Sp = 87.02% (AUROC = 0.945). The IFPTML linear and nonlinear models regarding the ADs vs MNs have good statistical parameters, and they could contribute toward finding new metabolic mutations in antibiotic resistance and reducing time/costs in antibacterial drug research.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain.,Universidad Regional Amazónica IKIAM, Tena, Napo 150150, Ecuador
| | - Gerardo M Casañola-Martin
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States.,Department of Systems and Computer Engineering, Carleton University, K1S5B6 Ottawa, Ontario, Canada
| | - Roldan Torres
- Universidad Regional Amazónica IKIAM, Tena, Napo 150150, Ecuador
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, K1S5B6 Ottawa, Ontario, Canada
| | - Humbert González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain.,BIOFISIKA, Basque Center for Biophysics CSIC-UPVEH, 48940 Leioa, Spain.,IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
9
|
In Silico Antiprotozoal Evaluation of 1,4-Naphthoquinone Derivatives against Chagas and Leishmaniasis Diseases Using QSAR, Molecular Docking, and ADME Approaches. Pharmaceuticals (Basel) 2022; 15:ph15060687. [PMID: 35745607 PMCID: PMC9228275 DOI: 10.3390/ph15060687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 05/24/2022] [Accepted: 05/27/2022] [Indexed: 12/04/2022] Open
Abstract
Chagas and leishmaniasis are two neglected diseases considered as public health problems worldwide, for which there is no effective, low-cost, and low-toxicity treatment for the host. Naphthoquinones are ligands with redox properties involved in oxidative biological processes with a wide variety of activities, including antiparasitic. In this work, in silico methods of quantitative structure–activity relationship (QSAR), molecular docking, and calculation of ADME (absorption, distribution, metabolism, and excretion) properties were used to evaluate naphthoquinone derivatives with unknown antiprotozoal activity. QSAR models were developed for predicting antiparasitic activity against Trypanosoma cruzi, Leishmania amazonensis, and Leishmania infatum, as well as the QSAR model for toxicity activity. Most of the evaluated ligands presented high antiparasitic activity. According to the docking results, the family of triazole derivatives presented the best affinity with the different macromolecular targets. The ADME results showed that most of the evaluated compounds present adequate conditions to be administered orally. Naphthoquinone derivatives show good biological activity results, depending on the substituents attached to the quinone ring, and perhaps the potential to be converted into drugs or starting molecules.
Collapse
|
10
|
Halder AK, Moura AS, Cordeiro MNDS. Moving Average-Based Multitasking In Silico Classification Modeling: Where Do We Stand and What Is Next? Int J Mol Sci 2022; 23:ijms23094937. [PMID: 35563327 PMCID: PMC9099502 DOI: 10.3390/ijms23094937] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 04/24/2022] [Accepted: 04/28/2022] [Indexed: 01/27/2023] Open
Abstract
Conventional in silico modeling is often viewed as 'one-target' or 'single-task' computer-aided modeling since it mainly relies on forecasting an endpoint of interest from similar input data. Multitasking or multitarget in silico modeling, in contrast, embraces a set of computational techniques that efficiently integrate multiple types of input data for setting up unique in silico models able to predict the outcome(s) relating to various experimental and/or theoretical conditions. The latter, specifically, based upon the Box-Jenkins moving average approach, has been applied in the last decade to several research fields including drug and materials design, environmental sciences, and nanotechnology. The present review discusses the current status of multitasking computer-aided modeling efforts, meanwhile describing both the existing challenges and future opportunities of its underlying techniques. Some important applications are also discussed to exemplify the ability of multitasking modeling in deriving holistic and reliable in silico classification-based models as well as in designing new chemical entities, either through fragment-based design or virtual screening. Focus will also be given to some software recently developed to automate and accelerate such types of modeling. Overall, this review may serve as a guideline for researchers to grasp the scope of multitasking computer-aided modeling as a promising in silico tool.
Collapse
Affiliation(s)
- Amit Kumar Halder
- LAQV@REQUIMTE, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal; (A.K.H.); (A.S.M.)
- Dr. B. C. Roy College of Pharmacy and Allied Health Sciences, Dr. Meghnad Saha Sarani, Bidhannagar, Durgapur 713212, West Bengal, India
| | - Ana S. Moura
- LAQV@REQUIMTE, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal; (A.K.H.); (A.S.M.)
| | - Maria Natália D. S. Cordeiro
- LAQV@REQUIMTE, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal; (A.K.H.); (A.S.M.)
- Correspondence: ; Tel.: +35-12-2040-2502
| |
Collapse
|
11
|
De Gauquier P, Vanommeslaeghe K, Heyden YV, Mangelings D. Modelling approaches for chiral chromatography on polysaccharide-based and macrocyclic antibiotic chiral selectors: A review. Anal Chim Acta 2022; 1198:338861. [DOI: 10.1016/j.aca.2021.338861] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 07/12/2021] [Accepted: 07/19/2021] [Indexed: 12/25/2022]
|
12
|
PTML Modeling for Pancreatic Cancer Research: In Silico Design of Simultaneous Multi-Protein and Multi-Cell Inhibitors. Biomedicines 2022; 10:biomedicines10020491. [PMID: 35203699 PMCID: PMC8962338 DOI: 10.3390/biomedicines10020491] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 02/10/2022] [Accepted: 02/15/2022] [Indexed: 02/07/2023] Open
Abstract
Pancreatic cancer (PANC) is a dangerous type of cancer that is a major cause of mortality worldwide and exhibits a remarkably poor prognosis. To date, discovering anti-PANC agents remains a very complex and expensive process. Computational approaches can accelerate the search for anti-PANC agents. We report for the first time two models that combined perturbation theory with machine learning via a multilayer perceptron network (PTML-MLP) to perform the virtual design and prediction of molecules that can simultaneously inhibit multiple PANC cell lines and PANC-related proteins, such as caspase-1, tumor necrosis factor-alpha (TNF-alpha), and the insulin-like growth factor 1 receptor (IGF1R). Both PTML-MLP models exhibited accuracies higher than 78%. Using the interpretation from one of the PTML-MLP models as a guideline, we extracted different molecular fragments desirable for the inhibition of the PANC cell lines and the aforementioned PANC-related proteins and then assembled some of those fragments to form three new molecules. The two PTML-MLP models predicted the designed molecules as potentially versatile anti-PANC agents through inhibition of the three PANC-related proteins and multiple PANC cell lines. Conclusions: This work opens new horizons for the application of the PTML modeling methodology to anticancer research.
Collapse
|
13
|
Liu J, Guo W, Sakkiah S, Ji Z, Yavas G, Zou W, Chen M, Tong W, Patterson TA, Hong H. Machine Learning Models for Predicting Liver Toxicity. Methods Mol Biol 2022; 2425:393-415. [PMID: 35188640 DOI: 10.1007/978-1-0716-1960-5_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Liver toxicity is a major adverse drug reaction that accounts for drug failure in clinical trials and withdrawal from the market. Therefore, predicting potential liver toxicity at an early stage in drug discovery is crucial to reduce costs and the potential for drug failure. However, current in vivo animal toxicity testing is very expensive and time consuming. As an alternative approach, various machine learning models have been developed to predict potential liver toxicity in humans. This chapter reviews current advances in the development and application of machine learning models for prediction of potential liver toxicity in humans and discusses possible improvements to liver toxicity prediction.
Collapse
Affiliation(s)
- Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Sugunadevi Sakkiah
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Zuowei Ji
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Gokhan Yavas
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wen Zou
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Minjun Chen
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
14
|
Quevedo-Tumailli V, Ortega-Tenezaca B, González-Díaz H. IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds. Int J Mol Sci 2021; 22:ijms222313066. [PMID: 34884870 PMCID: PMC8657696 DOI: 10.3390/ijms222313066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/23/2021] [Accepted: 11/24/2021] [Indexed: 11/16/2022] Open
Abstract
The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information—Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (cassayj = caj and cdataj = cdj) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (caj) or about the nature and quality of data (cdj). These categorical variables include information about 22 parameters of biological activity (ca0), 28 target proteins (ca1), and 9 organisms of assay (ca2), etc. We also created another partition of (cprotj = cpj) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (cp0), 10 chromosomes (cp1), gene orientation (cp2), and 31 protein functions (cp3). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon’s entropy measure Shk (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of Plasmodium.
Collapse
Affiliation(s)
- Viviana Quevedo-Tumailli
- Grupo RNASA-IMEDIR, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain; (V.Q.-T.); (B.O.-T.)
- Research Department, Puyo Campus, Universidad Estatal Amazónica, Puyo 160150, Ecuador
| | - Bernabe Ortega-Tenezaca
- Grupo RNASA-IMEDIR, Department of Computer Science, University of A Coruña, 15071 A Coruña, Spain; (V.Q.-T.); (B.O.-T.)
- Information and Communications Technology Management Department, Puyo Campus, Universidad Estatal Amazónica, Puyo 160150, Ecuador
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, 48940 Leioa, Spain
- BIOFISIKA, Basque Centre for Biophysics, CSIC-UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
- Correspondence: ;Tel.: +34-94-601-3547
| |
Collapse
|
15
|
Saavedra LM, Duchowicz PR. Predicting zebrafish (Danio rerio) embryo developmental toxicity through a non-conformational QSAR approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021; 796:148820. [PMID: 34328907 DOI: 10.1016/j.scitotenv.2021.148820] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 06/11/2021] [Accepted: 06/29/2021] [Indexed: 06/13/2023]
Abstract
For many years, the frequent use of synthetic chemicals in the manufacture of veterinary drugs and plague control products has raised negative effects on human health and other non-target organisms, promoting the need to employ a practical and suitable methodology for early risk identification of several thousand commercial compounds. The zebrafish (Danio rerio) embryo has been emerged as one sustainable animal model for measuring developmental toxicity, an endpoint that is included in the regulatory procedures to approve chemicals, avoiding conventional and costly toxicity assays based on animal testing. In this context, the Quantitative Structure-Activity Relationships (QSAR) theory is applied to develop a predictive model based on a well-defined zebrafish embryo developmental toxicity database reported by the ToxCast™ Phase I chemical library of the Environmental Protection Agency (U.S. EPA). By means of four freely available softwares, a set with 28,038 non-conformational descriptors that encode the largest amount of permanent structural features are readily calculated. The Replacement Method (RM) variable subset selection technique provided the best regression models. Thereby, a linear QSAR model with proper statistical quality (Rtrain2 = 0.64, RMSEtrain = 0.49) is established in agreement with the Organization for Economic Co-operation and Development principles, accomplishing each internal (loo, l15 % o, VIF and Y-randomization) and external (Rtest2,Rm2, QF12, QF22, QF32 and CCC) validation criterion. The present QSAR approach provides a useful computational tool to estimate zebrafish developmental toxicity of new, untasted or hypothetical compounds, and it can contribute to the general lack of QSAR models in the literature to predict this endpoint.
Collapse
Affiliation(s)
- Laura M Saavedra
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, Diag. 113 y 64, C.C. 16, Sucursal 4, 1900 La Plata, Argentina.
| | - Pablo R Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, Diag. 113 y 64, C.C. 16, Sucursal 4, 1900 La Plata, Argentina.
| |
Collapse
|
16
|
Calle L, Marrero-Ponce Y, Mora JR. Molecular simulation of the (GPx)-like antioxidant activity of ebselen derivatives through machine learning techniques. MOLECULAR SIMULATION 2021. [DOI: 10.1080/08927022.2021.1975039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Luis Calle
- Facultad de Ciencias Médicas, Instituto de Investigación e Innovación en Salud Integral (ISAIN), Universidad Católica Santiago de Guayaquil, Guayaquil, Ecuador
- Faculty of Pharmacy, University of Granada, Granada, Spain
| | - Yovani Marrero-Ponce
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Quito, Ecuador
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Universidad San Francisco de Quito, Quito, Ecuador
| | - José R. Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito, Quito, Ecuador
| |
Collapse
|
17
|
Computational Drug Repurposing for Antituberculosis Therapy: Discovery of Multi-Strain Inhibitors. Antibiotics (Basel) 2021; 10:antibiotics10081005. [PMID: 34439055 PMCID: PMC8388932 DOI: 10.3390/antibiotics10081005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 08/15/2021] [Accepted: 08/17/2021] [Indexed: 12/13/2022] Open
Abstract
Tuberculosis remains the most afflicting infectious disease known by humankind, with one quarter of the population estimated to have it in the latent state. Discovering antituberculosis drugs is a challenging, complex, expensive, and time-consuming task. To overcome the substantial costs and accelerate drug discovery and development, drug repurposing has emerged as an attractive alternative to find new applications for “old” drugs and where computational approaches play an essential role by filtering the chemical space. This work reports the first multi-condition model based on quantitative structure–activity relationships and an ensemble of neural networks (mtc-QSAR-EL) for the virtual screening of potential antituberculosis agents able to act as multi-strain inhibitors. The mtc-QSAR-EL model exhibited an accuracy higher than 85%. A physicochemical and fragment-based structural interpretation of this model was provided, and a large dataset of agency-regulated chemicals was virtually screened, with the mtc-QSAR-EL model identifying already proven antituberculosis drugs while proposing chemicals with great potential to be experimentally repurposed as antituberculosis (multi-strain inhibitors) agents. Some of the most promising molecules identified by the mtc-QSAR-EL model as antituberculosis agents were also confirmed by another computational approach, supporting the capabilities of the mtc-QSAR-EL model as an efficient tool for computational drug repurposing.
Collapse
|
18
|
Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021; 19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open
Abstract
Drug discovery aims at finding new compounds with specific chemical properties for the treatment of diseases. In the last years, the approach used in this search presents an important component in computer science with the skyrocketing of machine learning techniques due to its democratization. With the objectives set by the Precision Medicine initiative and the new challenges generated, it is necessary to establish robust, standard and reproducible computational methodologies to achieve the objectives set. Currently, predictive models based on Machine Learning have gained great importance in the step prior to preclinical studies. This stage manages to drastically reduce costs and research times in the discovery of new drugs. This review article focuses on how these new methodologies are being used in recent years of research. Analyzing the state of the art in this field will give us an idea of where cheminformatics will be developed in the short term, the limitations it presents and the positive results it has achieved. This review will focus mainly on the methods used to model the molecular data, as well as the biological problems addressed and the Machine Learning algorithms used for drug discovery in recent years.
Collapse
Key Words
- ADMET, Absorption, distribution, metabolism, elimination and toxicity
- ADR, Adverse Drug Reaction
- AI, Artificial Intelligence
- ANN, Artificial Neural Networks
- APFP, Atom Pairs 2d FingerPrint
- AUC, Area under the Curve
- BBB, Blood–Brain barrier
- CDK, Chemical Development Kit
- CNN, Convolutional Neural Networks
- CNS, Central Nervous System
- CPI, Compound-protein interaction
- CV, Cross Validation
- Cheminformatics
- DL, Deep Learning
- DNA, Deoxyribonucleic acid
- Deep Learning
- Drug Discovery
- ECFP, Extended Connectivity Fingerprints
- FDA, Food and Drug Administration
- FNN, Fully Connected Neural Networks
- FP, Fringerprints
- FS, Feature Selection
- GCN, Graph Convolutional Networks
- GEO, Gene Expression Omnibus
- GNN, Graph Neural Networks
- GO, Gene Ontology
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MACCS, Molecular ACCess System
- MCC, Matthews correlation coefficient
- MD, Molecular Descriptors
- MKL, Multiple Kernel Learning
- ML, Machine Learning
- Machine Learning
- Molecular Descriptors
- NB, Naive Bayes
- OOB, Out of Bag
- PCA, Principal Component Analyisis
- QSAR
- QSAR, Quantitative structure–activity relationship
- RF, Random Forest
- RNA, Ribonucleic Acid
- SMILES, simplified molecular-input line-entry system
- SVM, Support Vector Machines
- TCGA, The Cancer Genome Atlas
- WHO, World Health Organization
- t-SNE, t-Distributed Stochastic Neighbor Embedding
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
| | - Nereida Rodríguez-Fernández
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco Cedrón
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Francisco J. Novoa
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Adrian Carballal
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
| | - Victor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
- Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
19
|
Tripathi MK, Nath A, Singh TP, Ethayathulla AS, Kaur P. Evolving scenario of big data and Artificial Intelligence (AI) in drug discovery. Mol Divers 2021; 25:1439-1460. [PMID: 34159484 PMCID: PMC8219515 DOI: 10.1007/s11030-021-10256-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/14/2021] [Indexed: 12/24/2022]
Abstract
The accumulation of massive data in the plethora of Cheminformatics databases has made the role of big data and artificial intelligence (AI) indispensable in drug design. This has necessitated the development of newer algorithms and architectures to mine these databases and fulfil the specific needs of various drug discovery processes such as virtual drug screening, de novo molecule design and discovery in this big data era. The development of deep learning neural networks and their variants with the corresponding increase in chemical data has resulted in a paradigm shift in information mining pertaining to the chemical space. The present review summarizes the role of big data and AI techniques currently being implemented to satisfy the ever-increasing research demands in drug discovery pipelines.
Collapse
Affiliation(s)
- Manish Kumar Tripathi
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India
| | - Abhigyan Nath
- Department of Biochemistry, Pt. Jawahar Lal Nehru Memorial Medical College, Raipur, 492001, India
| | - Tej P Singh
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India
| | - A S Ethayathulla
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India
| | - Punit Kaur
- Department of Biophysics, All India Institute of Medical Sciences, New Delhi, 110029, India.
| |
Collapse
|
20
|
Duchowicz PR, Bennardi DO, Ortiz EV, Comelli NC. QSAR models for insecticidal properties of plant essential oils on the housefly ( Musca domestica L.). SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2021; 32:395-410. [PMID: 33870800 DOI: 10.1080/1062936x.2021.1905711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/16/2021] [Indexed: 06/12/2023]
Abstract
The fumigant and topical activities exhibited by 27 plant-derived essentials oils (EOs) on adult M. domestica housefly are predicted through the Quantitative Structure-Activity Relationship (QSAR) theory. These molecular structure based calculations are performed on 253 structurally diverse compounds from the EOs, where the number of constituents in each essential oil mixture varies between 2 to 24. A large number of 86,048 non-conformational mixture descriptors are derived as linear combinations of the molecular descriptors of the EO components. Two strategies are compared for the mixture descriptor formulation, which consider or avoid the use of the chemical composition. The multivariable linear regression QSAR models of the present work are useful for fumigant and topical applications, describing predictive parallelisms for the insecticidal activity of the analysed complex mixtures.
Collapse
Affiliation(s)
- P R Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, La Plata, Argentina
| | - D O Bennardi
- Cátedra de Química Orgánica, Facultad de Ciencias Agrarias y Forestales, La Plata, Argentina
| | - E V Ortiz
- Instituto de Monitoreo y Control de la Degradación Geoambiental (IMCoDeG), CONICET, Facultad de Tecnología y Ciencias Aplicadas, Universidad Nacional de Catamarca, Catamarca, Argentina
| | - N C Comelli
- Centro de Investigaciones y Transferencia de Catamarca (CITCA), CONICET, Universidad Nacional de Catamarca, Catamarca, Argentina
- Facultad de Ciencias Agrarias, Universidad Nacional de Catamarca, Catamarca, Argentina
| |
Collapse
|
21
|
Halder AK, Dias Soeiro Cordeiro MN. QSAR-Co-X: an open source toolkit for multitarget QSAR modelling. J Cheminform 2021; 13:29. [PMID: 33858509 PMCID: PMC8048082 DOI: 10.1186/s13321-021-00508-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Accepted: 03/31/2021] [Indexed: 12/02/2022] Open
Abstract
Quantitative structure activity relationships (QSAR) modelling is a well-known computational tool, often used in a wide variety of applications. Yet one of the major drawbacks of conventional QSAR modelling is that models are set up based on a limited number of experimental and/or theoretical conditions. To overcome this, the so-called multitasking or multitarget QSAR (mt-QSAR) approaches have emerged as new computational tools able to integrate diverse chemical and biological data into a single model equation, thus extending and improving the reliability of this type of modelling. We have developed QSAR-Co-X, an open source python–based toolkit (available to download at https://github.com/ncordeirfcup/QSAR-Co-X) for supporting mt-QSAR modelling following the Box-Jenkins moving average approach. The new toolkit embodies several functionalities for dataset selection and curation plus computation of descriptors, for setting up linear and non-linear models, as well as for a comprehensive results analysis. The workflow within this toolkit is guided by a cohort of multiple statistical parameters and graphical outputs onwards assessing both the predictivity and the robustness of the derived mt-QSAR models. To monitor and demonstrate the functionalities of the designed toolkit, four case-studies pertaining to previously reported datasets are examined here. We believe that this new toolkit, along with our previously launched QSAR-Co code, will significantly contribute to make mt-QSAR modelling widely and routinely applicable. ![]()
Collapse
Affiliation(s)
- Amit Kumar Halder
- LAQV@REQUIMTE/Faculty of Sciences, University of Porto, 4169-007, Porto, Portugal.
| | | |
Collapse
|
22
|
Structure Driven Prediction of Chromatographic Retention Times: Applications to Pharmaceutical Analysis. Int J Mol Sci 2021; 22:ijms22083848. [PMID: 33917733 PMCID: PMC8068189 DOI: 10.3390/ijms22083848] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 04/04/2021] [Accepted: 04/06/2021] [Indexed: 11/17/2022] Open
Abstract
Pharmaceutical drug development relies heavily on the use of Reversed-Phase Liquid Chromatography methods. These methods are used to characterize active pharmaceutical ingredients and drug products by separating the main component from related substances such as process related impurities or main component degradation products. The results presented here indicate that retention models based on Quantitative Structure Retention Relationships can be used for de-risking methods used in pharmaceutical analysis and for the identification of optimal conditions for separation of known sample constituents from postulated/hypothetical components. The prediction of retention times for hypothetical components in established methods is highly valuable as these compounds are not usually readily available for analysis. Here we discuss the development and optimization of retention models, selection of the most relevant structural molecular descriptors, regression model building and validation. We also present a practical example applied to chromatographic method development and discuss the accuracy of these models on selection of optimal separation parameters.
Collapse
|
23
|
Kleandrova VV, Scotti L, Bezerra Mendonça Junior FJ, Muratov E, Scotti MT, Speck-Planche A. QSAR Modeling for Multi-Target Drug Discovery: Designing Simultaneous Inhibitors of Proteins in Diverse Pathogenic Parasites. Front Chem 2021; 9:634663. [PMID: 33777898 PMCID: PMC7987820 DOI: 10.3389/fchem.2021.634663] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Accepted: 01/22/2021] [Indexed: 11/21/2022] Open
Abstract
Parasitic diseases remain as unresolved health issues worldwide. While for some parasites the treatments involve drug combinations with serious side effects, for others, chemical therapies are inefficient due to the emergence of drug resistance. This urges the search for novel antiparasitic agents able to act through multiple mechanisms of action. Here, we report the first multi-target model based on quantitative structure-activity relationships and a multilayer perceptron neural network (mt-QSAR-MLP) to virtually design and predict versatile inhibitors of proteins involved in the survival and/or infectivity of different pathogenic parasites. The mt-QSAR-MLP model exhibited high accuracy (>80%) in both training and test sets for the classification/prediction of protein inhibitors. Several fragments were directly extracted from the physicochemical and structural interpretations of the molecular descriptors in the mt-QSAR-MLP model. Such interpretations enabled the generation of four molecules that were predicted as multi-target inhibitors against at least three of the five parasitic proteins reported here with two of the molecules being predicted to inhibit all the proteins. Docking calculations converged with the mt-QSAR-MLP model regarding the multi-target profile of the designed molecules. The designed molecules exhibited drug-like properties, complying with Lipinski’s rule of five, as well as Ghose’s filter and Veber’s guidelines.
Collapse
Affiliation(s)
- Valeria V Kleandrova
- Laboratory of Fundamental and Applied Research of Quality and Technology of Food Production, Moscow State University of Food Production, Moscow, Russian Federation
| | - Luciana Scotti
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba, João Pessoa, Brazil
| | | | - Eugene Muratov
- Laboratory for Molecular Modeling, The UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC, United States
| | - Marcus T Scotti
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba, João Pessoa, Brazil
| | - Alejandro Speck-Planche
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba, João Pessoa, Brazil
| |
Collapse
|
24
|
Cuesta SA, Mora JR, Márquez EA. In Silico Screening of the DrugBank Database to Search for Possible Drugs against SARS-CoV-2. Molecules 2021; 26:1100. [PMID: 33669720 PMCID: PMC7923184 DOI: 10.3390/molecules26041100] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 02/11/2021] [Accepted: 02/16/2021] [Indexed: 12/29/2022] Open
Abstract
Coronavirus desease 2019 (COVID-19) is responsible for more than 1.80 M deaths worldwide. A Quantitative Structure-Activity Relationships (QSAR) model is developed based on experimental pIC50 values reported for a structurally diverse dataset. A robust model with only five descriptors is found, with values of R2 = 0.897, Q2LOO = 0.854, and Q2ext = 0.876 and complying with all the parameters established in the validation Tropsha's test. The analysis of the applicability domain (AD) reveals coverage of about 90% for the external test set. Docking and molecular dynamic analysis are performed on the three most relevant biological targets for SARS-CoV-2: main protease, papain-like protease, and RNA-dependent RNA polymerase. A screening of the DrugBank database is executed, predicting the pIC50 value of 6664 drugs, which are IN the AD of the model (coverage = 79%). Fifty-seven possible potent anti-COVID-19 candidates with pIC50 values > 6.6 are identified, and based on a pharmacophore modelling analysis, four compounds of this set can be suggested as potent candidates to be potential inhibitors of SARS-CoV-2. Finally, the biological activity of the compounds was related to the frontier molecular orbitals shapes.
Collapse
Affiliation(s)
- Sebastián A. Cuesta
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Colegio Politécnico, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador;
| | - José R. Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Colegio Politécnico, Universidad San Francisco de Quito, Diego de Robles y Vía Interoceánica, Quito 170901, Ecuador;
| | - Edgar A. Márquez
- Grupo de Investigaciones en Química y Biología, Departamento de Química y Biología, Facultad de Ciencias Exactas, Universidad del Norte, Carrera 51B, Km 5, vía Puerto Colombia, Barranquilla 081007, Colombia
| |
Collapse
|
25
|
QSAR models for the fumigant activity prediction of essential oils. J Mol Graph Model 2020; 101:107751. [DOI: 10.1016/j.jmgm.2020.107751] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 08/20/2020] [Accepted: 09/04/2020] [Indexed: 12/23/2022]
|
26
|
Kleandrova VV, Scotti MT, Scotti L, Nayarisseri A, Speck-Planche A. Cell-based multi-target QSAR model for design of virtual versatile inhibitors of liver cancer cell lines. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:815-836. [PMID: 32967475 DOI: 10.1080/1062936x.2020.1818617] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 08/31/2020] [Indexed: 06/11/2023]
Abstract
Liver cancers are one of the leading fatal diseases among malignant neoplasms. Current chemotherapeutic treatments used to fight these illnesses have become less efficient in terms of both efficacy and safety. Therefore, there is a great need of search for new anti-liver cancer agents and this can be accelerated by using computer-aided drug discovery approaches. In this work, we report the development of the first cell-based multi-target model based on quantitative structure-activity relationships (CBMT-QSAR) for the design and prediction of chemicals as anticancer agents against 17 liver cancer cell lines. While having a good quality and predictive power (accuracy higher than 80%) in the training and test sets, respectively, the CBMT-QSAR model was employed as a tool to directly extract suitable fragments from the physicochemical and structural interpretations of the molecular descriptors. Some of these desirable fragments were assembled, leading to the virtual design of eight molecules with drug-like properties, with six of them being predicted as versatile anticancer agents against the 17 liver cancer cell lines reported here.
Collapse
Affiliation(s)
- V V Kleandrova
- Laboratory of Fundamental and Applied Research of Quality and Technology of Food Production, Moscow State University of Food Production , Moscow, Russian Federation
| | - M T Scotti
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba , João Pessoa, Brazil
| | - L Scotti
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba , João Pessoa, Brazil
| | - A Nayarisseri
- In Silico Research Laboratory, Eminent Biosciences , Indore, Madhya Pradesh, India
| | - A Speck-Planche
- Postgraduate Program in Natural and Synthetic Bioactive Products, Federal University of Paraíba , João Pessoa, Brazil
| |
Collapse
|
27
|
Barigye SJ, Gómez-Ganau S, Serrano-Candelas E, Gozalbes R. PeptiDesCalculator: Software for computation of peptide descriptors. Definition, implementation and case studies for 9 bioactivity endpoints. Proteins 2020; 89:174-184. [PMID: 32881068 DOI: 10.1002/prot.26003] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 08/05/2020] [Accepted: 08/27/2020] [Indexed: 11/09/2022]
Abstract
We present a novel Java-based program denominated PeptiDesCalculator for computing peptide descriptors. These descriptors include: redefinitions of known protein parameters to suite the peptide domain, generalization schemes for the global descriptions of peptide characteristics, as well as empirical descriptors based on experimental evidence on peptide stability and interaction propensity. The PeptiDesCalculator software provides a user-friendly Graphical User Interface (GUI) and is parallelized to maximize the use of computational resources available in current work stations. The PeptiDesCalculator indices are employed in modeling 8 peptide bioactivity endpoints demonstrating satisfactory behavior. Moreover, we compare the performance of a support vector machine (SVM) classifier built using 15 PeptiDesCalculator indices with that of a recently reported deep neural network (DNN) antimicrobial activity classifier, demonstrating comparable test set performance notwithstanding the remarkably lower degree of freedom for the former. This software will facilitate the development of in silico models for the prediction of peptide properties.
Collapse
Affiliation(s)
- Stephen J Barigye
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,MolDrug AI Systems SL, Valencia, Spain
| | - Sergi Gómez-Ganau
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,Eurofins Agroscience Services Regulatory Spain SL, Valencia, Spain
| | - Eva Serrano-Candelas
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain
| | - Rafael Gozalbes
- ProtoQSAR SL, Centro Europeo de Empresas Innovadoras (CEEI), Parque Tecnológico de Valencia, Valencia, Spain.,MolDrug AI Systems SL, Valencia, Spain
| |
Collapse
|
28
|
Kleandrova VV, Speck-Planche A. PTML Modeling for Alzheimer’s Disease: Design and Prediction of Virtual Multi-Target Inhibitors of GSK3B, HDAC1, and HDAC6. Curr Top Med Chem 2020; 20:1661-1676. [DOI: 10.2174/1568026620666200607190951] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/12/2019] [Accepted: 01/05/2020] [Indexed: 01/23/2023]
Abstract
Background:
Alzheimer’s disease is characterized by a progressive pattern of cognitive and
functional impairment, which ultimately leads to death. Computational approaches have played an important
role in the context of drug discovery for anti-Alzheimer's therapies. However, most of the computational
models reported to date have been focused on only one protein associated with Alzheimer's,
while relying on small datasets of structurally related molecules.
Objective:
We introduce the first model combining perturbation theory and machine learning based on
artificial neural networks (PTML-ANN) for simultaneous prediction and design of inhibitors of three
Alzheimer’s disease-related proteins, namely glycogen synthase kinase 3 beta (GSK3B), histone deacetylase
1 (HDAC1), and histone deacetylase 6 (HDAC6).
Methods:
The PTML-ANN model was obtained from a dataset retrieved from ChEMBL, and it relied on
a classification approach to predict chemicals as active or inactive.
Results:
The PTML-ANN model displayed sensitivity and specificity higher than 85% in both training
and test sets. The physicochemical and structural interpretation of the molecular descriptors in the model
permitted the direct extraction of fragments suggested to favorably contribute to enhancing the multitarget
inhibitory activity. Based on this information, we assembled ten molecules from several fragments
with positive contributions. Seven of these molecules were predicted as triple target inhibitors while the
remaining three were predicted as dual-target inhibitors. The estimated physicochemical properties of
the designed molecules complied with Lipinski’s rule of five and its variants.
Conclusion:
This work opens new horizons toward the design of multi-target inhibitors for anti- Alzheimer's
therapies.
Collapse
Affiliation(s)
- Valeria V. Kleandrova
- Laboratory of Fundamental and Applied Research of Quality and Technology of Food Production, Moscow State University of Food Production, Volokolamskoe Shosse 11, 125080, Moscow, Russian Federation
| | - Alejandro Speck-Planche
- Programa Institucional de Fomento a la Investigacion, Desarrollo e Innovacion, Universidad Tecnologica Metropolitana, Ignacio Valdivieso 2409, P.O. Box 8940577, San Joaquin, Santiago, Chile
| |
Collapse
|
29
|
Mora JR, Marrero-Ponce Y, García-Jacas CR, Suarez Causado A. Ensemble Models Based on QuBiLS-MAS Features and Shallow Learning for the Prediction of Drug-Induced Liver Toxicity: Improving Deep Learning and Traditional Approaches. Chem Res Toxicol 2020; 33:1855-1873. [PMID: 32406679 DOI: 10.1021/acs.chemrestox.0c00030] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Drug-induced liver injury (DILI) is a key safety issue in the drug discovery pipeline and a regulatory concern. Thus, many in silico tools have been proposed to improve the hepatotoxicity prediction of organic-type chemicals. Here, classifiers for the prediction of DILI were developed by using QuBiLS-MAS 0-2.5D molecular descriptors and shallow machine learning techniques, on a training set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the learning series, namely, the following: accuracy = 0.840, sensibility = 0.890, specificity = 0.761, Matthew's correlation coefficient = 0.660, and area under the ROC curve = 0.904. The model was also satisfactorily evaluated with Y-scrambling test, and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external validation was also carried out by using two test sets and five external test sets, with an average accuracy value equal to 0.854 (±0.062) and a coverage equal to 98.4% according to its applicability domain. A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPredictor Software, Deep Learning DILIserver, and Vslead) reported in the literature, was also performed. In general, E13 presented the best global performance in all experiments. The sum of the ranking differences procedure provided a very similar grouping pattern to that of the M-ANOVA statistical analysis, where E13 was identified as the best model for DILI predictions. A noncommercial and fully cross-platform software for the DILI prediction was also developed, which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for the screening of seven data sets, containing natural products, leads, toxic materials, and FDA approved drugs, to assess the usefulness of the QSAR models in the DILI labeling of organic substances; it was found that 50-92% of the evaluated molecules are positive-DILI compounds. All in all, it can be stated that the E13 model is a relevant method for the prediction of DILI risk in humans, as it shows the best results among all of the methods analyzed.
Collapse
Affiliation(s)
- Jose R Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito (USFQ), Quito 17-1200-841, Ecuador.,Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y Vía Interoceánica, Quito 17-1200-841, Ecuador
| | - Yovani Marrero-Ponce
- Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y Vía Interoceánica, Quito 17-1200-841, Ecuador.,Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, and Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y vía Interoceánica, Quito, Pichincha 170157, Ecuador
| | - César R García-Jacas
- Cátedras Conacyt-Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - Amileth Suarez Causado
- Grupo de Investigación Prometeus & Biomedicina Aplicada a las Ciencias Clínicas, Área de Bioquímica, Campus de Zaragocilla, Facultad de Medicina, Universidad de Cartagena, Cartagena de Indias 130001, Colombia
| |
Collapse
|
30
|
Toropova AP, Duchowicz PR, Saavedra LM, Castro EA, Toropov AA. The Use of the Index of Ideality of Correlation to Build Up Models for Bioconcentration Factor. Mol Inform 2020; 39:e1900070. [DOI: 10.1002/minf.201900070] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 12/24/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Alla P. Toropova
- Laboratory of Environmental Chemistry and ToxicologyDepartment of Environmental Health ScienceIstituto di Ricerche Farmacologiche Mario Negri IRCCS Via La Masa 19 20156 Milano Italy
| | - Pablo R. Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA)CONICETUNLPDiag. 113 y 64C.C. 16 Sucursal 4 1900 La Plata Argentina
| | - Laura M. Saavedra
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA)CONICETUNLPDiag. 113 y 64C.C. 16 Sucursal 4 1900 La Plata Argentina
| | - Eduardo A. Castro
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA)CONICETUNLPDiag. 113 y 64C.C. 16 Sucursal 4 1900 La Plata Argentina
| | - Andrey A. Toropov
- Laboratory of Environmental Chemistry and ToxicologyDepartment of Environmental Health ScienceIstituto di Ricerche Farmacologiche Mario Negri IRCCS Via La Masa 19 20156 Milano Italy
| |
Collapse
|
31
|
Fioressi SE, Bacelo DE, Aranda JF, Duchowicz PR. Prediction of the aqueous solubility of diverse compounds by 2D-QSPR. J Mol Liq 2020. [DOI: 10.1016/j.molliq.2020.112572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
32
|
Contreras-Torres E, Marrero-Ponce Y, Terán JE, García-Jacas CR, Brizuela CA, Sánchez-Rodríguez JC. MuLiMs-MCoMPAs: A Novel Multiplatform Framework to Compute Tensor Algebra-Based Three-Dimensional Protein Descriptors. J Chem Inf Model 2020; 60:1042-1059. [PMID: 31663741 DOI: 10.1021/acs.jcim.9b00629] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
This report introduces the MuLiMs-MCoMPAs software (acronym for Multi-Linear Maps based on N-Metric and Contact Matrices of 3D Protein and Amino-acid weightings), designed to compute tensor-based 3D protein structural descriptors by applying two- and three-linear algebraic forms. Moreover, these descriptors contemplate generalizing components such as novel 3D protein structural representations, (dis)similarity metrics, and multimetrics to extract geometrical related information between two and three amino acids, weighting schemes based on amino acid properties, matrix normalization procedures that consider simple-stochastic and mutual probability transformations, topological and geometrical cutoffs, amino acid, and group-based MD calculations, and aggregation operators for merging amino acidic and group MDs. The MuLiMs-MCoMPAs software, which belongs to the ToMoCoMD-CAMPS suite, was developed in Java (version 1.8) using the Chemistry Development Kit (CDK) (version 1.4.19) and the Jmol libraries. This software implemented a divide-and-conquer strategy to parallelize the computation of the indices as well as modules for data preprocessing and batch computing functionalities. Furthermore, it consists of two components: (i) a desktop-graphical user interface (GUI) and (ii) an API library. The relevance of this novel approach is demonstrated through two analyses that considered Shannon's entropy-based variability and a principal component analysis. These studies showed that the MuLiMs-MCoMPAs' three-linear descriptor family contains higher informational entropy than several other descriptors generated with available computation tools. Moreover, the MuLiMs-MCoMPAs indices capture additional orthogonal information to the one codified by the available calculation approaches. As a result, two sets of suggested theoretical configurations that contain 13648 two-linear indices and 20263 three-linear indices are available for download at tomocomd.com . Furthermore, as a demonstration of the applicability and easy integration of the MuLiMs library into a QSAR-based expert system, a software application (ProStAF) was generated to predict SCOP protein structural classes and folding rate. It can thus be anticipated that the MuLiMs-MCoMPAs framework will turn into a valuable contribution to the chem- and bioinformatics research fields.
Collapse
Affiliation(s)
- Ernesto Contreras-Torres
- Computer-Aided Molecular "Biosilico" Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Cumbayá, Quito , Ecuador.,Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ) , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha , Ecuador
| | - Yovani Marrero-Ponce
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ) , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha , Ecuador.,Grupo GINUMED, Facultad de Salud, Programa de Medicina , Corporacion Universitaria Rafal Nuñez , Cartagena , Colombia.,Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia , Universitat de València , 46010 Valéncia , Spain
| | - Julio E Terán
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ) , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha , Ecuador.,Grupo de Química Computacional y Teórica, Departamento de Ingeniería Química , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha Ecuador
| | - César R García-Jacas
- Cátedras Conacyt-Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE) , Ensenada , Baja California , México
| | - Carlos A Brizuela
- Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE) , Ensenada , Baja California , México
| | | |
Collapse
|
33
|
Saavedra LM, Romanelli GP, Duchowicz PR. A non-conformational QSAR study for plant-derived larvicides against Zika Aedes aegypti L. vector. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020; 27:6205-6214. [PMID: 31865579 DOI: 10.1007/s11356-019-06630-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 09/25/2019] [Indexed: 06/10/2023]
Abstract
A set of 263 plant-derived compounds with larvicidal activity against Aedes aegypti L. (Diptera: Culicidae) vector is collected from the literature, and is studied by means of a non-conformational quantitative structure-activity relationships (QSAR) approach. The balanced subsets method (BSM) is employed to split the complete dataset into training, validation and test sets. From 26,775 freely available molecular descriptors, the most relevant structural features of compounds affecting the bioactivity are taken. The molecular descriptors are calculated through four different freewares, such as PaDEL, Mold2, EPI Suite and QuBiLs-MAS. The replacement method (RM) variable subset selection technique leads to the best linear regression models. A successful QSAR equation involves 7-conformation-independent molecular descriptors, fulfiling the evaluated internal (loo, l30%o, VIF and Y-randomization) and external (test set with Ntest = 65 compounds) validation criteria. The practical application of this QSAR model reveals promising predicted values for some natural compounds with unknown experimental larvicidal activity. Therefore, the present model constitutes the first one based on a large molecular set, being a useful computational tool for identifying and guiding the synthesis of new active molecules inspired by natural products.
Collapse
Affiliation(s)
- Laura M Saavedra
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, Diag. 113 y 64, C.C. 16, Sucursal 4, 1900, La Plata, Argentina.
| | - Gustavo P Romanelli
- Departamento de Química, Facultad de Ciencias Exactas, CONICET, UNLP, Centro de Investigación y Desarrollo en Ciencias Aplicadas "Dr. J.J. Ronco" (CINDECA), Calle 47 No. 257, B1900AJK, La Plata, Argentina
- Cátedra de Química Orgánica, Centro de Investigación en Sanidad Vegetal (CISaV), Facultad de Ciencias Agrarias y Forestales, Universidad Nacional de La Plata, Calles 60 y 119 s/n, B1904AAN, La Plata, Argentina
| | - Pablo R Duchowicz
- Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA), CONICET, UNLP, Diag. 113 y 64, C.C. 16, Sucursal 4, 1900, La Plata, Argentina.
| |
Collapse
|
34
|
Duchowicz PR, Aranda JF, Bacelo DE, Fioressi SE. QSPR study of the Henry’s law constant for heterogeneous compounds. Chem Eng Res Des 2020. [DOI: 10.1016/j.cherd.2019.12.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
35
|
Marrero-Ponce Y, Teran JE, Contreras-Torres E, García-Jacas CR, Perez-Castillo Y, Cubillan N, Peréz-Giménez F, Valdés-Martini JR. LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs. J Theor Biol 2020; 485:110039. [DOI: 10.1016/j.jtbi.2019.110039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 09/11/2019] [Accepted: 10/02/2019] [Indexed: 11/28/2022]
|
36
|
Halder AK, Giri AK, Cordeiro MNDS. Multi-Target Chemometric Modelling, Fragment Analysis and Virtual Screening with ERK Inhibitors as Potential Anticancer Agents. Molecules 2019; 24:molecules24213909. [PMID: 31671605 PMCID: PMC6864583 DOI: 10.3390/molecules24213909] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 10/21/2019] [Accepted: 10/25/2019] [Indexed: 02/07/2023] Open
Abstract
Two isoforms of extracellular regulated kinase (ERK), namely ERK-1 and ERK-2, are associated with several cellular processes, the aberration of which leads to cancer. The ERK-1/2 inhibitors are thus considered as potential agents for cancer therapy. Multitarget quantitative structure–activity relationship (mt-QSAR) models based on the Box–Jenkins approach were developed with a dataset containing 6400 ERK inhibitors assayed under different experimental conditions. The first mt-QSAR linear model was built with linear discriminant analysis (LDA) and provided information regarding the structural requirements for better activity. This linear model was also utilised for a fragment analysis to estimate the contributions of ring fragments towards ERK inhibition. Then, the random forest (RF) technique was employed to produce highly predictive non-linear mt-QSAR models, which were used for screening the Asinex kinase library and identify the most potential virtual hits. The fragment analysis results justified the selection of the hits retrieved through such virtual screening. The latter were subsequently subjected to molecular docking and molecular dynamics simulations to understand their possible interactions with ERK enzymes. The present work, which utilises in-silico techniques such as multitarget chemometric modelling, fragment analysis, virtual screening, molecular docking and dynamics, may provide important guidelines to facilitate the discovery of novel ERK inhibitors.
Collapse
Affiliation(s)
- Amit Kumar Halder
- Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal.
| | - Amal Kanta Giri
- Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal.
| | | |
Collapse
|
37
|
When global and local molecular descriptors are more than the sum of its parts: Simple, But Not Simpler? Mol Divers 2019; 24:913-932. [PMID: 31659696 DOI: 10.1007/s11030-019-10002-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 10/09/2019] [Indexed: 01/29/2023]
Abstract
In this report, we introduce a set of aggregation operators (AOs) to calculate global and local (group and atom type) molecular descriptors (MDs) as a generalization of the classical approach of molecular encoding using the sum of the atomic (or fragment) contributions. These AOs are implemented in a new and free software denominated MD-LOVIs ( http://tomocomd.com/md-lovis ), which allows for the calculation of MDs from atomic weights vector and LOVIs (local vertex invariants). This software was developed in Java programming language and employed the Chemical Development Kit (CDK) library for handling chemical structures and the calculation of atomic weights. An analysis of the complexities of the algorithms presented herein demonstrates that these aspects were efficiently implemented. The calculation speed experiments show that the MD-LOVIs software has satisfactory behavior when compared to software such as Padel, CDKDescriptor, DRAGON and Bluecal software. Shannon's entropy (SE)-based variability studies demonstrate that MD-LOVIs yields indices with greater information content when compared to those of popular academic and commercial software. A principal component analysis reveals that our approach captures chemical information orthogonal to that codified by the DRAGON, Padel and Mold2 software, as a result of the several generalizations in MD-LOVIs not used in other programs. Lastly, three QSARs were built using multiple linear regression with genetic algorithms, and the statistical parameters of these models demonstrate that the MD-LOVIs indices obtained with AOs yield better performance than those obtained when the summation operator is used exclusively. Moreover, it is also revealed that the MD-LOVIs indices yield models with comparable to superior performance when compared to other QSAR methodologies reported in the literature, despite their simplicity. The studies performed herein collectively demonstrated that MD-LOVIs software generates indices as simple as possible, but not simpler and that use of AOs enhances the diversity of the chemical information codified, which consequently improves the performance of traditional MDs.
Collapse
|
38
|
Terán JE, Marrero-Ponce Y, Contreras-Torres E, García-Jacas CR, Vivas-Reyes R, Terán E, Torres FJ. Tensor Algebra-based Geometrical (3D) Biomacro-Molecular Descriptors for Protein Research: Theory, Applications and Comparison with other Methods. Sci Rep 2019; 9:11391. [PMID: 31388082 PMCID: PMC6684663 DOI: 10.1038/s41598-019-47858-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 07/22/2019] [Indexed: 11/16/2022] Open
Abstract
In this report, a new type of tridimensional (3D) biomacro-molecular descriptors for proteins are proposed. These descriptors make use of multi-linear algebra concepts based on the application of 3-linear forms (i.e., Canonical Trilinear (Tr), Trilinear Cubic (TrC), Trilinear-Quadratic-Bilinear (TrQB) and so on) as a specific case of the N-linear algebraic forms. The definition of the kth 3-tuple similarity-dissimilarity spatial matrices (Tensor’s Form) are used for the transformation and for the representation of the existing chemical information available in the relationships between three amino acids of a protein. Several metrics (Minkowski-type, wave-edge, etc) and multi-metrics (Triangle area, Bond-angle, etc) are proposed for the interaction information extraction, as well as probabilistic transformations (e.g., simple stochastic and mutual probability) to achieve matrix normalization. A generalized procedure considering amino acid level-based indices that can be fused together by using aggregator operators for descriptors calculations is proposed. The obtained results demonstrated that the new proposed 3D biomacro-molecular indices perform better than other approaches in the SCOP-based discrimination and the prediction of folding rate of proteins by using simple linear parametrical models. It can be concluded that the proposed method allows the definition of 3D biomacro-molecular descriptors that contain orthogonal information capable of providing better models for applications in protein science.
Collapse
Affiliation(s)
- Julio E Terán
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador.,Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, and Instituto de Simulación Computacional (ISC-USFQ), Quito, Pichincha, Ecuador
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador. .,Universidad de San Buenaventura - Cartagena - Facultad de Ciencias de la Salud - Grupo de Investigación Microbiología & Ambiente (GIMA) - Calle Real de Ternera, Diagonal 32, No. 30-966, Cartagena, Código postal: 1300 10, Colombia.
| | - Ernesto Contreras-Torres
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
| | - César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencia de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California, Mexico
| | - Ricardo Vivas-Reyes
- Grupo de Química Cuántica y Teórica de la Universidad de Cartagena-Facultad de Ciencias Exactas y Naturales. Programa de Química. Campus de San Pablo and Grupo GINUMED Corporacion Universitaria Rafal Nuñez. Facultad de Salud. Programa de Medicina., Cartagena, Colombia.,Grupo CipTec, Facultad de Ingenierias. Fundacion Universitaria Tecnologico Comfenalco - Cartagena, Cartagena, Bolívar, Colombia
| | - Enrique Terán
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
| | - F Javier Torres
- Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, and Instituto de Simulación Computacional (ISC-USFQ), Quito, Pichincha, Ecuador
| |
Collapse
|
39
|
Serra A, Önlü S, Coretto P, Greco D. An integrated quantitative structure and mechanism of action-activity relationship model of human serum albumin binding. J Cheminform 2019; 11:38. [PMID: 31172382 PMCID: PMC6551915 DOI: 10.1186/s13321-019-0359-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 05/22/2019] [Indexed: 01/27/2023] Open
Abstract
Background Traditional quantitative structure-activity relationship models usually neglect the molecular alterations happening in the exposed systems (the mechanism of action, MOA), that mediate between structural properties of compounds and phenotypic effects of an exposure. Results Here, we propose a computational strategy that integrates molecular descriptors and MOA information to better explain the mechanisms underlying biological endpoints of interest. By applying our methodology, we obtained a statistically robust and validated model to predict the binding affinity to human serum albumin. Our model is also able to provide new venues for the interpretation of the chemical-biological interactions. Conclusion Our observations suggest that integrated quantitative models of structural and MOA-activity relationships are promising complementary tools in the arsenal of strategies aiming at developing new safe- and useful-by-design compounds. Electronic supplementary material The online version of this article (10.1186/s13321-019-0359-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Angela Serra
- Faculty of Medicine and Health Technology, Tampere University, Arvo Ylpön katu 34, Tampere, Finland
| | - Serli Önlü
- Faculty of Medicine and Health Technology, Tampere University, Arvo Ylpön katu 34, Tampere, Finland.,Corporate Product Safety/Henkel AG & Co. KGaA, Düsseldorf, Germany
| | - Pietro Coretto
- DISES, STATLAB, University of Salerno, Giovanni Paolo II 132, Fisciano, Italy
| | - Dario Greco
- Faculty of Medicine and Health Technology, Tampere University, Arvo Ylpön katu 34, Tampere, Finland. .,Institute of Biotechnology, University of Helsinki, Finland, Helsinki, Finland. .,BioMediTech institute, Tampere University, Tampere, Finland.
| |
Collapse
|
40
|
García-Jacas CR, Marrero-Ponce Y, Cortés-Guzmán F, Suárez-Lezcano J, Martinez-Rios FO, García-González LA, Pupo-Meriño M, Martinez-Mayorga K. Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes. Chem Res Toxicol 2019; 32:1178-1192. [PMID: 31066547 DOI: 10.1021/acs.chemrestox.9b00011] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Quantitative structure-activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with REPA-training ≥ 0.75 ( R = correlation coefficient) and MAEEPA-training ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAEtest-AD = 0.4044, MAEProTox-AD = 0.4067 and MAET3DB-AD = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAEProTox-AD = 0.3992 and MAET3DB-AD = 0.2286, and MAEProTox-AD = 0.3773 and MAET3DB-AD = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks ( http://tomocomd.com/apps/ptoxra ).
Collapse
Affiliation(s)
- César R García-Jacas
- Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada , Ensenada , Baja California , México
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional, Colegio de Ciencias de la Salud , Escuela de Medicina, Edificio de Especialidades Médicas , Quito , Pichincha , Ecuador.,Grupo de Investigación Ambiental, Programas Ambientales, Facultad de Ingenierías , Fundacion Universitaria Tecnologico Comfenalco-Cartagena , Cr44 DN 30 A, 91 , Cartagena , Bolívar , Colombia
| | - Fernando Cortés-Guzmán
- Instituto de Química , Universidad Nacional Autónoma de México , Ciudad de México , México
| | - José Suárez-Lezcano
- Pontificia Universidad Católica del Ecuador Sede Esmeraldas , Esmeraldas , Ecuador
| | | | - Luis A García-González
- Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba
| | - Mario Pupo-Meriño
- Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba
| | | |
Collapse
|
41
|
Pham-The H, Cabrera-Pérez MÁ, Nam NH, Castillo-Garit JA, Rasulev B, Le-Thi-Thu H, Casañola-Martin GM. In Silico Assessment of ADME Properties: Advances in Caco-2 Cell Monolayer Permeability Modeling. Curr Top Med Chem 2019; 18:2209-2229. [PMID: 30499410 DOI: 10.2174/1568026619666181130140350] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Revised: 10/16/2018] [Accepted: 11/19/2018] [Indexed: 11/22/2022]
Abstract
One of the main goals of in silico Caco-2 cell permeability models is to identify those drug substances with high intestinal absorption in human (HIA). For more than a decade, several in silico Caco-2 models have been made, applying a wide range of modeling techniques; nevertheless, their capacity for intestinal absorption extrapolation is still doubtful. There are three main problems related to the modest capacity of obtained models, including the existence of inter- and/or intra-laboratory variability of recollected data, the influence of the metabolism mechanism, and the inconsistent in vitro-in vivo correlation (IVIVC) of Caco-2 cell permeability. This review paper intends to sum up the recent advances and limitations of current modeling approaches, and revealed some possible solutions to improve the applicability of in silico Caco-2 permeability models for absorption property profiling, taking into account the above-mentioned issues.
Collapse
Affiliation(s)
- Hai Pham-The
- Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hanoi, Vietnam
| | - Miguel Á Cabrera-Pérez
- Unit of Modeling and Experimental Biopharmaceutics, Chemical Bioactive Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.,Department of Engineering, Area of Pharmacy and Pharmaceutical Technology, Miguel Hernández University, 03550 Sant Juan d'Alacant, Alicante, Spain
| | - Nguyen-Hai Nam
- Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hanoi, Vietnam
| | - Juan A Castillo-Garit
- Unidad de Toxicologia Experimental, Universidad de Ciencias Medicas "Dr. Serafín Ruiz de Zarate Ruiz" de Villa Clara, Santa Clara, 50200, Villa Clara, Cuba
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymer Materials, North Dakota State University, Fargo, ND, 58102, United States
| | - Huong Le-Thi-Thu
- School of Medicine and Pharmacy, Vietnam National University, 144 Xuan Thuy, Hanoi, Vietnam
| | - Gerardo M Casañola-Martin
- Department of Coatings and Polymer Materials, North Dakota State University, Fargo, ND, 58102, United States
| |
Collapse
|
42
|
Speck-Planche A. Combining Ensemble Learning with a Fragment-Based Topological Approach To Generate New Molecular Diversity in Drug Discovery: In Silico Design of Hsp90 Inhibitors. ACS OMEGA 2018; 3:14704-14716. [PMID: 30555986 PMCID: PMC6289491 DOI: 10.1021/acsomega.8b02419] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 10/23/2018] [Indexed: 05/05/2023]
Abstract
Machine learning methods have revolutionized modern science, providing fast and accurate solutions to multiple problems. However, they are commonly treated as "black boxes". Therefore, in important scientific fields such as medicinal chemistry and drug discovery, machine learning methods are restricted almost exclusively to the task of performing predictions of large and heterogeneous data sets of chemicals. The lack of interpretability prevents the full exploitation of the machine learning models as generators of new chemical knowledge. This work focuses on the development of an ensemble learning model for the prediction and design of potent dual heat shock protein 90 (Hsp90) inhibitors. The model displays accuracy higher than 80% in both training and test sets. To use the ensemble model as a generator of new chemical knowledge, three steps were followed. First, a physicochemical and/or structural interpretation was provided for each molecular descriptor present in the ensemble learning model. Second, the term "pseudolinear equation" was introduced within the context of machine learning to calculate the relative quantitative contributions of different molecular fragments to the inhibitory activity against the two Hsp90 isoforms studied here. Finally, by assembling the fragments with positive contributions, new molecules were designed, being predicted as potent Hsp90 inhibitors. According to Lipinski's rule of five, the designed molecules were found to exhibit potentially good oral bioavailability, a primordial property that chemicals must have to pass early stages in drug discovery. The present approach based on the combination of ensemble learning and fragment-based topological design holds great promise in drug discovery, and it can be adapted and applied to many different scientific disciplines.
Collapse
|
43
|
BET bromodomain inhibitors: fragment-based in silico design using multi-target QSAR models. Mol Divers 2018; 23:555-572. [PMID: 30421269 DOI: 10.1007/s11030-018-9890-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 10/30/2018] [Indexed: 12/17/2022]
Abstract
Epigenetics has become a focus of interest in drug discovery. In this sense, bromodomain-containing proteins have emerged as potential epigenetic targets in cancer research and other therapeutic areas. Several computational approaches have been applied to the prediction of bromodomain inhibitors. Nevertheless, such approaches have several drawbacks such as the fact that they predict activity against only one bromodomain-containing protein, using structurally related compounds. Also, there are no reports focused on meaningfully analyzing the physicochemical/structural features that are necessary for the design of a bromodomain inhibitor. This work describes the development of two different multi-target models based on quantitative structure-activity relationships (mt-QSAR) for the prediction and in silico design of multi-target bromodomain inhibitors against the proteins BRD2, BRD3, and BRD4. The first model relied on linear discriminant analysis (LDA) while the second focused on artificial neural networks. Both models exhibited accuracies higher than 85% in the dataset. Several molecular fragments were extracted, and their contributions to the inhibitory activity against the three BET proteins were calculated by the LDA model. Six molecules were designed by assembling the fragments with positive contributions, and they were predicted as multi-target BET bromodomain inhibitors by the two mt-QSAR models. Molecular docking calculations converged with the predictions performed by the mt-QSAR models, suggesting that the designed molecules can exhibit potent activity against the three BET proteins. These molecules complied with the Lipinski's rule of five.
Collapse
|
44
|
García-Jacas CR, Cabrera-Leyva L, Marrero-Ponce Y, Suárez-Lezcano J, Cortés-Guzmán F, Pupo-Meriño M, Vivas-Reyes R. Choquet integral-based fuzzy molecular characterizations: when global definitions are computed from the dependency among atom/bond contributions (LOVIs/LOEIs). J Cheminform 2018; 10:51. [PMID: 30362050 PMCID: PMC6755596 DOI: 10.1186/s13321-018-0306-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 10/15/2018] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Several topological (2D) and geometric (3D) molecular descriptors (MDs) are calculated from local vertex/edge invariants (LOVIs/LOEIs) by performing an aggregation process. To this end, norm-, mean- and statistic-based (non-fuzzy) operators are used, under the assumption that LOVIs/LOEIs are independent (orthogonal) values of one another. These operators are based on additive and/or linear measures and, consequently, they cannot be used to encode information from interrelated criteria. Thus, as LOVIs/LOEIs are not orthogonal values, then non-additive (fuzzy) measures can be used to encode the interrelation among them. RESULTS General approaches to compute fuzzy 2D/3D-MDs from the contribution of each atom (LOVIs) or covalent bond (LOEIs) within a molecule are proposed, by using the Choquet integral as fuzzy aggregation operator. The Choquet integral-based operator is rather different from the other operators often used for the 2D/3D-MDs calculation. It performs a reordering step to fuse the LOVIs/LOEIs according to their magnitudes and, in addition, it considers the interrelation among them through a fuzzy measure. With this operator, fuzzy definitions can be derived from traditional or recent MDs; for instance, fuzzy Randic-like connectivity indices, fuzzy Balaban-like indices, fuzzy Kier-Hall connectivity indices, among others. To demonstrate the feasibility of using this operator, the QuBiLS-MIDAS 3D-MDs were used as study case and, as a result, a module was built into the corresponding software to compute them ( http://tomocomd.com/qubils-midas ). Thus, it is the only software reported in the literature that can be employed to determine Choquet integral-based fuzzy MDs. Moreover, regression models were created on eight chemical datasets. In this way, a comparison between the results achieved by the models based on the non-fuzzy QuBiLS-MIDAS 3D-MDs with regard to the ones achieved by the models based on the fuzzy QuBiLS-MIDAS 3D-MDs was made. As a result, the models built with the fuzzy QuBiLS-MIDAS 3D-MDs achieved the best performance, which was statistically corroborated through the Wilcoxon signed-rank test. CONCLUSIONS All in all, it can be concluded that the Choquet integral constitutes a prominent alternative to compute fuzzy 2D/3D-MDs from LOVIs/LOEIs. In this way, better characterizations of the compounds can be obtained, which will be ultimately useful in enhancing the modelling ability of existing traditional 2D/3D-MDs.
Collapse
Affiliation(s)
- César R García-Jacas
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México.
| | - Lisset Cabrera-Leyva
- Grupo de Investigación de Inteligencia Artificial (AIRES), Facultad de Informática, Universidad de Camagüey, Camagüey, Cuba
| | - Yovani Marrero-Ponce
- Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Universidad San Francisco de Quito (USFQ), Quito, Pichincha, Ecuador.,Grupo de Investigación Ambiental (GIA), Programas Ambientales, Facultad de Ingenierías, Fundacion Universitaria Tecnologico Comfenalco - Cartagena, Cr 44 DN 30 A, 91, Cartagena, Bolívar, Colombia
| | - José Suárez-Lezcano
- Pontificia Universidad Católica del Ecuador Sede Esmeraldas (PUCESE), Esmeraldas, Ecuador
| | - Fernando Cortés-Guzmán
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México
| | - Mario Pupo-Meriño
- Grupo de Investigación de Bioinformática, Universidad de las Ciencias Informáticas (UCI), La Habana, Cuba
| | - Ricardo Vivas-Reyes
- Grupo de Química Cuántica y Teórica, Facultad de Ciencias Exactas y Naturales, Programa de Química, Universidad de Cartagena, Campus de San Pablo, Cartagena, Colombia.,Grupo CipTec, Facultad de Ingenierias, Fundacion Universitaria Tecnologico Comfenalco - Cartagena, Cr 44 DN 30 A, 91, Cartagena, Bolívar, Colombia
| |
Collapse
|
45
|
Zhang H, Ren JX, Ma JX, Ding L. Development of an in silico prediction model for chemical-induced urinary tract toxicity by using naïve Bayes classifier. Mol Divers 2018; 23:381-392. [DOI: 10.1007/s11030-018-9882-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 09/25/2018] [Indexed: 12/16/2022]
|
46
|
García-Jacas CR, Cabrera-Leyva L, Marrero-Ponce Y, Suárez-Lezcano J, Cortés-Guzmán F, García-González LA. GOWAWA Aggregation Operator-based Global Molecular Characterizations: Weighting Atom/bond Contributions (LOVIs/LOEIs) According to their Influence in the Molecular Encoding. Mol Inform 2018; 37:e1800039. [PMID: 30070434 DOI: 10.1002/minf.201800039] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Accepted: 07/13/2018] [Indexed: 11/11/2022]
Abstract
A different perspective to compute global weighted definitions of molecular descriptors from the contributions of each atom (LOVIs) or covalent bond (LOEIs) within a molecule is presented, using the generalized ordered weighted averaging - weighted averaging (GOWAWA) aggregation operator. This operator is rather different from the other norm-, mean- and statistic-based operators used up to date for the descriptors calculation from LOVIs/LOEIs. GOWAWA unifies the generalized ordered weighted averaging (GOWA) and the weighted generalized mean (WGM) functions and, in addition, it uses a smoothing parameter to assign different importance values to both functions depending on the problem under study. With the GOWAWA operator, diversity of novel global aggregations of molecular descriptors can be determined, where the influence that each atom (or covalent bond) has on the molecular characterization is taken into account. Therefore, this approach is completely different from the ones reported in the literature, where the values of LOVIs/LOEIs are considered equally important. To demonstrate the feasibility of using this operator, the QuBiLS-MIDAS descriptors (http://tomocomd.com/qubils-midas) were used and, as a result, a module was built into the corresponding software to compute them, being thus the only software reported in the literature that can be employed to determine weighted descriptors. Moreover, several modeling studies were performed on eight chemical datasets, which demonstrated that, with the GOWAWA aggregation operator, weighted QuBiLS-MIDAS descriptors that contribute to develop models with greater predictive power can be computed, if compared to the models based on the non-weighted descriptors calculated from the other operators used up to date. A non-parametric statistical assessment confirmed that the GOWAWA-based predictions are significantly superior to the others obtained. Therefore, all in all, it can be concluded that, from the results achieved, the GOWAWA operator constitutes a prominent alternative to codify relevant chemical information of the molecules, ultimately useful in improving the modeling ability of several old and recent descriptors whose definition is based on the LOVIs/LOEIs calculation.
Collapse
Affiliation(s)
- César R García-Jacas
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México
| | - Lisset Cabrera-Leyva
- Grupo de Investigación de Inteligencia Artificial (AIRES), Facultad de Informática, Universidad de Camagüey, Camagüey, Cuba
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador.,Grupo de Investigación Ambiental (GIA), Programas Ambientales, Facultad de Ingenierías, Fundación Universitaria Tecnológico de Comfenalco (COMFENALCO), Cartagena de Indias, Bolívar, Colombia
| | - José Suárez-Lezcano
- Pontificia Universidad Católica del Ecuador Sede Esmeraldas (PUCESE), Esmeraldas, Ecuador
| | - Fernando Cortés-Guzmán
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México
| | - Luis A García-González
- Grupo de Investigación de Bioinformática, Universidad de las Ciencias Informáticas (UCI), La Habana, Cuba
| |
Collapse
|
47
|
Drug repositioning for novel antitrichomonas from known antiprotozoan drugs using hierarchical screening. Future Med Chem 2018; 10:863-878. [DOI: 10.4155/fmc-2016-0211] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Aim: Metronidazole is the most widely used drug in trichomoniasis therapy. However, the emergence of metronidazole-resistant Trichomonas vaginalis isolates calls for the search for new drugs to counter the pathogenicity of these parasites. Results: Classification models for predicting the antitrichomonas activity of molecules were built. These models were employed to screen antiprotozoal drugs, from which 20 were classified as active. The in vitro experiments showed moderate to high activity for 19 of the molecules at 10 μg/ml, while 3 compounds yielded higher activity than the reference at 1 μg/ml. The 11 most active chemicals were evaluated in vivo using Naval Medical Research Institute (NMRI) mice. Conclusion: Benznidazole showed similar results as metronidazole, and can thus be considered as a potential candidate in antitrichomonas therapy.
Collapse
|
48
|
Dong J, Yao ZJ, Zhang L, Luo F, Lin Q, Lu AP, Chen AF, Cao DS. PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions. J Cheminform 2018; 10:16. [PMID: 29556758 PMCID: PMC5861255 DOI: 10.1186/s13321-018-0270-2] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 03/12/2018] [Indexed: 11/15/2022] Open
Abstract
Background
With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline. Results Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently. Conclusion PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html.![]()
Collapse
Affiliation(s)
- Jie Dong
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China.,College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
| | - Zhi-Jiang Yao
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China
| | - Lin Zhang
- College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
| | - Feijun Luo
- College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
| | - Qinlu Lin
- College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Alex F Chen
- Center for Vascular Disease and Translational Medicine, Third Xiangya Hospital, Central South University, Changsha, People's Republic of China
| | - Dong-Sheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China. .,Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China. .,Center for Vascular Disease and Translational Medicine, Third Xiangya Hospital, Central South University, Changsha, People's Republic of China.
| |
Collapse
|
49
|
Phanus-umporn C, Shoombuatong W, Prachayasittikul V, Anuwongcharoen N, Nantasenamat C. Privileged substructures for anti-sickling activity via cheminformatic analysis. RSC Adv 2018; 8:5920-5935. [PMID: 35539618 PMCID: PMC9078244 DOI: 10.1039/c7ra12079f] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Revised: 02/21/2018] [Accepted: 01/12/2018] [Indexed: 11/21/2022] Open
Abstract
Sickle cell disease (SCD), an autosomal recessive genetic disorder, has been recognized by the World Health Organization (WHO) as a major public health problem as it affects 300 000 individuals worldwide. Complications arising from SCD include anemia, microvascular occlusion, severe pain, stokes, renal dysfunction and infections. A lucrative therapeutic strategy is to employ anti-sickling agents that can disrupt the formation of the HbS polymer. This study therefore employed cheminformatic approaches, encompassing classification structure–activity relationship (CSAR) modeling, to deduce the privileged substructures giving rise to the anti-sickling activity of an investigated set of 115 compounds, followed by substructure analysis. Briefly, the compiled compounds were described by fingerprint descriptors and used in the construction of CSAR models via several machine learning algorithms. The modelability of the data set, as exemplified by the MODI index, was determined to be in the range of 0.70–0.84. The predictive performance was deduced by the accuracy, sensitivity, specificity and Matthews correlation coefficient, which was found to be statistically robust, whereby the former three parameters afforded values in excess of 0.7 while the latter statistical parameter provided a value greater than 0.5. An analysis of the top 20 important substructure descriptors for anti-sickling activity revealed that 10 important features were significant in the differentiation of actives from inactives, as illustrated by aromaticity/conjugation (e.g. SubFPC287, SubFPC171 and SubFPC5), carbonyl groups (e.g. SubFPC137, SubFPC139, SubFPC49 and SubFPC135) and miscellaneous groups (e.g. SubFPC303, SubFPC302 and SubFPC275). Furthermore, an analysis of the structure–activity relationship revealed that the length of alkyl chains, choice of functional moiety and position of substitution on the benzene ring may affect the anti-sickling activity of these compounds. Thus, this knowledge is anticipated to be useful for guiding the design of robust compounds against the gelling activity of HbS, as preliminarily demonstrated in the data-driven compound design presented herein. Cheminformatic approaches (classification structure–activity relationship models based on 12 fingerprint classes) were employed for deducing privileged substructures giving rise to the anti-sickling activity of an investigated set of 115 compounds.![]()
Collapse
Affiliation(s)
- Chuleeporn Phanus-umporn
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| | - Veda Prachayasittikul
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| | - Nuttapat Anuwongcharoen
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics
- Faculty of Medical Technology
- Mahidol University
- Bangkok 10700
- Thailand
| |
Collapse
|