1
|
Valdés-Martiní JR, Marrero-Ponce Y, García-Jacas CR, Martinez-Mayorga K, Barigye SJ, Vaz d'Almeida YS, Pham-The H, Pérez-Giménez F, Morell CA. QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations. J Cheminform 2017; 9:35. [PMID: 29086120 PMCID: PMC5462671 DOI: 10.1186/s13321-017-0211-5] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 04/07/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In previous reports, Marrero-Ponce et al. proposed algebraic formalisms for characterizing topological (2D) and chiral (2.5D) molecular features through atom- and bond-based ToMoCoMD-CARDD (acronym for Topological Molecular Computational Design-Computer Aided Rational Drug Design) molecular descriptors. These MDs codify molecular information based on the bilinear, quadratic and linear algebraic forms and the graph-theoretical electronic-density and edge-adjacency matrices in order to consider atom- and bond-based relations, respectively. These MDs have been successfully applied in the screening of chemical compounds of different therapeutic applications ranging from antimalarials, antibacterials, tyrosinase inhibitors and so on. To compute these MDs, a computational program with the same name was initially developed. However, this in house software barely offered the functionalities required in contemporary molecular modeling tasks, in addition to the inherent limitations that made its usability impractical. Therefore, the present manuscript introduces the QuBiLS-MAS (acronym for Quadratic, Bilinear and N-Linear mapS based on graph-theoretic electronic-density Matrices and Atomic weightingS) software designed to compute topological (0-2.5D) molecular descriptors based on bilinear, quadratic and linear algebraic forms for atom- and bond-based relations. RESULTS The QuBiLS-MAS module was designed as standalone software, in which extensions and generalizations of the former ToMoCoMD-CARDD 2D-algebraic indices are implemented, considering the following aspects: (a) two new matrix normalization approaches based on double-stochastic and mutual probability formalisms; (b) topological constraints (cut-offs) to take into account particular inter-atomic relations; (c) six additional atomic properties to be used as weighting schemes in the calculation of the molecular vectors; (d) four new local-fragments to consider molecular regions of interest; (e) number of lone-pair electrons in chemical structure defined by diagonal coefficients in matrix representations; and (f) several aggregation operators (invariants) applied over atom/bond-level descriptors in order to compute global indices. This software permits the parallel computation of the indices, contains a batch processing module and data curation functionalities. This program was developed in Java v1.7 using the Chemistry Development Kit library (version 1.4.19). The QuBiLS-MAS software consists of two components: a desktop interface (GUI) and an API library allowing for the easy integration of the latter in chemoinformatics applications. The relevance of the novel extensions and generalizations implemented in this software is demonstrated through three studies. Firstly, a comparative Shannon's entropy based variability study for the proposed QuBiLS-MAS and the DRAGON indices demonstrates superior performance for the former. A principal component analysis reveals that the QuBiLS-MAS approach captures chemical information orthogonal to that codified by the DRAGON descriptors. Lastly, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer's steroid dataset is carried out. CONCLUSIONS From these analyses, it is revealed that the QuBiLS-MAS approach for atom-pair relations yields similar-to-superior performance with regard to other QSAR methodologies reported in the literature. Therefore, the QuBiLS-MAS approach constitutes a useful tool for the diversity analysis of chemical compound datasets and high-throughput screening of structure-activity data.
Collapse
Affiliation(s)
- José R Valdés-Martiní
- StreelBridge Laboratories, SteelBridge Consulting Technology Solutions, Miami, FL, USA
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Ecuador. .,Universidad San Francisco de Quito (USFQ), Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, 170157, Quito, Pichincha, Ecuador. .,Computer-Aided Molecular "Biosilico" Discovery and Bioinformatics Research International Network (CAMD-BIR IN), Cumbayá, Quito, Ecuador. .,Grupo de Investigación Ambiental (GIA), Fundación Universitaria Tecnológico de Comfenalco, Facultad de Ingenierías, Programa de Ingeniería de Procesos, Cartagena de Indias, Bolívar, Colombia. .,Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain.
| | - César R García-Jacas
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México.,Escuela de Sistemas y Computación, Pontificia Universidad Católica del Ecuador Sede Esmeraldas (PUCESE), Esmeraldas, Ecuador.,Grupo de Investigación de Bioinformática, Universidad de las Ciencias Informáticas (UCI), Havana, Cuba
| | - Karina Martinez-Mayorga
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México
| | - Stephen J Barigye
- Facultad de Medicina, Universidad de Las Américas, Quito, Pichincha, Ecuador
| | | | - Hai Pham-The
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi, Vietnam
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Carlos A Morell
- Laboratorio de Inteligencia Artificial, Centro de Estudios de Informática (CEI), Facultad de Matemática, Física y Computación, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara, Cuba
| |
Collapse
|
2
|
García-Jacas CR, Marrero-Ponce Y, Hernández-Ortega T, Martinez-Mayorga K, Cabrera-Leyva L, Ledesma-Romero JC, Aguilera-Fernández I, Rodríguez-León AR. Tensor algebra-based geometric methodology to codify central chirality on organic molecules. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2017; 28:541-556. [PMID: 28705027 DOI: 10.1080/1062936x.2017.1344729] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 06/16/2017] [Indexed: 06/07/2023]
Abstract
A novel mathematical procedure to codify chiral features of organic molecules in the QuBiLS-MIDAS framework is introduced. This procedure constitutes a generalization to that commonly used to date, where the values 1 and -1 (correction factor) are employed to weight the molecular vectors when each atom is labelled as R (rectus) or S (sinister) according to the Cahn-Ingold-Prelog rules. Therefore, values in the range [Formula: see text] with steps equal to 0.25 may be accounted for. The atoms labelled R or S can have negative and positive values assigned (e.g. -3 for an R atom and 1 for an S atom, or vice versa), opposed values (e.g. -3 for an R atom and 3 for an S atom, or vice versa), positive values (e.g. 3 for an R atom and 1 for an S atom) or negative values (e.g. -3 for an R atom and -1 for an S atom). These proposed Chiral QuBiLS-MIDAS 3D-MDs are real numbers, non-symmetric and reduced to 'classical' (non-chiral) QuBiLS-MIDAS 3D-MDs when symmetry is not codified (correction factor equal to zero). In this report, only the factors with opposed values were considered with the purpose of demonstrating the feasibility of this proposal. From QSAR modelling carried out on four chemical datasets (Cramer's steroids, fenoterol stereoisomer derivatives, N-alkylated 3-(3-hydroxyphenyl)-piperidines, and perindoprilat stereoisomers), it was demonstrated that the use of several correction factors contributes to the building of models with greater robustness and predictive ability than those reported in the literature, as well as with respect to the models exclusively developed with QuBiLS-MIDAS 3D-MDs based on the factor 1 | -1. In conclusion, it can be stated that this novel strategy constitutes a suitable alternative to computed chirality-based descriptors, contributing to the development of good models to predict properties depending on symmetry.
Collapse
Affiliation(s)
- C R García-Jacas
- a Instituto de Química, Universidad Nacional Autónoma de México (UNAM) , Ciudad de México , México
- b Escuela de Sistemas y Computación , Pontificia Universidad Católica del Ecuador Sede Esmeraldas (PUCESE) , Esmeraldas , Ecuador
- g Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas (UCI) , La Habana , Cuba
| | - Y Marrero-Ponce
- c Computer-Aided Molecular "Biosilico" Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Quito , Ecuador
- d Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina , Quito , Pichincha , Ecuador
- e Universidad San Francisco de Quito (USFQ), Instituto de Simulación Computacional (ISC-USFQ) , Quito , Pichincha , Ecuador
- f Grupo de Investigación Ambiental (GIA) , Programas Ambientales, Facultad de Ingenierías, Fundación Universitaria Tecnológico de Comfenalco (COMFENALCO) , Cartagena de Indias , Bolívar , Colombia
| | - T Hernández-Ortega
- g Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas (UCI) , La Habana , Cuba
| | - K Martinez-Mayorga
- a Instituto de Química, Universidad Nacional Autónoma de México (UNAM) , Ciudad de México , México
| | - L Cabrera-Leyva
- h Grupo de Investigación de Inteligencia Artificial (AIRES), Facultad de Informática , Universidad de Camagüey , Camagüey , Cuba
| | - J C Ledesma-Romero
- g Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas (UCI) , La Habana , Cuba
| | - I Aguilera-Fernández
- g Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas (UCI) , La Habana , Cuba
| | - A R Rodríguez-León
- g Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas (UCI) , La Habana , Cuba
| |
Collapse
|
3
|
Marrero-Ponce Y, Castañeda YG, Vivas-Reyes R, Vergara FM, Arán VJ, Castillo-Garit JA, Pérez-Giménez F, Torrens F, Le-Thi-Thu H, Pham-The H, Montenegro YV, Ibarra-Velarde F. Dry selection and wet evaluation for the rational discovery of new anthelmintics. Mol Phys 2017. [DOI: 10.1080/00268976.2017.1296194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Ecuador
- Universidad San Francisco de Quito (USFQ), Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito, Ecuador
- Computer-Aided Molecular “Biosilico” Discovery and Bioinformatics Research International Network (CAMD-BIR IN), Quito, Ecuador
- GIA (Grupo de Investigación Ambiental), Fundación Universitaria Tecnológico de Comfenalco, Facultad de Ingenierías, Programa de Ingeniería de Procesos, Cartagena de Indias, Bolívar, Colombia
| | - Yeniel González Castañeda
- Computer-Aided Molecular “Biosilico” Discovery and Bioinformatics Research International Network (CAMD-BIR IN), Quito, Ecuador
| | - Ricardo Vivas-Reyes
- Grupo de Química Cuántica y Teórica, Facultad de Ciencias, Universidad de Cartagena, Cartagena de Indias, Bolívar, Colombia
- Grupo CipTec, Fundación Universitaria Tecnológico de Comfenalco, Facultad de Ingenierías, Programa de Ingeniería Industrial, Cartagena de Indias, Bolívar, Colombia
| | - Fredy Máximo Vergara
- Grupo de Química Cuántica y Teórica, Facultad de Ciencias, Universidad de Cartagena, Cartagena de Indias, Bolívar, Colombia
- Grupo CipTec, Fundación Universitaria Tecnológico de Comfenalco, Facultad de Ingenierías, Programa de Ingeniería Industrial, Cartagena de Indias, Bolívar, Colombia
| | | | - Juan A. Castillo-Garit
- Computer-Aided Molecular “Biosilico” Discovery and Bioinformatics Research International Network (CAMD-BIR IN), Quito, Ecuador
- Unidad de Toxicología Experimental, Universidad de Ciencias Medicas de Villas Clara, Santa Clara, 50200, Cuba
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, València, Spain
| | - Francisco Torrens
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, València, Spain
| | - Huong Le-Thi-Thu
- School of Medicine and Pharmacy, Vietnam National University, Hanoi, Vietnam
| | - Hai Pham-The
- Pharmacy Department, Hanoi University of Pharmacy , 13-15 Le Thonh Tong, Hoan Kiem, Hanoi, Vietnam
| | - Yolanda Vera Montenegro
- Department of Parasitology, Faculty of Veterinarian Medicinal and Zootecnic, UNAM, Mexico, Mexico
| | - Froylán Ibarra-Velarde
- Department of Parasitology, Faculty of Veterinarian Medicinal and Zootecnic, UNAM, Mexico, Mexico
| |
Collapse
|
4
|
Grenier PA, Brun L, Villemin D. Chemoinformatics and stereoisomerism: A stereo graph kernel together with three new extensions. Pattern Recognit Lett 2017. [DOI: 10.1016/j.patrec.2016.06.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
5
|
Gaohua L, Wedagedera J, Small BG, Almond L, Romero K, Hermann D, Hanna D, Jamei M, Gardner I. Development of a Multicompartment Permeability-Limited Lung PBPK Model and Its Application in Predicting Pulmonary Pharmacokinetics of Antituberculosis Drugs. CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY 2015; 4:605-13. [PMID: 26535161 PMCID: PMC4625865 DOI: 10.1002/psp4.12034] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Accepted: 08/18/2015] [Indexed: 12/20/2022]
Abstract
Achieving sufficient concentrations of antituberculosis (TB) drugs in pulmonary tissue at the optimum time is still a challenge in developing therapeutic regimens for TB. A physiologically based pharmacokinetic model incorporating a multicompartment permeability-limited lung model was developed and used to simulate plasma and pulmonary concentrations of seven drugs. Passive permeability of drugs within the lung was predicted using an in vitro-in vivo extrapolation approach. Simulated epithelial lining fluid (ELF):plasma concentration ratios showed reasonable agreement with observed clinical data for rifampicin, isoniazid, ethambutol, and erythromycin. For clarithromycin, itraconazole and pyrazinamide the observed ELF:plasma ratios were significantly underpredicted. Sensitivity analyses showed that changing ELF pH or introducing efflux transporter activity between lung tissue and ELF can alter the ELF:plasma concentration ratios. The described model has shown utility in predicting the lung pharmacokinetics of anti-TB drugs and provides a framework for predicting pulmonary concentrations of novel anti-TB drugs.
Collapse
Affiliation(s)
- L Gaohua
- Simcyp Limited (a Certara company) Sheffield, United Kingdom
| | - J Wedagedera
- Simcyp Limited (a Certara company) Sheffield, United Kingdom
| | - B G Small
- Simcyp Limited (a Certara company) Sheffield, United Kingdom
| | - L Almond
- Simcyp Limited (a Certara company) Sheffield, United Kingdom
| | - K Romero
- Critical Path Institute Tucson, Arizona, USA
| | - D Hermann
- Certara USA, Inc. Princeton, New Jersey, USA
| | - D Hanna
- Critical Path Institute Tucson, Arizona, USA
| | - M Jamei
- Simcyp Limited (a Certara company) Sheffield, United Kingdom
| | - I Gardner
- Simcyp Limited (a Certara company) Sheffield, United Kingdom
| |
Collapse
|
6
|
Castillo-Garit JA, del Toro-Cortés O, Vega MC, Rolón M, Rojas de Arias A, Casañola-Martin GM, Escario JA, Gómez-Barrio A, Marrero-Ponce Y, Torrens F, Abad C. Bond-based bilinear indices for computational discovery of novel trypanosomicidal drug-like compounds through virtual screening. Eur J Med Chem 2015; 96:238-44. [PMID: 25884114 DOI: 10.1016/j.ejmech.2015.03.063] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Revised: 02/27/2015] [Accepted: 03/27/2015] [Indexed: 11/25/2022]
Abstract
Two-dimensional bond-based bilinear indices and linear discriminant analysis are used in this report to perform a quantitative structure-activity relationship study to identify new trypanosomicidal compounds. A data set of 440 organic chemicals, 143 with antitrypanosomal activity and 297 having other clinical uses, is used to develop the theoretical models. Two discriminant models, computed using bond-based bilinear indices, are developed and both show accuracies higher than 86% for training and test sets. The stochastic model correctly indentifies nine out of eleven compounds of a set of organic chemicals obtained from our synthetic collaborators. The in vitro antitrypanosomal activity of this set against epimastigote forms of Trypanosoma cruzi is assayed. Both models show a good agreement between theoretical predictions and experimental results. Three compounds showed IC50 values for epimastigote elimination (AE) lower than 50 μM, while for the benznidazole the IC50 = 54.7 μM which was used as reference compound. The value of IC50 for cytotoxicity of these compounds is at least 5 times greater than their value of IC50 for AE. Finally, we can say that, the present algorithm constitutes a step forward in the search for efficient ways of discovering new antitrypanosomal compounds.
Collapse
Affiliation(s)
- Juan Alberto Castillo-Garit
- Centro de Estudio de Química Aplicada, Facultad de Química-Farmacia, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba; Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba; Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100, Burjassot, Spain; Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, P.O. Box 22085, E-46071, València, Spain.
| | - Oremia del Toro-Cortés
- Centro de Estudio de Química Aplicada, Facultad de Química-Farmacia, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | - Maria C Vega
- Centro para el Desarrollo de la Investigacion Cientifica (CEDIC) and Fundación Moisés Bertoni/Laboratorios Díaz Gill, Pai Perez 265 casi Mariscal Estigarribia, Asuncion, Paraguay
| | - Miriam Rolón
- Centro para el Desarrollo de la Investigacion Cientifica (CEDIC) and Fundación Moisés Bertoni/Laboratorios Díaz Gill, Pai Perez 265 casi Mariscal Estigarribia, Asuncion, Paraguay
| | - Antonieta Rojas de Arias
- Centro para el Desarrollo de la Investigacion Cientifica (CEDIC) and Fundación Moisés Bertoni/Laboratorios Díaz Gill, Pai Perez 265 casi Mariscal Estigarribia, Asuncion, Paraguay
| | - Gerardo M Casañola-Martin
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba; Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100, Burjassot, Spain; Centro de Información y Gestión Tecnológica, Ministerio de Ciencia Tecnología y Medio Ambiente (CITMA), 65100, Ciego de Ávila, Cuba
| | - José A Escario
- Departamento de Parasitología, Facultad de Farmacia, UCM, Pza. Ramón y Cajal s/n, 28040, Madrid, Spain
| | - Alicia Gómez-Barrio
- Departamento de Parasitología, Facultad de Farmacia, UCM, Pza. Ramón y Cajal s/n, 28040, Madrid, Spain
| | - Yovani Marrero-Ponce
- Enviromental and Computational Chemistry Group, Facultad de Química Farmacéutica, Universidad de Cartagena,Cartagena de Indias, Bolivar, Colombia
| | - Francisco Torrens
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, P.O. Box 22085, E-46071, València, Spain
| | - Concepción Abad
- Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100, Burjassot, Spain
| |
Collapse
|
7
|
García-Jacas CR, Marrero-Ponce Y, Acevedo-Martínez L, Barigye SJ, Valdés-Martiní JR, Contreras-Torres E. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps. J Comput Chem 2014; 35:1395-409. [PMID: 24889018 DOI: 10.1002/jcc.23640] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 04/22/2014] [Accepted: 04/23/2014] [Indexed: 11/12/2022]
Abstract
The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies.
Collapse
Affiliation(s)
- César R García-Jacas
- Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba; Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Martha Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | | | | | | | | | | |
Collapse
|
8
|
Toropov AA, Toropova AP. Optimal descriptor as a translator of eclectic data into endpoint prediction: mutagenicity of fullerene as a mathematical function of conditions. CHEMOSPHERE 2014; 104:262-264. [PMID: 24246220 DOI: 10.1016/j.chemosphere.2013.10.079] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2013] [Revised: 10/21/2013] [Accepted: 10/26/2013] [Indexed: 06/02/2023]
Abstract
The experimental data on the bacterial reverse mutation test on C60 nanoparticles (TA100) is examined as an endpoint. By means of the optimal descriptors calculated with the Monte Carlo method a mathematical model of the endpoint has been built up. The model is the mathematical function of (i) dose (g/plate); (ii) metabolic activation (i.e. with S9 mix or without S9 mix); and (iii) illumination (i.e. dark or irradiation). The statistical quality of the model is the following: n=10, r(2)=0.7549, q(2)=0.5709, s=7.67, F=25 (Training set); n=5, r(2)=0.8987, s=18.4 (Calibration set); and n=5, r(2)=0.6968, s=10.9 (Validation set).
Collapse
Affiliation(s)
- Andrey A Toropov
- IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, Milano 20156, Italy.
| | - Alla P Toropova
- IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, Milano 20156, Italy
| |
Collapse
|
9
|
Toropov AA, Toropova AP, Puzyn T, Benfenati E, Gini G, Leszczynska D, Leszczynski J. QSAR as a random event: modeling of nanoparticles uptake in PaCa2 cancer cells. CHEMOSPHERE 2013; 92:31-37. [PMID: 23566368 DOI: 10.1016/j.chemosphere.2013.03.012] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Revised: 02/21/2013] [Accepted: 03/06/2013] [Indexed: 05/28/2023]
Abstract
Quantitative structure-property/activity relationships (QSPRs/QSARs) are a tool to predict various endpoints for various substances. The "classic" QSPR/QSAR analysis is based on the representation of the molecular structure by the molecular graph. However, simplified molecular input-line entry system (SMILES) gradually becomes most popular representation of the molecular structure in the databases available on the Internet. Under such circumstances, the development of molecular descriptors calculated directly from SMILES becomes attractive alternative to "classic" descriptors. The CORAL software (http://www.insilico.eu/coral) is provider of SMILES-based optimal molecular descriptors which are aimed to correlate with various endpoints. We analyzed data set on nanoparticles uptake in PaCa2 pancreatic cancer cells. The data set includes 109 nanoparticles with the same core but different surface modifiers (small organic molecules). The concept of a QSAR as a random event is suggested in opposition to "classic" QSARs which are based on the only one distribution of available data into the training and the validation sets. In other words, five random splits into the "visible" training set and the "invisible" validation set were examined. The SMILES-based optimal descriptors (obtained by the Monte Carlo technique) for these splits are calculated with the CORAL software. The statistical quality of all these models is good.
Collapse
Affiliation(s)
- Andrey A Toropov
- Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via La Masa 19, 20156 Milano, Italy.
| | | | | | | | | | | | | |
Collapse
|
10
|
Abstract
AbstractAbstract The CORAL software (http://www.insilico.eu/coral/) has been examined as a tool for modeling anti-HIV-1 activity by quantitative structure — activity relationships (QSAR) for three different sets: (i) TIBO derivatives (n=82) (ii) anti-HIV-1 activity of 2-amino-6-arylsulfonylbenzonitriles and their congeners (n=64), and (iii) the measured binding affinity for fullerene-based HIV-1 PR inhibitors (n=48). A new global invariant ATOMPAIR of the molecular structure which can be calculated with the simplified molecular input line entry system (SMILES) was studied. The ATOMPAIR is an indicator of the joint presence of pairs of chemical elements (F, Cl, Br, N, O, S, and P) and three types of bonds (double covalent bond, triple covalent bond, and stereo chemical bond). Six random splits into sub-training, calibration, and test set were examined for each set. For the three aforementioned sets, the use of ATOMPAIR in the modeling process improves the predictive potential of the models for six random splits. Graphical abstract
Collapse
|
11
|
Brito-Sánchez Y, Castillo-Garit JA, Le-Thi-Thu H, González-Madariaga Y, Torrens F, Marrero-Ponce Y, Rodríguez-Borges JE. Comparative study to predict toxic modes of action of phenols from molecular structures. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:235-251. [PMID: 23437773 DOI: 10.1080/1062936x.2013.766260] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Quantitative structure-activity relationship models for the prediction of mode of toxic action (MOA) of 221 phenols to the ciliated protozoan Tetrahymena pyriformis using atom-based quadratic indices are reported. The phenols represent a variety of MOAs including polar narcotics, weak acid respiratory uncouplers, pro-electrophiles and soft electrophiles. Linear discriminant analysis (LDA), and four machine learning techniques (ML), namely k-nearest neighbours (k-NN), support vector machine (SVM), classification trees (CTs) and artificial neural networks (ANNs), have been used to develop several models with higher accuracies and predictive capabilities for distinguishing between four MOAs. Most of them showed global accuracy of over 90%, and false alarm rate values were below 2.9% for the training set. Cross-validation, complementary subsets and external test set were performed, with good behaviour in all cases. Our models compare favourably with other previously published models, and in general the models obtained with ML techniques show better results than those developed with linear techniques. We developed unsupervised and supervised consensus, and these results were better than our ML models, the results of rule-based approach and other ensemble models previously published. This investigation highlights the merits of ML-based techniques as an alternative to other more traditional methods for modelling MOA.
Collapse
Affiliation(s)
- Y Brito-Sánchez
- Unit of Computer-Aided Molecular Biosilico Discovery and Bioinformatic Research, Faculty of Chemistry-Pharmacy, Universidad Central Marta Abreu de Las Villas, Santa Clara, Cuba
| | | | | | | | | | | | | |
Collapse
|
12
|
Castillo-Garit JA, del Toro-Cortés O, Kouznetsov VV, Puentes CO, Romero Bohórquez AR, Vega MC, Rolón M, Escario JA, Gómez-Barrio A, Marrero-Ponce Y, Torrens F, Abad C. Identification In Silico and In Vitro of Novel Trypanosomicidal Drug-Like Compounds. Chem Biol Drug Des 2012; 80:38-45. [DOI: 10.1111/j.1747-0285.2012.01378.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
13
|
Ligand-based discovery of novel trypanosomicidal drug-like compounds: In silico identification and experimental support. Eur J Med Chem 2011; 46:3324-30. [DOI: 10.1016/j.ejmech.2011.04.057] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Revised: 04/26/2011] [Accepted: 04/26/2011] [Indexed: 01/08/2023]
|
14
|
Ortega-Broche SE, Marrero-Ponce Y, Díaz YE, Torrens F, Pérez-Giménez F. tomocomd-camps and protein bilinear indices - novel bio-macromolecular descriptors for protein research: I. Predicting protein stability effects of a complete set of alanine substitutions in the Arc repressor. FEBS J 2010; 277:3118-46. [DOI: 10.1111/j.1742-4658.2010.07711.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
15
|
Multi-target spectral moment QSAR versus ANN for antiparasitic drugs against different parasite species. Bioorg Med Chem 2010; 18:2225-2231. [PMID: 20185316 DOI: 10.1016/j.bmc.2010.01.068] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2009] [Revised: 01/22/2010] [Accepted: 01/29/2010] [Indexed: 11/23/2022]
Abstract
There are many of pathogen parasite species with different susceptibility profile to antiparasitic drugs. Unfortunately, almost QSAR models predict the biological activity of drugs against only one parasite species. Consequently, predicting the probability with which a drug is active against different species with a single unify model is a goal of the major importance. In so doing, we use Markov Chains theory to calculate new multi-target spectral moments to fit a QSAR model that predict by the first time a mt-QSAR model for 500 drugs tested in the literature against 16 parasite species and other 207 drugs no tested in the literature using spectral moments. The data was processed by linear discriminant analysis (LDA) classifying drugs as active or non-active against the different tested parasite species. The model correctly classifies 311 out of 358 active compounds (86.9%) and 2328 out of 2577 non-active compounds (90.3%) in training series. Overall training performance was 89.9%. Validation of the model was carried out by means of external predicting series. In these series the model classified correctly 157 out 190, 82.6% of antiparasitic compounds and 1151 out of 1277 non-active compounds (90.1%). Overall predictability performance was 89.2%. In addition we developed four types of non Linear Artificial neural networks (ANN) and we compared with the mt-QSAR model. The improved ANN model had an overall training performance was 87%. The present work report the first attempts to calculate within a unify framework probabilities of antiparasitic action of drugs against different parasite species based on spectral moment analysis.
Collapse
|
16
|
Castillo-Garit J, Marrero-Ponce Y, Torrens F, García-Domenech R, Rodríguez-Borges J. Applications of Bond-Based 3D-Chiral Quadratic Indices in QSAR Studies Related to Central Chirality Codification. ACTA ACUST UNITED AC 2009. [DOI: 10.1002/qsar.200960085] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
17
|
Castillo-Garit JA, Vega MC, Rolon M, Marrero-Ponce Y, Kouznetsov VV, Torres DFA, Gómez-Barrio A, Bello AA, Montero A, Torrens F, Pérez-Giménez F. Computational discovery of novel trypanosomicidal drug-like chemicals by using bond-based non-stochastic and stochastic quadratic maps and linear discriminant analysis. Eur J Pharm Sci 2009; 39:30-6. [PMID: 19854271 DOI: 10.1016/j.ejps.2009.10.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Revised: 10/01/2009] [Accepted: 10/13/2009] [Indexed: 11/28/2022]
Abstract
Herein we present results of a quantitative structure-activity relationship (QSAR) studies to classify and design, in a rational way, new antitrypanosomal compounds by using non-stochastic and stochastic bond-based quadratic indices. A data set of 440 organic chemicals, 143 with antitrypanosomal activity and 297 having other clinical uses, is used to develop QSAR models based on linear discriminant analysis (LDA). Non-stochastic model correctly classifies more than 93% and 95% of chemicals in both training and external prediction groups, respectively. On the other hand, the stochastic model shows an accuracy of about the 87% for both series. As an experiment of virtual lead generation, the present approach is finally satisfactorily applied to the virtual evaluation of 9 already synthesized in house compounds. The in vitro antitrypanosomal activity of this series against epimastigote forms of Trypanosoma cruzi is assayed. The model is able to predict correctly the behaviour for the majority of these compounds. Four compounds (FER16, FER32, FER33 and FER 132) showed more than 70% of epimastigote inhibition at a concentration of 100 microg/mL (86.74%, 78.12%, 88.85% and 72.10%, respectively) and two of these chemicals, FER16 (78.22% of AE) and FER33 (81.31% of AE), also showed good activity at a concentration of 10 microg/mL. At the same concentration, compound FER16 showed lower value of cytotoxicity (15.44%), and compound FER33 showed very low value of 1.37%. Taking into account all these results, we can say that these three compounds can be optimized in forthcoming works, but we consider that compound FER33 is the best candidate. Even though none of them resulted more active than Nifurtimox, the current results constitute a step forward in the search for efficient ways to discover new lead antitrypanosomals.
Collapse
Affiliation(s)
- Juan Alberto Castillo-Garit
- Applied Chemistry Research Center, Faculty of Chemistry-Pharmacy, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Multi-target spectral moment: QSAR for antiviral drugs vs. different viral species. Anal Chim Acta 2009; 651:159-64. [PMID: 19782806 DOI: 10.1016/j.aca.2009.08.022] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2009] [Revised: 08/05/2009] [Accepted: 08/18/2009] [Indexed: 11/23/2022]
Abstract
The antiviral QSAR models have an important limitation today. They predict the biological activity of drugs against only one viral species. This is determined by the fact that most of the current reported molecular descriptors encode only information about the molecular structure. As a result, predicting the probability with which a drug is active against different viral species with a single unifying model is a goal of major importance. In this work, we use Markov Chain theory to calculate new multi-target spectral moments to fit a QSAR model for drugs active against 40 viral species. The model is based on 500 drugs (including active and non-active compounds) tested as antiviral agents in the recent literature; not all drugs were predicted against all viruses, but only those with experimental values. The database also contains 207 well-known compounds (not as recent as the previous ones) reported in the Merck Index with other activities that do not include antiviral action against any virus species. We used Linear Discriminant Analysis (LDA) to classify all these drugs into two classes as active or non-active against the different viral species tested, whose data we processed. The model correctly classifies 5129 out of 5594 non-active compounds (91.69%) and 412 out of 422 active compounds (97.63%). Overall training predictability was 92.34%. The validation of the model was carried out by means of external predicting series, the model classifying, thus, 2568 out of 2779 non-active compounds and 224 out of 229 active compounds. Overall training predictability was 92.82%. The present work reports the first attempts to calculate within a unified framework the probabilities of antiviral drugs against different virus species based on a spectral moment analysis.
Collapse
|
19
|
Nucleotide's bilinear indices: novel bio-macromolecular descriptors for bioinformatics studies of nucleic acids. I. Prediction of paromomycin's affinity constant with HIV-1 Psi-RNA packaging region. J Theor Biol 2009; 259:229-41. [PMID: 19272394 DOI: 10.1016/j.jtbi.2009.02.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Revised: 02/24/2009] [Accepted: 02/25/2009] [Indexed: 02/03/2023]
Abstract
A new set of nucleotide-based bio-macromolecular descriptors are presented. This novel approach to bio-macromolecular design from a linear algebra point of view is relevant to nucleic acids quantitative structure-activity relationship (QSAR) studies. These bio-macromolecular indices are based on the calculus of bilinear maps on Re(n)[b(mk)(x (m),y (m)):Re(n) x Re(n)-->Re] in canonical basis. Nucleic acid's bilinear indices are calculated from kth power of non-stochastic and stochastic nucleotide's graph-theoretic electronic-contact matrices, M(m)(k) and (s)M(m)(k), respectively. That is to say, the kth non-stochastic and stochastic nucleic acid's bilinear indices are calculated using M(m)(k) and (s)M(m)(k) as matrix operators of bilinear transformations. Moreover, biochemical information is codified by using different pair combinations of nucleotide-base properties as weightings (experimental molar absorption coefficient epsilon(260) at 260 nm and pH=7.0, first (Delta E(1)) and second (Delta E(2)) single excitation energies in eV, and first (f(1)) and second (f(2)) oscillator strength values (of the first singlet excitation energies) of the nucleotide DNA-RNA bases. As example of this approach, an interaction study of the antibiotic paromomycin with the packaging region of the HIV-1 Psi-RNA have been performed and it have been obtained several linear models in order to predict the interaction strength. The best linear model obtained by using non-stochastic bilinear indices explains about 91% of the variance of the experimental Log K (R=0.95 and s=0.08 x 10(-4)M(-1)) as long as the best stochastic bilinear indices-based equation account for 93% of the Log K variance (R=0.97 and s=0.07 x 10(-4)M(-1)). The leave-one-out (LOO) press statistics, evidenced high predictive ability of both models (q(2)=0.86 and s(cv)=0.09 x 10(-4)M(-1) for non-stochastic and q(2)=0.91 and s(cv)=0.08 x 10(-4)M(-1) for stochastic bilinear indices). The nucleic acid's bilinear indices-based models compared favorably with other nucleic acid's indices-based approaches reported nowadays. These models also permit the interpretation of the driving forces of the interaction process. In this sense, developed equations involve short-reaching (k<or=3), middle-reaching (4<k<9), and far-reaching (k=10 or greater) nucleotide's bilinear indices. This situation points to electronic and topologic nucleotide's backbone interactions control of the stability profile of paromomycin-RNA complexes. Consequently, the present approach represents a novel and rather promising way to theoretical-biology studies.
Collapse
|
20
|
Del Rio A. Exploring enantioselective molecular recognition mechanisms with chemoinformatic techniques. J Sep Sci 2009; 32:1566-84. [DOI: 10.1002/jssc.200800693] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
21
|
García I, Munteanu CR, Fall Y, Gómez G, Uriarte E, González-Díaz H. QSAR and complex network study of the chiral HMGR inhibitor structural diversity. Bioorg Med Chem 2009; 17:165-75. [DOI: 10.1016/j.bmc.2008.11.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2008] [Revised: 10/31/2008] [Accepted: 11/06/2008] [Indexed: 10/21/2022]
|
22
|
Prado-Prado FJ, Martinez de la Vega O, Uriarte E, Ubeira FM, Chou KC, González-Díaz H. Unified QSAR approach to antimicrobials. 4. Multi-target QSAR modeling and comparative multi-distance study of the giant components of antiviral drug-drug complex networks. Bioorg Med Chem 2008; 17:569-75. [PMID: 19112024 DOI: 10.1016/j.bmc.2008.11.075] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2008] [Revised: 11/24/2008] [Accepted: 11/28/2008] [Indexed: 11/18/2022]
Abstract
One limitation of almost all antiviral Quantitative Structure-Activity Relationships (QSAR) models is that they predict the biological activity of drugs against only one species of virus. Consequently, the development of multi-tasking QSAR models (mt-QSAR) to predict drugs activity against different species of virus is of the major vitally important. These mt-QSARs offer also a good opportunity to construct drug-drug Complex Networks (CNs) that can be used to explore large and complex drug-viral species databases. It is known that in very large CNs we can use the Giant Component (GC) as a representative sub-set of nodes (drugs) and but the drug-drug similarity function selected may strongly determines the final network obtained. In the three previous works of the present series we reported mt-QSAR models to predict the antimicrobial activity against different fungi [Gonzalez-Diaz, H.; Prado-Prado, F. J.; Santana, L.; Uriarte, E. Bioorg.Med.Chem.2006, 14, 5973], bacteria [Prado-Prado, F. J.; Gonzalez-Diaz, H.; Santana, L.; Uriarte E. Bioorg.Med.Chem.2007, 15, 897] or parasite species [Prado-Prado, F.J.; González-Díaz, H.; Martinez de la Vega, O.; Ubeira, F.M.; Chou K.C. Bioorg.Med.Chem.2008, 16, 5871]. However, including these works, we do not found any report of mt-QSAR models for antivirals drug, or a comparative study of the different GC extracted from drug-drug CNs based on different similarity functions. In this work, we used Linear Discriminant Analysis (LDA) to fit a mt-QSAR model that classify 600 drugs as active or non-active against the 41 different tested species of virus. The model correctly classifies 143 of 169 active compounds (specificity=84.62%) and 119 of 139 non-active compounds (sensitivity=85.61%) and presents overall training accuracy of 85.1% (262 of 308 cases). Validation of the model was carried out by means of external predicting series, classifying the model 466 of 514, 90.7% of compounds. In order to illustrate the performance of the model in practice, we develop a virtual screening recognizing the model as active 92.7%, 102 of 110 antivirus compounds. These compounds were never use in training or predicting series. Next, we obtained and compared the topology of the CNs and their respective GCs based on Euclidean, Manhattan, Chebychey, Pearson and other similarity measures. The GC of the Manhattan network showed the more interesting features for drug-drug similarity search. We also give the procedure for the construction of Back-Projection Maps for the contribution of each drug sub-structure to the antiviral activity against different species.
Collapse
Affiliation(s)
- Francisco J Prado-Prado
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela 15782, Spain
| | | | | | | | | | | |
Collapse
|
23
|
Castillo-Garit JA, Marrero-Ponce Y, Torrens F, García-Domenech R, Romero-Zaldivar V. Bond-based 3D-chiral linear indices: Theory and QSAR applications to central chirality codification. J Comput Chem 2008; 29:2500-12. [DOI: 10.1002/jcc.20964] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
24
|
Castillo-Garit JA, Martinez-Santiago O, Marrero-Ponce Y, Casañola-Martín GM, Torrens F. Atom-based non-stochastic and stochastic bilinear indices: Application to QSPR/QSAR studies of organic compounds. Chem Phys Lett 2008. [DOI: 10.1016/j.cplett.2008.08.094] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
25
|
Castillo-Garit JA, Marrero-Ponce Y, Escobar J, Torrens F, Rotondo R. A novel approach to predict aquatic toxicity from molecular structure. CHEMOSPHERE 2008; 73:415-427. [PMID: 18597811 DOI: 10.1016/j.chemosphere.2008.05.024] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2008] [Revised: 04/29/2008] [Accepted: 05/07/2008] [Indexed: 05/26/2023]
Abstract
The main aim of the study was to develop quantitative structure-activity relationship (QSAR) models for the prediction of aquatic toxicity using atom-based non-stochastic and stochastic linear indices. The used dataset consist of 392 benzene derivatives, separated into training and test sets, for which toxicity data to the ciliate Tetrahymena pyriformis were available. Using multiple linear regression, two statistically significant QSAR models were obtained with non-stochastic (R2=0.791 and s=0.344) and stochastic (R2=0.799 and s=0.343) linear indices. A leave-one-out (LOO) cross-validation procedure was carried out achieving values of q2=0.781 (scv=0.348) and q2=0.786 (scv=0.350), respectively. In addition, a validation through an external test set was performed, which yields significant values of Rpred2 of 0.762 and 0.797. A brief study of the influence of the statistical outliers in QSAR's model development was also carried out. Finally, our method was compared with other approaches implemented in the Dragon software achieving better results. The non-stochastic and stochastic linear indices appear to provide an interesting alternative to costly and time-consuming experiments for determining toxicity.
Collapse
Affiliation(s)
- Juan A Castillo-Garit
- Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | |
Collapse
|
26
|
Dea-Ayuela MA, Pérez-Castillo Y, Meneses-Marcel A, Ubeira FM, Bolas-Fernández F, Chou KC, González-Díaz H. HP-Lattice QSAR for dynein proteins: experimental proteomics (2D-electrophoresis, mass spectrometry) and theoretic study of a Leishmania infantum sequence. Bioorg Med Chem 2008; 16:7770-6. [PMID: 18662882 DOI: 10.1016/j.bmc.2008.07.023] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Revised: 06/23/2008] [Accepted: 07/02/2008] [Indexed: 10/21/2022]
Abstract
The toxicity and inefficacy of actual organic drugs against Leishmaniosis justify research projects to find new molecular targets in Leishmania species including Leishmania infantum (L. infantum) and Leishmaniamajor (L. major), both important pathogens. In this sense, quantitative structure-activity relationship (QSAR) methods, which are very useful in Bioorganic and Medicinal Chemistry to discover small-sized drugs, may help to identify not only new drugs but also new drug targets, if we apply them to proteins. Dyneins are important proteins of these parasites governing fundamental processes such as cilia and flagella motion, nuclear migration, organization of the mitotic splinde, and chromosome separation during mitosis. However, despite the interest for them as potential drug targets, so far there has been no report whatsoever on dyneins with QSAR techniques. To the best of our knowledge, we report here the first QSAR for dynein proteins. We used as input the Spectral Moments of a Markov matrix associated to the HP-Lattice Network of the protein sequence. The data contain 411 protein sequences of different species selected by ClustalX to develop a QSAR that correctly discriminates on average between 92.75% and 92.51% of dyneins and other proteins in four different train and cross-validation datasets. We also report a combined experimental and theoretic study of a new dynein sequence in order to illustrate the utility of the model to search for potential drug targets with a practical example. First, we carried out a 2D-electrophoresis analysis of L. infantum biological samples. Next, we excised from 2D-E gels one spot of interest belonging to an unknown protein or protein fragment in the region M<20,200 and pI<4. We used MASCOT search engine to find proteins in the L. major data base with the highest similarity score to the MS of the protein isolated from L. infantum. We used the QSAR model to predict the new sequence as dynein with probability of 99.99% without relying upon alignment. In order to confirm the previous function annotation we predicted the sequences as dynein with BLAST and the omniBLAST tools (96% alignment similarity to dyneins of other species). Using this combined strategy, we have successfully identified L. infantum protein containing dynein heavy chain, and illustrated the potential use of the QSAR model as a complement to alignment tools.
Collapse
|
27
|
Prado-Prado FJ, González-Díaz H, de la Vega OM, Ubeira FM, Chou KC. Unified QSAR approach to antimicrobials. Part 3: first multi-tasking QSAR model for input-coded prediction, structural back-projection, and complex networks clustering of antiprotozoal compounds. Bioorg Med Chem 2008; 16:5871-80. [PMID: 18485714 DOI: 10.1016/j.bmc.2008.04.068] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Revised: 04/22/2008] [Accepted: 04/25/2008] [Indexed: 10/22/2022]
Abstract
Several pathogen parasite species show different susceptibilities to different antiparasite drugs. Unfortunately, almost all structure-based methods are one-task or one-target Quantitative Structure-Activity Relationships (ot-QSAR) that predict the biological activity of drugs against only one parasite species. Consequently, multi-tasking learning to predict drugs activity against different species by a single model (mt-QSAR) is vitally important. In the two previous works of the present series we reported two single mt-QSAR models in order to predict the antimicrobial activity against different fungal (Bioorg. Med. Chem.2006, 14, 5973-5980) or bacterial species (Bioorg. Med. Chem.2007, 15, 897-902). These mt-QSARs offer a good opportunity (unpractical with ot-QSAR) to construct drug-drug similarity Complex Networks and to map the contribution of sub-structures to function for multiple species. These possibilities were unattended in our previous works. In the present work, we continue this series toward other important direction of chemotherapy (antiparasite drugs) with the development of an mt-QSAR for more than 500 drugs tested in the literature against different parasites. The data were processed by Linear Discriminant Analysis (LDA) classifying drugs as active or non-active against the different tested parasite species. The model correctly classifies 212 out of 244 (87.0%) cases in training series and 207 out of 243 compounds (85.4%) in external validation series. In order to illustrate the performance of the QSAR for the selection of active drugs we carried out an additional virtual screening of antiparasite compounds not used in training or predicting series; the model recognized 97 out of 114 (85.1%) of them. We also give the procedures to construct back-projection maps and to calculate sub-structures contribution to the biological activity. Finally, we used the outputs of the QSAR to construct, by the first time, a multi-species Complex Networks of antiparasite drugs. The network predicted has 380 nodes (compounds), 634 edges (pairs of compounds with similar activity). This network allows us to cluster different compounds and identify on average three known compounds similar to a new query compound according to their profile of biological activity. This is the first attempt to calculate probabilities of antiparasitic action of drugs against different parasites.
Collapse
|