1
|
González-Castañeda Y, Marrero-Ponce Y, Guerra JO, Echevarría-Díaz Y, Pérez N, Pérez-Giménez F, Simonet AM, Macías FA, Nogueiras CM, Olazabal E, Serrano H. Computational discovery of novel anthelmintic natural compounds from Agave Brittoniana trel. Spp. Brachypus. BIONATURA 2022. [DOI: 10.21931/rb/2022.07.04.53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Helminth infections are a medical problem in the world nowadays. This report used bond-based 2D quadratic indices, a bond-level QuBiLs-MAS molecular descriptor family, and Linear Discriminant Analysis (LDA) to obtain a quantitative linear model that discriminates between anthelmintic and non-anthelmintic drug-like organic-compounds. The model obtained correctly classified 87.46% and 81.82% of the training and external data sets, respectively. The developed model was used in a virtual screening to predict the biological activity of all chemicals (19) previously obtained and chemically characterized by some authors of this report from Agave brittoniana Trel. spp. Brachypus. The model identified several metabolites (12) as possible anthelmintics, and a group of 5 novel natural products was tested in an in vitro assay against Fasciola hepatica (100% effectivity at 500 µg/mL). Finally, the two best hits were evaluated in vivo in bald/c mice and the same helminth parasite using a 25 mg/kg dose. Compound 8 (Karatavinoside A) showed an efficacy of 92.2% in vivo. It is important to remark that this natural compound exhibits similar-to-superior activity as triclabendazole, the best human fasciolicide available in the market against Fasciola hepatica, resulting in a novel lead scaffold with anti-helminthic activity.
Keywords: TOMOCOMD-CARDD Software; QuBiLs-MAS, nonstochastic and stochastic bond-based quadratic indices; LDA-based QSAR model; Computational Screening, Anthelmintic Agent; Agave brittoniana Trel. spp. Brachypus, Fasciola hepatica.
Collapse
Affiliation(s)
- Yeniel González-Castañeda
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA)
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Jose O. Guerra
- Chemistry Department, Faculty of Chemistry-Pharmacy. Universidad Central “Marta Abreu” de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | - Yunaimy Echevarría-Díaz
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE)
| | - Noel Pérez
- Colegio de Ciencias e Ingenierías “El Politécnico”, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Ana M. Simonet
- Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
| | - Francisco A. Macías
- Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
| | - Clara M. Nogueiras
- Departamento de Química Orgánica, Facultad de Química, Universidad de La Habana
| | - Ervelio Olazabal
- Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
| | - Hector Serrano
- Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
| |
Collapse
|
2
|
Mora JR, Marrero-Ponce Y, García-Jacas CR, Suarez Causado A. Ensemble Models Based on QuBiLS-MAS Features and Shallow Learning for the Prediction of Drug-Induced Liver Toxicity: Improving Deep Learning and Traditional Approaches. Chem Res Toxicol 2020; 33:1855-1873. [PMID: 32406679 DOI: 10.1021/acs.chemrestox.0c00030] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Drug-induced liver injury (DILI) is a key safety issue in the drug discovery pipeline and a regulatory concern. Thus, many in silico tools have been proposed to improve the hepatotoxicity prediction of organic-type chemicals. Here, classifiers for the prediction of DILI were developed by using QuBiLS-MAS 0-2.5D molecular descriptors and shallow machine learning techniques, on a training set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the learning series, namely, the following: accuracy = 0.840, sensibility = 0.890, specificity = 0.761, Matthew's correlation coefficient = 0.660, and area under the ROC curve = 0.904. The model was also satisfactorily evaluated with Y-scrambling test, and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external validation was also carried out by using two test sets and five external test sets, with an average accuracy value equal to 0.854 (±0.062) and a coverage equal to 98.4% according to its applicability domain. A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPredictor Software, Deep Learning DILIserver, and Vslead) reported in the literature, was also performed. In general, E13 presented the best global performance in all experiments. The sum of the ranking differences procedure provided a very similar grouping pattern to that of the M-ANOVA statistical analysis, where E13 was identified as the best model for DILI predictions. A noncommercial and fully cross-platform software for the DILI prediction was also developed, which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for the screening of seven data sets, containing natural products, leads, toxic materials, and FDA approved drugs, to assess the usefulness of the QSAR models in the DILI labeling of organic substances; it was found that 50-92% of the evaluated molecules are positive-DILI compounds. All in all, it can be stated that the E13 model is a relevant method for the prediction of DILI risk in humans, as it shows the best results among all of the methods analyzed.
Collapse
Affiliation(s)
- Jose R Mora
- Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, Universidad San Francisco de Quito (USFQ), Quito 17-1200-841, Ecuador.,Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y Vía Interoceánica, Quito 17-1200-841, Ecuador
| | - Yovani Marrero-Ponce
- Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y Vía Interoceánica, Quito 17-1200-841, Ecuador.,Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, and Instituto de Simulación Computacional (ISC-USFQ), Universidad San Francisco de Quito (USFQ), Diego de Robles y vía Interoceánica, Quito, Pichincha 170157, Ecuador
| | - César R García-Jacas
- Cátedras Conacyt-Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California 22860, México
| | - Amileth Suarez Causado
- Grupo de Investigación Prometeus & Biomedicina Aplicada a las Ciencias Clínicas, Área de Bioquímica, Campus de Zaragocilla, Facultad de Medicina, Universidad de Cartagena, Cartagena de Indias 130001, Colombia
| |
Collapse
|
3
|
García-Jacas CR, Marrero-Ponce Y, Cortés-Guzmán F, Suárez-Lezcano J, Martinez-Rios FO, García-González LA, Pupo-Meriño M, Martinez-Mayorga K. Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes. Chem Res Toxicol 2019; 32:1178-1192. [PMID: 31066547 DOI: 10.1021/acs.chemrestox.9b00011] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Quantitative structure-activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with REPA-training ≥ 0.75 ( R = correlation coefficient) and MAEEPA-training ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAEtest-AD = 0.4044, MAEProTox-AD = 0.4067 and MAET3DB-AD = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAEProTox-AD = 0.3992 and MAET3DB-AD = 0.2286, and MAEProTox-AD = 0.3773 and MAET3DB-AD = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks ( http://tomocomd.com/apps/ptoxra ).
Collapse
Affiliation(s)
- César R García-Jacas
- Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada , Ensenada , Baja California , México
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional, Colegio de Ciencias de la Salud , Escuela de Medicina, Edificio de Especialidades Médicas , Quito , Pichincha , Ecuador.,Grupo de Investigación Ambiental, Programas Ambientales, Facultad de Ingenierías , Fundacion Universitaria Tecnologico Comfenalco-Cartagena , Cr44 DN 30 A, 91 , Cartagena , Bolívar , Colombia
| | - Fernando Cortés-Guzmán
- Instituto de Química , Universidad Nacional Autónoma de México , Ciudad de México , México
| | - José Suárez-Lezcano
- Pontificia Universidad Católica del Ecuador Sede Esmeraldas , Esmeraldas , Ecuador
| | | | - Luis A García-González
- Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba
| | - Mario Pupo-Meriño
- Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba
| | | |
Collapse
|
4
|
Valdés-Martiní JR, Marrero-Ponce Y, García-Jacas CR, Martinez-Mayorga K, Barigye SJ, Vaz d'Almeida YS, Pham-The H, Pérez-Giménez F, Morell CA. QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations. J Cheminform 2017; 9:35. [PMID: 29086120 PMCID: PMC5462671 DOI: 10.1186/s13321-017-0211-5] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 04/07/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In previous reports, Marrero-Ponce et al. proposed algebraic formalisms for characterizing topological (2D) and chiral (2.5D) molecular features through atom- and bond-based ToMoCoMD-CARDD (acronym for Topological Molecular Computational Design-Computer Aided Rational Drug Design) molecular descriptors. These MDs codify molecular information based on the bilinear, quadratic and linear algebraic forms and the graph-theoretical electronic-density and edge-adjacency matrices in order to consider atom- and bond-based relations, respectively. These MDs have been successfully applied in the screening of chemical compounds of different therapeutic applications ranging from antimalarials, antibacterials, tyrosinase inhibitors and so on. To compute these MDs, a computational program with the same name was initially developed. However, this in house software barely offered the functionalities required in contemporary molecular modeling tasks, in addition to the inherent limitations that made its usability impractical. Therefore, the present manuscript introduces the QuBiLS-MAS (acronym for Quadratic, Bilinear and N-Linear mapS based on graph-theoretic electronic-density Matrices and Atomic weightingS) software designed to compute topological (0-2.5D) molecular descriptors based on bilinear, quadratic and linear algebraic forms for atom- and bond-based relations. RESULTS The QuBiLS-MAS module was designed as standalone software, in which extensions and generalizations of the former ToMoCoMD-CARDD 2D-algebraic indices are implemented, considering the following aspects: (a) two new matrix normalization approaches based on double-stochastic and mutual probability formalisms; (b) topological constraints (cut-offs) to take into account particular inter-atomic relations; (c) six additional atomic properties to be used as weighting schemes in the calculation of the molecular vectors; (d) four new local-fragments to consider molecular regions of interest; (e) number of lone-pair electrons in chemical structure defined by diagonal coefficients in matrix representations; and (f) several aggregation operators (invariants) applied over atom/bond-level descriptors in order to compute global indices. This software permits the parallel computation of the indices, contains a batch processing module and data curation functionalities. This program was developed in Java v1.7 using the Chemistry Development Kit library (version 1.4.19). The QuBiLS-MAS software consists of two components: a desktop interface (GUI) and an API library allowing for the easy integration of the latter in chemoinformatics applications. The relevance of the novel extensions and generalizations implemented in this software is demonstrated through three studies. Firstly, a comparative Shannon's entropy based variability study for the proposed QuBiLS-MAS and the DRAGON indices demonstrates superior performance for the former. A principal component analysis reveals that the QuBiLS-MAS approach captures chemical information orthogonal to that codified by the DRAGON descriptors. Lastly, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer's steroid dataset is carried out. CONCLUSIONS From these analyses, it is revealed that the QuBiLS-MAS approach for atom-pair relations yields similar-to-superior performance with regard to other QSAR methodologies reported in the literature. Therefore, the QuBiLS-MAS approach constitutes a useful tool for the diversity analysis of chemical compound datasets and high-throughput screening of structure-activity data.
Collapse
Affiliation(s)
- José R Valdés-Martiní
- StreelBridge Laboratories, SteelBridge Consulting Technology Solutions, Miami, FL, USA
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Ecuador. .,Universidad San Francisco de Quito (USFQ), Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, 170157, Quito, Pichincha, Ecuador. .,Computer-Aided Molecular "Biosilico" Discovery and Bioinformatics Research International Network (CAMD-BIR IN), Cumbayá, Quito, Ecuador. .,Grupo de Investigación Ambiental (GIA), Fundación Universitaria Tecnológico de Comfenalco, Facultad de Ingenierías, Programa de Ingeniería de Procesos, Cartagena de Indias, Bolívar, Colombia. .,Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain.
| | - César R García-Jacas
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México.,Escuela de Sistemas y Computación, Pontificia Universidad Católica del Ecuador Sede Esmeraldas (PUCESE), Esmeraldas, Ecuador.,Grupo de Investigación de Bioinformática, Universidad de las Ciencias Informáticas (UCI), Havana, Cuba
| | - Karina Martinez-Mayorga
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México
| | - Stephen J Barigye
- Facultad de Medicina, Universidad de Las Américas, Quito, Pichincha, Ecuador
| | | | - Hai Pham-The
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi, Vietnam
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Carlos A Morell
- Laboratorio de Inteligencia Artificial, Centro de Estudios de Informática (CEI), Facultad de Matemática, Física y Computación, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara, Cuba
| |
Collapse
|
5
|
Marrero-Ponce Y, Contreras-Torres E, García-Jacas CR, Barigye SJ, Cubillán N, Alvarado YJ. Novel 3D bio-macromolecular bilinear descriptors for protein science: Predicting protein structural classes. J Theor Biol 2015; 374:125-37. [DOI: 10.1016/j.jtbi.2015.03.026] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 02/23/2015] [Accepted: 03/20/2015] [Indexed: 12/11/2022]
|
6
|
Ruiz-Blanco YB, Paz W, Green J, Marrero-Ponce Y. ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins. BMC Bioinformatics 2015; 16:162. [PMID: 25982853 PMCID: PMC4432771 DOI: 10.1186/s12859-015-0586-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 04/22/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient and widely available solution. RESULTS We here introduce ProtDCal, a new computational software suite capable of generating tens of thousands of features considering both sequence-based and 3D-structural descriptors. We demonstrate, by means of principle component analysis and Shannon entropy tests, how ProtDCal's sequence-based descriptors provide new and more relevant information not encoded by currently available servers for sequence-based protein feature generation. The wide diversity of the 3D-structure-based features generated by ProtDCal is shown to provide additional complementary information and effectively completes its general protein encoding capability. As demonstration of the utility of ProtDCal's features, prediction models of N-linked glycosylation sites are trained and evaluated. Classification performance compares favourably with that of contemporary predictors of N-linked glycosylation sites, in spite of not using domain-specific features as input information. CONCLUSIONS ProtDCal provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http://bioinf.sce.carleton.ca/ProtDCal/ . ProtDCal introduces local and group-based encoding which enhances the diversity of the information captured by the computed features. Furthermore, we have shown that adding structure-based descriptors contributes non-redundant additional information to the features-based characterization of polypeptide systems. This software is intended to provide a useful tool for general-purpose encoding of protein sequences and structures for applications is protein classification, similarity analyses and function prediction.
Collapse
Affiliation(s)
- Yasser B Ruiz-Blanco
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada.
| | - Waldo Paz
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Centre of Informatics Studies (CEI), Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP:54830, Villa Clara, Cuba.
| | - James Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada.
| | - Yovani Marrero-Ponce
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Grupo de Investigación Microbiología y Ambiente (GIMA). Programa de Bacteriología, Facultad Ciencias de la Salud, Universidad de San Buenaventura, Calle Real de Ternera, Cartagena (Bolivar), Colombia.
| |
Collapse
|
7
|
García-Jacas CR, Marrero-Ponce Y, Acevedo-Martínez L, Barigye SJ, Valdés-Martiní JR, Contreras-Torres E. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps. J Comput Chem 2014; 35:1395-409. [PMID: 24889018 DOI: 10.1002/jcc.23640] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 04/22/2014] [Accepted: 04/23/2014] [Indexed: 11/12/2022]
Abstract
The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies.
Collapse
Affiliation(s)
- César R García-Jacas
- Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba; Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Martha Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | | | | | | | | | | |
Collapse
|
8
|
Martins Alho MA, Marrero-Ponce Y, Barigye SJ, Meneses-Marcel A, Machado Tugores Y, Montero-Torres A, Gómez-Barrio A, Nogal JJ, García-Sánchez RN, Vega MC, Rolón M, Martínez-Fernández AR, Escario JA, Pérez-Giménez F, Garcia-Domenech R, Rivera N, Mondragón R, Mondragón M, Ibarra-Velarde F, Lopez-Arencibia A, Martín-Navarro C, Lorenzo-Morales J, Cabrera-Serra MG, Piñero J, Tytgat J, Chicharro R, Arán VJ. Antiprotozoan lead discovery by aligning dry and wet screening: Prediction, synthesis, and biological assay of novel quinoxalinones. Bioorg Med Chem 2014; 22:1568-85. [DOI: 10.1016/j.bmc.2014.01.036] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Revised: 01/13/2014] [Accepted: 01/21/2014] [Indexed: 12/20/2022]
|
9
|
Discovery of novel anti-inflammatory drug-like compounds by aligning in silico and in vivo screening: The nitroindazolinone chemotype. Eur J Med Chem 2011; 46:5736-53. [DOI: 10.1016/j.ejmech.2011.07.053] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2011] [Revised: 07/28/2011] [Accepted: 07/29/2011] [Indexed: 11/15/2022]
|
10
|
Ligand-based discovery of novel trypanosomicidal drug-like compounds: In silico identification and experimental support. Eur J Med Chem 2011; 46:3324-30. [DOI: 10.1016/j.ejmech.2011.04.057] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Revised: 04/26/2011] [Accepted: 04/26/2011] [Indexed: 01/08/2023]
|
11
|
Le-Thi-Thu H, Marrero-Ponce Y, Casañola-Martin GM, Cardoso GC, Chávez M, Garcia MM, Morell C, Torrens F, Abad C. A Comparative Study of Nonlinear Machine Learning for the “In Silico” Depiction of Tyrosinase Inhibitory Activity from Molecular Structure. Mol Inform 2011; 30:527-37. [DOI: 10.1002/minf.201100021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2010] [Accepted: 03/25/2010] [Indexed: 11/05/2022]
|
12
|
Casañola-Martin GM, Marrero-Ponce Y, Khan MTH, Khan SB, Torrens F, Pérez-Jiménez F, Rescigno A, Abad C. Bond-based 2D quadratic fingerprints in QSAR studies: virtual and in vitro tyrosinase inhibitory activity elucidation. Chem Biol Drug Des 2010; 76:538-45. [PMID: 20964806 DOI: 10.1111/j.1747-0285.2010.01032.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In this report, we show the results of quantitative structure-activity relationship (QSAR) studies of tyrosinase inhibitory activity, by using the bond-based quadratic indices as molecular descriptors (MDs) and linear discriminant analysis (LDA), to generate discriminant functions to predict the anti-tyrosinase activity. The best two models [Eqs (6) and (12)] out of the total 12 QSAR models developed here show accuracies of 93.51% and 91.21%, as well as high Matthews correlation coefficients (C) of 0.86 and 0.82, respectively, in the training set. The validation external series depicts values of 90.00% and 89.44% for these best two equations (6) and (12), respectively. Afterwards, a second external prediction data are used to perform a virtual screening of compounds reported in the literature as active (tyrosinase inhibitors). In a final step, a series of lignans is analysed using the in silico-developed models, and in vitro corroboration of the activity is carried out. An issue of great importance to remark here is that all compounds present greater inhibition values than Kojic acid (standard tyrosinase inhibitor: IC₅₀ = 16.67 μm). The current obtained results could be used as a framework to increase the speed, in the biosilico discovery of leads for the treatment of skin disorders.
Collapse
|
13
|
Marrero-Ponce Y, Martínez-Albelo ER, Casañola-Martín GM, Castillo-Garit JA, Echevería-Díaz Y, Zaldivar VR, Tygat J, Borges JER, García-Domenech R, Torrens F, Pérez-Giménez F. Bond-based linear indices of the non-stochastic and stochastic edge-adjacency matrix. 1. Theory and modeling of ChemPhys properties of organic molecules. Mol Divers 2010; 14:731-53. [DOI: 10.1007/s11030-009-9201-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2009] [Accepted: 10/19/2009] [Indexed: 10/20/2022]
|
14
|
Castillo-Garit JA, Vega MC, Rolon M, Marrero-Ponce Y, Kouznetsov VV, Torres DFA, Gómez-Barrio A, Bello AA, Montero A, Torrens F, Pérez-Giménez F. Computational discovery of novel trypanosomicidal drug-like chemicals by using bond-based non-stochastic and stochastic quadratic maps and linear discriminant analysis. Eur J Pharm Sci 2009; 39:30-6. [PMID: 19854271 DOI: 10.1016/j.ejps.2009.10.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Revised: 10/01/2009] [Accepted: 10/13/2009] [Indexed: 11/28/2022]
Abstract
Herein we present results of a quantitative structure-activity relationship (QSAR) studies to classify and design, in a rational way, new antitrypanosomal compounds by using non-stochastic and stochastic bond-based quadratic indices. A data set of 440 organic chemicals, 143 with antitrypanosomal activity and 297 having other clinical uses, is used to develop QSAR models based on linear discriminant analysis (LDA). Non-stochastic model correctly classifies more than 93% and 95% of chemicals in both training and external prediction groups, respectively. On the other hand, the stochastic model shows an accuracy of about the 87% for both series. As an experiment of virtual lead generation, the present approach is finally satisfactorily applied to the virtual evaluation of 9 already synthesized in house compounds. The in vitro antitrypanosomal activity of this series against epimastigote forms of Trypanosoma cruzi is assayed. The model is able to predict correctly the behaviour for the majority of these compounds. Four compounds (FER16, FER32, FER33 and FER 132) showed more than 70% of epimastigote inhibition at a concentration of 100 microg/mL (86.74%, 78.12%, 88.85% and 72.10%, respectively) and two of these chemicals, FER16 (78.22% of AE) and FER33 (81.31% of AE), also showed good activity at a concentration of 10 microg/mL. At the same concentration, compound FER16 showed lower value of cytotoxicity (15.44%), and compound FER33 showed very low value of 1.37%. Taking into account all these results, we can say that these three compounds can be optimized in forthcoming works, but we consider that compound FER33 is the best candidate. Even though none of them resulted more active than Nifurtimox, the current results constitute a step forward in the search for efficient ways to discover new lead antitrypanosomals.
Collapse
Affiliation(s)
- Juan Alberto Castillo-Garit
- Applied Chemistry Research Center, Faculty of Chemistry-Pharmacy, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Nucleotide's bilinear indices: novel bio-macromolecular descriptors for bioinformatics studies of nucleic acids. I. Prediction of paromomycin's affinity constant with HIV-1 Psi-RNA packaging region. J Theor Biol 2009; 259:229-41. [PMID: 19272394 DOI: 10.1016/j.jtbi.2009.02.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Revised: 02/24/2009] [Accepted: 02/25/2009] [Indexed: 02/03/2023]
Abstract
A new set of nucleotide-based bio-macromolecular descriptors are presented. This novel approach to bio-macromolecular design from a linear algebra point of view is relevant to nucleic acids quantitative structure-activity relationship (QSAR) studies. These bio-macromolecular indices are based on the calculus of bilinear maps on Re(n)[b(mk)(x (m),y (m)):Re(n) x Re(n)-->Re] in canonical basis. Nucleic acid's bilinear indices are calculated from kth power of non-stochastic and stochastic nucleotide's graph-theoretic electronic-contact matrices, M(m)(k) and (s)M(m)(k), respectively. That is to say, the kth non-stochastic and stochastic nucleic acid's bilinear indices are calculated using M(m)(k) and (s)M(m)(k) as matrix operators of bilinear transformations. Moreover, biochemical information is codified by using different pair combinations of nucleotide-base properties as weightings (experimental molar absorption coefficient epsilon(260) at 260 nm and pH=7.0, first (Delta E(1)) and second (Delta E(2)) single excitation energies in eV, and first (f(1)) and second (f(2)) oscillator strength values (of the first singlet excitation energies) of the nucleotide DNA-RNA bases. As example of this approach, an interaction study of the antibiotic paromomycin with the packaging region of the HIV-1 Psi-RNA have been performed and it have been obtained several linear models in order to predict the interaction strength. The best linear model obtained by using non-stochastic bilinear indices explains about 91% of the variance of the experimental Log K (R=0.95 and s=0.08 x 10(-4)M(-1)) as long as the best stochastic bilinear indices-based equation account for 93% of the Log K variance (R=0.97 and s=0.07 x 10(-4)M(-1)). The leave-one-out (LOO) press statistics, evidenced high predictive ability of both models (q(2)=0.86 and s(cv)=0.09 x 10(-4)M(-1) for non-stochastic and q(2)=0.91 and s(cv)=0.08 x 10(-4)M(-1) for stochastic bilinear indices). The nucleic acid's bilinear indices-based models compared favorably with other nucleic acid's indices-based approaches reported nowadays. These models also permit the interpretation of the driving forces of the interaction process. In this sense, developed equations involve short-reaching (k<or=3), middle-reaching (4<k<9), and far-reaching (k=10 or greater) nucleotide's bilinear indices. This situation points to electronic and topologic nucleotide's backbone interactions control of the stability profile of paromomycin-RNA complexes. Consequently, the present approach represents a novel and rather promising way to theoretical-biology studies.
Collapse
|
16
|
Rivera-Borroto O, Marrero-Ponce Y, Meneses-Marcel A, Escario J, Gómez Barrio A, Arán V, Martins Alho M, Montero Pereira D, Nogal J, Torrens F, Ibarra-Velarde F, Montenegro Y, Huesca-Guillén A, Rivera N, Vogel C. Discovery of Novel Trichomonacidals Using LDA-Driven QSAR Models and Bond-Based Bilinear Indices as Molecular Descriptors. ACTA ACUST UNITED AC 2008. [DOI: 10.1002/qsar.200610165] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
17
|
Marrero-Ponce Y, Khan MTH, Casañola Martín GM, Ather A, Sultankhodzhaev MN, Torrens F, Rotondo R. Prediction of tyrosinase inhibition activity using atom-based bilinear indices. ChemMedChem 2008; 2:449-78. [PMID: 17366651 DOI: 10.1002/cmdc.200600186] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
A set of novel atom-based molecular fingerprints is proposed based on a bilinear map similar to that defined in linear algebra. These molecular descriptors (MDs) are proposed as a new means of molecular parametrization easily calculated from 2D molecular information. The nonstochastic and stochastic molecular indices match molecular structure provided by molecular topology by using the kth nonstochastic and stochastic graph-theoretical electronic-density matrices, M(k) and S(k), respectively. Thus, the kth nonstochastic and stochastic bilinear indices are calculated using M(k) and S(k) as matrix operators of bilinear transformations. Chemical information is coded by using different pair combinations of atomic weightings (mass, polarizability, vdW volume, and electronegativity). The results of QSAR studies of tyrosinase inhibitors using the new MDs and linear discriminant analysis (LDA) demonstrate the ability of the bilinear indices in testing biological properties. A database of 246 structurally diverse tyrosinase inhibitors was assembled. An inactive set of 412 drugs with other clinical uses was used; both active and inactive sets were processed by hierarchical and partitional cluster analyses to design training and predicting sets. Twelve LDA-based QSAR models were obtained, the first six using the nonstochastic total and local bilinear indices and the last six with the stochastic MDs. The discriminant models were applied; globally good classifications of 99.58 and 89.96 % were observed for the best nonstochastic and stochastic bilinear indices models in the training set along with high Matthews correlation coefficients (C) of 0.99 and 0.79, respectively, in the learning set. External prediction sets used to validate the models obtained were correctly classified, with accuracies of 100 and 87.78 %, respectively, yielding C values of 1.00 and 0.73. This subset contains 180 active and inactive compounds not considered to fit the models. A simulated virtual screen demonstrated this approach in searching tyrosinase inhibitors from compounds never considered in either training or predicting series. These fitted models permitted the selection of new cycloartane compounds isolated from herbal plants as new tyrosinase inhibitors. A good correspondence between theoretical and experimental inhibitory effects on tyrosinase was observed; compound CA6 (IC(50)=1.32 microM) showed higher activity than the reference compounds kojic acid (IC(50)=16.67 microM) and L-mimosine (IC(50)=3.68 microM).
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, Poligon la Coma s/n (detras de Canal Nou) P.O. Box 22085, 46071 Valencia, Spain.
| | | | | | | | | | | | | |
Collapse
|
18
|
Marrero-Ponce Y, Meneses-Marcel A, Rivera-Borroto OM, García-Domenech R, De Julián-Ortiz JV, Montero A, Escario JA, Barrio AG, Pereira DM, Nogal JJ, Grau R, Torrens F, Vogel C, Arán VJ. Bond-based linear indices in QSAR: computational discovery of novel anti-trichomonal compounds. J Comput Aided Mol Des 2008; 22:523-40. [DOI: 10.1007/s10822-008-9171-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2006] [Accepted: 01/05/2008] [Indexed: 10/22/2022]
|
19
|
Castillo-Garit JA, Marrero-Ponce Y, Torrens F, Rotondo R. Atom-based stochastic and non-stochastic 3D-chiral bilinear indices and their applications to central chirality codification. J Mol Graph Model 2007; 26:32-47. [PMID: 17110145 DOI: 10.1016/j.jmgm.2006.09.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2006] [Revised: 09/08/2006] [Accepted: 09/20/2006] [Indexed: 11/16/2022]
Abstract
Non-stochastic and stochastic 2D bilinear indices have been generalized to codify chemical structure information for chiral drugs, making use of a trigonometric 3D-chirality correction factor. In order to evaluate the effectiveness of this novel approach in drug design we have modeled the angiotensin-converting enzyme inhibitory activity of perindoprilate's sigma-stereoisomers combinatorial library. Two linear discriminant analysis models, using non-stochastic and stochastic linear indices, were obtained. The models had shown an accuracy of 95.65% for the training set and 100% for the external prediction set. Next the prediction of the sigma-receptor antagonists of chiral 3-(3-hydroxyphenyl)piperidines by multiple linear regression analysis was carried out. Two statistically significant QSAR models were obtained when non-stochastic (R(2)=0.953 and s=0.238) and stochastic (R(2)=0.961 and s=0.219) 3D-chiral bilinear indices were used. These models showed adequate predictive power (assessed by the leave-one-out cross-validation experiment) yielding values of q(2)=0.935 (s(cv)=0.259) and q(2)=0.946 (s(cv)=0.235), respectively. Finally, the prediction of the corticosteroid-binding globulin binding affinity of steroids set was performed. The obtained results are rather similar to most of the 3D-QSAR approaches reported so far. The validation of this method was achieved by comparison with previous reports applied to the same data set. The non-stochastic and stochastic 3D-chiral linear indices appear to provide a very interesting alternative to other more common 3D-QSAR descriptors.
Collapse
Affiliation(s)
- Juan A Castillo-Garit
- Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830 Villa Clara, Cuba.
| | | | | | | |
Collapse
|
20
|
Ponce Y, Khan M, Martín G, Ather A, Sultankhodzhaev M, Torrens F, Rotondo R, Alvarado Y. Atom-Based 2D Quadratic Indices in Drug Discovery of Novel Tyrosinase Inhibitors: Results ofIn Silico Studies Supported by Experimental Results. ACTA ACUST UNITED AC 2007. [DOI: 10.1002/qsar.200610156] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
21
|
Marrero-Ponce Y, Khan MTH, Casañola-Martín GM, Ather A, Sultankhodzhaev MN, García-Domenech R, Torrens F, Rotondo R. Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in 'in silico' selection of new lead tyrosinase inhibitors. J Comput Aided Mol Des 2007; 21:167-88. [PMID: 17333484 DOI: 10.1007/s10822-006-9094-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2006] [Accepted: 12/02/2006] [Indexed: 11/25/2022]
Abstract
In this paper, we present a new set of bond-level TOMOCOMD-CARDD molecular descriptors (MDs), the bond-based bilinear indices, based on a bilinear map similar to those defined in linear algebra. These novel MDs are used here in Quantitative Structure-Activity Relationship (QSAR) studies of tyrosinase inhibitors, for finding functions that discriminate between the tyrosinase inhibitor compounds and inactive ones. In total 14 models were obtained and the best two discriminant functions (Eqs. 32 and 33) shown globally good classification of 91.00% and 90.17%, respectively, in the training set. The test set had accuracies of 93.33% and 88.89% for the models 32 and 33, correspondingly. A simulated virtual screening was also carried out to prove the quality of the determined models. In a final step, the fitted models were used in the biosilico identification of new synthesized tetraketones, where a good agreement could be observed between the theoretical and experimental results. Four compounds of the novel bioactive chemicals discovered as tyrosinase inhibitors: TK10 (IC(50) = 2.09 microM), TK11 (IC(50) = 2.61 microM), TK21 (IC(50) = 2.06 microM), TK23 (IC(50) = 3.19 microM), showed more potent activity than L-mimose (IC(50) = 3.68 microM). Besides, for this study a heterogeneous database of tyrosinase inhibitors was collected, and could be a useful tool for the scientist in the domain of tyrosinase enzyme researches. The current report could help to shed some clues in the identification of new chemicals that inhibits enzyme tyrosinase, for entering in the pipeline of drug discovery development.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, Poligon la Coma s/n (detras de Canal Nou), Valencia, Spain.
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Alvarez-Ginarte YM, Marrero-Ponce Y, Ruiz-García JA, Montero-Cabrera LA, García de la Vega JM, Noheda Marin P, Crespo-Otero R, Zaragoza FT, García-Domenech R. Applying pattern recognition methods plus quantum and physico-chemical molecular descriptors to analyze the anabolic activity of structurally diverse steroids. J Comput Chem 2007; 29:317-33. [PMID: 17639502 DOI: 10.1002/jcc.20745] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The great cost associated with the development of new anabolic-androgenic steroid (AASs) makes necessary the development of computational methods that shorten the drug discovery pipeline. Toward this end, quantum, and physicochemical molecular descriptors, plus linear discriminant analysis (LDA) were used to analyze the anabolic/androgenic activity of structurally diverse steroids and to discover novel AASs, as well as also to give a structural interpretation of their anabolic-androgenic ratio (AAR). The obtained models are able to correctly classify 91.67% (86.27%) of the AASs in the training (test) sets, respectively. The results of predictions on the 10% full-out cross-validation test also evidence the robustness of the obtained model. Moreover, these classification functions are applied to an "in house" library of chemicals, to find novel AASs. Two new AASs are synthesized and tested for in vivo activity. Although both AASs are less active than some commercially AASs, this result leaves a door open to a virtual variational study of the structure of the two compounds, to improve their biological activity. The LDA-assisted QSAR models presented here, could significantly reduce the number of synthesized and tested AASs, as well as could increase the chance of finding new chemical entities with higher AAR.
Collapse
|
23
|
Marrero-Ponce Y, Torrens F, Alvarado YJ, Rotondo R. Bond-based global and local (bond, group and bond-type) quadratic indices and their applications to computer-aided molecular design. 1. QSPR studies of diverse sets of organic chemicals. J Comput Aided Mol Des 2006; 20:685-701. [PMID: 17186417 DOI: 10.1007/s10822-006-9089-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2006] [Accepted: 10/18/2006] [Indexed: 11/26/2022]
Abstract
The concept of atom-based quadratic indices is extended to a series of molecular descriptors (MDs) (both total and local) based on adjacency between edges. The kth edge-adjacency matrix (E ( k )) denotes the matrix of bond-based quadratic indices (non-stochastic) with respect to the canonical basis set. The kth "stochastic" edge-adjacency matrix, ES ( k ), is here proposed as a new molecular representation easily calculated from E ( k ). Then, the kth stochastic bond-based quadratic indices are calculated using ES ( k ) as operators of quadratic transformations. The study of six representative physicochemical properties of octane isomers was used to compare the ability of both series of MDs to produce significant quantitative structure-property relationship (QSPR) models. Moreover, the general performance of the new MDs in this QSPR study has been evaluated with respect to other 2D/3D well-known sets of indices and the obtained results shown a quite satisfactory behavior of the present method. The novel bond-level MDs were also used for the description and prediction of the boiling point of 28 alkyl-alcohols and to the modeling of the specific rate constant (log k) of 34 derivatives of 2-furylethylenes. These models were statistically significant and showed very good stability to data variation in leave-one-out (LOO) cross-validation experiment. The comparison with other approaches (edge- and vertices-based connectivity indices, total and local spectral moments, and quantum chemical descriptors as well as E-state/biomolecular encounter parameters) expose a good behavior of our method in this QSPR studies. The approach described in this report appears to be a very promising structural invariant, useful for QSPR/QSAR studies, similarity/diversity analysis, and computer-aided "rational" molecular (drug) design.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Central University of Las Villas, Santa Clara, Villa Clara, 54830, Cuba.
| | | | | | | |
Collapse
|
24
|
Marrero-Ponce Y, Meneses-Marcel A, Castillo-Garit JA, Machado-Tugores Y, Escario JA, Barrio AG, Pereira DM, Nogal-Ruiz JJ, Arán VJ, Martínez-Fernández AR, Torrens F, Rotondo R, Ibarra-Velarde F, Alvarado YJ. Predicting antitrichomonal activity: A computational screening using atom-based bilinear indices and experimental proofs. Bioorg Med Chem 2006; 14:6502-24. [PMID: 16875830 DOI: 10.1016/j.bmc.2006.06.016] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2006] [Revised: 06/06/2006] [Accepted: 06/08/2006] [Indexed: 11/30/2022]
Abstract
Existing Trichomonas vaginalis therapies are out of reach for most trichomoniasis people in developing countries and, where available, they are limited by their toxicity (mainly in pregnant women) and their cost. New antitrichomonal agents are needed to combat emerging metronidazole-resistant trichomoniasis and reduce the side effects associated with currently available drugs. Toward this end, atom-based bilinear indices, a new TOMOCOMD-CARDD molecular descriptor, and linear discriminant analysis (LDA) were used to discover novel, potent, and non-toxic lead trichomonacidal chemicals. Two discriminant functions were obtained with the use of non-stochastic and stochastic atom-type bilinear indices for heteroatoms and H-bonding of heteroatoms. These atomic-level molecular descriptors were calculated using a weighting scheme that includes four atomic labels, namely atomic masses, van der Waals volumes, atomic polarizabilities, and atomic electronegativities in Pauling scale. The obtained LDA-based QSAR models, using non-stochastic and stochastic indices, were able to classify correctly 94.51% (90.63%) and 93.41% (93.75%) of the chemicals in training (test) sets, respectively. They showed large Matthews' correlation coefficients (C); 0.89 (0.79) and 0.87 (0.85), for the training (test) sets, correspondingly. The result of predictions on the 15% full-out cross-validation test also evidenced the robustness and predictive power of the obtained models. In addition, canonical regression analyses corroborated the statistical quality of these models (R(can) of 0.749 and of 0.845, correspondingly); they were also used to compute biological activity canonical scores for each compound. On the other hand, a close inspection of the molecular descriptors included in both equations showed that several of these molecular fingerprints are strongly interrelated with each other. Therefore, these models were orthogonalized using the Randić orthogonalization procedure. These classification functions were then applied to find new lead antitrichomonal agents and six compounds were selected as possible active compounds by computational screening. The designed compounds were synthesized and tested for in vitro activity against T. vaginalis. Out of the six compounds that were designed, and synthesized, three molecules (chemicals VA5-5a, VA5-5c, and VA5-12b) showed high to moderate cytocidal activity at the concentration of 10 microg/ml, other two compounds (VA5-8pre and VA5-8) showed high cytocidal and cytostatic activity at the concentration of 100 microg/ml and 10 microg/ml, correspondingly, and the remaining chemical (compound VA5-5e) was inactive at these assayed concentrations. Nonetheless, these compounds possess structural features not seen in known trichomonacidal compounds and thus can serve as excellent leads for further optimization of antitrichomonal activity. The LDA-based QSAR models presented here can be considered as a computer-assisted system that could potentially significantly reduce the number of synthesized and tested compounds and increase the chance of finding new chemical entities with antitrichomonal activity.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, Poligon la Coma s/n (detras de Canal Nou), PO Box 22085, E-46071 Valencia, Spain.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Castillo-Garit JA, Marrero-Ponce Y, Torrens F. Atom-based 3D-chiral quadratic indices. Part 2: Prediction of the corticosteroid-binding globulinbinding affinity of the 31 benchmark steroids data set. Bioorg Med Chem 2006; 14:2398-408. [PMID: 16325409 DOI: 10.1016/j.bmc.2005.11.024] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2005] [Revised: 11/09/2005] [Accepted: 11/09/2005] [Indexed: 10/25/2022]
Abstract
A quantitative structure-activity relationship (QSAR) study to predict the relative affinities of the steroid 'benchmark' data set to the corticosteroid-binding globulin (CBG) is described. It is shown that the 3D-chiral quadratic indices closely correlate with the measured CBG affinity values for the 31 steroids. The calculated descriptors were correlated with biological data through multiple linear regressions. Two statistically significant models were obtained when non-stochastic (R = 0.924 and s = 0.46) as well as stochastic (R = 0.929 and s = 0.46) 3D-chiral quadratic indices were used. A leave-one-out (LOO) approach to model validation is used here; the best results obtained in the cross-validation procedure with non-stochastic (q2 = 0.781) and stochastic (q2 = 0.735) 3D-chiral quadratic indices are better or similar to most of the 3D-QSAR approaches reported so far. These results support the idea that the 3D-chiral quadratic indices may be helpful in prediction of the corticosteroid-binding affinity for new compounds.
Collapse
Affiliation(s)
- Juan A Castillo-Garit
- Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | |
Collapse
|
26
|
Vega MC, Montero-Torres A, Marrero-Ponce Y, Rolón M, Gómez-Barrio A, Escario JA, Arán VJ, Nogal JJ, Meneses-Marcel A, Torrens F. New ligand-based approach for the discovery of antitrypanosomal compounds. Bioorg Med Chem Lett 2006; 16:1898-904. [PMID: 16455249 DOI: 10.1016/j.bmcl.2005.12.087] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2005] [Revised: 12/23/2005] [Accepted: 12/27/2005] [Indexed: 11/23/2022]
Abstract
The antitrypanosomal activity of 10 already synthesized compounds was in silico predicted as well as in vitro and in vivo explored against Trypanosoma cruzi. For the computational study, an approach based on non-stochastic linear fingerprints to the identification of potential antichagasic compounds is introduced. Molecular structures of 66 organic compounds, 28 with antitrypanosomal activity and 38 having other clinical uses, were parameterized by means of the TOMOCOMD-CARDD software. A linear classification function was derived allowing the discrimination between active and inactive compounds with a confidence of 95%. As predicted, seven compounds showed antitrypanosomal activity (%AE>70) against epimastigotic forms of T. cruzi at a concentration of 100mug/mL. After an unspecific cytotoxic assay, three compounds were evaluated against amastigote forms of the parasite. An in vivo test was carried out for one of the studied compounds.
Collapse
Affiliation(s)
- María Celeste Vega
- Department of Parasitology, Faculty of Pharmacy, Universidad Complutense de Madrid, 28040 Madrid, Spain
| | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Marrero-Ponce Y, Marrero RM, Torrens F, Martinez Y, Bernal MG, Zaldivar VR, Castro EA, Abalo RG. Non-stochastic and stochastic linear indices of the molecular pseudograph’s atom-adjacency matrix: a novel approach for computational in silico screening and “rational” selection of new lead antibacterial agents. J Mol Model 2005; 12:255-71. [PMID: 16270182 DOI: 10.1007/s00894-005-0024-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2004] [Accepted: 06/20/2005] [Indexed: 11/25/2022]
Abstract
A novel approach (TOMOCOMD-CARDD) to computer-aided "rational" drug design is illustrated. This approach is based on the calculation of the non-stochastic and stochastic linear indices of the molecular pseudograph's atom-adjacency matrix representing molecular structures. These TOMOCOMD-CARDD descriptors are introduced for the computational (virtual) screening and "rational" selection of new lead antibacterial agents using linear discrimination analysis. The two structure-based antibacterial-activity classification models, including non-stochastic and stochastic indices, classify correctly 91.61% and 90.75%, respectively, of 1525 chemicals in training sets. These models show high Matthews correlation coefficients (MCC=0.84 and 0.82). An external validation process was carried out to assess the robustness and predictive power of the model obtained. These QSAR models permit the correct classification of 91.49% and 89.31% of 505 compounds in an external test set, yielding MCCs of 0.84 and 0.79, respectively. The TOMOCOMD-CARDD approach compares satisfactorily with respect to nine of the most useful models for antimicrobial selection reported to date. Finally, an in silico screening of 87 new chemicals reported in the anti-infective field with antibacterial activities is developed showing the ability of the TOMOCOMD-CARDD models to identify new lead antibacterial compounds.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | | | |
Collapse
|
28
|
Marrero-Ponce Y, Iyarreta-Veitía M, Montero-Torres A, Romero-Zaldivar C, Brandt CA, Avila PE, Kirchgatter K, Machado Y. Ligand-Based Virtual Screening and in Silico Design of New Antimalarial Compounds Using Nonstochastic and Stochastic Total and Atom-Type Quadratic Maps. J Chem Inf Model 2005; 45:1082-100. [PMID: 16045304 DOI: 10.1021/ci050085t] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Malaria has been one of the most significant public health problems for centuries. It affects many tropical and subtropical regions of the world. The increasing resistance of Plasmodium spp. to existing therapies has heightened alarms about malaria in the international health community. Nowadays, there is a pressing need for identifying and developing new drug-based antimalarial therapies. In an effort to overcome this problem, the main purpose of this study is to develop simple linear discriminant-based quantitative structure-activity relationship (QSAR) models for the classification and prediction of antimalarial activity using some of the TOMOCOMD-CARDD (TOpological MOlecular COMputer Design-Computer Aided "Rational" Drug Design) fingerprints, so as to enable computational screening from virtual combinatorial datasets. In this sense, a database of 1562 organic chemicals having great structural variability, 597 of them antimalarial agents and 965 compounds having other clinical uses, was analyzed and presented as a helpful tool, not only for theoretical chemists but also for other researchers in this area. This series of compounds was processed by a k-means cluster analysis in order to design training and predicting sets. Afterward, two linear classification functions were derived in order to discriminate between antimalarial and nonantimalarial compounds. The models (including nonstochastic and stochastic indices) correctly classify more than 93% of the compound set, in both training and external prediction datasets. They showed high Matthews' correlation coefficients, 0.889 and 0.866 for the training set and 0.855 and 0.857 for the test one. The models' predictivity was also assessed and validated by the random removal of 10% of the compounds to form a new test set, for which predictions were made using the models. The overall means of the correct classification for this process (leave group 10% full-out cross validation) using the equations with nonstochastic and stochastic atom-based quadratic fingerprints were 93.93% and 92.77%, respectively. The quadratic maps-based TOMOCOMD-CARDD approach implemented in this work was successfully compared with four of the most useful models for antimalarials selection reported to date. The developed models were then used in a simulation of a virtual search for Ras FTase (FTase = farnesyltransferase) inhibitors with antimalarial activity; 70% and 100% of the 10 inhibitors used in this virtual search were correctly classified, showing the ability of the models to identify new lead antimalarials. Finally, these two QSAR models were used in the identification of previously unknown antimalarials. In this sense, three synthetic intermediaries of quinolinic compounds were evaluated as active/inactive ones using the developed models. The synthesis and biological evaluation of these chemicals against two malaria strains, using chloroquine as a reference, was performed. An accuracy of 100% with the theoretical predictions was observed. Compound 3 showed antimalarial activity, being the first report of an arylaminomethylenemalonate having such behavior. This result opens a door to a virtual study considering a higher variability of the structural core already evaluated, as well as of other chemicals not included in this study. We conclude that the approach described here seems to be a promising QSAR tool for the molecular discovery of novel classes of antimalarial drugs, which may meet the dual challenges posed by drug-resistant parasites and the rapid progression of malaria illnesses.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical Pharmacy and Department of Drug Design, Chemical Bioactive Center, Central University of Las Villas, Santa Clara, 54830 Villa Clara, Cuba.
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Marrero-Ponce Y, Medina-Marrero R, Castillo-Garit JA, Romero-Zaldivar V, Torrens F, Castro EA. Protein linear indices of the ‘macromolecular pseudograph α-carbon atom adjacency matrix’ in bioinformatics. Part 1: Prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor. Bioorg Med Chem 2005; 13:3003-15. [PMID: 15781410 DOI: 10.1016/j.bmc.2005.01.062] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Revised: 01/28/2005] [Accepted: 01/31/2005] [Indexed: 10/25/2022]
Abstract
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara, 54830 Villa Clara, Cuba.
| | | | | | | | | | | |
Collapse
|
30
|
Ponce YM, Marrero RM, Castro EA, Ramos de Armas R, Díaz HG, Zaldivar VR, Torrens F. Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability. Molecules 2004; 9:1124-47. [PMID: 18007508 DOI: 10.3390/91201124] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2004] [Revised: 12/12/2004] [Accepted: 12/13/2004] [Indexed: 11/16/2022] Open
Abstract
This report describes a new set of macromolecular descriptors of relevance to protein QSAR/QSPR studies, protein's quadratic indices. These descriptors are calculated from the macromolecular pseudograph's alpha-carbon atom adjacency matrix. A study of the protein stability effects for a complete set of alanine substitutions in Arc repressor illustrates this approach. Quantitative Structure-Stability Relationship (QSSR) models allow discriminating between near wild-type stability and reduced-stability A-mutants. A linear discriminant function gives rise to excellent discrimination between 85.4% (35/41)and 91.67% (11/12) of near wild-type stability/reduced stability mutants in training and test series, respectively. The model's overall predictability oscillates from 80.49 until 82.93, when n varies from 2 to 10 in leave-n-out cross validation procedures. This value stabilizes around 80.49% when n was > 6. Additionally, canonical regression analysis corroborates the statistical quality of the classification model (Rcanc = 0.72, p-level <0.0001). This analysis was also used to compute biological stability canonical scores for each Arc A-mutant. On the other hand, nonlinear piecewise regression model compares favorably with respect to linear regression one on predicting the melting temperature (tm)of the Arc A-mutants. The linear model explains almost 72% of the variance of the experimental tm (R = 0.85 and s = 5.64) and LOO press statistics evidenced its predictive ability (q2 = 0.55 and scv = 6.24). However, this linear regression model falls to resolve t(m) predictions of Arc A-mutants in external prediction series. Therefore, the use of nonlinear piecewise models was required. The tm values of A-mutants in training (R = 0.94) and test(R = 0.91) sets are calculated by piecewise model with a high degree of precision. A break-point value of 51.32 degrees C characterizes two mutants' clusters and coincides perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutants' Arc homodimers. These models also permit the interpretation of the driving forces of such a folding process. The models include protein's quadratic indices accounting for hydrophobic (z1), bulk-steric (z2), and electronic (z3) features of the studied molecules. Preponderance of z1 and z3 over z2 indicates the higher importance of the hydrophobic and electronic side chain terms in the folding of the Arc dimer. In this sense, developed equations involve short-reaching (k < or = 3), middle- reaching (3 < k < or = 7) and far-reaching (k= 8 or greater) z1, 2, 3-protein's quadratic indices. This situation points to topologic/topographic protein's backbone interactions control of the stability profile of wild-type Arc and its A-mutants. Consequently, the present approach represents a novel and very promising way to mathematical research in biology sciences.
Collapse
Affiliation(s)
- Yovani Marrero Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | |
Collapse
|