Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Valdés-Martiní JR, Marrero-Ponce Y, García-Jacas CR, Martinez-Mayorga K, Barigye SJ, Vaz d'Almeida YS, Pham-The H, Pérez-Giménez F, Morell CA. QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations. J Cheminform 2017;9:35. [PMID: 29086120 PMCID: PMC5462671 DOI: 10.1186/s13321-017-0211-5] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 04/07/2017] [Indexed: 11/10/2022] Open

For:	Valdés-Martiní JR, Marrero-Ponce Y, García-Jacas CR, Martinez-Mayorga K, Barigye SJ, Vaz d'Almeida YS, Pham-The H, Pérez-Giménez F, Morell CA. QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations. J Cheminform 2017;9:35. [PMID: 29086120 PMCID: PMC5462671 DOI: 10.1186/s13321-017-0211-5] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 04/07/2017] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Obradović D, Stavrianidi A, Fedorova E, Bogojević A, Shpigun O, Buryak A, Lazović S. A comparative study of the predictive performance of different descriptor calculation tools: Molecular-based elution order modeling and interpretation of retention mechanism for isomeric compounds from METLIN database. J Chromatogr A 2024;1719:464731. [PMID: 38377661 DOI: 10.1016/j.chroma.2024.464731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/08/2024] [Accepted: 02/09/2024] [Indexed: 02/22/2024]

Abstract

In the pharmaceutical industry, the need for analytical standards is a bottleneck for comprehensive evaluation and quality control of intermediate and end products. These are complex mixtures containing structurally related molecules. In this regard, chromatographic peak annotation, especially for critical pairs of isomers and closest structural analogs, can be supported by using a Quantitative Structure Retention Relationship (QSRR) approach. In our study, we investigated the fundamental basis of the reversed-phase (RP) retention mechanism for 1141 isomeric compounds from the METLIN SMRT dataset. Nine different descriptor calculation tools combined with different feature selection methods (genetic algorithm (GA), stepwise, Boruta) and machine learning (ML) approaches (support vector machine (SVM), multiple linear regression (MLR), random forest (RF), XGBoost) were applied to provide a reliable molecular structure-based interpretation of RP retention behaviour of the isomeric compounds. Strict internal and external validation metrics were used to select models with the best predictive capabilities (rtest > 0.73, order of elution > 60 %). For the developed models, mean absolute errors were in the range of 60 to 110 s. Stepwise and GA showed the most suitable performance as descriptor selection methods, while SVM and XGBoost modeling gave satisfactory predictive characteristics in most cases. Validation performed on the published experimental data for structurally related pharmaceutical compounds confirmed the best accuracy of MLR modeling in combination with GA feature selection of general physico-chemical properties. The resulting models will be useful for the prediction of separation and identification of structurally related compounds in pharmaceutical analysis, providing a simultaneous understanding of the interaction mechanisms leading to their retention under RP conditions.

Collapse

Ibrahim AE, El Gohary NA, Aboushady D, Samir L, Karim SEA, Herz M, Salman BI, Al-Harrasi A, Hanafi R, El Deeb S. Recent advances in chiral selectors immobilization and chiral mobile phase additives in liquid chromatographic enantio-separations: A review. J Chromatogr A 2023;1706:464214. [PMID: 37506464 DOI: 10.1016/j.chroma.2023.464214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 07/10/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023]

Emonts J, Buyel J. An overview of descriptors to capture protein properties - Tools and perspectives in the context of QSAR modeling. Comput Struct Biotechnol J 2023;21:3234-3247. [PMID: 38213891 PMCID: PMC10781719 DOI: 10.1016/j.csbj.2023.05.022] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 05/23/2023] [Accepted: 05/23/2023] [Indexed: 01/13/2024] Open

Cuesta SA, Moreno M, López RA, Mora JR, Paz JL, Márquez EA. ElectroPredictor: An Application to Predict Mayr's Electrophilicity E through Implementation of an Ensemble Model Based on Machine Learning Algorithms. J Chem Inf Model 2023;63:507-521. [PMID: 36594600 DOI: 10.1021/acs.jcim.2c01367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Metwally AA, Nayel AA, Hathout RM. In silico prediction of siRNA ionizable-lipid nanoparticles In vivo efficacy: Machine learning modeling based on formulation and molecular descriptors. Front Mol Biosci 2022;9:1042720. [PMID: 36619167 PMCID: PMC9811823 DOI: 10.3389/fmolb.2022.1042720] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open

González-Castañeda Y, Marrero-Ponce Y, Guerra JO, Echevarría-Díaz Y, Pérez N, Pérez-Giménez F, Simonet AM, Macías FA, Nogueiras CM, Olazabal E, Serrano H. Computational discovery of novel anthelmintic natural compounds from Agave Brittoniana trel. Spp. Brachypus. BIONATURA 2022. [DOI: 10.21931/rb/2022.07.04.53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Affiliation(s)

Yeniel González-Castañeda Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA)
Yovani Marrero-Ponce Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
Jose O. Guerra Chemistry Department, Faculty of Chemistry-Pharmacy. Universidad Central “Marta Abreu” de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
Yunaimy Echevarría-Díaz Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE)
Noel Pérez Colegio de Ciencias e Ingenierías “El Politécnico”, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
Facundo Pérez-Giménez Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
Ana M. Simonet Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
Francisco A. Macías Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
Clara M. Nogueiras Departamento de Química Orgánica, Facultad de Química, Universidad de La Habana
Ervelio Olazabal Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
Hector Serrano Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara

Collapse

Searching glycolate oxidase inhibitors based on QSAR, molecular docking, and molecular dynamic simulation approaches. Sci Rep 2022;12:19969. [PMID: 36402831 PMCID: PMC9675741 DOI: 10.1038/s41598-022-24196-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 11/11/2022] [Indexed: 11/21/2022] Open

Diéguez-Santana K, Casañola-Martin GM, Torres R, Rasulev B, Green JR, González-Díaz H. Machine Learning Study of Metabolic Networks vs ChEMBL Data of Antibacterial Compounds. Mol Pharm 2022;19:2151-2163. [PMID: 35671399 PMCID: PMC9986951 DOI: 10.1021/acs.molpharmaceut.2c00029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

In Silico Antiprotozoal Evaluation of 1,4-Naphthoquinone Derivatives against Chagas and Leishmaniasis Diseases Using QSAR, Molecular Docking, and ADME Approaches. Pharmaceuticals (Basel) 2022;15:ph15060687. [PMID: 35745607 PMCID: PMC9228275 DOI: 10.3390/ph15060687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 05/24/2022] [Accepted: 05/27/2022] [Indexed: 12/04/2022] Open

Halder AK, Moura AS, Cordeiro MNDS. Moving Average-Based Multitasking In Silico Classification Modeling: Where Do We Stand and What Is Next? Int J Mol Sci 2022;23:ijms23094937. [PMID: 35563327 PMCID: PMC9099502 DOI: 10.3390/ijms23094937] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 04/24/2022] [Accepted: 04/28/2022] [Indexed: 01/27/2023] Open

De Gauquier P, Vanommeslaeghe K, Heyden YV, Mangelings D. Modelling approaches for chiral chromatography on polysaccharide-based and macrocyclic antibiotic chiral selectors: A review. Anal Chim Acta 2022;1198:338861. [DOI: 10.1016/j.aca.2021.338861] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Revised: 07/12/2021] [Accepted: 07/19/2021] [Indexed: 12/25/2022]

PTML Modeling for Pancreatic Cancer Research: In Silico Design of Simultaneous Multi-Protein and Multi-Cell Inhibitors. Biomedicines 2022;10:biomedicines10020491. [PMID: 35203699 PMCID: PMC8962338 DOI: 10.3390/biomedicines10020491] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 02/10/2022] [Accepted: 02/15/2022] [Indexed: 02/07/2023] Open

Liu J, Guo W, Sakkiah S, Ji Z, Yavas G, Zou W, Chen M, Tong W, Patterson TA, Hong H. Machine Learning Models for Predicting Liver Toxicity. Methods Mol Biol 2022;2425:393-415. [PMID: 35188640 DOI: 10.1007/978-1-0716-1960-5_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Quevedo-Tumailli V, Ortega-Tenezaca B, González-Díaz H. IFPTML Mapping of Drug Graphs with Protein and Chromosome Structural Networks vs. Pre-Clinical Assay Information for Discovery of Antimalarial Compounds. Int J Mol Sci 2021;22:ijms222313066. [PMID: 34884870 PMCID: PMC8657696 DOI: 10.3390/ijms222313066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 11/23/2021] [Accepted: 11/24/2021] [Indexed: 11/16/2022] Open

Abstract

The parasite species of genus Plasmodium causes Malaria, which remains a major global health problem due to parasite resistance to available Antimalarial drugs and increasing treatment costs. Consequently, computational prediction of new Antimalarial compounds with novel targets in the proteome of Plasmodium sp. is a very important goal for the pharmaceutical industry. We can expect that the success of the pre-clinical assay depends on the conditions of assay per se, the chemical structure of the drug, the structure of the target protein to be targeted, as well as on factors governing the expression of this protein in the proteome such as genes (Deoxyribonucleic acid, DNA) sequence and/or chromosomes structure. However, there are no reports of computational models that consider all these factors simultaneously. Some of the difficulties for this kind of analysis are the dispersion of data in different datasets, the high heterogeneity of data, etc. In this work, we analyzed three databases ChEMBL (Chemical database of the European Molecular Biology Laboratory), UniProt (Universal Protein Resource), and NCBI-GDV (National Center for Biotechnology Information—Genome Data Viewer) to achieve this goal. The ChEMBL dataset contains outcomes for 17,758 unique assays of potential Antimalarial compounds including numeric descriptors (variables) for the structure of compounds as well as a huge amount of information about the conditions of assays. The NCBI-GDV and UniProt datasets include the sequence of genes, proteins, and their functions. In addition, we also created two partitions (c_assayj = c_aj and c_dataj = cd_j) of categorical variables from theChEMBL dataset. These partitions contain variables that encode information about experimental conditions of preclinical assays (c_aj) or about the nature and quality of data (c_dj). These categorical variables include information about 22 parameters of biological activity (c_a0), 28 target proteins (c_a1), and 9 organisms of assay (c_a2), etc. We also created another partition of (c_protj = c_pj) including categorical variables with biological information about the target proteins, genes, and chromosomes. These variables cover32 genes (c_p0), 10 chromosomes (c_p1), gene orientation (c_p2), and 31 protein functions (c_p3). We used a Perturbation-Theory Machine Learning Information Fusion (IFPTML) algorithm to map all this information (from three databases) into and train a predictive model. Shannon’s entropy measure Sh_k (numerical variables) was used to quantify the information about the structure of drugs, protein sequences, gene sequences, and chromosomes in the same information scale. Perturbation Theory Operators (PTOs) with the form of Moving Average (MA) operators have been used to quantify perturbations (deviations) in the structural variables with respect to their expected values for different subsets (partitions) of categorical variables. We obtained three IFPTML models using General Discriminant Analysis (GDA), Classification Tree with Univariate Splits (CTUS), and Classification Tree with Linear Combinations (CTLC). The IFPTML-CTLC presented the better performance with Sensitivity Sn(%) = 83.6/85.1, and Specificity Sp(%) = 89.8/89.7 for training/validation sets, respectively. This model could become a useful tool for the optimization of preclinical assays of new Antimalarial compounds vs. different proteins in the proteome of Plasmodium.

Collapse

Saavedra LM, Duchowicz PR. Predicting zebrafish (Danio rerio) embryo developmental toxicity through a non-conformational QSAR approach. THE SCIENCE OF THE TOTAL ENVIRONMENT 2021;796:148820. [PMID: 34328907 DOI: 10.1016/j.scitotenv.2021.148820] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 06/11/2021] [Accepted: 06/29/2021] [Indexed: 06/13/2023]

Calle L, Marrero-Ponce Y, Mora JR. Molecular simulation of the (GPx)-like antioxidant activity of ebselen derivatives through machine learning techniques. MOLECULAR SIMULATION 2021. [DOI: 10.1080/08927022.2021.1975039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Computational Drug Repurposing for Antituberculosis Therapy: Discovery of Multi-Strain Inhibitors. Antibiotics (Basel) 2021;10:antibiotics10081005. [PMID: 34439055 PMCID: PMC8388932 DOI: 10.3390/antibiotics10081005] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 08/15/2021] [Accepted: 08/17/2021] [Indexed: 12/13/2022] Open

Carracedo-Reboredo P, Liñares-Blanco J, Rodríguez-Fernández N, Cedrón F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C. A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 2021;19:4538-4558. [PMID: 34471498 PMCID: PMC8387781 DOI: 10.1016/j.csbj.2021.08.011] [Citation(s) in RCA: 103] [Impact Index Per Article: 34.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 08/06/2021] [Accepted: 08/06/2021] [Indexed: 12/30/2022] Open

Affiliation(s)

Paula Carracedo-Reboredo Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
Jose Liñares-Blanco Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain
Nereida Rodríguez-Fernández CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
Francisco Cedrón Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
Francisco J. Novoa Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
Adrian Carballal Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain Department of Computer Science and Information Technologies, Faculty of Communication Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain
Victor Maojo Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, Madrid 28660, Spain
Alejandro Pazos Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain
Carlos Fernandez-Lozano Department of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruna, Campus Elviña s/n, A Coruña 15071, Spain CITIC-Research Center of Information and Communication Technologies, Universidade da Coruna, A Coruña 15071, Spain Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR), Complexo Hospitalario Universitario de A Coruña (CHUAC), SERGAS, Universidade da Coruña, Instituto de Investigación Biomédica de A Coruña (INIBIC), A Coruña, Spain

Collapse

Tripathi MK, Nath A, Singh TP, Ethayathulla AS, Kaur P. Evolving scenario of big data and Artificial Intelligence (AI) in drug discovery. Mol Divers 2021;25:1439-1460. [PMID: 34159484 PMCID: PMC8219515 DOI: 10.1007/s11030-021-10256-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 06/14/2021] [Indexed: 12/24/2022]

Duchowicz PR, Bennardi DO, Ortiz EV, Comelli NC. QSAR models for insecticidal properties of plant essential oils on the housefly (Musca domestica L.). SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2021;32:395-410. [PMID: 33870800 DOI: 10.1080/1062936x.2021.1905711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/16/2021] [Indexed: 06/12/2023]

Halder AK, Dias Soeiro Cordeiro MN. QSAR-Co-X: an open source toolkit for multitarget QSAR modelling. J Cheminform 2021;13:29. [PMID: 33858509 PMCID: PMC8048082 DOI: 10.1186/s13321-021-00508-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Accepted: 03/31/2021] [Indexed: 12/02/2022] Open

Structure Driven Prediction of Chromatographic Retention Times: Applications to Pharmaceutical Analysis. Int J Mol Sci 2021;22:ijms22083848. [PMID: 33917733 PMCID: PMC8068189 DOI: 10.3390/ijms22083848] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 04/04/2021] [Accepted: 04/06/2021] [Indexed: 11/17/2022] Open

Kleandrova VV, Scotti L, Bezerra Mendonça Junior FJ, Muratov E, Scotti MT, Speck-Planche A. QSAR Modeling for Multi-Target Drug Discovery: Designing Simultaneous Inhibitors of Proteins in Diverse Pathogenic Parasites. Front Chem 2021;9:634663. [PMID: 33777898 PMCID: PMC7987820 DOI: 10.3389/fchem.2021.634663] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Accepted: 01/22/2021] [Indexed: 11/21/2022] Open

Cuesta SA, Mora JR, Márquez EA. In Silico Screening of the DrugBank Database to Search for Possible Drugs against SARS-CoV-2. Molecules 2021;26:1100. [PMID: 33669720 PMCID: PMC7923184 DOI: 10.3390/molecules26041100] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 02/11/2021] [Accepted: 02/16/2021] [Indexed: 12/29/2022] Open

QSAR models for the fumigant activity prediction of essential oils. J Mol Graph Model 2020;101:107751. [DOI: 10.1016/j.jmgm.2020.107751] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 08/20/2020] [Accepted: 09/04/2020] [Indexed: 12/23/2022]

Kleandrova VV, Scotti MT, Scotti L, Nayarisseri A, Speck-Planche A. Cell-based multi-target QSAR model for design of virtual versatile inhibitors of liver cancer cell lines. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020;31:815-836. [PMID: 32967475 DOI: 10.1080/1062936x.2020.1818617] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 08/31/2020] [Indexed: 06/11/2023]

Barigye SJ, Gómez-Ganau S, Serrano-Candelas E, Gozalbes R. PeptiDesCalculator: Software for computation of peptide descriptors. Definition, implementation and case studies for 9 bioactivity endpoints. Proteins 2020;89:174-184. [PMID: 32881068 DOI: 10.1002/prot.26003] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 08/05/2020] [Accepted: 08/27/2020] [Indexed: 11/09/2022]

Kleandrova VV, Speck-Planche A. PTML Modeling for Alzheimer’s Disease: Design and Prediction of Virtual Multi-Target Inhibitors of GSK3B, HDAC1, and HDAC6. Curr Top Med Chem 2020;20:1661-1676. [DOI: 10.2174/1568026620666200607190951] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/12/2019] [Accepted: 01/05/2020] [Indexed: 01/23/2023]

Mora JR, Marrero-Ponce Y, García-Jacas CR, Suarez Causado A. Ensemble Models Based on QuBiLS-MAS Features and Shallow Learning for the Prediction of Drug-Induced Liver Toxicity: Improving Deep Learning and Traditional Approaches. Chem Res Toxicol 2020;33:1855-1873. [PMID: 32406679 DOI: 10.1021/acs.chemrestox.0c00030] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Abstract

Drug-induced liver injury (DILI) is a key safety issue in the drug discovery pipeline and a regulatory concern. Thus, many in silico tools have been proposed to improve the hepatotoxicity prediction of organic-type chemicals. Here, classifiers for the prediction of DILI were developed by using QuBiLS-MAS 0-2.5D molecular descriptors and shallow machine learning techniques, on a training set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the learning series, namely, the following: accuracy = 0.840, sensibility = 0.890, specificity = 0.761, Matthew's correlation coefficient = 0.660, and area under the ROC curve = 0.904. The model was also satisfactorily evaluated with Y-scrambling test, and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external validation was also carried out by using two test sets and five external test sets, with an average accuracy value equal to 0.854 (±0.062) and a coverage equal to 98.4% according to its applicability domain. A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPredictor Software, Deep Learning DILIserver, and Vslead) reported in the literature, was also performed. In general, E13 presented the best global performance in all experiments. The sum of the ranking differences procedure provided a very similar grouping pattern to that of the M-ANOVA statistical analysis, where E13 was identified as the best model for DILI predictions. A noncommercial and fully cross-platform software for the DILI prediction was also developed, which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for the screening of seven data sets, containing natural products, leads, toxic materials, and FDA approved drugs, to assess the usefulness of the QSAR models in the DILI labeling of organic substances; it was found that 50-92% of the evaluated molecules are positive-DILI compounds. All in all, it can be stated that the E13 model is a relevant method for the prediction of DILI risk in humans, as it shows the best results among all of the methods analyzed.

Collapse

Toropova AP, Duchowicz PR, Saavedra LM, Castro EA, Toropov AA. The Use of the Index of Ideality of Correlation to Build Up Models for Bioconcentration Factor. Mol Inform 2020;39:e1900070. [DOI: 10.1002/minf.201900070] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Accepted: 12/24/2019] [Indexed: 01/16/2023]

Fioressi SE, Bacelo DE, Aranda JF, Duchowicz PR. Prediction of the aqueous solubility of diverse compounds by 2D-QSPR. J Mol Liq 2020. [DOI: 10.1016/j.molliq.2020.112572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Contreras-Torres E, Marrero-Ponce Y, Terán JE, García-Jacas CR, Brizuela CA, Sánchez-Rodríguez JC. MuLiMs-MCoMPAs: A Novel Multiplatform Framework to Compute Tensor Algebra-Based Three-Dimensional Protein Descriptors. J Chem Inf Model 2020;60:1042-1059. [PMID: 31663741 DOI: 10.1021/acs.jcim.9b00629] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Abstract

This report introduces the MuLiMs-MCoMPAs software (acronym for Multi-Linear Maps based on N-Metric and Contact Matrices of 3D Protein and Amino-acid weightings), designed to compute tensor-based 3D protein structural descriptors by applying two- and three-linear algebraic forms. Moreover, these descriptors contemplate generalizing components such as novel 3D protein structural representations, (dis)similarity metrics, and multimetrics to extract geometrical related information between two and three amino acids, weighting schemes based on amino acid properties, matrix normalization procedures that consider simple-stochastic and mutual probability transformations, topological and geometrical cutoffs, amino acid, and group-based MD calculations, and aggregation operators for merging amino acidic and group MDs. The MuLiMs-MCoMPAs software, which belongs to the ToMoCoMD-CAMPS suite, was developed in Java (version 1.8) using the Chemistry Development Kit (CDK) (version 1.4.19) and the Jmol libraries. This software implemented a divide-and-conquer strategy to parallelize the computation of the indices as well as modules for data preprocessing and batch computing functionalities. Furthermore, it consists of two components: (i) a desktop-graphical user interface (GUI) and (ii) an API library. The relevance of this novel approach is demonstrated through two analyses that considered Shannon's entropy-based variability and a principal component analysis. These studies showed that the MuLiMs-MCoMPAs' three-linear descriptor family contains higher informational entropy than several other descriptors generated with available computation tools. Moreover, the MuLiMs-MCoMPAs indices capture additional orthogonal information to the one codified by the available calculation approaches. As a result, two sets of suggested theoretical configurations that contain 13648 two-linear indices and 20263 three-linear indices are available for download at tomocomd.com . Furthermore, as a demonstration of the applicability and easy integration of the MuLiMs library into a QSAR-based expert system, a software application (ProStAF) was generated to predict SCOP protein structural classes and folding rate. It can thus be anticipated that the MuLiMs-MCoMPAs framework will turn into a valuable contribution to the chem- and bioinformatics research fields.

Collapse

Affiliation(s)

Ernesto Contreras-Torres Computer-Aided Molecular "Biosilico" Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Cumbayá, Quito , Ecuador.,Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ) , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha , Ecuador
Yovani Marrero-Ponce Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ) , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha , Ecuador.,Grupo GINUMED, Facultad de Salud, Programa de Medicina , Corporacion Universitaria Rafal Nuñez , Cartagena , Colombia.,Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia , Universitat de València , 46010 Valéncia , Spain
Julio E Terán Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas; and Instituto de Simulación Computacional (ISC-USFQ) , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha , Ecuador.,Grupo de Química Computacional y Teórica, Departamento de Ingeniería Química , Universidad San Francisco de Quito (USFQ) , Diego de Robles y vía Interoceánica , Quito 170157 , Pichincha Ecuador
César R García-Jacas Cátedras Conacyt-Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE) , Ensenada , Baja California , México
Carlos A Brizuela Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE) , Ensenada , Baja California , México
Juan Carlos Sánchez-Rodríguez Dirección de Tecnología , Universidad de las Ciencias Informáticas (UCI) , La Habana , Cuba

Collapse

Saavedra LM, Romanelli GP, Duchowicz PR. A non-conformational QSAR study for plant-derived larvicides against Zika Aedes aegypti L. vector. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2020;27:6205-6214. [PMID: 31865579 DOI: 10.1007/s11356-019-06630-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 09/25/2019] [Indexed: 06/10/2023]

Duchowicz PR, Aranda JF, Bacelo DE, Fioressi SE. QSPR study of the Henry’s law constant for heterogeneous compounds. Chem Eng Res Des 2020. [DOI: 10.1016/j.cherd.2019.12.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Marrero-Ponce Y, Teran JE, Contreras-Torres E, García-Jacas CR, Perez-Castillo Y, Cubillan N, Peréz-Giménez F, Valdés-Martini JR. LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs. J Theor Biol 2020;485:110039. [DOI: 10.1016/j.jtbi.2019.110039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 09/11/2019] [Accepted: 10/02/2019] [Indexed: 11/28/2022]

Halder AK, Giri AK, Cordeiro MNDS. Multi-Target Chemometric Modelling, Fragment Analysis and Virtual Screening with ERK Inhibitors as Potential Anticancer Agents. Molecules 2019;24:molecules24213909. [PMID: 31671605 PMCID: PMC6864583 DOI: 10.3390/molecules24213909] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 10/21/2019] [Accepted: 10/25/2019] [Indexed: 02/07/2023] Open

When global and local molecular descriptors are more than the sum of its parts: Simple, But Not Simpler? Mol Divers 2019;24:913-932. [PMID: 31659696 DOI: 10.1007/s11030-019-10002-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 10/09/2019] [Indexed: 01/29/2023]

Abstract

In this report, we introduce a set of aggregation operators (AOs) to calculate global and local (group and atom type) molecular descriptors (MDs) as a generalization of the classical approach of molecular encoding using the sum of the atomic (or fragment) contributions. These AOs are implemented in a new and free software denominated MD-LOVIs ( http://tomocomd.com/md-lovis ), which allows for the calculation of MDs from atomic weights vector and LOVIs (local vertex invariants). This software was developed in Java programming language and employed the Chemical Development Kit (CDK) library for handling chemical structures and the calculation of atomic weights. An analysis of the complexities of the algorithms presented herein demonstrates that these aspects were efficiently implemented. The calculation speed experiments show that the MD-LOVIs software has satisfactory behavior when compared to software such as Padel, CDKDescriptor, DRAGON and Bluecal software. Shannon's entropy (SE)-based variability studies demonstrate that MD-LOVIs yields indices with greater information content when compared to those of popular academic and commercial software. A principal component analysis reveals that our approach captures chemical information orthogonal to that codified by the DRAGON, Padel and Mold2 software, as a result of the several generalizations in MD-LOVIs not used in other programs. Lastly, three QSARs were built using multiple linear regression with genetic algorithms, and the statistical parameters of these models demonstrate that the MD-LOVIs indices obtained with AOs yield better performance than those obtained when the summation operator is used exclusively. Moreover, it is also revealed that the MD-LOVIs indices yield models with comparable to superior performance when compared to other QSAR methodologies reported in the literature, despite their simplicity. The studies performed herein collectively demonstrated that MD-LOVIs software generates indices as simple as possible, but not simpler and that use of AOs enhances the diversity of the chemical information codified, which consequently improves the performance of traditional MDs.

Collapse

Terán JE, Marrero-Ponce Y, Contreras-Torres E, García-Jacas CR, Vivas-Reyes R, Terán E, Torres FJ. Tensor Algebra-based Geometrical (3D) Biomacro-Molecular Descriptors for Protein Research: Theory, Applications and Comparison with other Methods. Sci Rep 2019;9:11391. [PMID: 31388082 PMCID: PMC6684663 DOI: 10.1038/s41598-019-47858-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 07/22/2019] [Indexed: 11/16/2022] Open

Affiliation(s)

Julio E Terán Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador.,Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, and Instituto de Simulación Computacional (ISC-USFQ), Quito, Pichincha, Ecuador
Yovani Marrero-Ponce Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador. .,Universidad de San Buenaventura - Cartagena - Facultad de Ciencias de la Salud - Grupo de Investigación Microbiología & Ambiente (GIMA) - Calle Real de Ternera, Diagonal 32, No. 30-966, Cartagena, Código postal: 1300 10, Colombia.
Ernesto Contreras-Torres Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
César R García-Jacas Cátedras CONACYT - Departamento de Ciencia de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California, Mexico
Ricardo Vivas-Reyes Grupo de Química Cuántica y Teórica de la Universidad de Cartagena-Facultad de Ciencias Exactas y Naturales. Programa de Química. Campus de San Pablo and Grupo GINUMED Corporacion Universitaria Rafal Nuñez. Facultad de Salud. Programa de Medicina., Cartagena, Colombia.,Grupo CipTec, Facultad de Ingenierias. Fundacion Universitaria Tecnologico Comfenalco - Cartagena, Cartagena, Bolívar, Colombia
Enrique Terán Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
F Javier Torres Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, and Instituto de Simulación Computacional (ISC-USFQ), Quito, Pichincha, Ecuador

Collapse

Serra A, Önlü S, Coretto P, Greco D. An integrated quantitative structure and mechanism of action-activity relationship model of human serum albumin binding. J Cheminform 2019;11:38. [PMID: 31172382 PMCID: PMC6551915 DOI: 10.1186/s13321-019-0359-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 05/22/2019] [Indexed: 01/27/2023] Open

García-Jacas CR, Marrero-Ponce Y, Cortés-Guzmán F, Suárez-Lezcano J, Martinez-Rios FO, García-González LA, Pupo-Meriño M, Martinez-Mayorga K. Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes. Chem Res Toxicol 2019;32:1178-1192. [PMID: 31066547 DOI: 10.1021/acs.chemrestox.9b00011] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Abstract

Quantitative structure-activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with R_EPA-training ≥ 0.75 ( R = correlation coefficient) and MAE_EPA-training ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAE_test-AD = 0.4044, MAE_ProTox-AD = 0.4067 and MAE_T3DB-AD = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAE_ProTox-AD = 0.3992 and MAE_T3DB-AD = 0.2286, and MAE_ProTox-AD = 0.3773 and MAE_T3DB-AD = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks ( http://tomocomd.com/apps/ptoxra ).

Collapse

Pham-The H, Cabrera-Pérez MÁ, Nam NH, Castillo-Garit JA, Rasulev B, Le-Thi-Thu H, Casañola-Martin GM. In Silico Assessment of ADME Properties: Advances in Caco-2 Cell Monolayer Permeability Modeling. Curr Top Med Chem 2019;18:2209-2229. [PMID: 30499410 DOI: 10.2174/1568026619666181130140350] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Revised: 10/16/2018] [Accepted: 11/19/2018] [Indexed: 11/22/2022]

Speck-Planche A. Combining Ensemble Learning with a Fragment-Based Topological Approach To Generate New Molecular Diversity in Drug Discovery: In Silico Design of Hsp90 Inhibitors. ACS OMEGA 2018;3:14704-14716. [PMID: 30555986 PMCID: PMC6289491 DOI: 10.1021/acsomega.8b02419] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 10/23/2018] [Indexed: 05/05/2023]

Abstract

Machine learning methods have revolutionized modern science, providing fast and accurate solutions to multiple problems. However, they are commonly treated as "black boxes". Therefore, in important scientific fields such as medicinal chemistry and drug discovery, machine learning methods are restricted almost exclusively to the task of performing predictions of large and heterogeneous data sets of chemicals. The lack of interpretability prevents the full exploitation of the machine learning models as generators of new chemical knowledge. This work focuses on the development of an ensemble learning model for the prediction and design of potent dual heat shock protein 90 (Hsp90) inhibitors. The model displays accuracy higher than 80% in both training and test sets. To use the ensemble model as a generator of new chemical knowledge, three steps were followed. First, a physicochemical and/or structural interpretation was provided for each molecular descriptor present in the ensemble learning model. Second, the term "pseudolinear equation" was introduced within the context of machine learning to calculate the relative quantitative contributions of different molecular fragments to the inhibitory activity against the two Hsp90 isoforms studied here. Finally, by assembling the fragments with positive contributions, new molecules were designed, being predicted as potent Hsp90 inhibitors. According to Lipinski's rule of five, the designed molecules were found to exhibit potentially good oral bioavailability, a primordial property that chemicals must have to pass early stages in drug discovery. The present approach based on the combination of ensemble learning and fragment-based topological design holds great promise in drug discovery, and it can be adapted and applied to many different scientific disciplines.

Collapse

BET bromodomain inhibitors: fragment-based in silico design using multi-target QSAR models. Mol Divers 2018;23:555-572. [PMID: 30421269 DOI: 10.1007/s11030-018-9890-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2018] [Accepted: 10/30/2018] [Indexed: 12/17/2022]

García-Jacas CR, Cabrera-Leyva L, Marrero-Ponce Y, Suárez-Lezcano J, Cortés-Guzmán F, Pupo-Meriño M, Vivas-Reyes R. Choquet integral-based fuzzy molecular characterizations: when global definitions are computed from the dependency among atom/bond contributions (LOVIs/LOEIs). J Cheminform 2018;10:51. [PMID: 30362050 PMCID: PMC6755596 DOI: 10.1186/s13321-018-0306-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2018] [Accepted: 10/15/2018] [Indexed: 01/22/2023] Open

Abstract

BACKGROUND

Several topological (2D) and geometric (3D) molecular descriptors (MDs) are calculated from local vertex/edge invariants (LOVIs/LOEIs) by performing an aggregation process. To this end, norm-, mean- and statistic-based (non-fuzzy) operators are used, under the assumption that LOVIs/LOEIs are independent (orthogonal) values of one another. These operators are based on additive and/or linear measures and, consequently, they cannot be used to encode information from interrelated criteria. Thus, as LOVIs/LOEIs are not orthogonal values, then non-additive (fuzzy) measures can be used to encode the interrelation among them.

RESULTS

General approaches to compute fuzzy 2D/3D-MDs from the contribution of each atom (LOVIs) or covalent bond (LOEIs) within a molecule are proposed, by using the Choquet integral as fuzzy aggregation operator. The Choquet integral-based operator is rather different from the other operators often used for the 2D/3D-MDs calculation. It performs a reordering step to fuse the LOVIs/LOEIs according to their magnitudes and, in addition, it considers the interrelation among them through a fuzzy measure. With this operator, fuzzy definitions can be derived from traditional or recent MDs; for instance, fuzzy Randic-like connectivity indices, fuzzy Balaban-like indices, fuzzy Kier-Hall connectivity indices, among others. To demonstrate the feasibility of using this operator, the QuBiLS-MIDAS 3D-MDs were used as study case and, as a result, a module was built into the corresponding software to compute them ( http://tomocomd.com/qubils-midas ). Thus, it is the only software reported in the literature that can be employed to determine Choquet integral-based fuzzy MDs. Moreover, regression models were created on eight chemical datasets. In this way, a comparison between the results achieved by the models based on the non-fuzzy QuBiLS-MIDAS 3D-MDs with regard to the ones achieved by the models based on the fuzzy QuBiLS-MIDAS 3D-MDs was made. As a result, the models built with the fuzzy QuBiLS-MIDAS 3D-MDs achieved the best performance, which was statistically corroborated through the Wilcoxon signed-rank test.

CONCLUSIONS

All in all, it can be concluded that the Choquet integral constitutes a prominent alternative to compute fuzzy 2D/3D-MDs from LOVIs/LOEIs. In this way, better characterizations of the compounds can be obtained, which will be ultimately useful in enhancing the modelling ability of existing traditional 2D/3D-MDs.

Collapse

Zhang H, Ren JX, Ma JX, Ding L. Development of an in silico prediction model for chemical-induced urinary tract toxicity by using naïve Bayes classifier. Mol Divers 2018;23:381-392. [DOI: 10.1007/s11030-018-9882-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 09/25/2018] [Indexed: 12/16/2022]

García-Jacas CR, Cabrera-Leyva L, Marrero-Ponce Y, Suárez-Lezcano J, Cortés-Guzmán F, García-González LA. GOWAWA Aggregation Operator-based Global Molecular Characterizations: Weighting Atom/bond Contributions (LOVIs/LOEIs) According to their Influence in the Molecular Encoding. Mol Inform 2018;37:e1800039. [PMID: 30070434 DOI: 10.1002/minf.201800039] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Accepted: 07/13/2018] [Indexed: 11/11/2022]

Abstract

A different perspective to compute global weighted definitions of molecular descriptors from the contributions of each atom (LOVIs) or covalent bond (LOEIs) within a molecule is presented, using the generalized ordered weighted averaging - weighted averaging (GOWAWA) aggregation operator. This operator is rather different from the other norm-, mean- and statistic-based operators used up to date for the descriptors calculation from LOVIs/LOEIs. GOWAWA unifies the generalized ordered weighted averaging (GOWA) and the weighted generalized mean (WGM) functions and, in addition, it uses a smoothing parameter to assign different importance values to both functions depending on the problem under study. With the GOWAWA operator, diversity of novel global aggregations of molecular descriptors can be determined, where the influence that each atom (or covalent bond) has on the molecular characterization is taken into account. Therefore, this approach is completely different from the ones reported in the literature, where the values of LOVIs/LOEIs are considered equally important. To demonstrate the feasibility of using this operator, the QuBiLS-MIDAS descriptors (http://tomocomd.com/qubils-midas) were used and, as a result, a module was built into the corresponding software to compute them, being thus the only software reported in the literature that can be employed to determine weighted descriptors. Moreover, several modeling studies were performed on eight chemical datasets, which demonstrated that, with the GOWAWA aggregation operator, weighted QuBiLS-MIDAS descriptors that contribute to develop models with greater predictive power can be computed, if compared to the models based on the non-weighted descriptors calculated from the other operators used up to date. A non-parametric statistical assessment confirmed that the GOWAWA-based predictions are significantly superior to the others obtained. Therefore, all in all, it can be concluded that, from the results achieved, the GOWAWA operator constitutes a prominent alternative to codify relevant chemical information of the molecules, ultimately useful in improving the modeling ability of several old and recent descriptors whose definition is based on the LOVIs/LOEIs calculation.

Collapse

Drug repositioning for novel antitrichomonas from known antiprotozoan drugs using hierarchical screening. Future Med Chem 2018;10:863-878. [DOI: 10.4155/fmc-2016-0211] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open

Dong J, Yao ZJ, Zhang L, Luo F, Lin Q, Lu AP, Chen AF, Cao DS. PyBioMed: a python library for various molecular representations of chemicals, proteins and DNAs and their interactions. J Cheminform 2018;10:16. [PMID: 29556758 PMCID: PMC5861255 DOI: 10.1186/s13321-018-0270-2] [Citation(s) in RCA: 70] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 03/12/2018] [Indexed: 11/15/2022] Open

Abstract

Background

With the increasing development of biotechnology and informatics technology, publicly available data in chemistry and biology are undergoing explosive growth. Such wealthy information in these data needs to be extracted and transformed to useful knowledge by various data mining methods. Considering the amazing rate at which data are accumulated in chemistry and biology fields, new tools that process and interpret large and complex interaction data are increasingly important. So far, there are no suitable toolkits that can effectively link the chemical and biological space in view of molecular representation. To further explore these complex data, an integrated toolkit for various molecular representation is urgently needed which could be easily integrated with data mining algorithms to start a full data analysis pipeline.

Results

Herein, the python library PyBioMed is presented, which comprises functionalities for online download for various molecular objects by providing different IDs, the pretreatment of molecular structures, the computation of various molecular descriptors for chemicals, proteins, DNAs and their interactions. PyBioMed is a feature-rich and highly customized python library used for the characterization of various complex chemical and biological molecules and interaction samples. The current version of PyBioMed could calculate 775 chemical descriptors and 19 kinds of chemical fingerprints, 9920 protein descriptors based on protein sequences, more than 6000 DNA descriptors from nucleotide sequences, and interaction descriptors from pairwise samples using three different combining strategies. Several examples and five real-life applications were provided to clearly guide the users how to use PyBioMed as an integral part of data analysis projects. By using PyBioMed, users are able to start a full pipelining from getting molecular data, pretreating molecules, molecular representation to constructing machine learning models conveniently.

Conclusion

PyBioMed provides various user-friendly and highly customized APIs to calculate various features of biological molecules and complex interaction samples conveniently, which aims at building integrated analysis pipelines from data acquisition, data checking, and descriptor calculation to modeling. PyBioMed is freely available at http://projects.scbdd.com/pybiomed.html.

Collapse

Affiliation(s)

Jie Dong Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China.,College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
Zhi-Jiang Yao Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China
Lin Zhang College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
Feijun Luo College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
Qinlu Lin College of Food Science and Engineering, National Engineering Laboratory for Deep Processing of Rice and Byproducts, Central South University of Forestry and Technology, Changsha, China
Ai-Ping Lu Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
Alex F Chen Center for Vascular Disease and Translational Medicine, Third Xiangya Hospital, Central South University, Changsha, People's Republic of China
Dong-Sheng Cao Xiangya School of Pharmaceutical Sciences, Central South University, No. 172, Tongzipo Road, Yuelu District, Changsha, People's Republic of China. .,Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China. .,Center for Vascular Disease and Translational Medicine, Third Xiangya Hospital, Central South University, Changsha, People's Republic of China.

Collapse

Phanus-umporn C, Shoombuatong W, Prachayasittikul V, Anuwongcharoen N, Nantasenamat C. Privileged substructures for anti-sickling activity via cheminformatic analysis. RSC Adv 2018;8:5920-5935. [PMID: 35539618 PMCID: PMC9078244 DOI: 10.1039/c7ra12079f] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Revised: 02/21/2018] [Accepted: 01/12/2018] [Indexed: 11/21/2022] Open

Abstract

Sickle cell disease (SCD), an autosomal recessive genetic disorder, has been recognized by the World Health Organization (WHO) as a major public health problem as it affects 300 000 individuals worldwide. Complications arising from SCD include anemia, microvascular occlusion, severe pain, stokes, renal dysfunction and infections. A lucrative therapeutic strategy is to employ anti-sickling agents that can disrupt the formation of the HbS polymer. This study therefore employed cheminformatic approaches, encompassing classification structure–activity relationship (CSAR) modeling, to deduce the privileged substructures giving rise to the anti-sickling activity of an investigated set of 115 compounds, followed by substructure analysis. Briefly, the compiled compounds were described by fingerprint descriptors and used in the construction of CSAR models via several machine learning algorithms. The modelability of the data set, as exemplified by the MODI index, was determined to be in the range of 0.70–0.84. The predictive performance was deduced by the accuracy, sensitivity, specificity and Matthews correlation coefficient, which was found to be statistically robust, whereby the former three parameters afforded values in excess of 0.7 while the latter statistical parameter provided a value greater than 0.5. An analysis of the top 20 important substructure descriptors for anti-sickling activity revealed that 10 important features were significant in the differentiation of actives from inactives, as illustrated by aromaticity/conjugation (e.g. SubFPC287, SubFPC171 and SubFPC5), carbonyl groups (e.g. SubFPC137, SubFPC139, SubFPC49 and SubFPC135) and miscellaneous groups (e.g. SubFPC303, SubFPC302 and SubFPC275). Furthermore, an analysis of the structure–activity relationship revealed that the length of alkyl chains, choice of functional moiety and position of substitution on the benzene ring may affect the anti-sickling activity of these compounds. Thus, this knowledge is anticipated to be useful for guiding the design of robust compounds against the gelling activity of HbS, as preliminarily demonstrated in the data-driven compound design presented herein.

Cheminformatic approaches (classification structure–activity relationship models based on 12 fingerprint classes) were employed for deducing privileged substructures giving rise to the anti-sickling activity of an investigated set of 115 compounds.

Collapse