1
|
Jimenes-Vargas K, Pazos A, Munteanu CR, Perez-Castillo Y, Tejera E. Prediction of compound-target interaction using several artificial intelligence algorithms and comparison with a consensus-based strategy. J Cheminform 2024; 16:27. [PMID: 38449058 PMCID: PMC10919000 DOI: 10.1186/s13321-024-00816-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 02/15/2024] [Indexed: 03/08/2024] Open
Abstract
For understanding a chemical compound's mechanism of action and its side effects, as well as for drug discovery, it is crucial to predict its possible protein targets. This study examines 15 developed target-centric models (TCM) employing different molecular descriptions and machine learning algorithms. They were contrasted with 17 third-party models implemented as web tools (WTCM). In both sets of models, consensus strategies were implemented as potential improvement over individual predictions. The findings indicate that TCM reach f1-score values greater than 0.8. Comparing both approaches, the best TCM achieves values of 0.75, 0.61, 0.25 and 0.38 for true positive/negative rates (TPR, TNR) and false negative/positive rates (FNR, FPR); outperforming the best WTCM. Moreover, the consensus strategy proves to have the most relevant results in the top 20 % of target profiles. TCM consensus reach TPR and FNR values of 0.98 and 0; while on WTCM reach values of 0.75 and 0.24. The implemented computational tool with the TCM and their consensus strategy at: https://bioquimio.udla.edu.ec/tidentification01/ . Scientific Contribution: We compare and discuss the performances of 17 public compound-target interaction prediction models and 15 new constructions. We also explore a compound-target interaction prioritization strategy using a consensus approach, and we analyzed the challenging involved in interactions modeling.
Collapse
Affiliation(s)
- Karina Jimenes-Vargas
- Bio-Cheminformatics Research Group, Universidad de Las Américas, Quito, 170504, Ecuador.
- Departament of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruña, Campus Elviña s/n, 15071, A Coruña, Spain.
| | - Alejandro Pazos
- Departament of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruña, 15071, A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruna (CHUAC), 15006, A Coruna, Spain
| | - Cristian R Munteanu
- Departament of Computer Science and Information Technologies, Faculty of Computer Science, Universidade da Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
- CITIC-Research Center of Information and Communication Technologies, Universidade da Coruña, 15071, A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruna (CHUAC), 15006, A Coruna, Spain
| | | | - Eduardo Tejera
- Bio-Cheminformatics Research Group, Universidad de Las Américas, Quito, 170504, Ecuador.
| |
Collapse
|
2
|
Carracedo-Reboredo P, Aranzamendi E, He S, Arrasate S, Munteanu CR, Fernandez-Lozano C, Sotomayor N, Lete E, González-Díaz H. MATEO: intermolecular α-amidoalkylation theoretical enantioselectivity optimization. Online tool for selection and design of chiral catalysts and products. J Cheminform 2024; 16:9. [PMID: 38254200 PMCID: PMC10804835 DOI: 10.1186/s13321-024-00802-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 01/11/2024] [Indexed: 01/24/2024] Open
Abstract
The enantioselective Brønsted acid-catalyzed α-amidoalkylation reaction is a useful procedure is for the production of new drugs and natural products. In this context, Chiral Phosphoric Acid (CPA) catalysts are versatile catalysts for this type of reactions. The selection and design of new CPA catalysts for different enantioselective reactions has a dual interest because new CPA catalysts (tools) and chiral drugs or materials (products) can be obtained. However, this process is difficult and time consuming if approached from an experimental trial and error perspective. In this work, an Heuristic Perturbation-Theory and Machine Learning (HPTML) algorithm was used to seek a predictive model for CPA catalysts performance in terms of enantioselectivity in α-amidoalkylation reactions with R2 = 0.96 overall for training and validation series. It involved a Monte Carlo sampling of > 100,000 pairs of query and reference reactions. In addition, the computational and experimental investigation of a new set of intermolecular α-amidoalkylation reactions using BINOL-derived N-triflylphosphoramides as CPA catalysts is reported as a case of study. The model was implemented in a web server called MATEO: InterMolecular Amidoalkylation Theoretical Enantioselectivity Optimization, available online at: https://cptmltool.rnasa-imedir.com/CPTMLTools-Web/mateo . This new user-friendly online computational tool would enable sustainable optimization of reaction conditions that could lead to the design of new CPA catalysts along with new organic synthesis products.
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
- Department of Computer Science and Information Technologies, Faculty of Computer Science, CITIC-Research Center of Information and Communication Technologies, University of A Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
| | - Eider Aranzamendi
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
| | - Shan He
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
- IKERDATA S.L., ZITEK, University of Basque Country UPVEHU, Rectorate Building, 48940, Leioa, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain
| | - Cristian R Munteanu
- Department of Computer Science and Information Technologies, Faculty of Computer Science, CITIC-Research Center of Information and Communication Technologies, University of A Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, CITIC-Research Center of Information and Communication Technologies, University of A Coruña, Campus Elviña s/n, 15071, A Coruña, Spain
| | - Nuria Sotomayor
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
| | - Esther Lete
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, Faculty of Science and Technology, University of The Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain.
| |
Collapse
|
3
|
López-Cortés A, Guevara-Ramírez P, Kyriakidis NC, Barba-Ostria C, León Cáceres Á, Guerrero S, Ortiz-Prado E, Munteanu CR, Tejera E, Cevallos-Robalino D, Gómez-Jaramillo AM, Simbaña-Rivera K, Granizo-Martínez A, Pérez-M G, Moreno S, García-Cárdenas JM, Zambrano AK, Pérez-Castillo Y, Cabrera-Andrade A, Puig San Andrés L, Proaño-Castro C, Bautista J, Quevedo A, Varela N, Quiñones LA, Paz-y-Miño C. In silico Analyses of Immune System Protein Interactome Network, Single-Cell RNA Sequencing of Human Tissues, and Artificial Neural Networks Reveal Potential Therapeutic Targets for Drug Repurposing Against COVID-19. Front Pharmacol 2021; 12:598925. [PMID: 33716737 PMCID: PMC7952300 DOI: 10.3389/fphar.2021.598925] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Accepted: 01/11/2021] [Indexed: 12/15/2022] Open
Abstract
Background: There is pressing urgency to identify therapeutic targets and drugs that allow treating COVID-19 patients effectively. Methods: We performed in silico analyses of immune system protein interactome network, single-cell RNA sequencing of human tissues, and artificial neural networks to reveal potential therapeutic targets for drug repurposing against COVID-19. Results: We screened 1,584 high-confidence immune system proteins in ACE2 and TMPRSS2 co-expressing cells, finding 25 potential therapeutic targets significantly overexpressed in nasal goblet secretory cells, lung type II pneumocytes, and ileal absorptive enterocytes of patients with several immunopathologies. Then, we performed fully connected deep neural networks to find the best multitask classification model to predict the activity of 10,672 drugs, obtaining several approved drugs, compounds under investigation, and experimental compounds with the highest area under the receiver operating characteristics. Conclusion: After being effectively analyzed in clinical trials, these drugs can be considered for treatment of severe COVID-19 patients. Scripts can be downloaded at https://github.com/muntisa/immuno-drug-repurposing-COVID-19.
Collapse
Affiliation(s)
- Andrés López-Cortés
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Latin American Network for the Implementation and Validation of Clinical Pharmacogenomics Guidelines (RELIVAF-CYTED), Madrid, Spain
| | - Patricia Guevara-Ramírez
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| | - Nikolaos C. Kyriakidis
- One Health Research Group, Faculty of Medicine, Universidad de Las Américas (UDLA), Quito, Ecuador
| | - Carlos Barba-Ostria
- One Health Research Group, Faculty of Medicine, Universidad de Las Américas (UDLA), Quito, Ecuador
| | - Ángela León Cáceres
- Heidelberg Institute of Global Health, Faculty of Medicine, Heidelberg University, Heidelberg, Germany
- Instituto de Salud Pública, Facultad de Medicina, Pontificia Universidad Católica del Ecuador, Quito, Ecuador
- Tropical Herping, Quito, Ecuador
| | - Santiago Guerrero
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| | - Esteban Ortiz-Prado
- One Health Research Group, Faculty of Medicine, Universidad de Las Américas (UDLA), Quito, Ecuador
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Biomedical Research Institute of A Coruna (INIBIC), University Hospital Complex of A Coruna (CHUAC), A Coruña, Spain
- Centro de Información en Tecnologías de la Información y las Comunicaciones (CITIC), A Coruña, Spain
| | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas (UDLA), Quito, Ecuador
| | | | | | - Katherine Simbaña-Rivera
- One Health Research Group, Faculty of Medicine, Universidad de Las Américas (UDLA), Quito, Ecuador
| | - Adriana Granizo-Martínez
- Carrera de Medicina, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| | - Gabriela Pérez-M
- Centro Clínico Quirúrgico Ambulatorio Hospital del Día El Batán, Instituto Ecuatoriano de Seguridad Social, Quito, Ecuador
| | - Silvana Moreno
- Department of Plant Biology, Faculty of Natural Resources and Agricultural Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Jennyfer M. García-Cárdenas
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| | - Ana Karina Zambrano
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
- Biomedical Research Institute of A Coruna (INIBIC), University Hospital Complex of A Coruna (CHUAC), A Coruña, Spain
| | | | - Alejandro Cabrera-Andrade
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Grupo de Bio-Quimioinformática, Universidad de Las Américas (UDLA), Quito, Ecuador
| | - Lourdes Puig San Andrés
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| | | | - Jhommara Bautista
- Facultad de Ingeniería y Ciencias Aplicadas-Biotecnología, Universidad de Las Américas, Quito, Ecuador
| | - Andreina Quevedo
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| | - Nelson Varela
- Latin American Network for the Implementation and Validation of Clinical Pharmacogenomics Guidelines (RELIVAF-CYTED), Madrid, Spain
- Laboratory of Chemical Carcinogenesis and Pharmacogenetics, Department of Basic-Clinical Oncology, Faculty of Medicine, University of Chile, Santiago, Chile
| | - Luis Abel Quiñones
- Latin American Network for the Implementation and Validation of Clinical Pharmacogenomics Guidelines (RELIVAF-CYTED), Madrid, Spain
- Laboratory of Chemical Carcinogenesis and Pharmacogenetics, Department of Basic-Clinical Oncology, Faculty of Medicine, University of Chile, Santiago, Chile
| | - César Paz-y-Miño
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito, Ecuador
| |
Collapse
|
4
|
Nocedo-Mena D, Arrasate S, Garza-González E, Rivas-Galindo VM, Romo-Mancillas A, Munteanu CR, Sotomayor N, Lete E, Barbolla I, Martín CA, Del Rayo Camacho-Corona M. Molecular docking, SAR analysis and biophysical approaches in the study of the antibacterial activity of ceramides isolated from Cissus incisa. Bioorg Chem 2021; 109:104745. [PMID: 33640629 DOI: 10.1016/j.bioorg.2021.104745] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 01/09/2021] [Accepted: 02/10/2021] [Indexed: 12/11/2022]
Abstract
The developing of antibacterial resistance is becoming in crisis. In this sense, natural products play a fundamental role in the discovery of antibacterial agents with diverse mechanisms of action. Phytochemical investigation of Cissus incisa leaves led to isolation and characterization of the ceramides mixture (1): (8E)-2-(tritriacont-9-enoyl amino)-1,3,4-octadecanetriol-8-ene (1-I); (8E)-2-(2',3'-dihydroxyoctacosanoyl amino)-1,3,4-octadecanetriol-8-ene (1-II); (8E)-2-(2'-hydroxyheptacosanoyl amino)-1,3,4-octadecanetriol-8-ene (1-III); and (8E)-2-(-2'-hydroxynonacosanoyl amino)-1,3,4-octadecanetriol-8-ene (1-IV). Until now, this is the first report of the ceramides (1-I), (1-II), and (1-IV). The structures were elucidated using NMR and mass spectrometry analyses. Antibacterial activity of ceramides (1) and acetylated derivates (2) was evaluated against nine multidrug-resistant bacteria by Microdilution method. (1) showed the best results against Gram-negatives, mainly against carbapenems-resistant Acinetobacter baumannii with MIC = 50 μg/mL. Structure-activity analysis and molecular docking revealed interactions between plant ceramides with membrane proteins, and enzymes associated with biological membranes of Gram-negative bacteria, through hydrogen bonding of functional groups. Vesicular contents release assay showed the capacity of (1) to disturb membrane permeability detected by an increase of fluorescence probe over time. The membrane disruption is not caused for ceramides lytic action on cell membranes, according in vitro hemolyticactivity results. Combining SAR analysis, bioinformatics and biophysical techniques, and also experimental tests, it was possible to explain the antibacterial action of these natural ceramides.
Collapse
Affiliation(s)
- Deyani Nocedo-Mena
- Universidad Autónoma de Nuevo León, Facultad de Ciencias Químicas, Av. Universidad S/N, Ciudad Universitaria, 66451 San Nicolás de los Garza, Nuevo León, Mexico; University of the Basque Country UPV/EHU, Department of Organic Chemistry II, 48940 Leioa, Spain
| | - Sonia Arrasate
- University of the Basque Country UPV/EHU, Department of Organic Chemistry II, 48940 Leioa, Spain
| | - Elvira Garza-González
- Universidad Autónoma de Nuevo León, Servicio de Gastroenterología, Hospital Universitario "Dr. José Eleuterio González", Av. Gonzalitos and Madero S/N, Colonia Mitras Centro, 64460 Monterrey, Nuevo León, Mexico
| | - Verónica M Rivas-Galindo
- Universidad Autónoma de Nuevo León, Facultad de Medicina, Av. Gonzalitos and Madero S/N, Colonia Mitras Centro, 64460 Monterrey, Nuevo León, Mexico
| | - Antonio Romo-Mancillas
- Universidad Autónoma de Querétaro, Facultad de Ciencias Químicas, Centro Universitario, Cerro de las Campanas, 76010 Querétaro, Mexico
| | - Cristian R Munteanu
- University of A Coruna, Computer Science Faculty, 15071 A Coruña, Spain; Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), 15006 A Coruña, Spain; Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Nuria Sotomayor
- University of the Basque Country UPV/EHU, Department of Organic Chemistry II, 48940 Leioa, Spain
| | - Esther Lete
- University of the Basque Country UPV/EHU, Department of Organic Chemistry II, 48940 Leioa, Spain
| | - Iratxe Barbolla
- University of the Basque Country UPV/EHU, Department of Organic Chemistry II, 48940 Leioa, Spain
| | - César A Martín
- Biofisika Institute (UPV/EHU, CSIC), 48940, Leioa, Spain; University of the Basque Country, UPV/EHU, Department of Biochemistry and Molecular Biology, Faculty of Science and Technology, 48940 Leioa, Spain.
| | - María Del Rayo Camacho-Corona
- Universidad Autónoma de Nuevo León, Facultad de Ciencias Químicas, Av. Universidad S/N, Ciudad Universitaria, 66451 San Nicolás de los Garza, Nuevo León, Mexico.
| |
Collapse
|
5
|
Cabrera-Andrade A, López-Cortés A, Jaramillo-Koupermann G, González-Díaz H, Pazos A, Munteanu CR, Pérez-Castillo Y, Tejera E. A Multi-Objective Approach for Anti-Osteosarcoma Cancer Agents Discovery through Drug Repurposing. Pharmaceuticals (Basel) 2020; 13:ph13110409. [PMID: 33266378 PMCID: PMC7700154 DOI: 10.3390/ph13110409] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2020] [Revised: 11/11/2020] [Accepted: 11/12/2020] [Indexed: 02/08/2023] Open
Abstract
Osteosarcoma is the most common type of primary malignant bone tumor. Although nowadays 5-year survival rates can reach up to 60–70%, acute complications and late effects of osteosarcoma therapy are two of the limiting factors in treatments. We developed a multi-objective algorithm for the repurposing of new anti-osteosarcoma drugs, based on the modeling of molecules with described activity for HOS, MG63, SAOS2, and U2OS cell lines in the ChEMBL database. Several predictive models were obtained for each cell line and those with accuracy greater than 0.8 were integrated into a desirability function for the final multi-objective model. An exhaustive exploration of model combinations was carried out to obtain the best multi-objective model in virtual screening. For the top 1% of the screened list, the final model showed a BEDROC = 0.562, EF = 27.6, and AUC = 0.653. The repositioning was performed on 2218 molecules described in DrugBank. Within the top-ranked drugs, we found: temsirolimus, paclitaxel, sirolimus, everolimus, and cabazitaxel, which are antineoplastic drugs described in clinical trials for cancer in general. Interestingly, we found several broad-spectrum antibiotics and antiretroviral agents. This powerful model predicts several drugs that should be studied in depth to find new chemotherapy regimens and to propose new strategies for osteosarcoma treatment.
Collapse
Affiliation(s)
- Alejandro Cabrera-Andrade
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170125, Ecuador;
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Quito 170125, Ecuador
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071 A Coruña, Spain; (A.L.-C.); (A.P.); (C.R.M.)
- Correspondence: (A.C.-A.); (E.T.)
| | - Andrés López-Cortés
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071 A Coruña, Spain; (A.L.-C.); (A.P.); (C.R.M.)
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito 170129, Ecuador
- Latin American Network for Implementation and Validation of Clinical Pharmacogenomics Guidelines (RELIVAF-CYTED), 28029 Madrid, Spain
| | - Gabriela Jaramillo-Koupermann
- Laboratorio de Biología Molecular, Subproceso de Anatomía Patológica, Hospital de Especialidades Eugenio Espejo, Quito 170403, Ecuador;
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, and Basque Center for Biophysics CSIC-UPV/EHU, University of the Basque Country UPV/EHU, 48940 Leioa, Spain;
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071 A Coruña, Spain; (A.L.-C.); (A.P.); (C.R.M.)
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Cristian R. Munteanu
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071 A Coruña, Spain; (A.L.-C.); (A.P.); (C.R.M.)
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Yunierkis Pérez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170125, Ecuador;
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Quito 170125, Ecuador
| | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170125, Ecuador;
- Facultad de Ingeniería y Ciencias Agropecuarias, Universidad de Las Américas, Quito 170125, Ecuador
- Correspondence: (A.C.-A.); (E.T.)
| |
Collapse
|
6
|
Tejera E, Munteanu CR, López-Cortés A, Cabrera-Andrade A, Pérez-Castillo Y. Drugs Repurposing Using QSAR, Docking and Molecular Dynamics for Possible Inhibitors of the SARS-CoV-2 M pro Protease. Molecules 2020; 25:E5172. [PMID: 33172092 PMCID: PMC7664330 DOI: 10.3390/molecules25215172] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 10/28/2020] [Accepted: 11/04/2020] [Indexed: 12/17/2022] Open
Abstract
Wuhan, China was the epicenter of the first zoonotic transmission of the severe acute respiratory syndrome coronavirus clade 2 (SARS-CoV-2) in December 2019 and it is the causative agent of the novel human coronavirus disease 2019 (COVID-19). Almost from the beginning of the COVID-19 outbreak several attempts were made to predict possible drugs capable of inhibiting the virus replication. In the present work a drug repurposing study is performed to identify potential SARS-CoV-2 protease inhibitors. We created a Quantitative Structure-Activity Relationship (QSAR) model based on a machine learning strategy using hundreds of inhibitor molecules of the main protease (Mpro) of the SARS-CoV coronavirus. The QSAR model was used for virtual screening of a large list of drugs from the DrugBank database. The best 20 candidates were then evaluated in-silico against the Mpro of SARS-CoV-2 by using docking and molecular dynamics analyses. Docking was done by using the Gold software, and the free energies of binding were predicted with the MM-PBSA method as implemented in AMBER. Our results indicate that levothyroxine, amobarbital and ABP-700 are the best potential inhibitors of the SARS-CoV-2 virus through their binding to the Mpro enzyme. Five other compounds showed also a negative but small free energy of binding: nikethamide, nifurtimox, rebimastat, apomine and rebastinib.
Collapse
Affiliation(s)
- Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170513, Ecuador; (A.C.-A.); (Y.P.-C.)
- Facultad de Ingeniería y Ciencias Aplicadas, Universidad de Las Américas, Quito 170513, Ecuador
| | - Cristian R. Munteanu
- Faculty of Computer Science, Centre for Information and Communications Technology Research (CITIC), University of A Coruna, 15007 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruna (CHUAC), 15006 A Coruña, Spain
| | - Andrés López-Cortés
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito 170129, Ecuador;
- Latin American Network for Implementation and Validation of Clinical Pharmacogenomics Guidelines (RELIVAF-CYTED), 28029 Madrid, Spain
| | - Alejandro Cabrera-Andrade
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170513, Ecuador; (A.C.-A.); (Y.P.-C.)
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Quito 170513, Ecuador
| | - Yunierkis Pérez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170513, Ecuador; (A.C.-A.); (Y.P.-C.)
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Quito 170513, Ecuador
| |
Collapse
|
7
|
Ortega-Tenezaca B, Quevedo-Tumailli V, Bediaga H, Collados J, Arrasate S, Madariaga G, Munteanu CR, Cordeiro MND, González-Díaz H. PTML Multi-Label Algorithms: Models, Software, and Applications. Curr Top Med Chem 2020; 20:2326-2337. [DOI: 10.2174/1568026620666200916122616] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 07/19/2020] [Accepted: 07/20/2020] [Indexed: 12/17/2022]
Abstract
By combining Machine Learning (ML) methods with Perturbation Theory (PT), it is possible
to develop predictive models for a variety of response targets. Such combination often known as
Perturbation Theory Machine Learning (PTML) modeling comprises a set of techniques that can handle
various physical, and chemical properties of different organisms, complex biological or material
systems under multiple input conditions. In so doing, these techniques effectively integrate a manifold
of diverse chemical and biological data into a single computational framework that can then be applied
for screening lead chemicals as well as to find clues for improving the targeted response(s).
PTML models have thus been extremely helpful in drug or material design efforts and found to be
predictive and applicable across a broad space of systems. After a brief outline of the applied methodology,
this work reviews the different uses of PTML in Medicinal Chemistry, as well as in other
applications. Finally, we cover the development of software available nowadays for setting up PTML
models from large datasets.
Collapse
Affiliation(s)
| | | | - Harbil Bediaga
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Jon Collados
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Gotzon Madariaga
- Department of Condensed Matter Physics, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruna, Spain
| | - M. Natália D.S. Cordeiro
- LAQV@REQUIMTE, Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal
| | - Humbert González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| |
Collapse
|
8
|
Cabrera-Andrade A, López-Cortés A, Munteanu CR, Pazos A, Pérez-Castillo Y, Tejera E, Arrasate S, González-Díaz H. Perturbation-Theory Machine Learning (PTML) Multilabel Model of the ChEMBL Dataset of Preclinical Assays for Antisarcoma Compounds. ACS Omega 2020; 5:27211-27220. [PMID: 33134682 PMCID: PMC7594149 DOI: 10.1021/acsomega.0c03356] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 10/06/2020] [Indexed: 06/11/2023]
Abstract
Sarcomas are a group of malignant neoplasms of connective tissue with a different etiology than carcinomas. The efforts to discover new drugs with antisarcoma activity have generated large datasets of multiple preclinical assays with different experimental conditions. For instance, the ChEMBL database contains outcomes of 37,919 different antisarcoma assays with 34,955 different chemical compounds. Furthermore, the experimental conditions reported in this dataset include 157 types of biological activity parameters, 36 drug targets, 43 cell lines, and 17 assay organisms. Considering this information, we propose combining perturbation theory (PT) principles with machine learning (ML) to develop a PTML model to predict antisarcoma compounds. PTML models use one function of reference that measures the probability of a drug being active under certain conditions (protein, cell line, organism, etc.). In this paper, we used a linear discriminant analysis and neural network to train and compare PT and non-PT models. All the explored models have an accuracy of 89.19-95.25% for training and 89.22-95.46% in validation sets. PTML-based strategies have similar accuracy but generate simplest models. Therefore, they may become a versatile tool for predicting antisarcoma compounds.
Collapse
Affiliation(s)
- Alejandro Cabrera-Andrade
- Grupo
de Bio-Quimioinformática, Universidad
de Las Américas, de los Granados Avenue, Quito 170125, Ecuador
- Carrera
de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, de los Granados Avenue, Quito 170125, Ecuador
- RNASA-IMEDIR,
Computer Sciences Faculty, University of
A Coruña, A Coruña 15071, Spain
| | - Andrés López-Cortés
- RNASA-IMEDIR,
Computer Sciences Faculty, University of
A Coruña, A Coruña 15071, Spain
- Centro
de Investigación Genética y Genómica, Facultad
de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito 170129, Ecuador
| | - Cristian R. Munteanu
- RNASA-IMEDIR,
Computer Sciences Faculty, University of
A Coruña, A Coruña 15071, Spain
- Biomedical
Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña 15006, Spain
- Centro de
Investigación en Tecnologías de la Información
y las Comunicaciones (CITIC), Campus de
Elviña s/n, A Coruña 15071, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR,
Computer Sciences Faculty, University of
A Coruña, A Coruña 15071, Spain
- Biomedical
Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña 15006, Spain
| | - Yunierkis Pérez-Castillo
- Grupo
de Bio-Quimioinformática, Universidad
de Las Américas, de los Granados Avenue, Quito 170125, Ecuador
- Escuela
de Ciencias Físicas y Matemáticas, Universidad de Las Américas, de los Granados Avenue, Quito 170125, Ecuador
| | - Eduardo Tejera
- Grupo
de Bio-Quimioinformática, Universidad
de Las Américas, de los Granados Avenue, Quito 170125, Ecuador
- Facultad
de Ingeniería y Ciencias Aplicadas, Universidad de Las Américas, de los Granados Avenue, Quito 170125, Ecuador
| | - Sonia Arrasate
- Department
of Organic Chemistry II and Basque Center for Biophysics, University of Basque Country UPV/EHU, Leioa 48940, Biscay, Spain
| | - Humbert González-Díaz
- Department
of Organic Chemistry II and Basque Center for Biophysics, University of Basque Country UPV/EHU, Leioa 48940, Biscay, Spain
- Ikerbasque,
Basque Foundation for Science, Bilbao 48011, Biscay, Spain
| |
Collapse
|
9
|
Álvarez-Coiradas E, Munteanu CR, Díaz-Sáez L, Pazos A, Huber KVM, Loza MI, Domínguez E. Discovery of novel immunopharmacological ligands targeting the IL-17 inflammatory pathway. Int Immunopharmacol 2020; 89:107026. [PMID: 33045560 DOI: 10.1016/j.intimp.2020.107026] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 09/02/2020] [Accepted: 09/16/2020] [Indexed: 01/25/2023]
Abstract
Interleukin 17 (IL-17) is a proinflammatory cytokine that acts as an immune checkpoint for several autoimmune diseases. Therapeutic neutralizing antibodies that target this cytokine have demonstrated clinical efficacy in psoriasis. However, biologics have limitations such as their high cost and their lack of oral bioavailability. Thus, it is necessary to expand the therapeutic options for this IL-17A/IL-17RA pathway, applying novel drug discovery methods to find effective small molecules. In this work, we combined biophysical and cell-based assays with structure-based docking to find novel ligands that target this pathway. First, a virtual screening of our chemical library of 60000 compounds was used to identify 67 potential ligands of IL-17A and IL-17RA. We developed a biophysical label-free binding assay to determine interactions with the extracellular domain of IL-17RA. Two molecules (CBG040591 and CBG060392) with quinazolinone and pyrrolidinedione chemical scaffolds, respectively, were confirmed as ligands of IL-17RA with micromolar affinity. The anti-inflammatory activity of these ligands as cytokine-release inhibitors was evaluated in human keratinocytes. Both ligands inhibited the release of chemokines mediated by IL-17A, with an IC50 of 20.9 ± 12.6 μM and 23.6 ± 11.8 μM for CCL20 and an IC50 of 26.7 ± 13.1 μM and 45.3 ± 13.0 μM for CXCL8. Hence, they blocked IL-17A proinflammatory activity, which is consistent with the inhibition of the signalling of the IL-17A receptor by ligand CBG060392. Therefore, we identified two novel immunopharmacological ligands targeting the IL-17A/IL-17RA pathway with antiinflammatory efficacy that can be promising tools for a drug discovery program for psoriasis.
Collapse
Affiliation(s)
- Elia Álvarez-Coiradas
- Biofarma Research Group, Center for Research in Molecular Medicine and Chronic Diseases (CiMUS), Universidade de Santiago de Compostela, Avenida de Barcelona s/n, 15782 Santiago de Compostela, Spain
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, CITIC, Universidade da Coruña, A Coruña, 15007, Spain; Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña 15006, Spain
| | - Laura Díaz-Sáez
- Structural Genomics Consortium & Target Discovery Institute, University of Oxford, Nuffield Department of Medicine, Old Road Campus, Oxford OX3 7DQ & OX3 7FZ, UK
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, CITIC, Universidade da Coruña, A Coruña, 15007, Spain; Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña 15006, Spain
| | - Kilian V M Huber
- Structural Genomics Consortium & Target Discovery Institute, University of Oxford, Nuffield Department of Medicine, Old Road Campus, Oxford OX3 7DQ & OX3 7FZ, UK
| | - María Isabel Loza
- Biofarma Research Group, Center for Research in Molecular Medicine and Chronic Diseases (CiMUS), Universidade de Santiago de Compostela, Avenida de Barcelona s/n, 15782 Santiago de Compostela, Spain.
| | - Eduardo Domínguez
- Biofarma Research Group, Center for Research in Molecular Medicine and Chronic Diseases (CiMUS), Universidade de Santiago de Compostela, Avenida de Barcelona s/n, 15782 Santiago de Compostela, Spain.
| |
Collapse
|
10
|
Urista DV, Carrué DB, Otero I, Arrasate S, Quevedo-Tumailli VF, Gestal M, González-Díaz H, Munteanu CR. Prediction of Antimalarial Drug-Decorated Nanoparticle Delivery Systems with Random Forest Models. Biology (Basel) 2020; 9:biology9080198. [PMID: 32751710 PMCID: PMC7465777 DOI: 10.3390/biology9080198] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 07/22/2020] [Accepted: 07/27/2020] [Indexed: 12/13/2022]
Abstract
Drug-decorated nanoparticles (DDNPs) have important medical applications. The current work combined Perturbation Theory with Machine Learning and Information Fusion (PTMLIF). Thus, PTMLIF models were proposed to predict the probability of nanoparticle–compound/drug complexes having antimalarial activity (against Plasmodium). The aim is to save experimental resources and time by using a virtual screening for DDNPs. The raw data was obtained by the fusion of experimental data for nanoparticles with compound chemical assays from the ChEMBL database. The inputs for the eight Machine Learning classifiers were transformed features of drugs/compounds and nanoparticles as perturbations of molecular descriptors in specific experimental conditions (experiment-centered features). The resulting dataset contains 107 input features and 249,992 examples. The best classification model was provided by Random Forest, with 27 selected features of drugs/compounds and nanoparticles in all experimental conditions considered. The high performance of the model was demonstrated by the mean Area Under the Receiver Operating Characteristics (AUC) in a test subset with a value of 0.9921 ± 0.000244 (10-fold cross-validation). The results demonstrated the power of information fusion of the experimental-centered features of drugs/compounds and nanoparticles for the prediction of nanoparticle–compound antimalarial activity. The scripts and dataset for this project are available in the open GitHub repository.
Collapse
Affiliation(s)
- Diana V. Urista
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
| | - Diego B. Carrué
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
| | - Iago Otero
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
| | - Viviana F. Quevedo-Tumailli
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Universidad Estatal Amazónica UEA, Km. 2 1/2 vía Puyo a Tena (paso lateral), Puyo 160150, Pastaza, Ecuador
| | - Marcos Gestal
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Biomedical Research Institute of A Coruña (INIBIC), Hospital Teresa Herrera, Xubias de Arriba 84, 15006 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
- IKERBASQUE, Basque Foundation for Science, Alameda Urquijo 36, 48011 Bilbao, Spain
- Basque Centre for Biophysics CSIC-UPVEHU, University of Basque Country UPV/EHU, Barrio Sarriena, 48940 Leioa, Spain
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Biomedical Research Institute of A Coruña (INIBIC), Hospital Teresa Herrera, Xubias de Arriba 84, 15006 A Coruña, Spain
- Correspondence:
| |
Collapse
|
11
|
Liñares-Blanco J, Munteanu CR, Pazos A, Fernandez-Lozano C. Molecular docking and machine learning analysis of Abemaciclib in colon cancer. BMC Mol Cell Biol 2020; 21:52. [PMID: 32640984 PMCID: PMC7346626 DOI: 10.1186/s12860-020-00295-w] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 06/24/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The main challenge in cancer research is the identification of different omic variables that present a prognostic value and personalised diagnosis for each tumour. The fact that the diagnosis is personalised opens the doors to the design and discovery of new specific treatments for each patient. In this context, this work offers new ways to reuse existing databases and work to create added value in research. Three published signatures with significante prognostic value in Colon Adenocarcinoma (COAD) were indentified. These signatures were combined in a new meta-signature and validated with main Machine Learning (ML) and conventional statistical techniques. In addition, a drug repurposing experiment was carried out through Molecular Docking (MD) methodology in order to identify new potential treatments in COAD. RESULTS The prognostic potential of the signature was validated by means of ML algorithms and differential gene expression analysis. The results obtained supported the possibility that this meta-signature could harbor genes of interest for the prognosis and treatment of COAD. We studied drug repurposing following a molecular docking (MD) analysis, where the different protein data bank (PDB) structures of the genes of the meta-signature (in total 155) were confronted with 81 anti-cancer drugs approved by the FDA. We observed four interactions of interest: GLTP - Nilotinib, PTPRN - Venetoclax, VEGFA - Venetoclax and FABP6 - Abemaciclib. The FABP6 gene and its role within different metabolic pathways were studied in tumour and normal tissue and we observed the capability of the FABP6 gene to be a therapeutic target. Our in silico results showed a significant specificity of the union of the protein products of the FABP6 gene as well as the known action of Abemaciclib as an inhibitor of the CDK4/6 protein and therefore, of the cell cycle. CONCLUSIONS The results of our ML and differential expression experiments have first shown the FABP6 gene as a possible new cancer biomarker due to its specificity in colonic tumour tissue and no expression in healthy adjacent tissue. Next, the MD analysis showed that the drug Abemaciclib characteristic affinity for the different protein structures of the FABP6 gene. Therefore, in silico experiments have shown a new opportunity that should be validated experimentally, thus helping to reduce the cost and speed of drug screening. For these reasons, we propose the validation of the drug Abemaciclib for the treatment of colon cancer.
Collapse
Affiliation(s)
- Jose Liñares-Blanco
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain
| | - Cristian R Munteanu
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain.,Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Instituto de Investigación Biomédica de A Coruña (INIBIC). Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas. Universidade da Coruña (UDC), Xubias de arriba, 84, A Coruña, 15006, Spain
| | - Alejandro Pazos
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain.,Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Instituto de Investigación Biomédica de A Coruña (INIBIC). Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas. Universidade da Coruña (UDC), Xubias de arriba, 84, A Coruña, 15006, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, A Coruña, 15071, Spain. .,Grupo de Redes de Neuronas Artificiales y Sistemas Adaptativos. Imagen Médica y Diagnóstico Radiológico (RNASA-IMEDIR). Instituto de Investigación Biomédica de A Coruña (INIBIC). Complexo Hospitalario Universitario de A Coruña (CHUAC), Sergas. Universidade da Coruña (UDC), Xubias de arriba, 84, A Coruña, 15006, Spain.
| |
Collapse
|
12
|
López-Cortés A, Cabrera-Andrade A, Vázquez-Naya JM, Pazos A, Gonzáles-Díaz H, Paz-Y-Miño C, Guerrero S, Pérez-Castillo Y, Tejera E, Munteanu CR. Prediction of breast cancer proteins involved in immunotherapy, metastasis, and RNA-binding using molecular descriptors and artificial neural networks. Sci Rep 2020; 10:8515. [PMID: 32444848 PMCID: PMC7244564 DOI: 10.1038/s41598-020-65584-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Accepted: 04/28/2020] [Indexed: 12/12/2022] Open
Abstract
Breast cancer (BC) is a heterogeneous disease where genomic alterations, protein expression deregulation, signaling pathway alterations, hormone disruption, ethnicity and environmental determinants are involved. Due to the complexity of BC, the prediction of proteins involved in this disease is a trending topic in drug design. This work is proposing accurate prediction classifier for BC proteins using six sets of protein sequence descriptors and 13 machine-learning methods. After using a univariate feature selection for the mix of five descriptor families, the best classifier was obtained using multilayer perceptron method (artificial neural network) and 300 features. The performance of the model is demonstrated by the area under the receiver operating characteristics (AUROC) of 0.980 ± 0.0037, and accuracy of 0.936 ± 0.0056 (3-fold cross-validation). Regarding the prediction of 4,504 cancer-associated proteins using this model, the best ranked cancer immunotherapy proteins related to BC were RPS27, SUPT4H1, CLPSL2, POLR2K, RPL38, AKT3, CDK3, RPS20, RASL11A and UBTD1; the best ranked metastasis driver proteins related to BC were S100A9, DDA1, TXN, PRNP, RPS27, S100A14, S100A7, MAPK1, AGR3 and NDUFA13; and the best ranked RNA-binding proteins related to BC were S100A9, TXN, RPS27L, RPS27, RPS27A, RPL38, MRPL54, PPAN, RPS20 and CSRP1. This powerful model predicts several BC-related proteins that should be deeply studied to find new biomarkers and better therapeutic targets. Scripts can be downloaded at https://github.com/muntisa/neural-networks-for-breast-cancer-proteins.
Collapse
Affiliation(s)
- Andrés López-Cortés
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito, 170129, Ecuador.
- RNASA-IMEDIR, Computer Science Faculty, University of Coruna, Coruna, 15071, Spain.
- Red Latinoamericana de Implementación y Validación de Guías Clínicas Farmacogenómicas (RELIVAF-CYTED), Quito, Ecuador.
| | - Alejandro Cabrera-Andrade
- RNASA-IMEDIR, Computer Science Faculty, University of Coruna, Coruna, 15071, Spain
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
| | - José M Vázquez-Naya
- RNASA-IMEDIR, Computer Science Faculty, University of Coruna, Coruna, 15071, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n 15071, A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006, A Coruña, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of Coruna, Coruna, 15071, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n 15071, A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006, A Coruña, Spain
| | - Humberto Gonzáles-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, Leioa 48940, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, 48011, Biscay, Spain
| | - César Paz-Y-Miño
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito, 170129, Ecuador
| | - Santiago Guerrero
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito, 170129, Ecuador
| | - Yunierkis Pérez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
| | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
- Facultad de Ingeniería y Ciencias Agropecuarias, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of Coruna, Coruna, 15071, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n 15071, A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006, A Coruña, Spain
| |
Collapse
|
13
|
López-Cortés A, Paz-Y-Miño C, Guerrero S, Cabrera-Andrade A, Barigye SJ, Munteanu CR, González-Díaz H, Pazos A, Pérez-Castillo Y, Tejera E. OncoOmics approaches to reveal essential genes in breast cancer: a panoramic view from pathogenesis to precision medicine. Sci Rep 2020; 10:5285. [PMID: 32210335 PMCID: PMC7093549 DOI: 10.1038/s41598-020-62279-2] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Accepted: 03/02/2020] [Indexed: 02/06/2023] Open
Abstract
Breast cancer (BC) is the leading cause of cancer-related death among women and the most commonly diagnosed cancer worldwide. Although in recent years large-scale efforts have focused on identifying new therapeutic targets, a better understanding of BC molecular processes is required. Here we focused on elucidating the molecular hallmarks of BC heterogeneity and the oncogenic mutations involved in precision medicine that remains poorly defined. To fill this gap, we established an OncoOmics strategy that consists of analyzing genomic alterations, signaling pathways, protein-protein interactome network, protein expression, dependency maps in cell lines and patient-derived xenografts in 230 previously prioritized genes to reveal essential genes in breast cancer. As results, the OncoOmics BC essential genes were rationally filtered to 140. mRNA up-regulation was the most prevalent genomic alteration. The most altered signaling pathways were associated with basal-like and Her2-enriched molecular subtypes. RAC1, AKT1, CCND1, PIK3CA, ERBB2, CDH1, MAPK14, TP53, MAPK1, SRC, RAC3, BCL2, CTNNB1, EGFR, CDK2, GRB2, MED1 and GATA3 were essential genes in at least three OncoOmics approaches. Drugs with the highest amount of clinical trials in phases 3 and 4 were paclitaxel, docetaxel, trastuzumab, tamoxifen and doxorubicin. Lastly, we collected ~3,500 somatic and germline oncogenic variants associated with 50 essential genes, which in turn had therapeutic connectivity with 73 drugs. In conclusion, the OncoOmics strategy reveals essential genes capable of accelerating the development of targeted therapies for precision oncology.
Collapse
Affiliation(s)
- Andrés López-Cortés
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito, 170129, Ecuador.
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruna, 15071, Spain.
- Red Latinoamericana de Implementación y Validación de Guías Clínicas Farmacogenómicas (RELIVAF-CYTED), Quito, Ecuador.
| | - César Paz-Y-Miño
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito, 170129, Ecuador
| | - Santiago Guerrero
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, Quito, 170129, Ecuador
| | - Alejandro Cabrera-Andrade
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruna, 15071, Spain
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
| | - Stephen J Barigye
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QC, H3A 0B8, Canada
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruna, 15071, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruna (CHUAC), A Coruna, 15006, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, A Coruna, 15071, Spain
| | - Humberto González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, Leioa, 48940, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, 48011, Biscay, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruna, 15071, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruna (CHUAC), A Coruna, 15006, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, A Coruna, 15071, Spain
| | - Yunierkis Pérez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador
| | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador.
- Facultad de Ingeniería y Ciencias Agropecuarias, Universidad de Las Américas, Avenue de los Granados, Quito, 170125, Ecuador.
| |
Collapse
|
14
|
Cabrera-Andrade A, López-Cortés A, Jaramillo-Koupermann G, Paz-y-Miño C, Pérez-Castillo Y, Munteanu CR, González-Díaz H, Pazos A, Tejera E. Gene Prioritization through Consensus Strategy, Enrichment Methodologies Analysis, and Networking for Osteosarcoma Pathogenesis. Int J Mol Sci 2020; 21:ijms21031053. [PMID: 32033398 PMCID: PMC7038221 DOI: 10.3390/ijms21031053] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 01/30/2020] [Accepted: 01/30/2020] [Indexed: 12/12/2022] Open
Abstract
Osteosarcoma is the most common subtype of primary bone cancer, affecting mostly adolescents. In recent years, several studies have focused on elucidating the molecular mechanisms of this sarcoma; however, its molecular etiology has still not been determined with precision. Therefore, we applied a consensus strategy with the use of several bioinformatics tools to prioritize genes involved in its pathogenesis. Subsequently, we assessed the physical interactions of the previously selected genes and applied a communality analysis to this protein–protein interaction network. The consensus strategy prioritized a total list of 553 genes. Our enrichment analysis validates several studies that describe the signaling pathways PI3K/AKT and MAPK/ERK as pathogenic. The gene ontology described TP53 as a principal signal transducer that chiefly mediates processes associated with cell cycle and DNA damage response It is interesting to note that the communality analysis clusters several members involved in metastasis events, such as MMP2 and MMP9, and genes associated with DNA repair complexes, like ATM, ATR, CHEK1, and RAD51. In this study, we have identified well-known pathogenic genes for osteosarcoma and prioritized genes that need to be further explored.
Collapse
Affiliation(s)
- Alejandro Cabrera-Andrade
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170125, Ecuador;
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Quito 170125, Ecuador
- RNASA-IMEDIR, Computer Sciences Faculty, University of A Coruna, 15071 A Coruña, Spain; (A.L.-C.); (C.R.M.); (A.P.)
- Correspondence: (A.C.-A.); (E.T.); Tel.: +593-2398-1000 (ext. 2717) (A.C.-A.); +593-2398-1000 (ext. 713) (E.T.)
| | - Andrés López-Cortés
- RNASA-IMEDIR, Computer Sciences Faculty, University of A Coruna, 15071 A Coruña, Spain; (A.L.-C.); (C.R.M.); (A.P.)
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito 170129, Ecuador;
| | - Gabriela Jaramillo-Koupermann
- Laboratorio de Biología Molecular, Subproceso de Anatomía Patológica, Hospital de Especialidades Eugenio Espejo, Quito 170403, Ecuador;
| | - César Paz-y-Miño
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Quito 170129, Ecuador;
| | - Yunierkis Pérez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170125, Ecuador;
- Escuela de Ciencias Físicas y Matemáticas, Universidad de Las Américas, Quito 170125, Ecuador
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Sciences Faculty, University of A Coruna, 15071 A Coruña, Spain; (A.L.-C.); (C.R.M.); (A.P.)
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain;
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Sciences Faculty, University of A Coruna, 15071 A Coruña, Spain; (A.L.-C.); (C.R.M.); (A.P.)
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito 170125, Ecuador;
- Facultad de Ingeniería y Ciencias Agropecuarias, Universidad de Las Américas, Quito 170125, Ecuador
- Correspondence: (A.C.-A.); (E.T.); Tel.: +593-2398-1000 (ext. 2717) (A.C.-A.); +593-2398-1000 (ext. 713) (E.T.)
| |
Collapse
|
15
|
Liu Y, Munteanu CR, Yan Q, Pedreira N, Kang J, Tang S, Zhou C, He Z, Tan Z. Machine learning classification models for fetal skeletal development performance prediction using maternal bone metabolic proteins in goats. PeerJ 2019; 7:e7840. [PMID: 31649832 PMCID: PMC6802673 DOI: 10.7717/peerj.7840] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2019] [Accepted: 09/05/2019] [Indexed: 11/20/2022] Open
Abstract
Background In developing countries, maternal undernutrition is the major intrauterine environmental factor contributing to fetal development and adverse pregnancy outcomes. Maternal nutrition restriction (MNR) in gestation has proven to impact overall growth, bone development, and proliferation and metabolism of mesenchymal stem cells in offspring. However, the efficient method for elucidation of fetal bone development performance through maternal bone metabolic biochemical markers remains elusive. Methods We adapted goats to elucidate fetal bone development state with maternal serum bone metabolic proteins under malnutrition conditions in mid- and late-gestation stages. We used the experimental data to create 72 datasets by mixing different input features such as one-hot encoding of experimental conditions, metabolic original data, experimental-centered features and experimental condition probabilities. Seven Machine Learning methods have been used to predict six fetal bone parameters (weight, length, and diameter of femur/humerus). Results The results indicated that MNR influences fetal bone development (femur and humerus) and fetal bone metabolic protein levels (C-terminal telopeptides of collagen I, CTx, in middle-gestation and N-terminal telopeptides of collagen I, NTx, in late-gestation), and maternal bone metabolites (low bone alkaline phosphatase, BALP, in middle-gestation and high BALP in late-gestation). The results show the importance of experimental conditions (ECs) encoding by mixing the information with the serum metabolic data. The best classification models obtained for femur weight (Fw) and length (FI), and humerus weight (Hw) are Support Vector Machines classifiers with the leave-one-out cross-validation accuracy of 1. The rest of the accuracies are 0.98, 0.946 and 0.696 for the diameter of femur (Fd), diameter and length of humerus (Hd, Hl), respectively. With the feature importance analysis, the moving averages mixed ECs are generally more important for the majority of the models. The moving average of parathyroid hormone (PTH) within nutritional conditions (MA-PTH-experim) is important for Fd, Hd and Hl prediction models but its removal for enhancing the Fw, Fl and Hw model performance. Further, using one feature models, it is possible to obtain even more accurate models compared with the feature importance analysis models. In conclusion, the machine learning is an efficient method to confirm the important role of PTH and BALP mixed with nutritional conditions for fetal bone growth performance of goats. All the Python scripts including results and comments are available into an open repository at https://gitlab.com/muntisa/goat-bones-machine-learning.
Collapse
Affiliation(s)
- Yong Liu
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain.,Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña, Spain
| | - Qiongxian Yan
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| | - Nieves Pedreira
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
| | - Jinhe Kang
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| | - Shaoxun Tang
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| | - Chuanshe Zhou
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| | - Zhixiong He
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| | - Zhiliang Tan
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, Hunan, China
| |
Collapse
|
16
|
Munteanu CR, Gestal M, Martínez-Acevedo YG, Pedreira N, Pazos A, Dorado J. Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning. Int J Mol Sci 2019; 20:ijms20184362. [PMID: 31491969 PMCID: PMC6770149 DOI: 10.3390/ijms20184362] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 08/26/2019] [Accepted: 08/30/2019] [Indexed: 01/27/2023] Open
Abstract
In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequence recurrence networks and were mixed under experimental conditions. The new models were generated using 709,100 instances of pair descriptors for query and reference peptide sequences. Using perturbations of the initial descriptors under sequence or assay conditions, 10 transformed features were used as inputs for seven Machine Learning methods. The best model was obtained with random forest classifiers with an Area Under the Receiver Operating Characteristics (AUROC) of 0.981 ± 0.0005 for the external validation series (five-fold cross-validation). The database included information about 83,683 peptides sequences, 1448 epitope organisms, 323 host organisms, 15 types of in vivo processes, 28 experimental techniques, and 505 adjuvant additives. The current model could improve the in silico predictions of epitopes for vaccine design. The script and results are available as a free repository.
Collapse
Affiliation(s)
- Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Marcos Gestal
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain.
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain.
| | - Yunuen G Martínez-Acevedo
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Unidad Profesional Interdisciplinaria de Biotecnología, National Polytechnic Institute (IPN), Ticoman, 07340 Mexico City, Mexico
| | - Nieves Pedreira
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Julián Dorado
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| |
Collapse
|
17
|
Tenorio-Borroto E, Castañedo N, García-Mera X, Rivadeneira K, Vázquez Chagoyán JC, Barbabosa Pliego A, Munteanu CR, González-Díaz H. Perturbation Theory Machine Learning Modeling of Immunotoxicity for Drugs Targeting Inflammatory Cytokines and Study of the Antimicrobial G1 Using Cytometric Bead Arrays. Chem Res Toxicol 2019; 32:1811-1823. [PMID: 31327231 DOI: 10.1021/acs.chemrestox.9b00154] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
ChEMBL biological activities prediction for 1-5-bromofur-2-il-2-bromo-2-nitroethene (G1) is a difficult task for cytokine immunotoxicity. The current study presents experimental results for G1 interaction with mouse Th1/Th2 and pro-inflammatory cytokines using a cytometry bead array (CBA). In the in vitro test of CBA, the results show no significant differences between the mean values of the Th1/Th2 cytokines for the samples treated with G1 with respect to the negative control, but there are moderate differences for cytokine values between different periods (24/48 h). The experiments show no significant differences between the mean values of the pro-inflammatory cytokines for the samples treated with G1, regarding the negative control, except for the values of tumor necrosis factor (TNF) and Interleukin (IL6) between the group treated with G1 and the negative control at 48 h. Differences occur for these cytokines in the periods (24/48 h). The study confirmed that the antimicrobial G1 did not alter the Th1/Th2 cytokines concentration in vitro in different periods, but it can alter TNF and IL6. G1 promotes free radicals production and activates damage processes in macrophages culture. In order to predict all ChEMBL activities for drugs in other experimental conditions, a ChEMBL data set was constructed using 25 biological activities, 1366 assays, 2 assay types, 4 assay organisms, 2 organisms, and 12 cytokine targets. Molecular descriptors calculated with Rcpi and 15 machine learning methods were used to find the best model able to predict if a drug could be active or not against a specific cytokine, in specific experimental conditions. The best model is based on 120 selected molecular descriptors and a deep neural network with area under the curve of the receiver operating characteristic of 0.904 and accuracy of 0.832. This model predicted 1384 G1 biological activities against cytokines in all ChEMBL data set experimental conditions.
Collapse
Affiliation(s)
- Esvieta Tenorio-Borroto
- Department of Organic Chemistry, Faculty of Pharmacy , University of Santiago de Compostela , 15782 Santiago de Compostela , Spain.,Center for Research and Advanced Studies in Animal Health, Faculty of Veterinary Medicines and Animal Husbandry , Autonomous University of Mexico State (UAEM) , 50200 Toluca , México
| | - Nilo Castañedo
- Chemical Bioactive Center (CBQ) , Central University of Las Villas (UCLV) , 50100 Santa Clara , Cuba
| | - Xerardo García-Mera
- Department of Organic Chemistry, Faculty of Pharmacy , University of Santiago de Compostela , 15782 Santiago de Compostela , Spain
| | - Kenneth Rivadeneira
- RNASA-IMEDIR, Computer Science Faculty , University of A Coruna (UDC) , 15071 A Coruña , Spain
| | - Juan Carlos Vázquez Chagoyán
- Center for Research and Advanced Studies in Animal Health, Faculty of Veterinary Medicines and Animal Husbandry , Autonomous University of Mexico State (UAEM) , 50200 Toluca , México
| | - Alberto Barbabosa Pliego
- Center for Research and Advanced Studies in Animal Health, Faculty of Veterinary Medicines and Animal Husbandry , Autonomous University of Mexico State (UAEM) , 50200 Toluca , México
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty , University of A Coruna (UDC) , 15071 A Coruña , Spain.,Biomedical Research Institute of A Coruña (INIBIC) , University Hospital Complex of A Coruña (CHUAC) , 15006 A Coruña , Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain.,IKERBASQUE , Basque Foundation for Science , 48011 Bilbao , Spain
| |
Collapse
|
18
|
Concu R, D. S. Cordeiro MN, Munteanu CR, González-Díaz H. PTML Model of Enzyme Subclasses for Mining the Proteome of Biofuel Producing Microorganisms. J Proteome Res 2019; 18:2735-2746. [DOI: 10.1021/acs.jproteome.8b00949] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Riccardo Concu
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - M. Natália. D. S. Cordeiro
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruña, 15071 A Coruña, Spain
- INIBIC Biomedical Research Institute of Coruña, CHUAC University Hospital, 15006 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940 Leioa, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
19
|
Mato-Abad V, Labiano-Fontcuberta A, Rodríguez-Yáñez S, García-Vázquez R, Munteanu CR, Andrade-Garda J, Domingo-Santos A, Galán Sánchez-Seco V, Aladro Y, Martínez-Ginés ML, Ayuso L, Benito-León J. Classification of radiologically isolated syndrome and clinically isolated syndrome with machine-learning techniques. Eur J Neurol 2019; 26:1000-1005. [PMID: 30714276 DOI: 10.1111/ene.13923] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 01/28/2019] [Indexed: 11/29/2022]
Abstract
BACKGROUND AND PURPOSE The unanticipated detection by magnetic resonance imaging (MRI) in the brain of asymptomatic subjects of white matter lesions suggestive of multiple sclerosis (MS) has been named radiologically isolated syndrome (RIS). As the difference between early MS [i.e. clinically isolated syndrome (CIS)] and RIS is the occurrence of a clinical event, it is logical to improve detection of the subclinical form without interfering with MRI as there are radiological diagnostic criteria for that. Our objective was to use machine-learning classification methods to identify morphometric measures that help to discriminate patients with RIS from those with CIS. METHODS We used a multimodal 3-T MRI approach by combining MRI biomarkers (cortical thickness, cortical and subcortical grey matter volume, and white matter integrity) of a cohort of 17 patients with RIS and 17 patients with CIS for single-subject level classification. RESULTS The best proposed models to predict the diagnosis of CIS and RIS were based on the Naive Bayes, Bagging and Multilayer Perceptron classifiers using only three features: the left rostral middle frontal gyrus volume and the fractional anisotropy values in the right amygdala and right lingual gyrus. The Naive Bayes obtained the highest accuracy [overall classification, 0.765; area under the receiver operating characteristic (AUROC), 0.782]. CONCLUSIONS A machine-learning approach applied to multimodal MRI data may differentiate between the earliest clinical expressions of MS (CIS and RIS) with an accuracy of 78%.
Collapse
Affiliation(s)
- V Mato-Abad
- ISLA, Computer Science Faculty, A Coruna University, A Coruña
| | | | | | | | - C R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, A Coruna University, A Coruña.,Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña
| | - J Andrade-Garda
- ISLA, Computer Science Faculty, A Coruna University, A Coruña
| | - A Domingo-Santos
- Department of Neurology, University Hospital '12 de Octubre', Madrid
| | | | - Y Aladro
- Department of Neurology, Getafe University Hospital, Getafe
| | | | - L Ayuso
- Department of Neurology, University Hospital 'Principe de Asturias', Alcalá de Henares
| | - J Benito-León
- Department of Neurology, University Hospital '12 de Octubre', Madrid.,Centro de Investigación Biomédica en Red sobre Enfermedades Neurodegenerativas (CIBERNED), Madrid.,Department of Medicine, Complutense University, Madrid, Spain
| |
Collapse
|
20
|
Liu Y, Munteanu CR, Kong Z, Ran T, Sahagún-Ruiz A, He Z, Zhou C, Tan Z. Identification of coenzyme-binding proteins with machine learning algorithms. Comput Biol Chem 2019; 79:185-192. [PMID: 30851647 DOI: 10.1016/j.compbiolchem.2019.01.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 09/11/2018] [Accepted: 01/25/2019] [Indexed: 01/12/2023]
Abstract
The coenzyme-binding proteins play a vital role in the cellular metabolism processes, such as fatty acid biosynthesis, enzyme and gene regulation, lipid synthesis, particular vesicular traffic, and β-oxidation donation of acyl-CoA esters. Based on the theory of Star Graph Topological Indices (SGTIs) of protein primary sequences, we proposed a method to develop a first classification model for predicting protein with coenzyme-binding properties. To simulate the properties of coenzyme-binding proteins, we created a dataset containing 2897 proteins, among 456 proteins functioned as coenzyme-binding activity. The SGTIs of peptide sequence were calculated with Sequence to Star Network (S2SNet) application. We used the SGTIs as inputs to several classification techniques with a machine learning software - Weka. A Random Forest classifier based on 3 features of the embedded and non-embedded graphs was identified as the best predictive model for coenzyme-binding proteins. This model developed was with the true positive (TP) rate of 91.7%, false positive (FP) rate of 7.6%, and Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.971. The prediction of new coenzyme-binding activity proteins using this model could be useful for further drug development or enzyme metabolism researches.
Collapse
Affiliation(s)
- Yong Liu
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain; Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña, 15006, Spain
| | - Zhiwei Kong
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; University of the Chinese Academy of Sciences, Beijing, 100049, PR China
| | - Tao Ran
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, Alberta, T1J 4B1, Canada
| | - Alfredo Sahagún-Ruiz
- Department of Microbiology and Immunology, Faculty of Veterinary Medicine and Animal Science, National Autonomous University of Mexico, Universidad 3000, Copilco Coyoacán, CP 04510, México D.F., Mexico
| | - Zhixiong He
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China.
| | - Chuanshe Zhou
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China
| | - Zhiliang Tan
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China
| |
Collapse
|
21
|
Ferreira da Costa J, Silva D, Caamaño O, Brea JM, Loza MI, Munteanu CR, Pazos A, García-Mera X, González-Díaz H. Perturbation Theory/Machine Learning Model of ChEMBL Data for Dopamine Targets: Docking, Synthesis, and Assay of New l-Prolyl-l-leucyl-glycinamide Peptidomimetics. ACS Chem Neurosci 2018; 9:2572-2587. [PMID: 29791132 DOI: 10.1021/acschemneuro.8b00083] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Predicting drug-protein interactions (DPIs) for target proteins involved in dopamine pathways is a very important goal in medicinal chemistry. We can tackle this problem using Molecular Docking or Machine Learning (ML) models for one specific protein. Unfortunately, these models fail to account for large and complex big data sets of preclinical assays reported in public databases. This includes multiple conditions of assays, such as different experimental parameters, biological assays, target proteins, cell lines, organism of the target, or organism of assay. On the other hand, perturbation theory (PT) models allow us to predict the properties of a query compound or molecular system in experimental assays with multiple boundary conditions based on a previously known case of reference. In this work, we report the first PTML (PT + ML) study of a large ChEMBL data set of preclinical assays of compounds targeting dopamine pathway proteins. The best PTML model found predicts 50000 cases with accuracy of 70-91% in training and external validation series. We also compared the linear PTML model with alternative PTML models trained with multiple nonlinear methods (artificial neural network (ANN), Random Forest, Deep Learning, etc.). Some of the nonlinear methods outperform the linear model but at the cost of a notable increment of the complexity of the model. We illustrated the practical use of the new model with a proof-of-concept theoretical-experimental study. We reported for the first time the organic synthesis, chemical characterization, and pharmacological assay of a new series of l-prolyl-l-leucyl-glycinamide (PLG) peptidomimetic compounds. In addition, we performed a molecular docking study for some of these compounds with the software Vina AutoDock. The work ends with a PTML model predictive study of the outcomes of the new compounds in a large number of assays. Therefore, this study offers a new computational methodology for predicting the outcome for any compound in new assays. This PTML method focuses on the prediction with a simple linear model of multiple pharmacological parameters (IC50, EC50, Ki, etc.) for compounds in assays involving different cell lines used, organisms of the protein target, or organism of assay for proteins in the dopamine pathway.
Collapse
Affiliation(s)
- Joana Ferreira da Costa
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - David Silva
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Olga Caamaño
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - José M. Brea
- CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Pharmacology, Pharmacy and Pharmaceutical Technology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Maria Isabel Loza
- CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Pharmacology, Pharmacy and Pharmaceutical Technology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Cristian R. Munteanu
- Instituto de Investigacion Biomedica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), A Coruña, 15006, Spain
| | - Alejandro Pazos
- Instituto de Investigacion Biomedica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), A Coruña, 15006, Spain
- Computer Science Department, Faculty of Computer Science, University of A Coruna, 15071 A Coruña, Spain
| | - Xerardo García-Mera
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|
22
|
López-Cortés A, Paz-Y-Miño C, Cabrera-Andrade A, Barigye SJ, Munteanu CR, González-Díaz H, Pazos A, Pérez-Castillo Y, Tejera E. Gene prioritization, communality analysis, networking and metabolic integrated pathway to better understand breast cancer pathogenesis. Sci Rep 2018; 8:16679. [PMID: 30420728 PMCID: PMC6232116 DOI: 10.1038/s41598-018-35149-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 10/16/2018] [Indexed: 12/30/2022] Open
Abstract
Consensus strategy was proved to be highly efficient in the recognition of gene-disease association. Therefore, the main objective of this study was to apply theoretical approaches to explore genes and communities directly involved in breast cancer (BC) pathogenesis. We evaluated the consensus between 8 prioritization strategies for the early recognition of pathogenic genes. A communality analysis in the protein-protein interaction (PPi) network of previously selected genes was enriched with gene ontology, metabolic pathways, as well as oncogenomics validation with the OncoPPi and DRIVE projects. The consensus genes were rationally filtered to 1842 genes. The communality analysis showed an enrichment of 14 communities specially connected with ERBB, PI3K-AKT, mTOR, FOXO, p53, HIF-1, VEGF, MAPK and prolactin signaling pathways. Genes with highest ranking were TP53, ESR1, BRCA2, BRCA1 and ERBB2. Genes with highest connectivity degree were TP53, AKT1, SRC, CREBBP and EP300. The connectivity degree allowed to establish a significant correlation between the OncoPPi network and our BC integrated network conformed by 51 genes and 62 PPi. In addition, CCND1, RAD51, CDC42, YAP1 and RPA1 were functional genes with significant sensitivity score in BC cell lines. In conclusion, the consensus strategy identifies both well-known pathogenic genes and prioritized genes that need to be further explored.
Collapse
Affiliation(s)
- Andrés López-Cortés
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, 170129, Quito, Ecuador.
- RNASA-IMEDIR, Computer Sciences Faculty, University of Coruna, 15071, Coruna, Spain.
| | - César Paz-Y-Miño
- Centro de Investigación Genética y Genómica, Facultad de Ciencias de la Salud Eugenio Espejo, Universidad UTE, Mariscal Sucre Avenue, 170129, Quito, Ecuador
| | - Alejandro Cabrera-Andrade
- Carrera de Enfermería, Facultad de Ciencias de la Salud, Universidad de las Américas, Avenue de los Granados, 170125, Quito, Ecuador
- Grupo de Bio-Quimioinformática, Universidad de las Américas, Avenue de los Granados, 170125, Quito, Ecuador
| | - Stephen J Barigye
- Department of Chemistry, McGill University, 801 Sherbrooke Street West, Montreal, QC, H3A 0B8, Canada
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Sciences Faculty, University of Coruna, 15071, Coruna, Spain
- INIBIC, Institute of Biomedical Research, CHUAC, UDC, 15006, Coruna, Spain
| | - Humberto González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Biscay, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Sciences Faculty, University of Coruna, 15071, Coruna, Spain
- INIBIC, Institute of Biomedical Research, CHUAC, UDC, 15006, Coruna, Spain
| | - Yunierkis Pérez-Castillo
- Grupo de Bio-Quimioinformática, Universidad de las Américas, Avenue de los Granados, 170125, Quito, Ecuador
- Escuela de Ciencias Físicas y Matemáticas, Universidad de las Américas, Avenue de los Granados, 170125, Quito, Ecuador
| | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de las Américas, Avenue de los Granados, 170125, Quito, Ecuador.
- Facultad de Ingeniería y Ciencias Agropecuarias, Universidad de las Américas, Avenue de los Granados, 170125, Quito, Ecuador.
| |
Collapse
|
23
|
Barreiro E, Munteanu CR, Cruz-Monteagudo M, Pazos A, González-Díaz H. Net-Net Auto Machine Learning (AutoML) Prediction of Complex Ecosystems. Sci Rep 2018; 8:12340. [PMID: 30120369 PMCID: PMC6098100 DOI: 10.1038/s41598-018-30637-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Accepted: 07/24/2018] [Indexed: 11/09/2022] Open
Abstract
Biological Ecosystem Networks (BENs) are webs of biological species (nodes) establishing trophic relationships (links). Experimental confirmation of all possible links is difficult and generates a huge volume of information. Consequently, computational prediction becomes an important goal. Artificial Neural Networks (ANNs) are Machine Learning (ML) algorithms that may be used to predict BENs, using as input Shannon entropy information measures (Shk) of known ecosystems to train them. However, it is difficult to select a priori which ANN topology will have a higher accuracy. Interestingly, Auto Machine Learning (AutoML) methods focus on the automatic selection of the more efficient ML algorithms for specific problems. In this work, a preliminary study of a new approach to AutoML selection of ANNs is proposed for the prediction of BENs. We call it the Net-Net AutoML approach, because it uses for the first time Shk values of both networks involving BENs (networks to be predicted) and ANN topologies (networks to be tested). Twelve types of classifiers have been tested for the Net-Net model including linear, Bayesian, trees-based methods, multilayer perceptrons and deep neuronal networks. The best Net-Net AutoML model for 338,050 outputs of 10 ANN topologies for links of 69 BENs was obtained with a deep fully connected neuronal network, characterized by a test accuracy of 0.866 and a test AUROC of 0.935. This work paves the way for the application of Net-Net AutoML to other systems or ML algorithms.
Collapse
Affiliation(s)
- Enrique Barreiro
- Department of Computation, Computer Science Faculty, University of A Coruna (UDC), 15071, A Coruña, Spain.,Center for Computational Science (CCS), University of Miami (UM), Miami, 33136, FL, USA.,West Coast University, Miami Campus, 33178, FL, USA
| | - Cristian R Munteanu
- Department of Computation, Computer Science Faculty, University of A Coruna (UDC), 15071, A Coruña, Spain
| | - Maykel Cruz-Monteagudo
- Center for Computational Science (CCS), University of Miami (UM), Miami, 33136, FL, USA.,West Coast University, Miami Campus, 33178, FL, USA
| | - Alejandro Pazos
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña, 15006, Spain
| | - Humbert González-Díaz
- Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Biscay, Spain. .,IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Biscay, Spain.
| |
Collapse
|
24
|
González-Durruthy M, Monserrat JM, Rasulev B, Casañola-Martín GM, Barreiro Sorrivas JM, Paraíso-Medina S, Maojo V, González-Díaz H, Pazos A, Munteanu CR. Carbon Nanotubes' Effect on Mitochondrial Oxygen Flux Dynamics: Polarography Experimental Study and Machine Learning Models using Star Graph Trace Invariants of Raman Spectra. Nanomaterials (Basel) 2017; 7:nano7110386. [PMID: 29137126 PMCID: PMC5707603 DOI: 10.3390/nano7110386] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2017] [Revised: 11/06/2017] [Accepted: 11/08/2017] [Indexed: 11/16/2022]
Abstract
This study presents the impact of carbon nanotubes (CNTs) on mitochondrial oxygen mass flux (Jm) under three experimental conditions. New experimental results and a new methodology are reported for the first time and they are based on CNT Raman spectra star graph transform (spectral moments) and perturbation theory. The experimental measures of Jm showed that no tested CNT family can inhibit the oxygen consumption profiles of mitochondria. The best model for the prediction of Jm for other CNTs was provided by random forest using eight features, obtaining test R-squared (R2) of 0.863 and test root-mean-square error (RMSE) of 0.0461. The results demonstrate the capability of encoding CNT information into spectral moments of the Raman star graphs (SG) transform with a potential applicability as predictive tools in nanotechnology and material risk assessments.
Collapse
Affiliation(s)
- Michael González-Durruthy
- Institute of Biological Science (ICB), Federal University of Rio Grande, Rio Grande, RS 96270-900, Brazil.
| | - Jose M Monserrat
- Institute of Biological Science (ICB), Federal University of Rio Grande, Rio Grande, RS 96270-900, Brazil.
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University (NDSU), Fargo, ND 58102, USA.
| | | | - José María Barreiro Sorrivas
- Computer Science School (ETSIINF), Polytechnic University of Madrid (UPM), Calle de losCiruelos, Boadilla del Monte, 28660 Madrid, Spain.
| | - Sergio Paraíso-Medina
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, 28660 Madrid, Spain.
| | - Víctor Maojo
- Biomedical Informatics Group, Artificial Intelligence Department, Polytechnic University of Madrid, Calle de los Ciruelos, Boadilla del Monte, 28660 Madrid, Spain.
| | - Humberto González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940 Leioa, Biscay, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain.
| | - Alejandro Pazos
- INIBIC Institute of Biomedical Research, CHUAC, UDC, 15006 Coruña, Spain.
- RNASA-IMEDIR, Computer Sciences Faculty, University of Coruña, 15071 Coruña, Spain.
| | - Cristian R Munteanu
- INIBIC Institute of Biomedical Research, CHUAC, UDC, 15006 Coruña, Spain.
- RNASA-IMEDIR, Computer Sciences Faculty, University of Coruña, 15071 Coruña, Spain.
| |
Collapse
|
25
|
González-Durruthy M, Werhli AV, Seus V, Machado KS, Pazos A, Munteanu CR, González-Díaz H, Monserrat JM. Decrypting Strong and Weak Single-Walled Carbon Nanotubes Interactions with Mitochondrial Voltage-Dependent Anion Channels Using Molecular Docking and Perturbation Theory. Sci Rep 2017; 7:13271. [PMID: 29038520 PMCID: PMC5643473 DOI: 10.1038/s41598-017-13691-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 09/25/2017] [Indexed: 01/30/2023] Open
Abstract
The current molecular docking study provided the Free Energy of Binding (FEB) for the interaction (nanotoxicity) between VDAC mitochondrial channels of three species (VDAC1-Mus musculus, VDAC1-Homo sapiens, VDAC2-Danio rerio) with SWCNT-H, SWCNT-OH, SWCNT-COOH carbon nanotubes. The general results showed that the FEB values were statistically more negative (p < 0.05) in the following order: (SWCNT-VDAC2-Danio rerio) > (SWCNT-VDAC1-Mus musculus) > (SWCNT-VDAC1-Homo sapiens) > (ATP-VDAC). More negative FEB values for SWCNT-COOH and OH were found in VDAC2-Danio rerio when compared with VDAC1-Mus musculus and VDAC1-Homo sapiens (p < 0.05). In addition, a significant correlation (0.66 > r2 > 0.97) was observed between n-Hamada index and VDAC nanotoxicity (or FEB) for the zigzag topologies of SWCNT-COOH and SWCNT-OH. Predictive Nanoparticles-Quantitative-Structure Binding-Relationship models (nano-QSBR) for strong and weak SWCNT-VDAC docking interactions were performed using Perturbation Theory, regression and classification models. Thus, 405 SWCNT-VDAC interactions were predicted using a nano-PT-QSBR classifications model with high accuracy, specificity, and sensitivity (73–98%) in training and validation series, and a maximum AUROC value of 0.978. In addition, the best regression model was obtained with Random Forest (R2 of 0.833, RMSE of 0.0844), suggesting an excellent potential to predict SWCNT-VDAC channel nanotoxicity. All study data are available at https://doi.org/10.6084/m9.figshare.4802320.v2.
Collapse
Affiliation(s)
- Michael González-Durruthy
- Institute of Biological Sciences (ICB)- Federal University of Rio Grande - FURG, Postgraduate Program in Physiological Sciences, Cx. P. 474, CEP 96200-970, Rio Grande, RS, Brazil.
| | - Adriano V Werhli
- Center of Computational Sciences (C3)- Federal University of Rio Grande - FURG, Cx. P. 474, CEP 96200-970, Rio Grande, RS, Brazil
| | - Vinicius Seus
- Center of Computational Sciences (C3)- Federal University of Rio Grande - FURG, Cx. P. 474, CEP 96200-970, Rio Grande, RS, Brazil
| | - Karina S Machado
- Center of Computational Sciences (C3)- Federal University of Rio Grande - FURG, Cx. P. 474, CEP 96200-970, Rio Grande, RS, Brazil
| | - Alejandro Pazos
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña, 15006, Spain.,RNASA-IMEDIR, Computer Science Faculty, University of A Coruña, Campus de Elviña s/n, 15071, A Coruña, Spain
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruña, Campus de Elviña s/n, 15071, A Coruña, Spain
| | - Humberto González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Spain.,IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain
| | - José M Monserrat
- Institute of Biological Sciences (ICB)- Federal University of Rio Grande - FURG, Postgraduate Program in Physiological Sciences, Cx. P. 474, CEP 96200-970, Rio Grande, RS, Brazil
| |
Collapse
|
26
|
Deng Y, Liu Y, Tang S, Zhou C, Han X, Xiao W, Pastur-Romay LA, Vazquez-Naya JM, Loureiro JP, Munteanu CR, Tan Z. General Machine Learning Model, Review, and Experimental-Theoretic Study of Magnolol Activity in Enterotoxigenic Induced Oxidative Stress. Curr Top Med Chem 2017; 17:2977-2988. [DOI: 10.2174/1568026617666170821130315] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2017] [Revised: 05/15/2017] [Accepted: 06/15/2017] [Indexed: 11/22/2022]
Affiliation(s)
- Yanli Deng
- National Research Center of Engineering Technology for Utilization of Botanical Functional Ingredients, Hunan Agricultural University, Changsha, Hunan 410128, China
| | - Yong Liu
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan 410125, China
| | - Shaoxun Tang
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan 410125, China
| | - Chuanshe Zhou
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan 410125, China
| | - Xuefeng Han
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan 410125, China
| | - Wenjun Xiao
- National Research Center of Engineering Technology for Utilization of Botanical Functional Ingredients, Hunan Agricultural University, Changsha, Hunan 410128, China
| | - Lucas Anton Pastur-Romay
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, Campus de Elvina s/n, 15071, A Coruna, Spain
| | - Jose Manuel Vazquez-Naya
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, Campus de Elvina s/n, 15071, A Coruna, Spain
| | | | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, Campus de Elvina s/n, 15071, A Coruna, Spain
| | - Zhiliang Tan
- Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan 410125, China
| |
Collapse
|
27
|
Perez-Rey D, Alonso-Calvo R, Paraiso-Medina S, Munteanu CR, Garcia-Remesal M. SNOMED2HL7: A tool to normalize and bind SNOMED CT concepts to the HL7 Reference Information Model. Comput Methods Programs Biomed 2017; 149:1-9. [PMID: 28802325 DOI: 10.1016/j.cmpb.2017.06.020] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2016] [Revised: 04/19/2017] [Accepted: 06/28/2017] [Indexed: 06/07/2023]
Abstract
BACKGROUND Current clinical research and practice requires interoperability among systems in a complex and highly dynamic domain. There has been a significant effort in recent years to develop integrative common data models and domain terminologies. Such efforts have not completely solved the challenges associated with clinical data that are distributed among different and heterogeneous institutions with different systems to encode the information. Currently, when providing homogeneous interfaces to exploit clinical data, certain transformations still involve manual and time-consuming processes that could be automated. OBJECTIVES There is a lack of tools to support data experts adopting clinical standards. This absence is especially significant when links between data model and vocabulary are required. The objective of this work is to present SNOMED2HL7, a novel tool to automatically link biomedical concepts from widely used terminologies, and the corresponding clinical context, to the HL7 Reference Information Model (RIM). METHODS Based on the recommendations of the International Health Terminology Standards Development Organisation (IHTSDO), the SNOMED Normal Form has been implemented within SNOMED2HL7 to decompose and provide a method to reduce the number of options to store the same information. The binding of clinical terminologies to HL7 RIM components is the core of SNOMED2HL7, where terminology concepts have been annotated with the corresponding options within the interoperability standard. A web-based tool has been developed to automatically provide information from the normalization mechanisms and the terminology binding. RESULTS SNOMED2HL7 binding coverage includes the majority of the concepts used to annotate legacy systems. It follows HL7 recommendations to solve binding overlaps and provides the binding of the normalized version of the concepts. The first version of the tool, available at http://kandel.dia.fi.upm.es:8078, has been validated in EU funded projects to integrate real world data for clinical research with an 88.47% of accuracy. CONCLUSIONS This paper presents the first initiative to automatically retrieve concept-centered information required to transform legacy data into widely adopted interoperability standards. Although additional functionality will extend capabilities to automate data transformations, SNOMED2HL7 already provides the functionality required for the clinical interoperability community.
Collapse
Affiliation(s)
- D Perez-Rey
- Biomedical Informatics Group, School of Computer Science, Universidad Politecnica de Madrid. Campus de Montegancedo, s/n, 28660, Boadilla del Monte, Madrid, Spain.
| | - R Alonso-Calvo
- Biomedical Informatics Group, School of Computer Science, Universidad Politecnica de Madrid. Campus de Montegancedo, s/n, 28660, Boadilla del Monte, Madrid, Spain
| | - S Paraiso-Medina
- Biomedical Informatics Group, School of Computer Science, Universidad Politecnica de Madrid. Campus de Montegancedo, s/n, 28660, Boadilla del Monte, Madrid, Spain
| | - C R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain; Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), A Coruña, Spain
| | - M Garcia-Remesal
- Biomedical Informatics Group, School of Computer Science, Universidad Politecnica de Madrid. Campus de Montegancedo, s/n, 28660, Boadilla del Monte, Madrid, Spain
| |
Collapse
|
28
|
Liu Y, Munteanu CR, Fernandez-Lozano C, Pazos A, Ran T, Tan Z, Yu Y, Zhou C, Tang S, González-Díaz H. Experimental Study and ANN Dual-Time Scale Perturbation Model of Electrokinetic Properties of Microbiota. Front Microbiol 2017; 8:1216. [PMID: 28713345 PMCID: PMC5491601 DOI: 10.3389/fmicb.2017.01216] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2017] [Accepted: 06/14/2017] [Indexed: 12/18/2022] Open
Abstract
The electrokinetic properties of the rumen microbiota are involved in cell surface adhesion and microbial metabolism. An in vitro study was carried out in batch culture to determine the effects of three levels of special surface area (SSA) of biomaterials and four levels of surface tension (ST) of culture medium on electrokinetic properties (Zeta potential, ξ; electrokinetic mobility, μe), fermentation parameters (volatile fatty acids, VFAs), and ST over fermentation processes (ST-a, γ). The obtained results were combined with previously published data (digestibility, D; pH; concentration of ammonia nitrogen, c(NH3-N)) to establish a predictive artificial neural network (ANN) model. Concepts of dual-time series analysis, perturbation theory (PT), and Box-Jenkins Operators were applied for the first time to develop an ANN model to predict the variations of the electrokinetic properties of microbiota. The best dual-time series Radial Basis Functions (RBR) model for ξ of rumen microbiota predicted ξ for >30,000 cases with a correlation coefficient >0.8. This model provided insight into the correlations between electrokinetic property (zeta potential) of rumen microbiota and the perturbations of physical factors (specific surface area and surface tension) of media, digestibility of substrate, and their metabolites (NH3-N, VFAs) in relation to environmental factors.
Collapse
Affiliation(s)
- Yong Liu
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South-Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, Chinese Academy of SciencesChangsha, China
- RNASA-IMEDIR, Computer Science Faculty, University of A CorunaA Coruña, Spain
| | | | - Carlos Fernandez-Lozano
- RNASA-IMEDIR, Computer Science Faculty, University of A CorunaA Coruña, Spain
- Instituto de Investigación Biomédica de A Coruña, Complexo Hospitalario Universitario de A CoruñaA Coruña, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of A CorunaA Coruña, Spain
- Instituto de Investigación Biomédica de A Coruña, Complexo Hospitalario Universitario de A CoruñaA Coruña, Spain
| | - Tao Ran
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South-Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, Chinese Academy of SciencesChangsha, China
| | - Zhiliang Tan
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South-Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, Chinese Academy of SciencesChangsha, China
- Hunan Co-Innovation Center of Animal Production Safety, CICAPSChangsha, China
| | - Yizun Yu
- Institute of Biological Resources, Jiangxi Academy of SciencesJiangxi, China
| | - Chuanshe Zhou
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South-Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, Chinese Academy of SciencesChangsha, China
- Hunan Co-Innovation Center of Animal Production Safety, CICAPSChangsha, China
| | - Shaoxun Tang
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, Hunan Research Center of Livestock and Poultry Sciences, South-Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, Chinese Academy of SciencesChangsha, China
- Hunan Co-Innovation Center of Animal Production Safety, CICAPSChangsha, China
| | - Humberto González-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHULeioa, Spain
- IKERBASQUE, Basque Foundation for ScienceBilbao, Spain
| |
Collapse
|
29
|
González-Durruthy M, Alberici LC, Curti C, Naal Z, Atique-Sawazaki DT, Vázquez-Naya JM, González-Díaz H, Munteanu CR. Experimental-Computational Study of Carbon Nanotube Effects on Mitochondrial Respiration: In Silico Nano-QSPR Machine Learning Models Based on New Raman Spectra Transform with Markov-Shannon Entropy Invariants. J Chem Inf Model 2017; 57:1029-1044. [PMID: 28414908 DOI: 10.1021/acs.jcim.6b00458] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The study of selective toxicity of carbon nanotubes (CNTs) on mitochondria (CNT-mitotoxicity) is of major interest for future biomedical applications. In the current work, the mitochondrial oxygen consumption (E3) is measured under three experimental conditions by exposure to pristine and oxidized CNTs (hydroxylated and carboxylated). Respiratory functional assays showed that the information on the CNT Raman spectroscopy could be useful to predict structural parameters of mitotoxicity induced by CNTs. The in vitro functional assays show that the mitochondrial oxidative phosphorylation by ATP-synthase (or state V3 of respiration) was not perturbed in isolated rat-liver mitochondria. For the first time a star graph (SG) transform of the CNT Raman spectra is proposed in order to obtain the raw information for a nano-QSPR model. Box-Jenkins and perturbation theory operators are used for the SG Shannon entropies. A modified RRegrs methodology is employed to test four regression methods such as multiple linear regression (LM), partial least squares regression (PLS), neural networks regression (NN), and random forest (RF). RF provides the best models to predict the mitochondrial oxygen consumption in the presence of specific CNTs with R2 of 0.998-0.999 and RMSE of 0.0068-0.0133 (training and test subsets). This work is aimed at demonstrating that the SG transform of Raman spectra is useful to encode CNT information, similarly to the SG transform of the blood proteome spectra in cancer or electroencephalograms in epilepsy and also as a prospective chemoinformatics tool for nanorisk assessment. All data files and R object models are available at https://dx.doi.org/10.6084/m9.figshare.3472349 .
Collapse
Affiliation(s)
| | | | | | | | | | - José M Vázquez-Naya
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna , Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Humberto González-Díaz
- Department of Organic Chemistry II, Faculty of Science and Technology, University of the Basque Country UPV/EHU , 48940, Leioa, Bizkaia, Spain.,IKERBASQUE, Basque Foundation for Science , 48011, Bilbao, Bizkaia, Spain
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna , Campus de Elviña s/n, 15071 A Coruña, Spain.,Instituto de Investigación Biomédica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC) , A Coruña, 15006, Spain
| |
Collapse
|
30
|
Fernandez-Lozano C, Gestal M, Munteanu CR, Dorado J, Pazos A. A methodology for the design of experiments in computational intelligence with multiple regression models. PeerJ 2016; 4:e2721. [PMID: 27920952 PMCID: PMC5136129 DOI: 10.7717/peerj.2721] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 10/25/2016] [Indexed: 01/23/2023] Open
Abstract
The design of experiments and the validation of the results achieved with them are vital in any research study. This paper focuses on the use of different Machine Learning approaches for regression tasks in the field of Computational Intelligence and especially on a correct comparison between the different results provided for different methods, as those techniques are complex systems that require further study to be fully understood. A methodology commonly accepted in Computational intelligence is implemented in an R package called RRegrs. This package includes ten simple and complex regression models to carry out predictive modeling using Machine Learning and well-known regression algorithms. The framework for experimental design presented herein is evaluated and validated against RRegrs. Our results are different for three out of five state-of-the-art simple datasets and it can be stated that the selection of the best model according to our proposal is statistically significant and relevant. It is of relevance to use a statistical approach to indicate whether the differences are statistically significant using this kind of algorithms. Furthermore, our results with three real complex datasets report different best models than with the previously published methodology. Our final goal is to provide a complete methodology for the use of different steps in order to compare the results obtained in Computational Intelligence problems, as well as from other fields, such as for bioinformatics, cheminformatics, etc., given that our proposal is open and modifiable.
Collapse
Affiliation(s)
- Carlos Fernandez-Lozano
- Information and Communications Technologies Department, University of A Coruna , A Coruña , Spain
| | - Marcos Gestal
- Information and Communications Technologies Department, University of A Coruna , A Coruña , Spain
| | - Cristian R Munteanu
- Information and Communications Technologies Department, University of A Coruna , A Coruña , Spain
| | - Julian Dorado
- Information and Communications Technologies Department, University of A Coruna , A Coruña , Spain
| | - Alejandro Pazos
- Information and Communications Technologies Department, University of A Coruna, A Coruña, Spain; Complexo Hospitalario Universitario de A Coruña (CHUAC), Instituto de Investigacion Biomedica de A Coruña (INIBIC), A Coruña, Spain
| |
Collapse
|
31
|
R. Munteanu C, Aguiar-Pulido V, Freire A, Martínez-Romero M, Porto-Pazos AB, Pereira J, Dorado J. Graph-Based Processing of Macromolecular Information. Curr Bioinform 2015. [DOI: 10.2174/1574893610666151008012438] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
32
|
Fernandez-Lozano C, Cuiñas RF, Seoane JA, Fernández-Blanco E, Dorado J, Munteanu CR. Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models. J Theor Biol 2015; 384:50-8. [DOI: 10.1016/j.jtbi.2015.07.038] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Revised: 07/20/2015] [Accepted: 07/27/2015] [Indexed: 12/11/2022]
|
33
|
Tsiliki G, Munteanu CR, Seoane JA, Fernandez-Lozano C, Sarimveis H, Willighagen EL. RRegrs: an R package for computer-aided model selection with multiple regression models. J Cheminform 2015; 7:46. [PMID: 26379782 PMCID: PMC4570700 DOI: 10.1186/s13321-015-0094-2] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Accepted: 08/24/2015] [Indexed: 11/25/2022] Open
Abstract
Background Predictive regression models can
be created with many different modelling approaches. Choices need to be made for data set splitting, cross-validation methods, specific regression parameters and best model criteria, as they all affect the accuracy and efficiency of the produced predictive models, and therefore, raising model reproducibility and comparison issues. Cheminformatics and bioinformatics are extensively using predictive modelling and exhibit a need for standardization of these methodologies in order to assist model selection and speed up the process of predictive model development. A tool accessible to all users, irrespectively of their statistical knowledge, would be valuable if it tests several simple and complex regression models and validation schemes, produce unified reports, and offer the option to be integrated into more extensive studies. Additionally, such methodology should be implemented as a free programming package, in order to be continuously adapted and redistributed by others. Results We propose an integrated framework for creating multiple regression models, called RRegrs. The tool offers the option of ten simple and complex regression methods combined with repeated 10-fold and leave-one-out cross-validation. Methods include Multiple Linear regression, Generalized Linear Model with Stepwise Feature Selection, Partial Least Squares regression, Lasso regression, and Support Vector Machines Recursive Feature Elimination. The new framework is an automated fully validated procedure which produces standardized reports to quickly oversee the impact of choices in modelling algorithms and assess the model and cross-validation results. The methodology was implemented as an open source R package, available at https://www.github.com/enanomapper/RRegrs, by reusing and extending on the caret package. Conclusion The universality of the new methodology is demonstrated using five standard data sets from different scientific fields. Its efficiency in cheminformatics and QSAR modelling is shown with three use cases: proteomics data for surface-modified gold nanoparticles, nano-metal oxides descriptor data, and molecular descriptors for acute aquatic toxicity data. The results show that for all data sets RRegrs reports models with equal or better performance for both training and test sets than those reported in the original publications. Its good performance as well as its adaptability in terms of parameter optimization could make RRegrs a popular framework to assist the initial exploration of predictive models, and with that, the design of more comprehensive in silico screening applications.RRegrs is a computer-aided model selection framework for R multiple regression models; this is a fully validated procedure with application to QSAR modelling ![]()
Collapse
Affiliation(s)
- Georgia Tsiliki
- School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechneiou Street, Zografou Campus, 15780 Athens, Greece
| | - Cristian R Munteanu
- Computer Science Faculty, University of A Coruna, Campus Elviña, s/n, 15071 A Coruña, Spain.,Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, P.O. Box 616, UNS50 Box 19, 6200 MD Maastricht, The Netherlands
| | - Jose A Seoane
- Stanford Cancer Institute, Stanford University, C.J.Huang Building, 780 Welch Road, Palo Alto, CA 94304 USA
| | | | - Haralambos Sarimveis
- School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechneiou Street, Zografou Campus, 15780 Athens, Greece
| | - Egon L Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, P.O. Box 616, UNS50 Box 19, 6200 MD Maastricht, The Netherlands
| |
Collapse
|
34
|
Liu Y, Munteanu CR, Fernández Blanco E, Tan Z, Santos Del Riego A, Pazos A. Prediction of Nucleotide Binding Peptides Using Star Graph Topological Indices. Mol Inform 2015; 34:736-41. [PMID: 27491034 DOI: 10.1002/minf.201500064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 07/06/2015] [Indexed: 01/14/2023]
Abstract
The nucleotide binding proteins are involved in many important cellular processes, such as transmission of genetic information or energy transfer and storage. Therefore, the screening of new peptides for this biological function is an important research topic. The current study proposes a mixed methodology to obtain the first classification model that is able to predict new nucleotide binding peptides, using only the amino acid sequence. Thus, the methodology uses a Star graph molecular descriptor of the peptide sequences and the Machine Learning technique for the best classifier. The best model represents a Random Forest classifier based on two features of the embedded and non-embedded graphs. The performance of the model is excellent, considering similar models in the field, with an Area Under the Receiver Operating Characteristic Curve (AUROC) value of 0.938 and true positive rate (TPR) of 0.886 (test subset). The prediction of new nucleotide binding peptides with this model could be useful for drug target studies in drug development.
Collapse
Affiliation(s)
- Yong Liu
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160.,Faculty of Veterinary Medicine and Animal Science, Autonomous University of the State of Mexico, Toluca, 50090, México.,Key Laboratory of Subtropical Agro-ecological Engineering, Institute of Subtropical Agriculture, the Chinese Academy of Sciences, Changsha, Hunan, 410125, P. R. China
| | - Cristian R Munteanu
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160.
| | - Enrique Fernández Blanco
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160
| | - Zhiliang Tan
- Key Laboratory of Subtropical Agro-ecological Engineering, Institute of Subtropical Agriculture, the Chinese Academy of Sciences, Changsha, Hunan, 410125, P. R. China
| | - Antonino Santos Del Riego
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160
| | - Alejandro Pazos
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160
| |
Collapse
|
35
|
Munteanu CR, Pimenta AC, Fernandez-Lozano C, Melo A, Cordeiro MNDS, Moreira IS. Solvent accessible surface area-based hot-spot detection methods for protein-protein and protein-nucleic acid interfaces. J Chem Inf Model 2015; 55:1077-86. [PMID: 25845030 DOI: 10.1021/ci500760m] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Due to the importance of hot-spots (HS) detection and the efficiency of computational methodologies, several HS detecting approaches have been developed. The current paper presents new models to predict HS for protein-protein and protein-nucleic acid interactions with better statistics compared with the ones currently reported in literature. These models are based on solvent accessible surface area (SASA) and genetic conservation features subjected to simple Bayes networks (protein-protein systems) and a more complex multi-objective genetic algorithm-support vector machine algorithms (protein-nucleic acid systems). The best models for these interactions have been implemented in two free Web tools.
Collapse
Affiliation(s)
- Cristian R Munteanu
- †Information and Communication Technologies Department, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071 A Coruña, Spain
| | - António C Pimenta
- ‡REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências da Universidade do Porto, Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Carlos Fernandez-Lozano
- †Information and Communication Technologies Department, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071 A Coruña, Spain
| | - André Melo
- ‡REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências da Universidade do Porto, Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Maria N D S Cordeiro
- ‡REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências da Universidade do Porto, Rua do Campo Alegre s/n, 4169-007 Porto, Portugal
| | - Irina S Moreira
- ‡REQUIMTE/Departamento de Química e Bioquímica, Faculdade de Ciências da Universidade do Porto, Rua do Campo Alegre s/n, 4169-007 Porto, Portugal.,§CNC-Center for Neuroscience and Cell Biology, Universidade de Coimbra, Rua Larga, FMUC, Polo I, 1°andar, 3004-517 Coimbra, Portugal
| |
Collapse
|
36
|
Hastings J, Jeliazkova N, Owen G, Tsiliki G, Munteanu CR, Steinbeck C, Willighagen E. eNanoMapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. J Biomed Semantics 2015; 6:10. [PMID: 25815161 PMCID: PMC4374589 DOI: 10.1186/s13326-015-0005-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2014] [Accepted: 02/27/2015] [Indexed: 11/18/2022] Open
Abstract
Engineered nanomaterials (ENMs) are being developed to meet specific application needs in diverse domains across the engineering and biomedical sciences (e.g. drug delivery). However, accompanying the exciting proliferation of novel nanomaterials is a challenging race to understand and predict their possibly detrimental effects on human health and the environment. The eNanoMapper project (www.enanomapper.net) is creating a pan-European computational infrastructure for toxicological data management for ENMs, based on semantic web standards and ontologies. Here, we describe the development of the eNanoMapper ontology based on adopting and extending existing ontologies of relevance for the nanosafety domain. The resulting eNanoMapper ontology is available at http://purl.enanomapper.net/onto/enanomapper.owl. We aim to make the re-use of external ontology content seamless and thus we have developed a library to automate the extraction of subsets of ontology content and the assembly of the subsets into an integrated whole. The library is available (open source) at http://github.com/enanomapper/slimmer/. Finally, we give a comprehensive survey of the domain content and identify gap areas. ENM safety is at the boundary between engineering and the life sciences, and at the boundary between molecular granularity and bulk granularity. This creates challenges for the definition of key entities in the domain, which we also discuss.
Collapse
Affiliation(s)
- Janna Hastings
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | | | - Gareth Owen
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Georgia Tsiliki
- National Technical University of Athens (NTUA), Athens, Greece
| | - Cristian R Munteanu
- Computer Science Faculty, University of A Coruña, A Coruña, Spain ; Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| | - Christoph Steinbeck
- European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Cambridge, United Kingdom
| | - Egon Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| |
Collapse
|
37
|
Liu Y, Buendía-Rodríguez G, Peñuelas-Rívas CG, Tan Z, Rívas-Guevara M, Tenorio-Borroto E, Munteanu CR, Pazos A, González-Díaz H. Experimental and computational studies of fatty acid distribution networks. Mol BioSyst 2015; 11:2964-77. [DOI: 10.1039/c5mb00325c] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
A new PT-LFER model is useful for predicting a distribution network in terms of specific fatty acid distribution.
Collapse
Affiliation(s)
- Yong Liu
- Faculty of Veterinary Medicine and Animal Science
- Autonomous University of the State of Mexico
- Toluca
- Mexico
- Key Laboratory of Subtropical Agro-ecological Engineering
| | - Germán Buendía-Rodríguez
- National Center for Disciplinary Research on Animal Physiology and Breeding
- National Institute of Forestry
- Agriculture and Livestock Research
- Queretaro
- Mexico
| | | | - Zhiliang Tan
- Key Laboratory of Subtropical Agro-ecological Engineering
- Institute of Subtropical Agriculture, the Chinese Academy of Sciences
- Changsha
- P. R. China
| | - María Rívas-Guevara
- Ethnobiology and Biodiversity Research Center
- Chapingo Autonomous University
- Texcoco
- Mexico
| | - Esvieta Tenorio-Borroto
- Faculty of Veterinary Medicine and Animal Science
- Autonomous University of the State of Mexico
- Toluca
- Mexico
| | | | | | - Humberto González-Díaz
- Department of Organic Chemistry II
- Faculty of Science and Technology
- University of the Basque Country UPV/EHU
- Leioa
- Spain
| |
Collapse
|
38
|
Jeliazkova N, Chomenidis C, Doganis P, Fadeel B, Grafström R, Hardy B, Hastings J, Hegi M, Jeliazkov V, Kochev N, Kohonen P, Munteanu CR, Sarimveis H, Smeets B, Sopasakis P, Tsiliki G, Vorgrimmler D, Willighagen E. The eNanoMapper database for nanomaterial safety information. Beilstein J Nanotechnol 2015; 6:1609-34. [PMID: 26425413 PMCID: PMC4578352 DOI: 10.3762/bjnano.6.165] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 07/03/2015] [Indexed: 05/20/2023]
Abstract
BACKGROUND The NanoSafety Cluster, a cluster of projects funded by the European Commision, identified the need for a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs). Ontologies, open standards, and interoperable designs were envisioned to empower a harmonized approach to European research in nanotechnology. This setting provides a number of opportunities and challenges in the representation of nanomaterials data and the integration of ENM information originating from diverse systems. Within this cluster, eNanoMapper works towards supporting the collaborative safety assessment for ENMs by creating a modular and extensible infrastructure for data sharing, data analysis, and building computational toxicology models for ENMs. RESULTS The eNanoMapper database solution builds on the previous experience of the consortium partners in supporting diverse data through flexible data storage, open source components and web services. We have recently described the design of the eNanoMapper prototype database along with a summary of challenges in the representation of ENM data and an extensive review of existing nano-related data models, databases, and nanomaterials-related entries in chemical and toxicogenomic databases. This paper continues with a focus on the database functionality exposed through its application programming interface (API), and its use in visualisation and modelling. Considering the preferred community practice of using spreadsheet templates, we developed a configurable spreadsheet parser facilitating user friendly data preparation and data upload. We further present a web application able to retrieve the experimental data via the API and analyze it with multiple data preprocessing and machine learning algorithms. CONCLUSION We demonstrate how the eNanoMapper database is used to import and publish online ENM and assay data from several data sources, how the "representational state transfer" (REST) API enables building user friendly interfaces and graphical summaries of the data, and how these resources facilitate the modelling of reproducible quantitative structure-activity relationships for nanomaterials (NanoQSAR).
Collapse
Affiliation(s)
| | | | - Philip Doganis
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
| | | | | | - Barry Hardy
- Douglas Connect GmbH, Zeiningen, Switzerland
| | - Janna Hastings
- European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI), Hinxton, United Kingdom
| | - Markus Hegi
- Douglas Connect GmbH, Zeiningen, Switzerland
| | | | - Nikolay Kochev
- Ideaconsult Ltd., Sofia, Bulgaria
- Department of Analytical Chemistry and Computer Chemistry, University of Plovdiv, Plovdiv, Bulgaria
| | | | - Cristian R Munteanu
- Department of Bioinformatics, NUTRIM, Maastricht University, Maastricht, The Netherlands
- Computer Science Faculty, University of A Coruna, A Coruña, Spain
| | - Haralambos Sarimveis
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
| | - Bart Smeets
- Department of Bioinformatics, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Pantelis Sopasakis
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
- IMT Institute for Advanced Studies Lucca, Lucca, Italy
| | - Georgia Tsiliki
- National Technical University of Athens, School of Chemical Engineering, Athens, Greece
| | | | - Egon Willighagen
- Department of Bioinformatics, NUTRIM, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
39
|
Munteanu CR, Pedreira N, Dorado J, Pazos A, Pérez-Montoto LG, Ubeira FM, González-Díaz H. LECTINPred: web Server that Uses Complex Networks of Protein Structure for Prediction of Lectins with Potential Use as Cancer Biomarkers or in Parasite Vaccine Design. Mol Inform 2014; 33:276-85. [DOI: 10.1002/minf.201300027] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 12/11/2014] [Indexed: 01/05/2023]
|
40
|
González-Díaz H, Herrera-Ibatá DM, Duardo-Sánchez A, Munteanu CR, Orbegozo-Medina RA, Pazos A. ANN Multiscale Model of Anti-HIV Drugs Activity vs AIDS Prevalence in the US at County Level Based on Information Indices of Molecular Graphs and Social Networks. J Chem Inf Model 2014; 54:744-55. [DOI: 10.1021/ci400716y] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Humberto González-Díaz
- Department
of Organic Chemistry II, Faculty of Science and Technology, University of the Basque Country UPV/EHU, 48940, Leioa, Vizcaya, Spain
- IKERBASQUE, Basque
Foundation for Science, 48011, Bilbao, Vizcaya, Spain
| | - Diana María Herrera-Ibatá
- Department of Information and Communication Technologies, University of A Coruña UDC, 15071, A Coruña, A Coruña, Spain
| | - Aliuska Duardo-Sánchez
- Department of Information and Communication Technologies, University of A Coruña UDC, 15071, A Coruña, A Coruña, Spain
| | - Cristian R. Munteanu
- Department of Information and Communication Technologies, University of A Coruña UDC, 15071, A Coruña, A Coruña, Spain
| | - Ricardo Alfredo Orbegozo-Medina
- Department
of Microbiology and Parasitology, University of Santiago de Compostela (USC), 15782, Santiago de Compostela, A Coruña, Spain
| | - Alejandro Pazos
- Department of Information and Communication Technologies, University of A Coruña UDC, 15071, A Coruña, A Coruña, Spain
| |
Collapse
|
41
|
González-Díaz H, Arrasate S, Sotomayor N, Lete E, Munteanu CR, Pazos A, Besada-Porto L, Ruso JM. MIANN models in medicinal, physical and organic chemistry. Curr Top Med Chem 2014; 13:619-41. [PMID: 23548024 DOI: 10.2174/1568026611313050006] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Revised: 02/27/2013] [Accepted: 03/14/2013] [Indexed: 11/22/2022]
Abstract
Reducing costs in terms of time, animal sacrifice, and material resources with computational methods has become a promising goal in Medicinal, Biological, Physical and Organic Chemistry. There are many computational techniques that can be used in this sense. In any case, almost all these methods focus on few fundamental aspects including: type (1) methods to quantify the molecular structure, type (2) methods to link the structure with the biological activity, and others. In particular, MARCH-INSIDE (MI), acronym for Markov Chain Invariants for Networks Simulation and Design, is a well-known method for QSAR analysis useful in step (1). In addition, the bio-inspired Artificial-Intelligence (AI) algorithms called Artificial Neural Networks (ANNs) are among the most powerful type (2) methods. We can combine MI with ANNs in order to seek QSAR models, a strategy which is called herein MIANN (MI & ANN models). One of the first applications of the MIANN strategy was in the development of new QSAR models for drug discovery. MIANN strategy has been expanded to the QSAR study of proteins, protein-drug interactions, and protein-protein interaction networks. In this paper, we review for the first time many interesting aspects of the MIANN strategy including theoretical basis, implementation in web servers, and examples of applications in Medicinal and Biological chemistry. We also report new applications of the MIANN strategy in Medicinal chemistry and the first examples in Physical and Organic Chemistry, as well. In so doing, we developed new MIANN models for several self-assembly physicochemical properties of surfactants and large reaction networks in organic synthesis. In some of the new examples we also present experimental results which were not published up to date.
Collapse
|
42
|
Aguiar-Pulido V, Gestal M, Fernandez-Lozano C, Rivero D, Munteanu CR. Applied computational techniques on schizophrenia using genetic mutations. Curr Top Med Chem 2014; 13:675-84. [PMID: 23548028 DOI: 10.2174/1568026611313050010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Revised: 02/26/2013] [Accepted: 03/09/2013] [Indexed: 11/22/2022]
Abstract
Schizophrenia is a complex disease, with both genetic and environmental influence. Machine learning techniques can be used to associate different genetic variations at different genes with a (schizophrenic or non-schizophrenic) phenotype. Several machine learning techniques were applied to schizophrenia data to obtain the results presented in this study. Considering these data, Quantitative Genotype - Disease Relationships (QDGRs) can be used for disease prediction. One of the best machine learning-based models obtained after this exhaustive comparative study was implemented online; this model is an artificial neural network (ANN). Thus, the tool offers the possibility to introduce Single Nucleotide Polymorphism (SNP) sequences in order to classify a patient with schizophrenia. Besides this comparative study, a method for variable selection, based on ANNs and evolutionary computation (EC), is also presented. This method uses half the number of variables as the original ANN and the variables obtained are among those found in other publications. In the future, QDGR models based on nucleic acid information could be expanded to other diseases.
Collapse
Affiliation(s)
- Vanessa Aguiar-Pulido
- Information and Communications Technologies Department, Computer Science Faculty, University of A Coruña, Campus de Elviña s/n, 15071 Spain
| | | | | | | | | |
Collapse
|
43
|
Fernandez-Lozano C, Fernández-Blanco E, Dave K, Pedreira N, Gestal M, Dorado J, Munteanu CR. Improving enzyme regulatory protein classification by means of SVM-RFE feature selection. Mol BioSyst 2014; 10:1063-71. [DOI: 10.1039/c3mb70489k] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
44
|
Duardo-Sánchez A, Munteanu CR, Riera-Fernández P, López-Díaz A, Pazos A, González-Díaz H. Modeling Complex Metabolic Reactions, Ecological Systems, and Financial and Legal Networks with MIANN Models Based on Markov-Wiener Node Descriptors. J Chem Inf Model 2013; 54:16-29. [DOI: 10.1021/ci400280n] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Aliuska Duardo-Sánchez
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
- Department of Special Public Law, Financial
and Tributary Law Area, Faculty of Law, University of Santiago de Compostela (USC), 15782, Santiago de Compostela, A Coruña, Spain
| | - Cristian R. Munteanu
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
| | - Pablo Riera-Fernández
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
| | - Antonio López-Díaz
- Department of Special Public Law, Financial
and Tributary Law Area, Faculty of Law, University of Santiago de Compostela (USC), 15782, Santiago de Compostela, A Coruña, Spain
| | - Alejandro Pazos
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
| | - Humberto González-Díaz
- Department of Organic Chemistry II, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Bizkaia, Spain
- IKERBASQUE, Basque
Foundation for Science, 48011, Bilbao, Biscay, Spain
| |
Collapse
|
45
|
Abstract
Using the CCSD(T) model, we evaluated the intermolecular potential energy surfaces of the He-, Ne-, and Ar-phosgene complexes. We considered a representative number of intermolecular geometries for which we calculated the corresponding interaction energies with the augmented (He complex) and double augmented (Ne and Ar complexes) correlation-consistent polarized valence triple-ζ basis sets extended with a set of 3s3p2d1f1g midbond functions. These basis sets were selected after systematic basis set studies carried out at geometries close to those of the surface minima. The He-, Ne-, and Ar-phosgene surfaces were found to have absolute minima of -72.1, -140.4, and -326.6 cm(-1) at distances between the rare-gas atom and the phosgene center of mass of 3.184, 3.254, and 3.516 Å, respectively. The potentials were further used in the evaluation of rovibrational states and the rotational constants of the complexes, providing valuable results for future experimental investigations. Comparing our results to those previously available for other phosgene complexes, we suggest that the results for Cl2-phosgene should be revised.
Collapse
|
46
|
R. Munteanu C, Dorado J, Pazos A. Editorial (Hot Topic: Artificial Intelligence Techniques in Medicinal Chemistry). Curr Top Med Chem 2013; 13:525. [DOI: 10.2174/1568026611313050001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
47
|
Seoane JA, Aguiar-Pulido V, Munteanu CR, Rivero D, Rabunal JR, Dorado J, Pazos A. Biomedical data integration in computational drug design and bioinformatics. Curr Comput Aided Drug Des 2013; 9:108-117. [PMID: 23294434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Revised: 06/01/2012] [Accepted: 10/02/2012] [Indexed: 06/01/2023]
Abstract
In recent years, in the post genomic era, more and more data is being generated by biological high throughput technologies, such as proteomics and transcriptomics. This omics data can be very useful, but the real challenge is to analyze all this data, as a whole, after integrating it. Biomedical data integration enables making queries to different, heterogeneous and distributed biomedical data sources. Data integration solutions can be very useful not only in the context of drug design, but also in biomedical information retrieval, clinical diagnosis, system biology, etc. In this review, we analyze the most common approaches to biomedical data integration, such as federated databases, data warehousing, multi-agent systems and semantic technology, as well as the solutions developed using these approaches in the past few years.
Collapse
Affiliation(s)
- Jose A Seoane
- Department of Information and Communication Technologies, Computer Science School, University of A Coruña, Spain
| | | | | | | | | | | | | |
Collapse
|
48
|
A. Seoane J, Aguiar-Pulido V, R. Munteanu C, Rivero D, R. Rabunal J, Dorado J, Pazos A. Biomedical Data Integration in Computational Drug Design and Bioinformatics. Curr Comput Aided Drug Des 2013. [DOI: 10.2174/157340913804998757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
49
|
A. Seoane J, Aguiar-Pulido V, R. Munteanu C, Rivero D, R. Rabunal J, Dorado J, Pazos A. Biomedical Data Integration in Computational Drug Design and Bioinformatics. Curr Comput Aided Drug Des 2013. [DOI: 10.2174/15734099112089990011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
50
|
A. Seoane J, Aguiar-Pulido V, R. Munteanu C, Rivero D, R. Rabunal J, Dorado J, Pazos A. Biomedical Data Integration in Computational Drug Design and Bioinformatics. Curr Comput Aided Drug Des 2013. [DOI: 10.2174/15734099112089990010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|