1
|
Diéguez-Santana K, Casañola-Martin GM, Green JR, Rasulev B, González-Díaz H. Predicting Metabolic Reaction Networks with Perturbation-Theory Machine Learning (PTML) Models. Curr Top Med Chem 2021; 21:819-827. [PMID: 33797370 DOI: 10.2174/1568026621666210331161144] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 12/30/2020] [Accepted: 01/07/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND Checking the connectivity (structure) of complex Metabolic Reaction Networks (MRNs) models proposed for new microorganisms with promising properties is an important goal for chemical biology. OBJECTIVE In principle, we can perform a hand-on checking (Manual Curation). However, this is a challenging task due to the high number of combinations of pairs of nodes (possible metabolic reactions). RESULTS The CPTML linear model obtained using the LDA algorithm is able to discriminate nodes (metabolites) with the correct assignation of reactions from incorrect nodes with values of accuracy, specificity, and sensitivity in the range of 85-100% in both training and external validation data series. METHODS In this work, we used Combinatorial Perturbation Theory and Machine Learning techniques to seek a CPTML model for MRNs >40 organisms compiled by Barabasis' group. First, we quantified the local structure of a very large set of nodes in each MRN using a new class of node index called Markov linear indices fk. Next, we calculated CPT operators for 150000 combinations of query and reference nodes of MRNs. Last, we used these CPT operators as inputs of different ML algorithms. CONCLUSION Meanwhile, PTML models based on Bayesian network, J48-Decision Tree and Random Forest algorithms were identified as the three best non-linear models with accuracy greater than 97.5%. The present work opens the door to the study of MRNs of multiple organisms using PTML models.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, and Basque Center for Biophysics CSIC-UPV/EHU, Leioa 48940, Great Bilbao, Biscay, Basque Country, Spain
| | | | - James R Green
- Department of Systems and Computer Engineering, Carleton University, K1S 5B6, Ottawa, ON, Canada
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, ND 58102, United States
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, and Basque Center for Biophysics CSIC-UPV/EHU, Leioa 48940, Great Bilbao, Biscay, Basque Country, Spain
| |
Collapse
|
2
|
Duardo-Sánchez A, Munteanu CR, Riera-Fernández P, López-Díaz A, Pazos A, González-Díaz H. Modeling Complex Metabolic Reactions, Ecological Systems, and Financial and Legal Networks with MIANN Models Based on Markov-Wiener Node Descriptors. J Chem Inf Model 2013; 54:16-29. [DOI: 10.1021/ci400280n] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Aliuska Duardo-Sánchez
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
- Department of Special Public Law, Financial
and Tributary Law Area, Faculty of Law, University of Santiago de Compostela (USC), 15782, Santiago de Compostela, A Coruña, Spain
| | - Cristian R. Munteanu
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
| | - Pablo Riera-Fernández
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
| | - Antonio López-Díaz
- Department of Special Public Law, Financial
and Tributary Law Area, Faculty of Law, University of Santiago de Compostela (USC), 15782, Santiago de Compostela, A Coruña, Spain
| | - Alejandro Pazos
- Department
of Information and Communication Technologies, Computer Science Faculty, University of A Coruña, Campus de Elviña, 15071, A Coruña, A Coruña, Spain
| | - Humberto González-Díaz
- Department of Organic Chemistry II, Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Leioa, Bizkaia, Spain
- IKERBASQUE, Basque
Foundation for Science, 48011, Bilbao, Biscay, Spain
| |
Collapse
|
3
|
Krivovichev SV. Which inorganic structures are the most complex? Angew Chem Int Ed Engl 2013; 53:654-61. [PMID: 24339343 DOI: 10.1002/anie.201304374] [Citation(s) in RCA: 90] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Revised: 07/06/2013] [Indexed: 11/09/2022]
Abstract
The discovery of the diffraction of X-rays on crystals opened up a new era in our understanding of nature, leading to a multitude of striking discoveries about the structures and functions of matter on the atomic and molecular scales. Over the last hundred years, about 150,000 of inorganic crystal structures have been elucidated and visualized. The advent of new technologies, such as area detectors and synchrotron radiation, led to the solution of structures of unprecedented complexity. However, the very notion of structural complexity of crystals still lacks an unambiguous quantitative definition. In this Minireview we use information theory to characterize complexity of inorganic structures in terms of their information content.
Collapse
Affiliation(s)
- Sergey V Krivovichev
- Department of Crystallography, St. Petersburg State University, University Emb. 7/9, 199034 St. Petersburg (Russia).
| |
Collapse
|
4
|
|
5
|
The Rücker–Markov invariants of complex Bio-Systems: Applications in Parasitology and Neuroinformatics. Biosystems 2013; 111:199-207. [DOI: 10.1016/j.biosystems.2013.02.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Accepted: 02/11/2013] [Indexed: 11/23/2022]
|
6
|
Krivovichev SV. Information-based measures of structural complexity: application to fluorite-related structures. Struct Chem 2012. [DOI: 10.1007/s11224-012-0015-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
7
|
Krivovichev S. Topological complexity of crystal structures: quantitative approach. Acta Crystallogr A 2012; 68:393-8. [DOI: 10.1107/s0108767312012044] [Citation(s) in RCA: 117] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2012] [Accepted: 03/20/2012] [Indexed: 11/11/2022] Open
|
8
|
González-Díaz H, Prado-Prado F, Sobarzo-Sánchez E, Haddad M, Maurel Chevalley S, Valentin A, Quetin-Leclercq J, Dea-Ayuela MA, Teresa Gomez-Muños M, Munteanu CR, José Torres-Labandeira J, García-Mera X, Tapia RA, Ubeira FM. NL MIND-BEST: A web server for ligands and proteins discovery—Theoretic-experimental study of proteins of Giardia lamblia and new compounds active against Plasmodium falciparum. J Theor Biol 2011; 276:229-49. [DOI: 10.1016/j.jtbi.2011.01.010] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Revised: 12/02/2010] [Accepted: 01/10/2011] [Indexed: 10/18/2022]
|
9
|
Rodriguez-Soca Y, Munteanu CR, Dorado J, Rabuñal J, Pazos A, González-Díaz H. Plasmod-PPI: A web-server predicting complex biopolymer targets in plasmodium with entropy measures of protein–protein interactions. POLYMER 2010. [DOI: 10.1016/j.polymer.2009.11.029] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
10
|
Pérez-Montoto LG, Santana L, González-Díaz H. Scoring function for DNA-drug docking of anticancer and antiparasitic compounds based on spectral moments of 2D lattice graphs for molecular dynamics trajectories. Eur J Med Chem 2009; 44:4461-9. [PMID: 19604606 PMCID: PMC7127518 DOI: 10.1016/j.ejmech.2009.06.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2009] [Revised: 06/04/2009] [Accepted: 06/05/2009] [Indexed: 02/02/2023]
Abstract
We introduce here a new class of invariants for MD trajectories based on the spectral moments pi(k)(L) of the Markov matrix associated to lattice network-like (LN) graph representations of Molecular Dynamics (MD) trajectories. The procedure embeds the MD energy profiles on a 2D Cartesian coordinates system using simple heuristic rules. At the same time, we associate the LN with a Markov matrix that describes the probabilities of passing from one state to other in the new 2D space. We construct this type of LNs for 422 MD trajectories obtained in DNA-drug docking experiments of 57 furocoumarins. The combined use of psoralens+ultraviolet light (UVA) radiation is known as PUVA therapy. PUVA is effective in the treatment of skin diseases such as psoriasis and mycosis fungoides. PUVA is also useful to treat human platelet (PTL) concentrates in order to eliminate Leishmania spp. and Trypanosoma cruzi. Both are parasites that cause Leishmaniosis (a dangerous skin and visceral disease) and Chagas disease, respectively; and may circulate in blood products collected from infected donors. We included in this study both lineal (psoralens) and angular (angelicins) furocoumarins. In the study, we grouped the LNs on two sets; set1: DNA-drug complex MD trajectories for active compounds and set2: MD trajectories of non-active compounds or no-optimal MD trajectories of active compounds. We calculated the respective pi(k)(L) values for all these LNs and used them as inputs to train a new classifier that discriminate set1 from set2 cases. In training series the model correctly classifies 79 out of 80 (specificity=98.75%) set1 and 226 out of 238 (Sensitivity=94.96%) set2 trajectories. In independent validation series the model correctly classifies 26 out of 26 (specificity=100%) set1 and 75 out of 78 (sensitivity=96.15%) set2 trajectories. We propose this new model as a scoring function to guide DNA-docking studies in the drug design of new coumarins for anticancer or antiparasitic PUVA therapy.
Collapse
Affiliation(s)
- Lázaro G. Pérez-Montoto
- Department of Microbiology & Parasitology, and Department of Organic Chemistry
- Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain
| | - Lourdes Santana
- Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain
| | - Humberto González-Díaz
- Department of Microbiology & Parasitology, and Department of Organic Chemistry
- Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain
| |
Collapse
|
11
|
Viña D, Uriarte E, Orallo F, González-Díaz H. Alignment-free prediction of a drug-target complex network based on parameters of drug connectivity and protein sequence of receptors. Mol Pharm 2009; 6:825-35. [PMID: 19281186 DOI: 10.1021/mp800102c] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
There are many drugs described with very different affinity to a large number of receptors. In this work, we selected drug-receptor pairs (DRPs) of affinity/nonaffinity drugs to similar/dissimilar receptors and we represented them as a large network, which may be used to identify drugs that can act on a receptor. Computational chemistry prediction of the biological activity based on quantitative structure-activity relationships (QSAR) substantially increases the potentialities of this kind of networks avoiding time- and resource-consuming experiments. Unfortunately, most QSAR models are unspecific or predict activity against only one receptor. To solve this problem, we developed here a multitarget QSAR (mt-QSAR) classification model. Overall model classification accuracy was 72.25% (1390/1924 compounds) in training, 72.28% (459/635) in cross-validation. Outputs of this mt-QSAR model were used as inputs to construct a network. The observed network has 1735 nodes (DRPs), 1754 edges or pairs of DRPs with similar drug-target affinity (sPDRPs), and low coverage density d = 0.12%. The predicted network has 1735 DRPs, 1857 sPDRPs, and also low coverage density d = 0.12%. After an edge-to-edge comparison (chi-square = 9420.3; p < 0.005), we have demonstrated that the predicted network is significantly similar to the one observed and both have a distribution closer to exponential than to normal.
Collapse
Affiliation(s)
- Dolores Viña
- Department of Organic Chemistry, University of Santiago de Compostela, 15782, Spain
| | | | | | | |
Collapse
|
12
|
Pérez-Montoto LG, Dea-Ayuela MA, Prado-Prado FJ, Bolas-Fernández F, Ubeira FM, González-Díaz H. Study of peptide fingerprints of parasite proteins and drug-DNA interactions with Markov-Mean-Energy invariants of biopolymer molecular-dynamic lattice networks. POLYMER 2009; 50:3857-3870. [PMID: 32287404 PMCID: PMC7111648 DOI: 10.1016/j.polymer.2009.05.055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Revised: 05/06/2009] [Accepted: 05/14/2009] [Indexed: 11/26/2022]
Abstract
Since the advent of Molecular Dynamics (MD) in biopolymers science with the study by Karplus et al. on protein dynamics, MD has become the by foremost well established, computational technique to investigate structure and function of biomolecules and their respective complexes and interactions. The analysis of the MD trajectories (MDTs) remains, however, the greatest challenge and requires a great deal of insight, experience, and effort. Here, we introduce a new class of invariants for MDTs based on the spatial distribution of Mean-Energy values ξk (L) on a 2D Euclidean space representation of the MDTs. The procedure forces one MD trajectory to fold into a 2D Cartesian coordinates system using a step-by-step procedure driven by simple rules. The ξk (L) values are invariants of a Markov matrix (1 Π), which describes the probabilities of transition between two states in the new 2D space; which is associated to a graph representation of MDTs similar to the lattice networks (LNs) of DNA and protein sequences. We also introduce a new algorithm to perform phylogenetic analysis of peptides based on MDTs instead of the sequence of the polypeptide. In a first experiment, we illustrate this algorithm for 35 peptides present on the Peptide Mass Fingerprint (PMF) of a new protein of Leishmania infantum studied in this work. We report, by the first time, 2D Electrophoresis isolation, MALDI TOF Mass Spectroscopy characterization, and MASCOT search results for this PMF. In a second experiment, we construct the LNs for 422 MDTs obtained in DNA-Drug Docking simulations of the interaction of 57 anticancer furocoumarins with a DNA oligonucleotide. We calculated the respective ξk (L) values for all these LNs and used them as inputs to train a new classifier with Accuracy = 85.44% and 84.91% in training and validation respectively. The new model can be used as scoring function to guide DNA-Drug Docking studies in drug design of new coumarins for PUVA therapy. The new phylogenetics analysis algorithms encode information different from sequence similarity and may be used to analyze MDTs obtained in Docking or modeling experiments for any classes of biopolymers. The work opens new perspective on the analysis and applications of MD in polymer sciences.
Collapse
Affiliation(s)
- Lázaro Guillermo Pérez-Montoto
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - María Auxiliadora Dea-Ayuela
- Departamento de Atención Sanitaria, Salud Pública y Sanidad Animal, Facultad CC Experimentales y de La Salud, Universidad CEU Cardenal Herrera, 46113 Moncada (Valencia), Spain
| | - Francisco J Prado-Prado
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | | | - Florencio M Ubeira
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Humberto González-Díaz
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| |
Collapse
|
13
|
Gündüz G, Dernaika M, Dikencik G, Fares M, Aras L. Graph theoretical approach to the mechanical strength of polymers. MOLECULAR SIMULATION 2008. [DOI: 10.1080/08927020701868367] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
14
|
Abstract
A comparative analysis of the topological structure of molecules and molecular biology networks revealed both similarity and differences in the methods used, as well as in the essential features of the two types of systems. Molecular graphs are static and, due to the limitations in atomic valence, show neither power distribution of vertex degrees nor "small-world" properties, which are typical for dynamic evolutionary networks. Areas of mutual benefits from an exchange of methods and ideas are outlined for the two fields. More specifically, chemical graph theory might make use of some new descriptors of network structure. Of interest for quantitative structure-property relationship/quantitative structure-activity relationship and drug design might be the conclusion that descriptors based on distributions of vertex degrees, distances, and subgraphs seem to be more relevant to biological information than the single-number descriptors. The network concepts of centrality, clustering, and cliques provide a basis for similar studies in theoretical chemistry. The need of dynamic theory of molecular topology is advocated.
Collapse
Affiliation(s)
- Danail Bonchev
- Center for the Study of Biological Chemistry, Virginia Commonwealth University, P. O. Box 842030, Richmond, Virginia 23284-2030, USA.
| | | |
Collapse
|