1
|
Agüero-Chapin G, Galpert-Cañizares D, Domínguez-Pérez D, Marrero-Ponce Y, Pérez-Machado G, Teijeira M, Antunes A. Emerging Computational Approaches for Antimicrobial Peptide Discovery. Antibiotics (Basel) 2022; 11:antibiotics11070936. [PMID: 35884190 PMCID: PMC9311958 DOI: 10.3390/antibiotics11070936] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 07/01/2022] [Accepted: 07/08/2022] [Indexed: 02/05/2023] Open
Abstract
In the last two decades many reports have addressed the application of artificial intelligence (AI) in the search and design of antimicrobial peptides (AMPs). AI has been represented by machine learning (ML) algorithms that use sequence-based features for the discovery of new peptidic scaffolds with promising biological activity. From AI perspective, evolutionary algorithms have been also applied to the rational generation of peptide libraries aimed at the optimization/design of AMPs. However, the literature has scarcely dedicated to other emerging non-conventional in silico approaches for the search/design of such bioactive peptides. Thus, the first motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models. Secondly, it is valuable to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space. Another point worthy of mention is the recent application of evolutionary algorithms that actually simulate sequence evolution to both the generation of diversity-oriented peptide libraries and the optimization of hit peptides. Last but not least, included here some new considerations in proteogenomic analyses currently incorporated into the computational workflow for unravelling AMPs in natural sources.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| | - Deborah Galpert-Cañizares
- Departamento de Ciencia de la Computación, Universidad Central Marta Abreu de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Dany Domínguez-Pérez
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Proquinorte, Unipessoal, Lda, Avenida 5 de Outubro, 124, 7º Piso, Avenidas Novas, 1050-061 Lisboa, Portugal
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas and Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, Quito 170157, Ecuador;
| | - Gisselle Pérez-Machado
- EpiDisease S.L—Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Marta Teijeira
- Departamento de Química Orgánica, Facultade de Química, Universidade de Vigo, 36310 Vigo, Spain;
- Instituto de Investigación Sanitaria Galicia Sur, Hospital Álvaro Cunqueiro, 36213 Vigo, Spain
| | - Agostinho Antunes
- CIIMAR—Centro Interdisciplinar de Investigação Marinha e Ambiental, Universidade do Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal;
- Departamento de Biologia, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
- Correspondence: (G.A.-C.); (A.A.); Tel.: +351-22-340-1813 (G.A.-C. & A.A.)
| |
Collapse
|
2
|
The Extremal Structures of r-Uniform Unicyclic Hypergraphs on the Signless Laplacian Estrada Index. MATHEMATICS 2022. [DOI: 10.3390/math10060941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
SLEE has various applications in a large variety of problems. The signless Laplacian Estrada index of a hypergraph H is defined as SLEE(H)=∑i=1neλi(Q), where λ1(Q),λ2(Q),…,λn(Q) are the eigenvalues of the signless Laplacian matrix of H. In this paper, we characterize the unique r-uniform unicyclic hypergraphs with maximum and minimum SLEE.
Collapse
|
3
|
Computing the Energy and Estrada Index of Different Molecular Structures. J CHEM-NY 2022. [DOI: 10.1155/2022/6227093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Graph energy is an invariant that is derived from the spectrum of the adjacency matrix of a graph. Graph energy is actually the absolute sum of all the eigenvalues of the adjacency matrix of a graph i.e.
, and the Estrada index of a graph
is elaborated as
, where,
are the eigenvalues of the adjacency matrix of a graph. In this paper, energy
and Estrada index
of different molecular structures are obtained and also established inequalities among the exact and estimated values of energies and Estrada index of
nanosheet and naphthalene.
Collapse
|
4
|
Sladek V, Harada R, Shigeta Y. Residue Folding Degree-Relationship to Secondary Structure Categories and Use as Collective Variable. Int J Mol Sci 2021; 22:ijms222313042. [PMID: 34884847 PMCID: PMC8657879 DOI: 10.3390/ijms222313042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 11/23/2021] [Accepted: 11/29/2021] [Indexed: 11/22/2022] Open
Abstract
Recently, we have shown that the residue folding degree, a network-based measure of folded content in proteins, is able to capture backbone conformational transitions related to the formation of secondary structures in molecular dynamics (MD) simulations. In this work, we focus primarily on developing a collective variable (CV) for MD based on this residue-bound parameter to be able to trace the evolution of secondary structure in segments of the protein. We show that this CV can do just that and that the related energy profiles (potentials of mean force, PMF) and transition barriers are comparable to those found by others for particular events in the folding process of the model mini protein Trp-cage. Hence, we conclude that the relative segment folding degree (the newly proposed CV) is a computationally viable option to gain insight into the formation of secondary structures in protein dynamics. We also show that this CV can be directly used as a measure of the amount of α-helical content in a selected segment.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry, Slovak Academy of Sciences, 845 38 Bratislava, Slovakia
- Correspondence:
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan; (R.H.); (Y.S.)
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan; (R.H.); (Y.S.)
| |
Collapse
|
5
|
Sladek V, Yamamoto Y, Harada R, Shoji M, Shigeta Y, Sladek V. pyProGA-A PyMOL plugin for protein residue network analysis. PLoS One 2021; 16:e0255167. [PMID: 34329304 PMCID: PMC8323899 DOI: 10.1371/journal.pone.0255167] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2021] [Accepted: 07/11/2021] [Indexed: 11/18/2022] Open
Abstract
The field of protein residue network (PRN) research has brought several useful methods and techniques for structural analysis of proteins and protein complexes. Many of these are ripe and ready to be used by the proteomics community outside of the PRN specialists. In this paper we present software which collects an ensemble of (network) methods tailored towards the analysis of protein-protein interactions (PPI) and/or interactions of proteins with ligands of other type, e.g. nucleic acids, oligosaccharides etc. In parallel, we propose the use of the network differential analysis as a method to identify residues mediating key interactions between proteins. We use a model system, to show that in combination with other, already published methods, also included in pyProGA, it can be used to make such predictions. Such extended repertoire of methods allows to cross-check predictions with other methods as well, as we show here. In addition, the possibility to construct PRN models from various kinds of input is so far a unique asset of our code. One can use structural data as defined in PDB files and/or from data on residue pair interaction energies, either from force-field parameters or fragment molecular orbital (FMO) calculations. pyProGA is a free open-source software available from https://gitlab.com/Vlado_S/pyproga.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry, Slovak Academy of Sciences, Bratislava, Slovakia
| | - Yuta Yamamoto
- Department of Chemistry, Rikkyo University, Nishi-Ikebukuro, Tokyo, Japan
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Mitsuo Shoji
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki, Japan
| | - Vladimir Sladek
- Institute of Construction and Architecture, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
6
|
Quality testing of spectrum-based valency descriptors for polycyclic aromatic hydrocarbons with applications. J Mol Struct 2021. [DOI: 10.1016/j.molstruc.2020.129789] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
7
|
Abstract
The analysis of folding trajectories for proteins is an open challenge. One of the problems is how to describe the amount of folded secondary structure in a protein. We extend the use of Estradas' folding degree (Bioinformatics 2002, 18, 697) for the analysis of the evolution of the folding stage during molecular dynamics (MD) simulation. It is shown that residue contribution to the total folding degree is a predominantly local property, well-defined by the backbone dihedral angles at the given residue, without significant contribution from the backbone conformation of other residues. Moreover, the magnitude of this residue contribution can be quite easily associated with characteristic motifs of secondary protein structures such as the α-helix, β-sheet (hairpin), and so on by means of a Ramachandran-like plot as a function of backbone dihedral angles φ,ψ. Additionally, the understanding of the free energy profile associated with the folding process becomes much simpler. Often a 1D profile is sufficient to locate global minima and the corresponding structure for short peptides.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry - Centre for Glycomics, Dubravska cesta 9, 84538 Bratislava, Slovakia.,Agency for Medical Research and Development (AMED), Chiyoda-ku, Japan
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8577, Japan
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
8
|
Santiago Á, Razo-Hernández RS, Pastor N. Revealing the Structural Contributions to Thermal Adaptation of the TATA-Box Binding Protein: Molecular Dynamics and QSPR Analyses. J Chem Inf Model 2020; 60:866-879. [PMID: 31917925 DOI: 10.1021/acs.jcim.9b00824] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The TATA-box binding protein (TBP) is an important element of the transcription machinery in archaea and eukaryotic organisms. TBP is expressed in organisms adapted to different temperatures, indicating a robust structure, and experimental studies have shown that the mid-unfolding temperature (Tm) of TBP is directly correlated with the optimal growth temperature (OGT) of the organism. To understand which are the relevant structural requirements for its stability, we present the first structural and dynamic computational study of TBPs, combining molecular dynamics (MD) simulations and a quantitative structure-property relationship (QSPR) over a set of TBPs of organisms adapted to different temperatures. We found that the main structural properties of TBP used to adapt to high temperatures are an increase in the ease of desolvation of charged residues at the surface, an increase in the local resiliency, the presence of Leu clusters in the protein core, and an increase in the loss of hydrophobic packing in the N-terminal subdomain. In view of our results, we consider that TBP is a good model to study thermal adaptation, and our analysis opens the possibility of performing protein engineering on TBPs to study transcription at high or low temperatures.
Collapse
Affiliation(s)
- Ángel Santiago
- Laboratorio de Dinámica de Proteínas, Centro de Investigación en Dinámica Celular, Instituto de Investigación en Ciencias Básicas y Aplicadas , Universidad Autónoma del Estado de Morelos , Av. Universidad 1001, Col. Chamilpa , Cuernavaca , Morelos 62209 , México
| | - Rodrigo Said Razo-Hernández
- Laboratorio de Dinámica de Proteínas, Centro de Investigación en Dinámica Celular, Instituto de Investigación en Ciencias Básicas y Aplicadas , Universidad Autónoma del Estado de Morelos , Av. Universidad 1001, Col. Chamilpa , Cuernavaca , Morelos 62209 , México
| | - Nina Pastor
- Laboratorio de Dinámica de Proteínas, Centro de Investigación en Dinámica Celular, Instituto de Investigación en Ciencias Básicas y Aplicadas , Universidad Autónoma del Estado de Morelos , Av. Universidad 1001, Col. Chamilpa , Cuernavaca , Morelos 62209 , México.,Departamento de Medicina Molecular y Bioprocesos, Instituto de Biotecnología , Universidad Nacional Autónoma de México , Av. Universidad 2001, Col. Chamilpa , Cuernavaca , Morelos 62210 , México
| |
Collapse
|
9
|
Merging the Spectral Theories of Distance Estrada and Distance Signless Laplacian Estrada Indices of Graphs. MATHEMATICS 2019. [DOI: 10.3390/math7100995] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Suppose that G is a simple undirected connected graph. Denote by D ( G ) the distance matrix of G and by T r ( G ) the diagonal matrix of the vertex transmissions in G, and let α ∈ [ 0 , 1 ] . The generalized distance matrix D α ( G ) is defined as D α ( G ) = α T r ( G ) + ( 1 − α ) D ( G ) , where 0 ≤ α ≤ 1 . If ∂ 1 ≥ ∂ 2 ≥ … ≥ ∂ n are the eigenvalues of D α ( G ) ; we define the generalized distance Estrada index of the graph G as D α E ( G ) = ∑ i = 1 n e ∂ i − 2 α W ( G ) n , where W ( G ) denotes for the Wiener index of G. It is clear from the definition that D 0 E ( G ) = D E E ( G ) and 2 D 1 2 E ( G ) = D Q E E ( G ) , where D E E ( G ) denotes the distance Estrada index of G and D Q E E ( G ) denotes the distance signless Laplacian Estrada index of G. This shows that the concept of generalized distance Estrada index of a graph G merges the theories of distance Estrada index and the distance signless Laplacian Estrada index. In this paper, we obtain some lower and upper bounds for the generalized distance Estrada index, in terms of various graph parameters associated with the structure of the graph G, and characterize the extremal graphs attaining these bounds. We also highlight relationship between the generalized distance Estrada index and the other graph-spectrum-based invariants, including generalized distance energy. Moreover, we have worked out some expressions for D α E ( G ) of some special classes of graphs.
Collapse
|
10
|
Abstract
For a simple undirected connected graph G of order n, let D ( G ) , D L ( G ) , D Q ( G ) and T r ( G ) be, respectively, the distance matrix, the distance Laplacian matrix, the distance signless Laplacian matrix and the diagonal matrix of the vertex transmissions of G. The generalized distance matrix D α ( G ) is signified by D α ( G ) = α T r ( G ) + ( 1 - α ) D ( G ) , where α ∈ [ 0 , 1 ] . Here, we propose a new kind of Estrada index based on the Gaussianization of the generalized distance matrix of a graph. Let ∂ 1 , ∂ 2 , … , ∂ n be the generalized distance eigenvalues of a graph G. We define the generalized distance Gaussian Estrada index P α ( G ) , as P α ( G ) = ∑ i = 1 n e - ∂ i 2 . Since characterization of P α ( G ) is very appealing in quantum information theory, it is interesting to study the quantity P α ( G ) and explore some properties like the bounds, the dependence on the graph topology G and the dependence on the parameter α . In this paper, we establish some bounds for the generalized distance Gaussian Estrada index P α ( G ) of a connected graph G, involving the different graph parameters, including the order n, the Wiener index W ( G ) , the transmission degrees and the parameter α ∈ [ 0 , 1 ] , and characterize the extremal graphs attaining these bounds.
Collapse
|
11
|
Abstract
Suppose that G is a graph over n vertices. G has n eigenvalues (of adjacency matrix) represented by λ1,λ2,⋯,λn. The Gaussian Estrada index, denoted by H(G) (Estrada et al., Chaos 27(2017) 023109), can be defined as H(G)=∑i=1ne-λi2. Gaussian Estrada index underlines the eigenvalues close to zero, which plays an important role in chemistry reactions, such as molecular stability and molecular magnetic properties. In a network of particles governed by quantum mechanics, this graph-theoretic index is known to account for the information encoded in the eigenvalues of the Hamiltonian near zero by folding the graph spectrum. In this paper, we establish some new lower bounds for H(G) in terms of the number of vertices, the number of edges, as well as the first Zagreb index.
Collapse
|
12
|
|
13
|
On the maximum Estrada index of 3-uniform linear hypertrees. ScientificWorldJournal 2014; 2014:637865. [PMID: 25302329 PMCID: PMC4164850 DOI: 10.1155/2014/637865] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2014] [Accepted: 08/12/2014] [Indexed: 11/24/2022] Open
Abstract
For a simple hypergraph H on n vertices, its Estrada index is defined as EE(H)=∑i=1neλi, where λ1, λ2,…, λn are the eigenvalues of its adjacency matrix. In this paper, we determine the unique 3-uniform linear hypertree with the maximum Estrada index.
Collapse
|
14
|
Estrada E. Generalized walks-based centrality measures for complex biological networks. J Theor Biol 2010; 263:556-65. [PMID: 20085771 DOI: 10.1016/j.jtbi.2010.01.014] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2009] [Revised: 01/03/2010] [Accepted: 01/14/2010] [Indexed: 11/29/2022]
Abstract
A strategy for zooming in and out the topological environment of a node in a complex network is developed. This approach is applied here to generalize the subgraph centrality of nodes in complex networks. In this case the zooming in strategy is based on the use of some known matrix functions which allow focusing locally on the environment of a node. When a zooming out strategy is applied new matrix functions are introduced, which give a more global picture of the topological surrounds of a node. These indices permit a modulation of the scales at which the environment of a node influences its centrality. We apply them to the study of 10 protein-protein interaction (PPI) networks. We illustrate the similarities and differences between the generalized subgraph centrality indices as well as among them and some classical centrality measures. We show here that the use of centrality indices based on the zooming in strategy identifies a larger number of essential proteins in the yeast PPI network than any of the other centrality measures studied.
Collapse
Affiliation(s)
- Ernesto Estrada
- Department of Mathematics and Statistics, Department of Physics, Institute of Complex Systems, University of Strathclyde, Glasgow G1 1XQ, UK.
| |
Collapse
|
15
|
Liu L, Wang T. Comparison of TOPS strings based on LZ complexity. J Theor Biol 2008; 251:159-66. [PMID: 18166201 DOI: 10.1016/j.jtbi.2007.11.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2007] [Revised: 11/13/2007] [Accepted: 11/13/2007] [Indexed: 10/22/2022]
|
16
|
|
17
|
García-Domenech R, Galvez J, de Julian-Ortiz JV, Pogliani L. Some new trends in chemical graph theory. Chem Rev 2008; 108:1127-69. [PMID: 18302420 DOI: 10.1021/cr0780006] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Ramón García-Domenech
- Unidad de Investigación de Diseño de Farmacos y Conectividad Molecular, Departamento de Química Fisica, Facultad de Farmacía, Universitat de València, 46100 Burjassot, València, Spain
| | | | | | | |
Collapse
|
18
|
Estrada E. Tight-binding "dihedral orbitals" approach to the degree of folding of macromolecular chains. J Phys Chem B 2007; 111:13611-8. [PMID: 17988111 DOI: 10.1021/jp074595x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We develop a tight-binding molecular approach to quantify the degree of folding of a macromolecular chain. This approach is based on the linear combination of "dihedral" orbitals to give molecular orbitals (LCDO-MO). The dihedral orbitals are a set of orbitals situated in each dihedral angle of the chain. The LCDO-MO approach remains basically topological, and we display its direct relation to known graph theoretical concepts. Using this approach, we define the dihedral electronic energy and the dihedral electronic partition function of a linear macromolecular chain. We show that the partition function per dihedral angle quantifies the degree of folding of the dihedral graph. We analyze the empirical relationship between these two functions by using a series of 100 proteins. We also study the relation between these two functions and the percentages of secondary structure for these proteins. Finally, we illustrate the use of the dihedral energy and the partition function in structure-property studies of proteins by analyzing the binding of steroids to DB3 antibody.
Collapse
Affiliation(s)
- Ernesto Estrada
- Complex Systems Research Group, X-rays Unit, RIAIDT, Edificio CACTUS, University of Santiago de Compostela, 15706 Santiago de Compostela, Spain.
| |
Collapse
|
19
|
Estrada E, Hatano N. Tight-binding ‘dihedral orbitals’ approach to electronic communicability in macromolecular chains. Chem Phys Lett 2007. [DOI: 10.1016/j.cplett.2007.10.028] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
20
|
|
21
|
Estrada E, Hatano N. Statistical-mechanical approach to subgraph centrality in complex networks. Chem Phys Lett 2007. [DOI: 10.1016/j.cplett.2007.03.098] [Citation(s) in RCA: 89] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
22
|
González-Díaz H, Saíz-Urra L, Molina R, González-Díaz Y, Sánchez-González A. Computational chemistry approach to protein kinase recognition using 3D stochastic van der Waals spectral moments. J Comput Chem 2007; 28:1042-8. [PMID: 17269125 DOI: 10.1002/jcc.20649] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Three-dimensional (3D) protein structures now frequently lack functional annotations because of the increase in the rate at which chemical structures are solved with respect to experimental knowledge of biological activity. As a result, predicting structure-function relationships for proteins is an active research field in computational chemistry and has implications in medicinal chemistry, biochemistry and proteomics. In previous studies stochastic spectral moments were used to predict protein stability or function (González-Díaz, H. et al. Bioorg Med Chem 2005, 13, 323; Biopolymers 2005, 77, 296). Nevertheless, these moments take into consideration only electrostatic interactions and ignore other important factors such as van der Waals interactions. The present study introduces a new class of 3D structure molecular descriptors for folded proteins named the stochastic van der Waals spectral moments ((o)beta(k)). Among many possible applications, recognition of kinases was selected due to the fact that previous computational chemistry studies in this area have not been reported, despite the widespread distribution of kinases. The best linear model found was Kact = -9.44 degrees beta(0)(c) +10.94 degrees beta(5)(c) -2.40 degrees beta(0)(i) + 2.45 degrees beta(5)(m) + 0.73, where core (c), inner (i) and middle (m) refer to specific spatial protein regions. The model with a high Matthew's regression coefficient (0.79) correctly classified 206 out of 230 proteins (89.6%) including both training and predicting series. An area under the ROC curve of 0.94 differentiates our model from a random classifier. A subsequent principal components analysis of 152 heterogeneous proteins demonstrated that beta(k) codifies information different to other descriptors used in protein computational chemistry studies. Finally, the model recognizes 110 out of 125 kinases (88.0%) in a virtual screening experiment and this can be considered as an additional validation study (these proteins were not used in training or predicting series).
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Organic Chemistry and Institute of Industrial Pharmacy, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain.
| | | | | | | | | |
Collapse
|
23
|
|
24
|
Estrada E. Point scattering: A new geometric invariant with applications from (Nano)clusters to biomolecules. J Comput Chem 2007; 28:767-77. [PMID: 17226832 DOI: 10.1002/jcc.20541] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
A new geometric invariant is defined from "first principles" for a point ensemble, which can represent clusters, molecules, crystals, and biomolecules. The scattering of a point ensemble is defined in terms of the Euclidean distance matrix and a vector measuring the weighted departure of the points from the cluster centre. Using the Rayleigh-Ritz theorem this function is maximized obtaining the point scattering of the ensemble. The point scattering shows several properties which are useful for studying clusters, molecules, crystals, and biomolecules. We examined different natural clusters of hard spheres such as colloidal particles and fullerenes, as well as protein-peptide complexes and the effect of temperature on protein structure. In all cases point scattering differentiates point ensembles with different structures, which are not distinguished by other geometric invariants, such as the second moment of mass distribution, surface areas, and volumes. Point scattering also shows better correlation with thermodynamic parameters of binding and describes the interior cavities of hollowed ensembles better than the other geometric measures.
Collapse
Affiliation(s)
- Ernesto Estrada
- Complex Systems Research Group, X-ray Unit, RIAIDT, Edificio CACTUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain.
| |
Collapse
|
25
|
Abstract
The Estrada index EE is a recently proposed molecular structure-descriptor, used in the modeling of certain features of the 3D structure of organic molecules, in particular of the degree of folding of proteins and other long-chain biopolymers. The Estrada index is computed from the spectrum of the molecular graph. Therefore, finding its relation with the spectral radius r (= the greatest graph eigenvalue) is of interest, especially because the structure-dependency of r is relatively well understood. In this work, the basic characteristics of the relation between EE and r, which turned out to be much more complicated than initially anticipated, was determined.
Collapse
|
26
|
Abstract
A quantitative measure of the degree of folding of azurins and pseudoazurins has been made. We have found that the reduction potential of azurins and pseudoazurins is a function of the contribution to the degree of folding of His117, a key amino acid in electron transfer which is directly bonded to copper in these proteins. The folding degree of His117 explains 95% of the variance in the experimental values of the reduction potential of azurins and pseudoazurins. The change in the folding degree of this amino acid influences several geometric parameters of the main backbones of these proteins. Among them, the angle formed between N(His117)...Cu...S(Cys112), which plays an important role in electron transport, but not the N(His117)...Cu distance, shows some non-linear correlation with the reduction potential of azurins and pseudoazurins. However, it is only able to explain less than 75% in the variance of the reduction potential of these proteins instead of the 95% explained by the folding degree of His117.
Collapse
Affiliation(s)
- Ernesto Estrada
- Complex Systems Research Group, X-Ray Unit, Edificio CACTUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain.
| | | |
Collapse
|
27
|
Krilov G, Randić M. Quantitative characterization of protein structure: application to a novel α/β fold. NEW J CHEM 2004. [DOI: 10.1039/b405153j] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|