1
|
Beltrán HI, Alas-Guardado SJ, González-Pérez PP. Improving coarse-grained models of protein folding through weighting of polar-polar/hydrophobic-hydrophobic interactions into crowded spaces. J Mol Model 2022; 28:87. [PMID: 35262807 DOI: 10.1007/s00894-022-05071-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 02/26/2022] [Indexed: 10/18/2022]
Abstract
Herein were tested 7 hydrophobic-polar sequences in two types of 2D-square space lattices, homogeneous and correlated, the latter simulating molecular crowding included as a geometric boundary restriction. Optimization of 2D structures was carried out using a variant of Dill's model, inspired by convex function, taking into account both hydrophobic (Dill's model) and polar interactions, including more structural information to reach better folding solutions. While using correlated networks, degrees of freedom in the folding of sequences were limited; as a result in all cases, more successful structural trials were found in comparison to a homogeneous lattice. The majority of employed sequences were designed by our workgroup, two of them were folded with other approaches, and another is a modified version of a previous sequence, initial forms of the other two have been employed but without taking into account polar-polar contributions. Three of them are newly proposed, intended to test the conjoint hydrophobic-hydrophobic and polar-polar contributions in crowded spaces. One sequence turned out to be the most difficult of the seven folded, this perhaps due to intrinsic (i) degrees of freedom and (ii) motifs of the expected 2D HP structure. Meanwhile two-sequence, although optimal folding was not achieved for neither of the two approaches, folding with correlated network approach not only produced better results than homogeneous space, but for them the best values found with crowding were very close to the expected optimal fitness. In general, five sequences were better folded with medium lattice units for correlated media; instead, another two sequences were better folded with a bit larger degree of lattice unit, revealing that depending on the degrees of freedom and particular folding, motifs in each sequence would require tuned crowding to achieve better folding. Therefore, the main goal herein was to obtain a modified 2D HP lattice model to mimic folding of proteins or secondary structures, like β-sheets, taking into account both hydrophobic-hydrophobic and polar-polar interactions, and fold them in a crowded environment. This simple but enough construction would be conducted to determine the needed information to fold sequences in a sort of a minimal but complete heuristic model. Finally, we claim that all folded sequences into crowded spaces achieve better results than homogeneous ones.
Collapse
Affiliation(s)
- Hiram Isaac Beltrán
- Departamento de Ciencias Básicas, Universidad Autónoma Metropolitana, Unidad Azcapotzalco, CDMX 02200, Mexico, Mexico
| | - Salomón J Alas-Guardado
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, CDMX 05300, Mexico, Mexico.
| | - Pedro Pablo González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana, Unidad Cuajimalpa, CDMX 05300, Mexico, Mexico.
| |
Collapse
|
2
|
Alas-Guardado SJ, González-Pérez PP, Beltrán HI. Contributions of topological polar-polar contacts to achieve better folding stability of 2D/3D HP lattice proteins: An in silico approach. AIMS BIOPHYSICS 2021. [DOI: 10.3934/biophy.2021023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
<abstract>
<p>Many of the simplistic hydrophobic-polar lattice models, such as Dill's model (called <bold>Model 1</bold> herein), are aimed to fold structures through hydrophobic-hydrophobic interactions mimicking the well-known hydrophobic collapse present in protein structures. In this work, we studied 11 designed hydrophobic-polar sequences, S<sub>1</sub>-S<sub>8</sub> folded in 2D-square lattice, and S<sub>9</sub>-S<sub>11</sub> folded in 3D-cubic lattice. And to better fold these structures we have developed <bold>Model 2</bold> as an approximation to convex function aimed to weight hydrophobic-hydrophobic but also polar-polar contacts as an augmented version of <bold>Model 1</bold>. In this partitioned approach hydrophobic-hydrophobic ponderation was tuned as <italic>α</italic>-1 and polar-polar ponderation as <italic>α</italic>. This model is centered in preserving required hydrophobic substructure, and at the same time including polar-polar interactions, otherwise absent, to reach a better folding score now also acquiring the polar-polar substructure. In all tested cases the folding trials were better achieved with <bold>Model 2</bold>, using <italic>α</italic> values of 0.05, 0.1, 0.2 and 0.3 depending of sequence size, even finding optimal scores not reached with <bold>Model 1</bold>. An important result is that the better folding score, required the lower <italic>α</italic> weighting. And when <italic>α</italic> values above 0.3 are employed, no matter the nature of the hydrophobic-polar sequence, banning of hydrophobic-hydrophobic contacts started, thus yielding misfolding of sequences. Therefore, the value of <italic>α</italic> to correctly fold structures is the result of a careful weighting among hydrophobic-hydrophobic and polar-polar contacts.</p>
</abstract>
Collapse
|
3
|
Alas SDJ, González-Pérez PP, Beltrán HI. In silico minimalist approach to study 2D HP protein folding into an inhomogeneous space mimicking osmolyte effect: First trial in the search of foldameric backbones. Biosystems 2019; 181:31-43. [PMID: 31029589 DOI: 10.1016/j.biosystems.2019.04.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Revised: 04/01/2019] [Accepted: 04/08/2019] [Indexed: 12/22/2022]
Abstract
We have employed our bioinformatics workbench, named Evolution, a Multi-Agent System based architecture with lattice-bead-models, evolutionary-algorithms, and correlated-networks as inhomogeneous spaces, with different correlation lengths, mimicking osmolyte effect (molecular crowding), to in silico survey protein folding. Resolution is with hydrophobic-polar (H-P) sequences in inhomogeneous 2D square lattices, since general biophysicochemical trends consider i) that the backbone is one of the major components responsible for protein folding and ii) osmolyte effect plays an important role to better folding kinetics and reach deeper optima. We have designed foldamers, as square n × n (n = 3, 4, 5, 6) arrays of hydrophobic cores stabilized by H⋯H contacts, attached through short PP (P2) or long PPPP (P4) loops, giving rise to 8 sequences (S1 to S8) with known optimal scores. Designed sequences were folded into different inhomogeneous spaces and indeed crowded media induced deeper optima, being crowding necessary to best fold, but the space should be enough constrained to induce folding without banning chain movement. The constrained space plays an important role to reach the optimal structure, depending on designed foldamer sequence size, for an optimal correlation length, implying that media affects the folding pathways as happens in real systems. Designed structures were found, moreover, they undergo to degenerated states, both folding states could survey considering i) backbone information and ii) osmolyte effect. In nature, the proteins fold in different structures aiming to reach a global minimum, but a local minimum could be enough to the protein to be functional or dysfunctional.
Collapse
Affiliation(s)
- Salomón de Jesús Alas
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Ciudad de México, 05300, Mexico.
| | - Pedro Pablo González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa, 05300, Ciudad de Mexico, Mexico
| | - Hiram Isaac Beltrán
- Departamento de Ciencias Básicas, Universidad Autónoma Metropolitana Unidad Azcapotzalco, Ciudad de México, 02200, Mexico.
| |
Collapse
|
4
|
González-Pérez PP, Orta DJ, Peña I, Flores EC, Ramírez JU, Beltrán HI, Alas SJ. A Computational Approach to Studying Protein Folding Problems Considering the Crucial Role of the Intracellular Environment. J Comput Biol 2017; 24:995-1013. [PMID: 28177752 DOI: 10.1089/cmb.2016.0115] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Intracellular protein folding (PF) is performed in a highly inhomogeneous, crowded, and correlated environment. Due to this inherent complexity, the study and understanding of PF phenomena is a fundamental issue in the field of computational systems biology. In particular, it is important to use a modeled medium that accurately reflects PF in natural systems. In the current study, we present a simulation wherein PF is carried out within an inhomogeneous modeled medium. Simulation resources included a two-dimensional hydrophobic-polar (HP) model, evolutionary algorithms, and the dual site-bond model. The dual site-bond model was used to develop an environment where HP beads could be folded. Our modeled medium included correlation lengths and fractal-like behavior, which were selected according to HP sequence lengths to induce folding in a crowded environment. Analysis of three benchmark HP sequences showed that the modeled inhomogeneous space played an important role in deeper energy folding and obtained better performance and convergence compared with homogeneous environments. Our computational approach also demonstrated that our correlated network provided a better space for PF. Thus, our approach represents a major advancement in PF simulations, not only for folding but also for understanding functional chemical structure and physicochemical properties of proteins in crowded molecular systems, which normally occur in nature.
Collapse
Affiliation(s)
- Pedro P González-Pérez
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Daniel J Orta
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Irving Peña
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Eduardo C Flores
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - José U Ramírez
- 1 Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Hiram I Beltrán
- 2 Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| | - Salomón J Alas
- 2 Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa , Ciudad de México, México
| |
Collapse
|
5
|
Purvine E, Monson K, Jurrus E, Star K, Baker NA. Energy Minimization of Discrete Protein Titration State Models Using Graph Theory. J Phys Chem B 2016; 120:8354-60. [PMID: 27089174 DOI: 10.1021/acs.jpcb.6b02059] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There are several applications in computational biophysics that require the optimization of discrete interacting states, for example, amino acid titration states, ligand oxidation states, or discrete rotamer angles. Such optimization can be very time-consuming as it scales exponentially in the number of sites to be optimized. In this paper, we describe a new polynomial time algorithm for optimization of discrete states in macromolecular systems. This algorithm was adapted from image processing and uses techniques from discrete mathematics and graph theory to restate the optimization problem in terms of "maximum flow-minimum cut" graph analysis. The interaction energy graph, a graph in which vertices (amino acids) and edges (interactions) are weighted with their respective energies, is transformed into a flow network in which the value of the minimum cut in the network equals the minimum free energy of the protein and the cut itself encodes the state that achieves the minimum free energy. Because of its deterministic nature and polynomial time performance, this algorithm has the potential to allow for the ionization state of larger proteins to be discovered.
Collapse
Affiliation(s)
| | | | | | | | - Nathan A Baker
- Division of Applied Mathematics, Brown University , Providence, Rhode Island 02912, United States
| |
Collapse
|
6
|
Alas SJ, González-Pérez PP. Simulating the folding of HP-sequences with a minimalist model in an inhomogeneous medium. Biosystems 2016; 142-143:52-67. [PMID: 27020756 DOI: 10.1016/j.biosystems.2016.03.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 03/16/2016] [Accepted: 03/24/2016] [Indexed: 11/24/2022]
Abstract
The phenomenon of protein folding is a fundamental issue in the field of the computational molecular biology. The protein folding inside the cells is performed in a highly inhomogeneous, tortuous, and correlated environment. Therefore, it is important to include in the theoretical studies the medium where the protein folding is developed. In this work we present the combination of three models to mimic the protein folding inside of an inhomogeneous medium. The models used here are Hydrophobic-Polar (HP) in 2D square arrangement, Evolutionary Algorithms (EA), and the Dual Site Bond Model (DSBM). The DSBM model is used to simulate the environment where the HP beads are folded; in this case the medium is correlated and is fractal-like. The analysis of five benchmark HP sequences shows that the inhomogeneous space provided with a given correlation length and fractal dimension plays an important role for correct folding of these sequences, which does not occur in a homogeneous space.
Collapse
Affiliation(s)
- S J Alas
- Departamento de Ciencias Naturales, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Av. Vasco de Quiroga 4871, Distrito Federal 05300, Mexico
| | - P P González-Pérez
- Departamento de Matemáticas Aplicadas y Sistemas, Universidad Autónoma Metropolitana Unidad Cuajimalpa, Av. Vasco de Quiroga 4871, Distrito Federal 05300, Mexico.
| |
Collapse
|
7
|
Pérez-Montoto LG, Santana L, González-Díaz H. Scoring function for DNA-drug docking of anticancer and antiparasitic compounds based on spectral moments of 2D lattice graphs for molecular dynamics trajectories. Eur J Med Chem 2009; 44:4461-9. [PMID: 19604606 PMCID: PMC7127518 DOI: 10.1016/j.ejmech.2009.06.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2009] [Revised: 06/04/2009] [Accepted: 06/05/2009] [Indexed: 02/02/2023]
Abstract
We introduce here a new class of invariants for MD trajectories based on the spectral moments pi(k)(L) of the Markov matrix associated to lattice network-like (LN) graph representations of Molecular Dynamics (MD) trajectories. The procedure embeds the MD energy profiles on a 2D Cartesian coordinates system using simple heuristic rules. At the same time, we associate the LN with a Markov matrix that describes the probabilities of passing from one state to other in the new 2D space. We construct this type of LNs for 422 MD trajectories obtained in DNA-drug docking experiments of 57 furocoumarins. The combined use of psoralens+ultraviolet light (UVA) radiation is known as PUVA therapy. PUVA is effective in the treatment of skin diseases such as psoriasis and mycosis fungoides. PUVA is also useful to treat human platelet (PTL) concentrates in order to eliminate Leishmania spp. and Trypanosoma cruzi. Both are parasites that cause Leishmaniosis (a dangerous skin and visceral disease) and Chagas disease, respectively; and may circulate in blood products collected from infected donors. We included in this study both lineal (psoralens) and angular (angelicins) furocoumarins. In the study, we grouped the LNs on two sets; set1: DNA-drug complex MD trajectories for active compounds and set2: MD trajectories of non-active compounds or no-optimal MD trajectories of active compounds. We calculated the respective pi(k)(L) values for all these LNs and used them as inputs to train a new classifier that discriminate set1 from set2 cases. In training series the model correctly classifies 79 out of 80 (specificity=98.75%) set1 and 226 out of 238 (Sensitivity=94.96%) set2 trajectories. In independent validation series the model correctly classifies 26 out of 26 (specificity=100%) set1 and 75 out of 78 (sensitivity=96.15%) set2 trajectories. We propose this new model as a scoring function to guide DNA-docking studies in the drug design of new coumarins for anticancer or antiparasitic PUVA therapy.
Collapse
Affiliation(s)
- Lázaro G. Pérez-Montoto
- Department of Microbiology & Parasitology, and Department of Organic Chemistry
- Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain
| | - Lourdes Santana
- Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain
| | - Humberto González-Díaz
- Department of Microbiology & Parasitology, and Department of Organic Chemistry
- Faculty of Pharmacy, University of Santiago de Compostela, 15782, Spain
| |
Collapse
|
8
|
Khodabakhshi AH, Manuch J, Rafiey A, Gupta A. Inverse protein folding in 3D hexagonal prism lattice under HPC model. J Comput Biol 2009; 16:769-802. [PMID: 19522663 DOI: 10.1089/cmb.2008.0202] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The inverse protein folding problem is that of designing an amino acid sequence which has a prescribed native protein fold. This problem arises in drug design where a particular structure is necessary to ensure proper protein-protein interactions. Previously, tubular structures for a three-dimensional (3D) hexagonal prism lattice were introduced and their stability was formally proved for simple instances under the hydrophobic-polar (HP) model of Dill. In this article, we generalize the design of tubular structures to allow for much larger variety of designable structures by allowing branching of tubes. Our generalized design could be used to roughly approximate given 3D shapes in the considered lattice. Although the generalized tubular structures are not stable under the HP model, we can prove that a simple instance of generalized tubular structures is structurally stable (all native folds have the designed shape) under a refined version of the HP model, called the HPC model. We conjecture that there is a way to choose which hydrophobic monomers are cysteines in all generalized tubular structures such that the designed proteins are structurally stable under the HPC model.
Collapse
|
9
|
Pérez-Montoto LG, Dea-Ayuela MA, Prado-Prado FJ, Bolas-Fernández F, Ubeira FM, González-Díaz H. Study of peptide fingerprints of parasite proteins and drug-DNA interactions with Markov-Mean-Energy invariants of biopolymer molecular-dynamic lattice networks. POLYMER 2009; 50:3857-3870. [PMID: 32287404 PMCID: PMC7111648 DOI: 10.1016/j.polymer.2009.05.055] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Revised: 05/06/2009] [Accepted: 05/14/2009] [Indexed: 11/26/2022]
Abstract
Since the advent of Molecular Dynamics (MD) in biopolymers science with the study by Karplus et al. on protein dynamics, MD has become the by foremost well established, computational technique to investigate structure and function of biomolecules and their respective complexes and interactions. The analysis of the MD trajectories (MDTs) remains, however, the greatest challenge and requires a great deal of insight, experience, and effort. Here, we introduce a new class of invariants for MDTs based on the spatial distribution of Mean-Energy values ξk (L) on a 2D Euclidean space representation of the MDTs. The procedure forces one MD trajectory to fold into a 2D Cartesian coordinates system using a step-by-step procedure driven by simple rules. The ξk (L) values are invariants of a Markov matrix (1 Π), which describes the probabilities of transition between two states in the new 2D space; which is associated to a graph representation of MDTs similar to the lattice networks (LNs) of DNA and protein sequences. We also introduce a new algorithm to perform phylogenetic analysis of peptides based on MDTs instead of the sequence of the polypeptide. In a first experiment, we illustrate this algorithm for 35 peptides present on the Peptide Mass Fingerprint (PMF) of a new protein of Leishmania infantum studied in this work. We report, by the first time, 2D Electrophoresis isolation, MALDI TOF Mass Spectroscopy characterization, and MASCOT search results for this PMF. In a second experiment, we construct the LNs for 422 MDTs obtained in DNA-Drug Docking simulations of the interaction of 57 anticancer furocoumarins with a DNA oligonucleotide. We calculated the respective ξk (L) values for all these LNs and used them as inputs to train a new classifier with Accuracy = 85.44% and 84.91% in training and validation respectively. The new model can be used as scoring function to guide DNA-Drug Docking studies in drug design of new coumarins for PUVA therapy. The new phylogenetics analysis algorithms encode information different from sequence similarity and may be used to analyze MDTs obtained in Docking or modeling experiments for any classes of biopolymers. The work opens new perspective on the analysis and applications of MD in polymer sciences.
Collapse
Affiliation(s)
- Lázaro Guillermo Pérez-Montoto
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - María Auxiliadora Dea-Ayuela
- Departamento de Atención Sanitaria, Salud Pública y Sanidad Animal, Facultad CC Experimentales y de La Salud, Universidad CEU Cardenal Herrera, 46113 Moncada (Valencia), Spain
| | - Francisco J Prado-Prado
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | | | - Florencio M Ubeira
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Humberto González-Díaz
- Department of Microbiology and Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| |
Collapse
|
10
|
Khodabakhshi AH, Maňuch J, Rafiey A, Gupta A. Stable Structure-Approximating Inverse Protein Folding in 2D Hydrophobic-Polar-Cysteine (HPC) Model. J Comput Biol 2009; 16:19-30. [DOI: 10.1089/cmb.2008.0096] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
| | - Ján Maňuch
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| | - Arash Rafiey
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| | - Arvind Gupta
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
11
|
Mann M, Maticzka D, Saunders R, Backofen R. Classifying proteinlike sequences in arbitrary lattice protein models using LatPack. HFSP JOURNAL 2008; 2:396-404. [PMID: 19436498 DOI: 10.2976/1.3027681] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Accepted: 10/23/2008] [Indexed: 01/06/2023]
Abstract
Knowledge of a protein's three-dimensional native structure is vital in determining its chemical properties and functionality. However, experimental methods to determine structure are very costly and time-consuming. Computational approaches such as folding simulations and structure prediction algorithms are quicker and cheaper but lack consistent accuracy. This currently restricts extensive computational studies to abstract protein models. It is thus essential that simplifications induced by the models do not negate scientific value. Key to this is the use of thoroughly defined proteinlike sequences. In such cases abstract models can allow for the investigation of important biological questions. Here, we present a procedure to generate and classify proteinlike sequence data sets. Our LatPack tools and the approach in general are applicable to arbitrary lattice protein models. Identification is based on thermodynamic kinetic features and incorporates the sequential assembly of proteins by addressing cotranslational folding. We demonstrate the approach in the widely used unrestricted 3D-cubic HP-model. The resulting sequence set is the first large data set for this model exhibiting the proteinlike properties required. Our data tools are freely available and can be used to investigate protein-related problems.
Collapse
|
12
|
Dea-Ayuela MA, Pérez-Castillo Y, Meneses-Marcel A, Ubeira FM, Bolas-Fernández F, Chou KC, González-Díaz H. HP-Lattice QSAR for dynein proteins: experimental proteomics (2D-electrophoresis, mass spectrometry) and theoretic study of a Leishmania infantum sequence. Bioorg Med Chem 2008; 16:7770-6. [PMID: 18662882 DOI: 10.1016/j.bmc.2008.07.023] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Revised: 06/23/2008] [Accepted: 07/02/2008] [Indexed: 10/21/2022]
Abstract
The toxicity and inefficacy of actual organic drugs against Leishmaniosis justify research projects to find new molecular targets in Leishmania species including Leishmania infantum (L. infantum) and Leishmaniamajor (L. major), both important pathogens. In this sense, quantitative structure-activity relationship (QSAR) methods, which are very useful in Bioorganic and Medicinal Chemistry to discover small-sized drugs, may help to identify not only new drugs but also new drug targets, if we apply them to proteins. Dyneins are important proteins of these parasites governing fundamental processes such as cilia and flagella motion, nuclear migration, organization of the mitotic splinde, and chromosome separation during mitosis. However, despite the interest for them as potential drug targets, so far there has been no report whatsoever on dyneins with QSAR techniques. To the best of our knowledge, we report here the first QSAR for dynein proteins. We used as input the Spectral Moments of a Markov matrix associated to the HP-Lattice Network of the protein sequence. The data contain 411 protein sequences of different species selected by ClustalX to develop a QSAR that correctly discriminates on average between 92.75% and 92.51% of dyneins and other proteins in four different train and cross-validation datasets. We also report a combined experimental and theoretic study of a new dynein sequence in order to illustrate the utility of the model to search for potential drug targets with a practical example. First, we carried out a 2D-electrophoresis analysis of L. infantum biological samples. Next, we excised from 2D-E gels one spot of interest belonging to an unknown protein or protein fragment in the region M<20,200 and pI<4. We used MASCOT search engine to find proteins in the L. major data base with the highest similarity score to the MS of the protein isolated from L. infantum. We used the QSAR model to predict the new sequence as dynein with probability of 99.99% without relying upon alignment. In order to confirm the previous function annotation we predicted the sequences as dynein with BLAST and the omniBLAST tools (96% alignment similarity to dyneins of other species). Using this combined strategy, we have successfully identified L. infantum protein containing dynein heavy chain, and illustrated the potential use of the QSAR model as a complement to alignment tools.
Collapse
|
13
|
Mann M, Will S, Backofen R. CPSP-tools--exact and complete algorithms for high-throughput 3D lattice protein studies. BMC Bioinformatics 2008; 9:230. [PMID: 18462492 PMCID: PMC2396640 DOI: 10.1186/1471-2105-9-230] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2007] [Accepted: 05/07/2008] [Indexed: 02/06/2023] Open
Abstract
Background The principles of protein folding and evolution pose problems of very high inherent complexity. Often these problems are tackled using simplified protein models, e.g. lattice proteins. The CPSP-tools package provides programs to solve exactly and completely the problems typical of studies using 3D lattice protein models. Among the tasks addressed are the prediction of (all) globally optimal and/or suboptimal structures as well as sequence design and neutral network exploration. Results In contrast to stochastic approaches, which are not capable of answering many fundamental questions, our methods are based on fast, non-heuristic techniques. The resulting tools are designed for high-throughput studies of 3D-lattice proteins utilising the Hydrophobic-Polar (HP) model. The source bundle is freely available [1]. Conclusion The CPSP-tools package is the first set of exact and complete methods for extensive, high-throughput studies of non-restricted 3D-lattice protein models. In particular, our package deals with cubic and face centered cubic (FCC) lattices.
Collapse
Affiliation(s)
- Martin Mann
- Bioinformatics Group, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.
| | | | | |
Collapse
|
14
|
González-Díaz H, González-Díaz Y, Santana L, Ubeira FM, Uriarte E. Proteomics, networks and connectivity indices. Proteomics 2008; 8:750-78. [DOI: 10.1002/pmic.200700638] [Citation(s) in RCA: 170] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
15
|
Agüero-Chapín G, González-Díaz H, de la Riva G, Rodríguez E, Sánchez-Rodríguez A, Podda G, Vazquez-Padrón RI. MMM-QSAR Recognition of Ribonucleases without Alignment: Comparison with an HMM Model and Isolation from Schizosaccharomyces pombe, Prediction, and Experimental Assay of a New Sequence. J Chem Inf Model 2008; 48:434-48. [DOI: 10.1021/ci7003225] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Guillermín Agüero-Chapín
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| | - Humberto González-Díaz
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| | - Gustavo de la Riva
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| | - Edrey Rodríguez
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| | - Aminael Sánchez-Rodríguez
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| | - Gianni Podda
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| | - Roberto I. Vazquez-Padrón
- Dipartimento Farmaco Chimico Tecnologico, Universitá Degli Studi di Cagliari, Cagliari, 09124, Italy, CAP, Faculty of Chemistry and Pharmacy, IBP, and CBQ, UCLV, Santa Clara 54830, Cuba, Unit for Bioinformatics & Connectivity Analysis (UBICA), Institute of Industrial Pharmacy and Department of Organic Chemistry, Faculty of Pharmacy, USC, Santiago de Compostela 15782, Spain, CINVESTAV-LANGEBIO, Irapuato, Guanajuato 36821, México, Caribbean Vitroplants, Santo Domingo 1464, Dominican Republic, and Vascular
| |
Collapse
|
16
|
Fernández M, Fernández L, Abreu JI, Garriga M. Classification of voltage-gated K(+) ion channels from 3D pseudo-folding graph representation of protein sequences using genetic algorithm-optimized support vector machines. J Mol Graph Model 2008; 26:1306-14. [PMID: 18289899 DOI: 10.1016/j.jmgm.2008.01.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2007] [Revised: 01/03/2008] [Accepted: 01/03/2008] [Indexed: 11/26/2022]
Abstract
Voltage-gated K(+) ion channels (VKCs) are membrane proteins that regulate the passage of potassium ions through membranes. This work reports a classification scheme of VKCs according to the signs of three electrophysiological variables: activation threshold voltage (V(t)), half-activation voltage (V(a50)) and half-inactivation voltage (V(h50)). A novel 3D pseudo-folding graph representation of protein sequences encoded the VKC sequences. Amino acid pseudo-folding 3D distances count (AAp3DC) descriptors, calculated from the Euclidean distances matrices (EDMs) were tested for building the classifiers. Genetic algorithm (GA)-optimized support vector machines (SVMs) with a radial basis function (RBF) kernel well discriminated between VKCs having negative and positive/zero V(t), V(a50) and V(h50) values with overall accuracies about 80, 90 and 86%, respectively, in crossvalidation test. We found contributions of the "pseudo-core" and "pseudo-surface" of the 3D pseudo-folded proteins to the discrimination between VKCs according to the three electrophysiological variables.
Collapse
Affiliation(s)
- Michael Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba.
| | | | | | | |
Collapse
|
17
|
Thachuk C, Shmygelska A, Hoos HH. A replica exchange Monte Carlo algorithm for protein folding in the HP model. BMC Bioinformatics 2007; 8:342. [PMID: 17875212 PMCID: PMC2071922 DOI: 10.1186/1471-2105-8-342] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2007] [Accepted: 09/17/2007] [Indexed: 12/04/2022] Open
Abstract
Background The ab initio protein folding problem consists of predicting protein tertiary structure from a given amino acid sequence by minimizing an energy function; it is one of the most important and challenging problems in biochemistry, molecular biology and biophysics. The ab initio protein folding problem is computationally challenging and has been shown to be NP
MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaat0uy0HwzTfgDPnwy1egaryqtHrhAL1wy0L2yHvdaiqaacqWFneVtcqqGqbauaaa@3961@-hard even when conformations are restricted to a lattice. In this work, we implement and evaluate the replica exchange Monte Carlo (REMC) method, which has already been applied very successfully to more complex protein models and other optimization problems with complex energy landscapes, in combination with the highly effective pull move neighbourhood in two widely studied Hydrophobic Polar (HP) lattice models. Results We demonstrate that REMC is highly effective for solving instances of the square (2D) and cubic (3D) HP protein folding problem. When using the pull move neighbourhood, REMC outperforms current state-of-the-art algorithms for most benchmark instances. Additionally, we show that this new algorithm provides a larger ensemble of ground-state structures than the existing state-of-the-art methods. Furthermore, it scales well with sequence length, and it finds significantly better conformations on long biological sequences and sequences with a provably unique ground-state structure, which is believed to be a characteristic of real proteins. We also present evidence that our REMC algorithm can fold sequences which exhibit significant interaction between termini in the hydrophobic core relatively easily. Conclusion We demonstrate that REMC utilizing the pull move neighbourhood significantly outperforms current state-of-the-art methods for protein structure prediction in the HP model on 2D and 3D lattices. This is particularly noteworthy, since so far, the state-of-the-art methods for 2D and 3D HP protein folding – in particular, the pruned-enriched Rosenbluth method (PERM) and, to some extent, Ant Colony Optimisation (ACO) – were based on chain growth mechanisms. To the best of our knowledge, this is the first application of REMC to HP protein folding on the cubic lattice, and the first extension of the pull move neighbourhood to a 3D lattice.
Collapse
Affiliation(s)
- Chris Thachuk
- School of Computing Science, Simon Fraser University, Burnaby, B.C., V5A 1S6, Canada
| | - Alena Shmygelska
- Department of Structural Biology, Stanford University, Stanford, CA, 94305, USA
| | - Holger H Hoos
- Department of Computer Science, University of British Columbia, B.C., V6T 1Z4, Canada
| |
Collapse
|