1
|
da Silva LSA, Seman LO, Camponogara E, Mariani VC, Dos Santos Coelho L. Bilinear optimization of protein structure prediction: An exact approach via AB off-lattice model. Comput Biol Med 2024; 176:108558. [PMID: 38754216 DOI: 10.1016/j.compbiomed.2024.108558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/25/2024] [Accepted: 05/05/2024] [Indexed: 05/18/2024]
Abstract
Protein structure prediction (PSP) remains a central challenge in computational biology due to its inherent complexity and high dimensionality. While numerous heuristic approaches have appeared in the literature, their success varies. The AB off-lattice model, which characterizes proteins as sequences of A (hydrophobic) and B (hydrophilic) beads, presents a simplified perspective on PSP. This work presents a mathematical optimization-based methodology capitalizing on the off-lattice AB model. Dissecting the inherent non-linearities of the energy landscape of protein folding allowed for formulating the PSP as a bilinear optimization problem. This formulation was achieved by introducing auxiliary variables and constraints that encapsulate the nuanced relationship between the protein's conformational space and its energy landscape. The proposed bilinear model exhibited notable accuracy in pinpointing the global minimum energy conformations on a benchmark dataset presented by the Protein Data Bank (PDB). Compared to traditional heuristic-based methods, this bilinear approach yielded exact solutions, reducing the likelihood of local minima entrapment. This research highlights the potential of reframing the traditionally non-linear protein structure prediction problem into a bilinear optimization problem through the off-lattice AB model. Such a transformation offers a route toward methodologies that can determine the global solution, challenging current PSP paradigms. Exploration into hybrid models, merging bilinear optimization and heuristic components, might present an avenue for balancing accuracy with computational efficiency.
Collapse
Affiliation(s)
- Luiza Scapinello Aquino da Silva
- Electrical Engineering Graduate Program (PPGEE), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil.
| | - Laio Oriel Seman
- Department of Automation and Systems Engineering, Federal University of Santa Catarina (UFSC), Engenheiro Agronômico Andrei Cristian Ferreira, Florianópolis, 88040-900, Santa Catarina, Brazil
| | - Eduardo Camponogara
- Department of Automation and Systems Engineering, Federal University of Santa Catarina (UFSC), Engenheiro Agronômico Andrei Cristian Ferreira, Florianópolis, 88040-900, Santa Catarina, Brazil
| | - Viviana Cocco Mariani
- Electrical Engineering Graduate Program (PPGEE), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil; Mechanical Engineering Graduate Program (PGMec), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil
| | - Leandro Dos Santos Coelho
- Electrical Engineering Graduate Program (PPGEE), Federal University of Parana (UFPR), Coronel Francisco Heraclito dos Santos, Curitiba, 81530-000, Paraná, Brazil
| |
Collapse
|
2
|
Osifová Z, Kalvoda T, Galgonek J, Culka M, Vondrášek J, Bouř P, Bednárová L, Andrushchenko V, Dračínský M, Rulíšek L. What are the minimal folding seeds in proteins? Experimental and theoretical assessment of secondary structure propensities of small peptide fragments. Chem Sci 2024; 15:594-608. [PMID: 38179543 PMCID: PMC10763034 DOI: 10.1039/d3sc04960d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 11/22/2023] [Indexed: 01/06/2024] Open
Abstract
Certain peptide sequences, some of them as short as amino acid triplets, are significantly overpopulated in specific secondary structure motifs in folded protein structures. For example, 74% of the EAM triplet is found in α-helices, and only 3% occurs in the extended parts of proteins (typically β-sheets). In contrast, other triplets (such as VIV and IYI) appear almost exclusively in extended parts (79% and 69%, respectively). In order to determine whether such preferences are structurally encoded in a particular peptide fragment or appear only at the level of a complex protein structure, NMR, VCD, and ECD experiments were carried out on selected tripeptides: EAM (denoted as pro-'α-helical' in proteins), KAM(α), ALA(α), DIC(α), EKF(α), IYI(pro-β-sheet or more generally, pro-extended), and VIV(β), and the reference α-helical CATWEAMEKCK undecapeptide. The experimental data were in very good agreement with extensive quantum mechanical conformational sampling. Altogether, we clearly showed that the pro-helical vs. pro-extended propensities start to emerge already at the level of tripeptides and can be fully developed at longer sequences. We postulate that certain short peptide sequences can be considered minimal "folding seeds". Admittedly, the inherent secondary structure propensity can be overruled by the large intramolecular interaction energies within the folded and compact protein structures. Still, the correlation of experimental and computational data presented herein suggests that the secondary structure propensity should be considered as one of the key factors that may lead to understanding the underlying physico-chemical principles of protein structure and folding from the first principles.
Collapse
Affiliation(s)
- Zuzana Osifová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
- Department of Organic Chemistry, Faculty of Science, Charles University Hlavova 2030 Prague 128 00 Czech Republic
| | - Tadeáš Kalvoda
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Petr Bouř
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Lucie Bednárová
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Valery Andrushchenko
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Martin Dračínský
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Flemingovo náměstí 2, 160 00, Praha 6 Czech Republic
| |
Collapse
|
3
|
Imidazole-amino acids. Conformational switch under tautomer and pH change. Amino Acids 2023; 55:33-49. [PMID: 36319875 PMCID: PMC9877100 DOI: 10.1007/s00726-022-03201-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 08/16/2022] [Indexed: 01/26/2023]
Abstract
Replacement of the main chain peptide bond by imidazole ring seems to be a promising tool for the peptide-based drug design, due to the specific prototropic tautomeric as well as amphoteric properties. In this study, we present that both tautomer and pH change can cause a conformational switch of the studied residues of alanine (1-4) and dehydroalanine (5-8) with the C-terminal peptide group replaced by imidazole. The DFT methods are applied and an environment of increasing polarity is simulated. The conformational maps (Ramachandram diagrams) are presented and the stability of possible conformations is discussed. The neutral forms, tautomers τ (1) and π (2), adapt the conformations αRτ (φ, ψ = - 75°, - 114°) and C7eq (φ, ψ = - 75°, 66°), respectively. Their torsion angles ψ differ by about 180°, which results in a considerable impact on the peptide chain conformation. The cation form (3) adapts both these conformations, whereas the anion analogue (4) prefers the conformations C5 (φ, ψ = - 165°, - 178°) and β2 (φ, ψ ~ - 165°, - 3°). Dehydroamino acid analogues, the tautomers τ (5) and π (6) as well as the anion form (8), have a strong tendency toward the conformations β2 (φ, ψ = - 179°, 0°) and C5 (φ, ψ = - 180°, 180°). The preferences of the protonated imidazolium form (7) depend on the environment. The imidazole ring, acting as a donor or acceptor of the hydrogen bonds created within the studied residues, has a profound effect on the type of conformation.
Collapse
|
4
|
Dicks L, Wales DJ. Exploiting Sequence-Dependent Rotamer Information in Global Optimization of Proteins. J Phys Chem B 2022; 126:8381-8390. [PMID: 36257022 PMCID: PMC9623586 DOI: 10.1021/acs.jpcb.2c04647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Rotamers, namely amino acid side chain conformations common to many different peptides, can be compiled into libraries. These rotamer libraries are used in protein modeling, where the limited conformational space occupied by amino acid side chains is exploited. Here, we construct a sequence-dependent rotamer library from simulations of all possible tripeptides, which provides rotameric states dependent on adjacent amino acids. We observe significant sensitivity of rotamer populations to sequence and find that the library is successful in locating side chain conformations present in crystal structures. The library is designed for applications with basin-hopping global optimization, where we use it to propose moves in conformational space. The addition of rotamer moves significantly increases the efficiency of protein structure prediction within this framework, and we determine parameters to optimize efficiency.
Collapse
Affiliation(s)
- L. Dicks
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,IBM
Research, The Hartree Centre STFC Laboratory,
Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - D. J. Wales
- Yusuf
Hamied Department of Chemistry, University
of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom,
| |
Collapse
|
5
|
Kalvoda T, Culka M, Rulíšek L, Andris E. Exhaustive Mapping of the Conformational Space of Natural Dipeptides by the DFT-D3//COSMO-RS Method. J Phys Chem B 2022; 126:5949-5958. [PMID: 35930560 DOI: 10.1021/acs.jpcb.2c02861] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We extensively mapped energy landscapes and conformations of 22 (including three His protonation states) proteinogenic α-amino acids in trans configuration and the corresponding 484 (222) dipeptides. To mimic the environment in a protein chain, the N- and C-termini of the studied systems were capped with acetyl and N-methylamide groups, respectively. We systematically varied the main chain dihedral angles (ϕ, ψ) by 40° steps and all side chain angles by 90° or 120° steps. We optimized the molecular geometries with the GFN2-xTB semiempirical (SQM) method and performed single point density functional theory calculations at the BP86-D3/DGauss-DZVP//COSMO-RS level in water, 1-octanol, N,N-dimethylformamide, and n-hexane. For each restrained (nonequilibrium) structure, we also calculated energy gradients (in water) and natural atomic charges. The exhaustive and unprecedented QM-based sampling enabled us to construct Ramachandran plots of quantum mechanical (QM(BP86-D3)//COSMO-RS) energies calculated on SQM structures, for all 506 (484 dipeptides and 22 amino acids) studied systems. We showed how the character of an amino acid side chain influences the conformational space of single amino acids and dipeptides. With clustering techniques, we were able to identify unique minima of amino acids and dipeptides (i.e., minima on the GFN2-xTB potential energy surfaces) and analyze the distribution of their BP86-D3//COSMO-RS conformational energies in all four solvents. We also derived an empirical formula for the number of unique minima based on the overall number of rotatable bonds within each peptide. The final peptide conformer data set (PeptideCs) comprises over 400 million structures, all of them annotated with QM(BP86-D3)//COSMO-RS energies. Thanks to its completeness and unbiased nature, the PeptideCs can serve, inter alia, as a data set for the validation of new methods for predicting the energy landscapes of protein structures. This data set may also prove to be useful in the development and reparameterization of biomolecular force fields. The data set is deposited at Figshare (10.25452/figshare.plus.19607172) and can be accessed using a simple web interface at http://peptidecs.uochb.cas.cz.
Collapse
Affiliation(s)
- Tadeáš Kalvoda
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| | - Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| | - Erik Andris
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha, Czech Republic
| |
Collapse
|
6
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Fast and Accurate Quantum Mechanical Modeling of Large Molecular Systems Using Small Basis Set Hartree-Fock Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2208-2232. [PMID: 35313106 DOI: 10.1021/acs.jctc.1c01128] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
There has been significant interest in developing fast and accurate quantum mechanical methods for modeling large molecular systems. In this work, by utilizing a machine learning regression technique, we have developed new low-cost quantum mechanical approaches to model large molecular systems. The developed approaches rely on using one-electron Gaussian-type functions called atom-centered potentials (ACPs) to correct for the basis set incompleteness and the lack of correlation effects in the underlying minimal or small basis set Hartree-Fock (HF) methods. In particular, ACPs are proposed for ten elements common in organic and bioorganic chemistry (H, B, C, N, O, F, Si, P, S, and Cl) and four different base methods: two minimal basis sets (MINIs and MINIX) plus a double-ζ basis set (6-31G*) in combination with dispersion-corrected HF (HF-D3/MINIs, HF-D3/MINIX, HF-D3/6-31G*) and the HF-3c method. The new ACPs are trained on a very large set (73 832 data points) of noncovalent properties (interaction and conformational energies) and validated additionally on a set of 32 048 data points. All reference data are of complete basis set coupled-cluster quality, mostly CCSD(T)/CBS. The proposed ACP-corrected methods are shown to give errors in the tenths of a kcal/mol range for noncovalent interaction energies and up to 2 kcal/mol for molecular conformational energies. More importantly, the average errors are similar in the training and validation sets, confirming the robustness and applicability of these methods outside the boundaries of the training set. In addition, the performance of the new ACP-corrected methods is similar to complete basis set density functional theory (DFT) but at a cost that is orders of magnitude lower, and the proposed ACPs can be used in any computational chemistry program that supports effective-core potentials without modification. It is also shown that ACPs improve the description of covalent and noncovalent bond geometries of the underlying methods and that the improvement brought about by the application of the ACPs is directly related to the number of atoms to which they are applied, allowing the treatment of systems containing some atoms for which ACPs are not available. Overall, the ACP-corrected methods proposed in this work constitute an alternative accurate, economical, and reliable quantum mechanical approach to describe the geometries, interaction energies, and conformational energies of systems with hundreds to thousands of atoms.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Alberto Otero-de-la-Roza
- MALTA Consolider Team, Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, E-33006 Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| |
Collapse
|
7
|
Sladek V, Harada R, Shigeta Y. Residue Folding Degree-Relationship to Secondary Structure Categories and Use as Collective Variable. Int J Mol Sci 2021; 22:ijms222313042. [PMID: 34884847 PMCID: PMC8657879 DOI: 10.3390/ijms222313042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 11/23/2021] [Accepted: 11/29/2021] [Indexed: 11/22/2022] Open
Abstract
Recently, we have shown that the residue folding degree, a network-based measure of folded content in proteins, is able to capture backbone conformational transitions related to the formation of secondary structures in molecular dynamics (MD) simulations. In this work, we focus primarily on developing a collective variable (CV) for MD based on this residue-bound parameter to be able to trace the evolution of secondary structure in segments of the protein. We show that this CV can do just that and that the related energy profiles (potentials of mean force, PMF) and transition barriers are comparable to those found by others for particular events in the folding process of the model mini protein Trp-cage. Hence, we conclude that the relative segment folding degree (the newly proposed CV) is a computationally viable option to gain insight into the formation of secondary structures in protein dynamics. We also show that this CV can be directly used as a measure of the amount of α-helical content in a selected segment.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry, Slovak Academy of Sciences, 845 38 Bratislava, Slovakia
- Correspondence:
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan; (R.H.); (Y.S.)
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tsukuba 305-8577, Ibaraki, Japan; (R.H.); (Y.S.)
| |
Collapse
|
8
|
Staś M, Broda MA, Siodłak D. Thiazole-amino acids: influence of thiazole ring on conformational properties of amino acid residues. Amino Acids 2021; 53:673-686. [PMID: 33837859 PMCID: PMC8128816 DOI: 10.1007/s00726-021-02974-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 03/29/2021] [Indexed: 12/29/2022]
Abstract
Post-translational modified thiazole-amino acid (Xaa-Tzl) residues have been found in macrocyclic peptides (e.g., thiopeptides and cyanobactins), which mostly inhibit protein synthesis in Gram + bacteria. Conformational study of the series of model compounds containing this structural motif with alanine, dehydroalanine, dehydrobutyrine and dehydrophenylalanine were performed using DFT method in various environments. The solid-state crystal structure conformations of thiazole-amino acid residues retrieved from the Cambridge Structural Database were also analysed. The studied structural units tend to adopt the unique semi-extended β2 conformation; which is stabilised mainly by N-H⋯NTzl hydrogen bond, and for dehydroamino acids also by π-electron conjugation. The conformational preferences of amino acids with a thiazole ring were compared with oxazole analogues and the role of the sulfur atom in stabilising the conformations of studied peptides was discussed.
Collapse
Affiliation(s)
- Monika Staś
- Faculty of Chemistry, University of Opole, 45-052, Opole, Poland.
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Science, Flemingovo Náměstí 2, 166 10, Praha 6, Czech Republic.
| | | | - Dawid Siodłak
- Faculty of Chemistry, University of Opole, 45-052, Opole, Poland.
| |
Collapse
|
9
|
Ravikumar A, de Brevern AG, Srinivasan N. Conformational Strain Indicated by Ramachandran Angles for the Protein Backbone Is Only Weakly Related to the Flexibility. J Phys Chem B 2021; 125:2597-2606. [PMID: 33666418 DOI: 10.1021/acs.jpcb.1c00168] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Studies on energy associated with free dipeptides have shown that conformers with unfavorable (ϕ,ψ) torsion angles have higher energy compared to conformers with favorable (ϕ,ψ) angles. It is expected that higher energy confers higher dynamics and flexibility to that part of the protein. Here, we explore a potential relationship between conformational strain in a residue due to unfavorable (ϕ,ψ) angles and its flexibility and dynamics in the context of protein structures. We compared flexibility of strained and relaxed residues, which are recognized based on outlier/allowed and favorable (ϕ,ψ) angles respectively, using normal-mode analysis (NMA). We also performed in-depth analysis on flexibility and dynamics at catalytic residues in protein kinases, which exhibit different strain status in different kinase structures using NMA and molecular dynamics simulations. We underline that strain of a residue, as defined by backbone torsion angles, is almost unrelated to the flexibility and dynamics associated with it. Even the overall trend observed among all high-resolution structures in which relaxed residues tend to have slightly higher flexibility than strained residues is counterintuitive. Consequently, we propose that identifying strained residues based on (ϕ,ψ) values is not an effective way to recognize energetic strain in protein structures.
Collapse
Affiliation(s)
- Ashraya Ravikumar
- Molecular Biophysics Unit, Indian Institute of Science, Bengaluru, India, 560012
| | - Alexandre G de Brevern
- INSERM, U 1134, DSIMB, Paris F-75739, France.,University of Paris, Paris F-75739, France.,Institut National de la Transfusion Sanguine (INTS), Paris F-75739, France.,Laboratoire d'Excellence GR-Ex, Paris F-75739, France
| | | |
Collapse
|
10
|
Culka M, Kalvoda T, Gutten O, Rulíšek L. Mapping Conformational Space of All 8000 Tripeptides by Quantum Chemical Methods: What Strain Is Affordable within Folded Protein Chains? J Phys Chem B 2021; 125:58-69. [PMID: 33393778 DOI: 10.1021/acs.jpcb.0c09251] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
To gain more insight into the physicochemical aspects of a protein structure from the first principles, conformational space of all 8000 "capped" tripeptides (i.e., N-Ac-X1X2X3-NH-CH3, where Xi is one of the 20 natural amino acids) was investigated computationally. An enormous dataset (denoted P-CONF_1.6M and containing close to 1 600 000 conformers in total) has been obtained by employing a composite protocol combining density functional theory, semiempirical quantum mechanics (SQM), and state-of-the-art solvation methods with 1000 K molecular dynamics (MD) used to generate initial structures (200 snapshots for each tripeptide). This allowed us to present the first rigorous QM-based glimpse at the vast conformational space spanned by small protein fragments. The same computational procedure was repeated for tripeptide fragments taken from the SCOPe database of three-dimensional protein folds, by restraining them to their geometry in a protein. Such complementary data allowed us to compare the distribution of conformational strain energies of unrestrained tripeptidic fragments "in solvent" with those in existing protein chains. Besides providing a rigorous (ab initio) proof of a few well-known concepts and hypotheses concerning protein structures, such as the distribution of (φ, ψ) angles in Ramachandran plots, we have made several observations that came as a certain surprise: (1) distribution of conformational energies does not significantly differ between the "unbiased/unrestrained" conformers obtained from MD sampling in solvent and the biased conformers, i.e., those of a given tripeptide obtained from protein structures; (2) conformational (strain) energy window up to ∼20 to 25 kcal·mol-1 is readily available to tripeptide fragments within the context of a protein chain; (3) overpopulation in certain regions of Ramachandran plot was observed for the unbiased conformers. Last but not least, the massive dataset of accurate (DFT-D3//COSMO-RS) conformational (free) energies of ∼1.6 M peptide conformers, P-CONF_1.6M, obtained throughout this work may serve as excellent dataset for calibrating and benchmarking of popular force fields.
Collapse
Affiliation(s)
- Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| | - Tadeáš Kalvoda
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| | - Ondrej Gutten
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| |
Collapse
|
11
|
Culka M, Rulíšek L. Interplay between Conformational Strain and Intramolecular Interaction in Protein Structures: Which of Them Is Evolutionarily Conserved? J Phys Chem B 2020; 124:3252-3260. [PMID: 32237747 DOI: 10.1021/acs.jpcb.9b11784] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
By computing strain energies of peptide fragments within protein structures and their intramolecular interaction energies, we attempt to reveal general biophysical trends behind the secondary structure formation in the context of protein evolution. Our "protein basis set" consisted of 1143 representatives of different folds obtained from curated SCOPe database, and for each member of the set, the strain and intramolecular energy was calculated on the "rolling tripeptide" basis, employing the DFT-D3/COSMO-RS method for the former and the QM-calibrated force field method (MM) for the latter. The calculated data, strain and interactions, were correlated with the conservation of amino acid residues in secondary structure elements and also with the level of the residue burial within the protein three-dimensional structure. It allowed us to formulate several observations concerning fundamental differences between two main secondary structure motifs: α-helices and β-strands. We have shown that a strong interaction is one of the determining characteristics of the β-sheet formation, at least at the level of tripeptides (and likely penta- or heptapeptides, too), and that the β-strand is a prevailing secondary structure in the strongly-interacting regions of the protein folds conserved by evolution. On the other hand, low strain was neither proven to be an important physicochemical property conserved by evolution nor does it correlate with the propensity for the α-helix and β-strand. Finally, it has been demonstrated that the strong interaction has a certain level of connection with residue burial; however, we demonstrate that these two characteristics should be rather regarded as two complementary factors. These findings represent an important contribution to understanding protein folding from first principles, which is a complementary approach to ongoing efforts to solve the protein folding problem by knowledge-based approaches and machine-learning.
Collapse
Affiliation(s)
- Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, 166 10 Praha 6, Czech Republic
| |
Collapse
|
12
|
Abstract
The analysis of folding trajectories for proteins is an open challenge. One of the problems is how to describe the amount of folded secondary structure in a protein. We extend the use of Estradas' folding degree (Bioinformatics 2002, 18, 697) for the analysis of the evolution of the folding stage during molecular dynamics (MD) simulation. It is shown that residue contribution to the total folding degree is a predominantly local property, well-defined by the backbone dihedral angles at the given residue, without significant contribution from the backbone conformation of other residues. Moreover, the magnitude of this residue contribution can be quite easily associated with characteristic motifs of secondary protein structures such as the α-helix, β-sheet (hairpin), and so on by means of a Ramachandran-like plot as a function of backbone dihedral angles φ,ψ. Additionally, the understanding of the free energy profile associated with the folding process becomes much simpler. Often a 1D profile is sufficient to locate global minima and the corresponding structure for short peptides.
Collapse
Affiliation(s)
- Vladimir Sladek
- Institute of Chemistry - Centre for Glycomics, Dubravska cesta 9, 84538 Bratislava, Slovakia.,Agency for Medical Research and Development (AMED), Chiyoda-ku, Japan
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8577, Japan
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, Tennodai 1-1-1, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
13
|
Culka M, Rulíšek L. Factors Stabilizing β-Sheets in Protein Structures from a Quantum-Chemical Perspective. J Phys Chem B 2019; 123:6453-6461. [PMID: 31287693 DOI: 10.1021/acs.jpcb.9b04866] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Protein folds are determined by the interplay between various (de)stabilizing forces, which can be broadly divided into a local strain of the protein chain and intramolecular interactions. In contrast to the α-helix, the β-sheet secondary protein structure is significantly stabilized by long-range interactions between the individual β-strands. It has been observed that quite diverse amino acid sequences can form a very similar small β-sheet fold, such as in the three-β-strand WW domain. Employing "calibrated" quantum-chemical methods, we show herein on two sequentially diverse examples of the WW domain that the internal strain energy is higher in the β-strands and lower in the loops, while the interaction energy has an opposite trend. Low strain energy computed for peptide sequences in the loop 1 correlates with its postulated early formation in the folding process. The relatively high strain energy within the β-strands (up to 8 kcal mol-1 per amino acid residue) is compensated by even higher intramolecular interaction energy (up to 15 kcal mol-1 per residue). It is shown in a quantitative way that the most conserved residues across the structural family of WW domains have the highest contributions to the intramolecular interaction energy. On the other hand, the residues in the regions with the lowest strain are not conserved. We conclude that the internal interaction energy is the physical quantity tuned by evolution to define the β-sheet protein fold.
Collapse
Affiliation(s)
- Martin Culka
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences , Flemingovo náměstí 2 , 166 10 Praha 6 , Czech Republic
| | - Lubomír Rulíšek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences , Flemingovo náměstí 2 , 166 10 Praha 6 , Czech Republic
| |
Collapse
|