1
|
Turina P, Fariselli P, Capriotti E. K-Pro: Kinetics Data on Proteins and Mutants. J Mol Biol 2023; 435:168245. [PMID: 37625584 DOI: 10.1016/j.jmb.2023.168245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 08/16/2023] [Accepted: 08/17/2023] [Indexed: 08/27/2023]
Abstract
The study of protein folding plays a crucial role in improving our understanding of protein function and of the relationship between genetics and phenotypes. In particular, understanding the thermodynamics and kinetics of the folding process is important for uncovering the mechanisms behind human disorders caused by protein misfolding. To address this issue, it is essential to collect and curate experimental kinetic and thermodynamic data on protein folding. K-Pro is a new database designed for collecting and storing experimental kinetic data on monomeric proteins, with a two-state folding mechanism. With 1,529 records from 62 proteins corresponding to 65 structures, K-Pro contains various kinetic parameters such as the logarithm of the folding and unfolding rates, Tanford's β and the ϕ values. When available, the database also includes thermodynamic parameters associated with the kinetic data. K-Pro features a user-friendly interface that allows browsing and downloading kinetic data of interest. The graphical interface provides a visual representation of the protein and mutants, and it is cross-linked to key databases such as PDB, UniProt, and PubMed. K-Pro is open and freely accessible through https://folding.biofold.org/k-pro and supports the latest versions of popular browsers.
Collapse
Affiliation(s)
- Paola Turina
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Via F. Selmi 3, 40126 Bologna, Italy.
| |
Collapse
|
2
|
Yang Y, Chong Z, Vihinen M. PON-Fold: Prediction of Substitutions Affecting Protein Folding Rate. Int J Mol Sci 2023; 24:13023. [PMID: 37629203 PMCID: PMC10455311 DOI: 10.3390/ijms241613023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/08/2023] [Accepted: 08/09/2023] [Indexed: 08/27/2023] Open
Abstract
Most proteins fold into characteristic three-dimensional structures. The rate of folding and unfolding varies widely and can be affected by variations in proteins. We developed a novel machine-learning-based method for the prediction of the folding rate effects of amino acid substitutions in two-state folding proteins. We collected a data set of experimentally defined folding rates for variants and used them to train a gradient boosting algorithm starting with 1161 features. Two predictors were designed. The three-class classifier had, in blind tests, specificity and sensitivity ranging from 0.324 to 0.419 and from 0.256 to 0.451, respectively. The other tool was a regression predictor that showed a Pearson correlation coefficient of 0.525. The error measures, mean absolute error and mean squared error, were 0.581 and 0.603, respectively. One of the previously presented tools could be used for comparison with the blind test data set, our method called PON-Fold showed superior performance on all used measures. The applicability of the tool was tested by predicting all possible substitutions in a protein domain. Predictions for different conformations of proteins, open and closed forms of a protein kinase, and apo and holo forms of an enzyme indicated that the choice of the structure had a large impact on the outcome. PON-Fold is freely available.
Collapse
Affiliation(s)
- Yang Yang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
- Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210000, China
| | - Zhang Chong
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China; (Y.Y.); (Z.C.)
| | - Mauno Vihinen
- Department of Experimental Medical Science, Lund University, BMC B13, SE-221 84 Lund, Sweden
| |
Collapse
|
3
|
Kaur U, Kihn KC, Ke H, Kuo W, Gierasch LM, Hebert DN, Wintrode PL, Deredge D, Gershenson A. The conformational landscape of a serpin N-terminal subdomain facilitates folding and in-cell quality control. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.24.537978. [PMID: 37163105 PMCID: PMC10168285 DOI: 10.1101/2023.04.24.537978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Many multi-domain proteins including the serpin family of serine protease inhibitors contain non-sequential domains composed of regions that are far apart in sequence. Because proteins are translated vectorially from N- to C-terminus, such domains pose a particular challenge: how to balance the conformational lability necessary to form productive interactions between early and late translated regions while avoiding aggregation. This balance is mediated by the protein sequence properties and the interactions of the folding protein with the cellular quality control machinery. For serpins, particularly α 1 -antitrypsin (AAT), mutations often lead to polymer accumulation in cells and consequent disease suggesting that the lability/aggregation balance is especially precarious. Therefore, we investigated the properties of progressively longer AAT N-terminal fragments in solution and in cells. The N-terminal subdomain, residues 1-190 (AAT190), is monomeric in solution and efficiently degraded in cells. More β -rich fragments, 1-290 and 1-323, form small oligomers in solution, but are still efficiently degraded, and even the polymerization promoting Siiyama (S53F) mutation did not significantly affect fragment degradation. In vitro, the AAT190 region is among the last regions incorporated into the final structure. Hydrogen-deuterium exchange mass spectrometry and enhanced sampling molecular dynamics simulations show that AAT190 has a broad, dynamic conformational ensemble that helps protect one particularly aggregation prone β -strand from solvent. These AAT190 dynamics result in transient exposure of sequences that are buried in folded, full-length AAT, which may provide important recognition sites for the cellular quality control machinery and facilitate degradation and, under favorable conditions, reduce the likelihood of polymerization.
Collapse
Affiliation(s)
- Upneet Kaur
- Department of Biochemistry & Molecular Biology, University of Massachusetts, Amherst, MA 01003
| | - Kyle C. Kihn
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| | - Haiping Ke
- Department of Biochemistry & Molecular Biology, University of Massachusetts, Amherst, MA 01003
| | - Weiwei Kuo
- Department of Biochemistry & Molecular Biology, University of Massachusetts, Amherst, MA 01003
| | - Lila M. Gierasch
- Department of Biochemistry & Molecular Biology, University of Massachusetts, Amherst, MA 01003
- Program in Molecular and Cellular Biology, University of Massachusetts, Amherst, MA 01003
- Department of Chemistry, University of Massachusetts, Amherst, MA 01003
| | - Daniel N. Hebert
- Department of Biochemistry & Molecular Biology, University of Massachusetts, Amherst, MA 01003
- Program in Molecular and Cellular Biology, University of Massachusetts, Amherst, MA 01003
| | - Patrick L. Wintrode
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| | - Daniel Deredge
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, MD 21201
| | - Anne Gershenson
- Department of Biochemistry & Molecular Biology, University of Massachusetts, Amherst, MA 01003
- Program in Molecular and Cellular Biology, University of Massachusetts, Amherst, MA 01003
| |
Collapse
|
4
|
Jaswal SS. Lessons from a quarter century of being human in protein science. Protein Sci 2022; 31:768-783. [PMID: 35048424 PMCID: PMC8927861 DOI: 10.1002/pro.4278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 01/11/2022] [Indexed: 11/05/2022]
Abstract
Over the past quarter century, my engagement with the protein society has allowed me to witness first-hand the evolution of our deepening understanding of the complexity of protein folding landscapes. During my own evolution as a protein scientist, my passion for protein folding has deepened into an obsession with mapping and decoding the thermodynamic and kinetic secrets of protein landscapes - especially those of rebel proteins, whose "non-traditional" behavior has challenged our paradigms and inspired the expansion of our models and methods. It is perhaps not surprising that I see parallels in the evolution of the landscape framework and in the development of our own trajectories as humans in STEM. Just as with proteins however, we need to recognize that our individual human landscapes are not isolated from our local departmental and institutional communities, and are integrated into the larger networks of our STEM disciplines, academia, industry and/or government, not to mention society. My experience with hundreds of participants in the Being Human in STEM initiative that Amherst College undergraduates and I co-founded in 2016 has helped me find hope for STEM and humanity. If we commit to reconciling our identities as scientists with our responsibilities as human beings, together we can accelerate the evolution of individual, community and societal landscapes to contribute to addressing the dire challenges facing our planet. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Sheila S Jaswal
- Department of Chemistry, and Program in Biochemistry & Biophysics Amherst College
| |
Collapse
|
5
|
McBride JM, Tlusty T. Slowest-first protein translation scheme: Structural asymmetry and co-translational folding. Biophys J 2021; 120:5466-5477. [PMID: 34813729 PMCID: PMC8715247 DOI: 10.1016/j.bpj.2021.11.024] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 09/30/2021] [Accepted: 11/17/2021] [Indexed: 11/19/2022] Open
Abstract
Proteins are translated from the N to the C terminus, raising the basic question of how this innate directionality affects their evolution. To explore this question, we analyze 16,200 structures from the Protein Data Bank (PDB). We find remarkable enrichment of α helices at the C terminus and β strands at the N terminus. Furthermore, this α-β asymmetry correlates with sequence length and contact order, both determinants of folding rate, hinting at possible links to co-translational folding (CTF). Hence, we propose the "slowest-first" scheme, whereby protein sequences evolved structural asymmetry to accelerate CTF: the slowest of the cooperatively folding segments are positioned near the N terminus so they have more time to fold during translation. A phenomenological model predicts that CTF can be accelerated by asymmetry in folding rate, up to double the rate, when folding time is commensurate with translation time; analysis of the PDB predicts that structural asymmetry is indeed maximal in this regime. This correspondence is greater in prokaryotes, which generally require faster protein production. Altogether, this indicates that accelerating CTF is a substantial evolutionary force whose interplay with stability and functionality is encoded in secondary structure asymmetry.
Collapse
Affiliation(s)
- John M McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan, South Korea.
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan, South Korea; Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan, South Korea.
| |
Collapse
|
6
|
Scalvini B, Sheikhhassani V, Mashaghi A. Topological principles of protein folding. Phys Chem Chem Phys 2021; 23:21316-21328. [PMID: 34545868 DOI: 10.1039/d1cp03390e] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
What is the topology of a protein and what governs protein folding to a specific topology? This is a fundamental question in biology. The protein folding reaction is a critically important cellular process, which is failing in many prevalent diseases. Understanding protein folding is also key to the design of new proteins for applications. However, our ability to predict the folding of a protein chain is quite limited and much is still unknown about the topological principles of folding. Current predictors of folding kinetics, including the contact order and size, present a limited predictive power, suggesting that these models are fundamentally incomplete. Here, we use a newly developed mathematical framework to define and extract the topology of a native protein conformation beyond knot theory, and investigate the relationship between native topology and folding kinetics in experimentally characterized proteins. We show that not only the folding rate, but also the mechanistic insight into folding mechanisms can be inferred from topological parameters. We identify basic topological features that speed up or slow down the folding process. The approach enabled the decomposition of protein 3D conformation into topologically independent elementary folding units, called circuits. The number of circuits correlates significantly with the folding rate, offering not only an efficient kinetic predictor, but also a tool for a deeper understanding of theoretical folding models. This study contributes to recent work that reveals the critical relevance of topology to protein folding with a new, contact-based, mathematically rigorous perspective. We show that topology can predict folding kinetics when geometry-based predictors like contact order and size fail.
Collapse
Affiliation(s)
- Barbara Scalvini
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Vahid Sheikhhassani
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Alireza Mashaghi
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| |
Collapse
|
7
|
Turina P, Fariselli P, Capriotti E. ThermoScan: Semi-automatic Identification of Protein Stability Data From PubMed. Front Mol Biosci 2021; 8:620475. [PMID: 33842537 PMCID: PMC8027235 DOI: 10.3389/fmolb.2021.620475] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 02/18/2021] [Indexed: 11/13/2022] Open
Abstract
During the last years, the increasing number of DNA sequencing and protein mutagenesis studies has generated a large amount of variation data published in the biomedical literature. The collection of such data has been essential for the development and assessment of tools predicting the impact of protein variants at functional and structural levels. Nevertheless, the collection of manually curated data from literature is a highly time consuming and costly process that requires domain experts. In particular, the development of methods for predicting the effect of amino acid variants on protein stability relies on the thermodynamic data extracted from literature. In the past, such data were deposited in the ProTherm database, which however is no longer maintained since 2013. For facilitating the collection of protein thermodynamic data from literature, we developed the semi-automatic tool ThermoScan. ThermoScan is a text mining approach for the identification of relevant thermodynamic data on protein stability from full-text articles. The method relies on a regular expression searching for groups of words, including the most common conceptual words appearing in experimental studies on protein stability, several thermodynamic variables, and their units of measure. ThermoScan analyzes full-text articles from the PubMed Central Open Access subset and calculates an empiric score that allows the identification of manuscripts reporting thermodynamic data on protein stability. The method was optimized on a set of publications included in the ProTherm database, and tested on a new curated set of articles, manually selected for presence of thermodynamic data. The results show that ThermoScan returns accurate predictions and outperforms recently developed text-mining algorithms based on the analysis of publication abstracts. Availability: The ThermoScan server is freely accessible online at https://folding.biofold.org/thermoscan. The ThermoScan python code and the Google Chrome extension for submitting visualized PMC web pages to the ThermoScan server are available at https://github.com/biofold/ThermoScan.
Collapse
Affiliation(s)
- Paola Turina
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Torino, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| |
Collapse
|
8
|
PFDB: A standardized protein folding database with temperature correction. Sci Rep 2019; 9:1588. [PMID: 30733462 PMCID: PMC6367381 DOI: 10.1038/s41598-018-36992-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Accepted: 11/22/2018] [Indexed: 11/23/2022] Open
Abstract
We constructed a standardized protein folding kinetics database (PFDB) in which the logarithmic rate constants of all listed proteins are calculated at the standard temperature (25 °C). A temperature correction based on the Eyring–Kramers equation was introduced for proteins whose folding kinetics were originally measured at temperatures other than 25 °C. We verified the temperature correction by comparing the logarithmic rate constants predicted and experimentally observed at 25 °C for 14 different proteins, and the results demonstrated improvement of the quality of the database. PFDB consists of 141 (89 two-state and 52 non-two-state) single-domain globular proteins, which has the largest number among the currently available databases of protein folding kinetics. PFDB is thus intended to be used as a standard for developing and testing future predictive and theoretical studies of protein folding. PFDB can be accessed from the following link: http://lee.kias.re.kr/~bala/PFDB.
Collapse
|
9
|
Censoni L, Martínez L. Prediction of kinetics of protein folding with non-redundant contact information. Bioinformatics 2018; 34:4034-4038. [DOI: 10.1093/bioinformatics/bty478] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 06/12/2018] [Indexed: 11/14/2022] Open
Affiliation(s)
- Luciano Censoni
- Institute of Chemistry and Center for Computational Engineering and Science, University of Campinas, Campinas, SP, Brazil
| | - Leandro Martínez
- Institute of Chemistry and Center for Computational Engineering and Science, University of Campinas, Campinas, SP, Brazil
| |
Collapse
|
10
|
Crane JM, Randall LL. The Sec System: Protein Export in Escherichia coli. EcoSal Plus 2017; 7:10.1128/ecosalplus.ESP-0002-2017. [PMID: 29165233 PMCID: PMC5807066 DOI: 10.1128/ecosalplus.esp-0002-2017] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Indexed: 11/20/2022]
Abstract
In Escherichia coli, proteins found in the periplasm or the outer membrane are exported from the cytoplasm by the general secretory, Sec, system before they acquire stably folded structure. This dynamic process involves intricate interactions among cytoplasmic and membrane proteins, both peripheral and integral, as well as lipids. In vivo, both ATP hydrolysis and proton motive force are required. Here, we review the Sec system from the inception of the field through early 2016, including biochemical, genetic, and structural data.
Collapse
Affiliation(s)
- Jennine M. Crane
- Department of Biochemistry, University of Missouri, Columbia, Missouri
| | - Linda L. Randall
- Department of Biochemistry, University of Missouri, Columbia, Missouri
| |
Collapse
|
11
|
Endoh T, Sugimoto N. Conformational Dynamics of mRNA in Gene Expression as New Pharmaceutical Target. CHEM REC 2017; 17:817-832. [DOI: 10.1002/tcr.201700016] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Indexed: 11/05/2022]
Affiliation(s)
- Tamaki Endoh
- Frontier Institute for Biomolecular Engineering Research (FIBER); Konan University; 7-1-20 Minatojima-minamimachi Chuo-ku, Kobe 650-0047 Japan
| | - Naoki Sugimoto
- Frontier Institute for Biomolecular Engineering Research (FIBER); Konan University; 7-1-20 Minatojima-minamimachi Chuo-ku, Kobe 650-0047 Japan
- Graduate School of Frontiers of Innovative Research in Science and Technology (FIRST); Konan University; 7-1-20 Minatojima-minamimachi Chuo-ku, Kobe 650-0047 Japan
| |
Collapse
|