1
|
Zhou Y, Myung Y, Rodrigues CM, Ascher D. DDMut-PPI: predicting effects of mutations on protein-protein interactions using graph-based deep learning. Nucleic Acids Res 2024; 52:W207-W214. [PMID: 38783112 PMCID: PMC11223791 DOI: 10.1093/nar/gkae412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/30/2024] [Accepted: 05/02/2024] [Indexed: 05/25/2024] Open
Abstract
Protein-protein interactions (PPIs) play a vital role in cellular functions and are essential for therapeutic development and understanding diseases. However, current predictive tools often struggle to balance efficiency and precision in predicting the effects of mutations on these complex interactions. To address this, we present DDMut-PPI, a deep learning model that efficiently and accurately predicts changes in PPI binding free energy upon single and multiple point mutations. Building on the robust Siamese network architecture with graph-based signatures from our prior work, DDMut, the DDMut-PPI model was enhanced with a graph convolutional network operated on the protein interaction interface. We used residue-specific embeddings from ProtT5 protein language model as node features, and a variety of molecular interactions as edge features. By integrating evolutionary context with spatial information, this framework enables DDMut-PPI to achieve a robust Pearson correlation of up to 0.75 (root mean squared error: 1.33 kcal/mol) in our evaluations, outperforming most existing methods. Importantly, the model demonstrated consistent performance across mutations that increase or decrease binding affinity. DDMut-PPI offers a significant advancement in the field and will serve as a valuable tool for researchers probing the complexities of protein interactions. DDMut-PPI is freely available as a web server and an application programming interface at https://biosig.lab.uq.edu.au/ddmut_ppi.
Collapse
Affiliation(s)
- Yunzhuo Zhou
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - YooChan Myung
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| | - Carlos H M Rodrigues
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
| | - David B Ascher
- The Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland 4072, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
| |
Collapse
|
2
|
Abstract
Membrane proteins, particularly those that are α-helical, such as transporters and G-protein-coupled receptors (GPCRs), have significant biological relevance. However, their expression and purification pose difficulties because of their poor water solubilities, which impedes progress in this field. The QTY method, a code-based protein-engineering approach, was recently developed to produce soluble transmembrane proteins. Here, we describe a comprehensive Web server built for QTY design and its relevance for in silico analyses. Typically, the simple design model is expected to require only 2 to 4 min of computer time, and the library design model requires 2 to 5 h, depending on the target protein size and the number of transmembrane helices. Detailed protocols for using the server with both the simple design and library design modules are provided. Methods for experiments following the QTY design are also included to facilitate the implementation of this approach. The design pipeline was further evaluated using microbial transmembrane proteins and structural alignment between the designed proteins and their origins by employing AlphaFold2. The results reveal that mutants generated by the developed pipeline were highly identical to their origins in terms of three-dimensional (3D) structures. In summary, the utilization of our Web server and associated protocols will enable QTY-based protein engineering to be implemented in a convenient, fast, accurate, and rational manner. The Protein Solubilizing Server (PSS) is publicly available at http://pss.sjtu.edu.cn. IMPORTANCE Water-soluble expression and purification are of considerable importance for protein identification and characterization. However, there has been a lack of an effective method for water-soluble expression of membrane proteins, which has severely hampered their studies. Here, an enabling comprehensive Web server, PSS, was developed for designing water-soluble mutants of α-helical membrane proteins, based on QTY design, a code-based protein-engineering approach. With microbial transmembrane proteins and GPCRs as examples, we systematically evaluated the server and demonstrated its successful performance. PSS is readily available for worldwide users as a Web-based tool, rendering QTY-based protein engineering convenient, efficient, accurate, and rational.
Collapse
|
3
|
Zaucha J, Heinzinger M, Kulandaisamy A, Kataka E, Salvádor ÓL, Popov P, Rost B, Gromiha MM, Zhorov BS, Frishman D. Mutations in transmembrane proteins: diseases, evolutionary insights, prediction and comparison with globular proteins. Brief Bioinform 2020; 22:5872174. [PMID: 32672331 DOI: 10.1093/bib/bbaa132] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 05/26/2020] [Accepted: 05/28/2020] [Indexed: 12/18/2022] Open
Abstract
Membrane proteins are unique in that they interact with lipid bilayers, making them indispensable for transporting molecules and relaying signals between and across cells. Due to the significance of the protein's functions, mutations often have profound effects on the fitness of the host. This is apparent both from experimental studies, which implicated numerous missense variants in diseases, as well as from evolutionary signals that allow elucidating the physicochemical constraints that intermembrane and aqueous environments bring. In this review, we report on the current state of knowledge acquired on missense variants (referred to as to single amino acid variants) affecting membrane proteins as well as the insights that can be extrapolated from data already available. This includes an overview of the annotations for membrane protein variants that have been collated within databases dedicated to the topic, bioinformatics approaches that leverage evolutionary information in order to shed light on previously uncharacterized membrane protein structures or interaction interfaces, tools for predicting the effects of mutations tailored specifically towards the characteristics of membrane proteins as well as two clinically relevant case studies explaining the implications of mutated membrane proteins in cancer and cardiomyopathy.
Collapse
Affiliation(s)
- Jan Zaucha
- Department of Bioinformatics of the TUM School of Life Sciences Weihenstephan in Freising, Germany
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics and Computational Biology of the TUM Faculty of Informatics in Garching, Germany
| | - A Kulandaisamy
- Department of Biotechnology of the IIT Bhupat and Jyoti Mehta School of BioSciences in Madras, India
| | - Evans Kataka
- Department of Bioinformatics of the TUM School of Life Sciences Weihenstephan in Freising, Germany
| | - Óscar Llorian Salvádor
- Department of Informatics, Bioinformatics and Computational Biology of the TUM Faculty of Informatics in Garching, Germany
| | - Petr Popov
- Center for Computational and Data-Intensive Science and Engineering of the Skolkovo Institute of Science and Technology in Moscow, Russia
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology at the TUM Faculty of Informatics in Garching, Germany
| | | | - Boris S Zhorov
- Department of Biochemistry and Biomedical Sciences, McMaster University in Hamilton, Canada
| | - Dmitrij Frishman
- Department of Bioinformatics at the TUM School of Life Sciences Weihenstephan in Freising, Germany
| |
Collapse
|
4
|
Marabotti A, Scafuri B, Facchiano A. Predicting the stability of mutant proteins by computational approaches: an overview. Brief Bioinform 2020; 22:5850907. [PMID: 32496523 DOI: 10.1093/bib/bbaa074] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 04/07/2020] [Accepted: 04/10/2020] [Indexed: 01/06/2023] Open
Abstract
A very large number of computational methods to predict the change in thermodynamic stability of proteins due to mutations have been developed during the last 30 years, and many different web servers are currently available. Nevertheless, most of them suffer from severe drawbacks that decrease their general reliability and, consequently, their applicability to different goals such as protein engineering or the predictions of the effects of mutations in genetic diseases. In this review, we have summarized all the main approaches used to develop these tools, with a survey of the web servers currently available. Moreover, we have also reviewed the different assessments made during the years, in order to allow the reader to check directly the different performances of these tools, to select the one that best fits his/her needs, and to help naïve users in finding the best option for their needs.
Collapse
|
5
|
Nutschel C, Fulton A, Zimmermann O, Schwaneberg U, Jaeger KE, Gohlke H. Systematically Scrutinizing the Impact of Substitution Sites on Thermostability and Detergent Tolerance for Bacillus subtilis Lipase A. J Chem Inf Model 2020; 60:1568-1584. [PMID: 31905288 DOI: 10.1021/acs.jcim.9b00954] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Improving an enzyme's (thermo-)stability or tolerance against solvents and detergents is highly relevant in protein engineering and biotechnology. Recent developments have tended toward data-driven approaches, where available knowledge about the protein is used to identify substitution sites with high potential to yield protein variants with improved stability, and subsequently, substitutions are engineered by site-directed or site-saturation (SSM) mutagenesis. However, the development and validation of algorithms for data-driven approaches have been hampered by the lack of availability of large-scale data measured in a uniform way and being unbiased with respect to substitution types and locations. Here, we extend our knowledge on guidelines for protein engineering following a data-driven approach by scrutinizing the impact of substitution sites on thermostability or/and detergent tolerance for Bacillus subtilis lipase A (BsLipA) at very large scale. We systematically analyze a complete experimental SSM library of BsLipA containing all 3439 possible single variants, which was evaluated as to thermostability and tolerances against four detergents under respectively uniform conditions. Our results provide systematic and unbiased reference data at unprecedented scale for a biotechnologically important protein, identify consistently defined hot spot types for evaluating the performance of data-driven protein-engineering approaches, and show that the rigidity theory and ensemble-based approach Constraint Network Analysis yields hot spot predictions with an up to ninefold gain in precision over random classification.
Collapse
Affiliation(s)
- Christina Nutschel
- John von Neumann Institute for Computing (NIC) and Institute for Complex Systems-Structural Biochemistry (ICS-6), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany.,Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Alexander Fulton
- Institute of Molecular Enzyme Technology, Heinrich Heine University Düsseldorf, 52425 Jülich, Germany
| | - Olav Zimmermann
- Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Ulrich Schwaneberg
- Institute of Biotechnology, RWTH Aachen University, 52074 Aachen, Germany.,DWI-Leibniz-Institute for Interactive Materials, 52056 Aachen, Germany
| | - Karl-Erich Jaeger
- Institute of Molecular Enzyme Technology, Heinrich Heine University Düsseldorf, 52425 Jülich, Germany.,Institute of Bio- and Geosciences IBG-1: Biotechnology, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Holger Gohlke
- John von Neumann Institute for Computing (NIC) and Institute for Complex Systems-Structural Biochemistry (ICS-6), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany.,Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany.,Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, 40225 Düsseldorf, Germany
| |
Collapse
|
6
|
Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci 2020; 29:247-257. [PMID: 31693276 PMCID: PMC6933854 DOI: 10.1002/pro.3774] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/31/2019] [Accepted: 10/31/2019] [Indexed: 02/02/2023]
Abstract
Next-generation sequencing methods have not only allowed an understanding of genome sequence variation during the evolution of organisms but have also provided invaluable information about genetic variants in inherited disease and the emergence of resistance to drugs in cancers and infectious disease. A challenge is to distinguish mutations that are drivers of disease or drug resistance, from passengers that are neutral or even selectively advantageous to the organism. This requires an understanding of impacts of missense mutations in gene expression and regulation, and on the disruption of protein function by modulating protein stability or disturbing interactions with proteins, nucleic acids, small molecule ligands, and other biological molecules. Experimental approaches to understanding differences between wild-type and mutant proteins are most accurate but are also time-consuming and costly. Computational tools used to predict the impacts of mutations can provide useful information more quickly. Here, we focus on two widely used structure-based approaches, originally developed in the Blundell lab: site-directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph-based signatures to represent the wild-type structural environment and machine learning to predict the effect of mutations on protein stability. Here, we describe DUET that uses machine learning to combine the two approaches. We discuss briefly the development of mCSM for understanding the impacts of mutations on interfaces with other proteins, nucleic acids, and ligands, and we exemplify the wide application of these approaches to understand human genetic disorders and drug resistance mutations relevant to cancer and mycobacterial infections. STATEMENT FOR A BROADER AUDIENCE: Genetic or somatic changes in genes can lead to mutations in human proteins, which give rise to genetic disorders or cancer, or to genes of pathogens leading to drug resistance. Computer software described here, using statistical approaches or machine learning, uses the information from genome sequencing of humans and pathogens, together with experimental or modeled 3D structures of gene products, the proteins, to predict impacts of mutations in genetic disease, cancer and drug resistance.
Collapse
Affiliation(s)
- Arun Prasad Pandurangan
- Department of BiochemistryUniversity of CambridgeCambridgeUK
- MRC Laboratory of Molecular BiologyCambridgeUK
| | - Tom L. Blundell
- Department of BiochemistryUniversity of CambridgeCambridgeUK
| |
Collapse
|
7
|
Robust Prediction of Single and Multiple Point Protein Mutations Stability Changes. Biomolecules 2019; 10:biom10010067. [PMID: 31906171 PMCID: PMC7023245 DOI: 10.3390/biom10010067] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2019] [Revised: 12/19/2019] [Accepted: 12/20/2019] [Indexed: 11/16/2022] Open
Abstract
Accurate prediction of protein stability changes resulting from amino acid substitutions is of utmost importance in medicine to better understand which mutations are deleterious, leading to diseases, and which are neutral. Since conducting wet lab experiments to get a better understanding of protein mutations is costly and time consuming, and because of huge number of possible mutations the need of computational methods that could accurately predict effects of amino acid mutations is of greatest importance. In this research, we present a robust methodology to predict the energy changes of a proteins upon mutations. The proposed prediction scheme is based on two step algorithm that is a Holdout Random Sampler followed by a neural network model for regression. The Holdout Random Sampler is utilized to analysis the energy change, the corresponding uncertainty, and to obtain a set of admissible energy changes, expressed as a cumulative distribution function. These values are further utilized to train a simple neural network model that can predict the energy changes. Results were blindly tested (validated) against experimental energy changes, giving Pearson correlation coefficients of 0.66 for Single Point Mutations and 0.77 for Multiple Point Mutations. These results confirm the successfulness of our method, since it outperforms majority of previous studies in this field.
Collapse
|
8
|
Savojardo C, Martelli PL, Casadio R, Fariselli P. On the critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation. Brief Bioinform 2019; 22:601-603. [PMID: 31885042 DOI: 10.1093/bib/bbz168] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Revised: 11/26/2019] [Accepted: 12/05/2019] [Indexed: 01/17/2023] Open
Abstract
A review, recently published in this journal by Fang (2019), showed that methods trained for the prediction of protein stability changes upon mutation have a very critical bias: they neglect that a protein variation (A- > B) and its reverse (B- > A) must have the opposite value of the free energy difference (ΔΔGAB = - ΔΔGBA). In this letter, we complement the Fang's paper presenting a more general view of the problem. In particular, a machine learning-based method, published in 2015 (INPS), addressed the bias issue directly. We include the analysis of the missing method, showing that INPS is nearly insensitive to the addressed problem.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Torino, Italy
| |
Collapse
|
9
|
Fang J. A critical review of five machine learning-based algorithms for predicting protein stability changes upon mutation. Brief Bioinform 2019; 21:1285-1292. [PMID: 31273374 DOI: 10.1093/bib/bbz071] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 05/14/2019] [Accepted: 05/16/2019] [Indexed: 01/02/2023] Open
Abstract
A number of machine learning (ML)-based algorithms have been proposed for predicting mutation-induced stability changes in proteins. In this critical review, we used hypothetical reverse mutations to evaluate the performance of five representative algorithms and found all of them suffer from the problem of overfitting. This approach is based on the fact that if a wild-type protein is more stable than a mutant protein, then the same mutant is less stable than the wild-type protein. We analyzed the underlying issues and suggest that the main causes of the overfitting problem include that the numbers of training cases were too small, and the features used in the models were not sufficiently informative for the task. We make recommendations on how to avoid overfitting in this important research area and improve the reliability and robustness of ML-based algorithms in general.
Collapse
Affiliation(s)
- Jianwen Fang
- Computational & Systems Biology Branch, Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Drive, Rockville, MD 20850, USA
| |
Collapse
|
10
|
Pandurangan AP, Ochoa-Montaño B, Ascher DB, Blundell TL. SDM: a server for predicting effects of mutations on protein stability. Nucleic Acids Res 2019; 45:W229-W235. [PMID: 28525590 PMCID: PMC5793720 DOI: 10.1093/nar/gkx439] [Citation(s) in RCA: 332] [Impact Index Per Article: 66.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 05/15/2017] [Indexed: 02/02/2023] Open
Abstract
Here, we report a webserver for the improved SDM, used for predicting the effects of mutations on protein stability. As a pioneering knowledge-based approach, SDM has been highlighted as the most appropriate method to use in combination with many other approaches. We have updated the environment-specific amino-acid substitution tables based on the current expanded PDB (a 5-fold increase in information), and introduced new residue-conformation and interaction parameters, including packing density and residue depth. The updated server has been extensively tested using a benchmark containing 2690 point mutations from 132 different protein structures. The revised method correlates well against the hypothetical reverse mutations, better than comparable methods built using machine-learning approaches, highlighting the strength of our knowledge-based approach for identifying stabilising mutations. Given a PDB file (a Protein Data Bank file format containing the 3D coordinates of the protein atoms), and a point mutation, the server calculates the stability difference score between the wildtype and mutant protein. The server is available at http://structure.bioc.cam.ac.uk/sdm2
Collapse
Affiliation(s)
| | | | - David B Ascher
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK.,Department of Biochemistry and Molecular Biology, University of Melbourne, Australia
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| |
Collapse
|
11
|
Chakravorty D, Patra S. RankProt: A multi criteria-ranking platform to attain protein thermostabilizing mutations and its in vitro applications - Attribute based prediction method on the principles of Analytical Hierarchical Process. PLoS One 2018; 13:e0203036. [PMID: 30286107 PMCID: PMC6171822 DOI: 10.1371/journal.pone.0203036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Accepted: 08/14/2018] [Indexed: 01/15/2023] Open
Abstract
Attaining recombinant thermostable proteins is still a challenge for protein engineering. The complexity is the length of time and enormous efforts required to achieve the desired results. Present work proposes a novel and economic strategy of attaining protein thermostability by predicting site-specific mutations at the shortest possible time. The success of the approach can be attributed to Analytical Hierarchical Process and the outcome was a rationalized thermostable mutation(s) prediction tool- RankProt. Briefly the method involved ranking of 17 biophysical protein features as class predictors, derived from 127 pairs of thermostable and mesostable proteins. Among the 17 predictors, ionic interactions and main-chain to main-chain hydrogen bonds were the highest ranked features with eigen value of 0.091. The success of the tool was judged by multi-fold in silico validation tests and it achieved the prediction accuracy of 91% with AUC 0.927. Further, in vitro validation was carried out by predicting thermostabilizing mutations for mesostable Bacillus subtilis lipase and performing the predicted mutations by multi-site directed mutagenesis. The rationalized method was successful to render the lipase thermostable with optimum temperature stability and Tm increase by 20°C and 7°C respectively. Conclusively it can be said that it was the minimum number of mutations in comparison to the number of mutations incorporated to render Bacillus subtilis lipase thermostable, by directed evolution techniques. The present work shows that protein stabilizing mutations can be rationally designed by balancing the biophysical pleiotropy of proteins, in accordance to the selection pressure.
Collapse
Affiliation(s)
- Debamitra Chakravorty
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
| | - Sanjukta Patra
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
- * E-mail:
| |
Collapse
|
12
|
Saarman NP, Kober KM, Simison WB, Pogson GH. Sequence-Based Analysis of Thermal Adaptation and Protein Energy Landscapes in an Invasive Blue Mussel (Mytilus galloprovincialis). Genome Biol Evol 2018; 9:2739-2751. [PMID: 28985307 PMCID: PMC5647807 DOI: 10.1093/gbe/evx190] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/13/2017] [Indexed: 12/12/2022] Open
Abstract
Adaptive responses to thermal stress in poikilotherms plays an important role in determining competitive ability and species distributions. Amino acid substitutions that affect protein stability and modify the thermal optima of orthologous proteins may be particularly important in this context. Here, we examine a set of 2,770 protein-coding genes to determine if proteins in a highly invasive heat tolerant blue mussel (Mytilus galloprovincialis) contain signals of adaptive increases in protein stability relative to orthologs in a more cold tolerant M. trossulus. Such thermal adaptations might help to explain, mechanistically, the success with which the invasive marine mussel M. galloprovincialis has displaced native species in contact zones in the eastern (California) and western (Japan) Pacific. We tested for stabilizing amino acid substitutions in warm tolerant M. galloprovincialis relative to cold tolerant M. trossulus with a generalized linear model that compares in silico estimates of recent changes in protein stability among closely related congeners. Fixed substitutions in M. galloprovincialis were 3,180.0 calories per mol per substitution more stabilizing at genes with both elevated dN/dS ratios and transcriptional responses to heat stress, and 705.8 calories per mol per substitution more stabilizing across all 2,770 loci investigated. Amino acid substitutions concentrated in a small number of genes were more stabilizing in M. galloprovincialis compared with cold tolerant M. trossulus. We also tested for, but did not find, enrichment of a priori GO terms in genes with elevated dN/dS ratios in M. galloprovincialis. This might indicate that selection for thermodynamic stability is generic across all lineages, and suggests that the high change in estimated protein stability that we observed in M. galloprovincialis is driven by selection for extra stabilizing substitutions, rather than by higher incidence of selection in a greater number of genes in this lineage. Nonetheless, our finding of more stabilizing amino acid changes in the warm adapted lineage is important because it suggests that adaption for thermal stability has contributed to M. galloprovincialis’ superior tolerance to heat stress, and that pairing tests for positive selection and tests for transcriptional response to heat stress can identify candidates of protein stability adaptation.
Collapse
Affiliation(s)
- Norah P Saarman
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz.,Department of Ecology and Evolutionary Biology, Yale University
| | - Kord M Kober
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz.,Department of Physiological Nursing, University of California, San Francisco.,Institute for Computational Health Sciences, University of California, San Francisco
| | - W Brian Simison
- Center for Comparative Genomics, California Academy of Sciences, San Francisco, California
| | - Grant H Pogson
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz
| |
Collapse
|
13
|
Steinbrecher T, Zhu C, Wang L, Abel R, Negron C, Pearlman D, Feyfant E, Duan J, Sherman W. Predicting the Effect of Amino Acid Single-Point Mutations on Protein Stability—Large-Scale Validation of MD-Based Relative Free Energy Calculations. J Mol Biol 2017; 429:948-963. [DOI: 10.1016/j.jmb.2016.12.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 12/02/2016] [Accepted: 12/02/2016] [Indexed: 12/22/2022]
|
14
|
Wahome N, Sully E, Singer C, Thomas JC, Hu L, Joshi SB, Volkin DB, Fang J, Karanicolas J, Jacobs DJ, Mantis NJ, Middaugh CR. Novel Ricin Subunit Antigens With Enhanced Capacity to Elicit Toxin-Neutralizing Antibody Responses in Mice. J Pharm Sci 2016; 105:1603-1613. [PMID: 26987947 PMCID: PMC4846473 DOI: 10.1016/j.xphs.2016.02.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 01/26/2016] [Accepted: 02/09/2016] [Indexed: 02/07/2023]
Abstract
RiVax is a candidate ricin toxin subunit vaccine antigen that has proven to be safe in human phase I clinical trials. In this study, we introduced double and triple cavity-filling point mutations into the RiVax antigen with the expectation that stability-enhancing modifications would have a beneficial effect on overall immunogenicity of the recombinant proteins. We demonstrate that 2 RiVax triple mutant derivatives, RB (V81L/C171L/V204I) and RC (V81I/C171L/V204I), when adsorbed to aluminum salts adjuvant and tested in a mouse prime-boost-boost regimen were 5- to 10-fold more effective than RiVax at eliciting toxin-neutralizing serum IgG antibody titers. Increased toxin neutralizing antibody values and seroconversion rates were evident at different antigen dosages and within 7 days after the first booster. Quantitative stability/flexibility relationships analysis revealed that the RB and RC mutations affect rigidification of regions spanning residues 98-103, which constitutes a known immunodominant neutralizing B-cell epitope. A more detailed understanding of the immunogenic nature of RB and RC may provide insight into the fundamental relationship between local protein stability and antibody reactivity.
Collapse
Affiliation(s)
- Newton Wahome
- Department of Pharmaceutical Chemistry, Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66047
| | - Erin Sully
- Division of Infectious Disease, Wadsworth Center, New York State Department of Health, Albany, New York 12208
| | - Christopher Singer
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, North Carolina 28223
| | - Justin C Thomas
- Department of Pharmaceutical Chemistry, Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66047
| | - Lei Hu
- Department of Pharmaceutical Chemistry, Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66047
| | - Sangeeta B Joshi
- Department of Pharmaceutical Chemistry, Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66047
| | - David B Volkin
- Department of Pharmaceutical Chemistry, Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66047
| | - Jianwen Fang
- Applied Bioinformatics Laboratory, Department of Medicinal Chemistry, University of Kansas, Lawrence, Kansas 66047
| | - John Karanicolas
- Department of Molecular Biosciences, Center for Computational Biology, University of Kansas, Lawrence, Kansas 66045
| | - Donald J Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, North Carolina 28223.
| | - Nicholas J Mantis
- Division of Infectious Disease, Wadsworth Center, New York State Department of Health, Albany, New York 12208; Department of Biomedical Sciences, School of Public Health, University at Albany, Albany, New York 12201.
| | - C Russell Middaugh
- Department of Pharmaceutical Chemistry, Macromolecule and Vaccine Stabilization Center, University of Kansas, Lawrence, Kansas 66047.
| |
Collapse
|
15
|
Gromiha MM, Anoosha P, Huang LT. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants. Methods Mol Biol 2016; 1415:71-89. [PMID: 27115628 DOI: 10.1007/978-1-4939-3572-7_4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Protein stability is the free energy difference between unfolded and folded states of a protein, which lies in the range of 5-25 kcal/mol. Experimentally, protein stability is measured with circular dichroism, differential scanning calorimetry, and fluorescence spectroscopy using thermal and denaturant denaturation methods. These experimental data have been accumulated in the form of a database, ProTherm, thermodynamic database for proteins and mutants. It also contains sequence and structure information of a protein, experimental methods and conditions, and literature information. Different features such as search, display, and sorting options and visualization tools have been incorporated in the database. ProTherm is a valuable resource for understanding/predicting the stability of proteins and it can be accessed at http://www.abren.net/protherm/ . ProTherm has been effectively used to examine the relationship among thermodynamics, structure, and function of proteins. We describe the recent progress on the development of methods for understanding/predicting protein stability, such as (1) general trends on mutational effects on stability, (2) relationship between the stability of protein mutants and amino acid properties, (3) applications of protein three-dimensional structures for predicting their stability upon point mutations, (4) prediction of protein stability upon single mutations from amino acid sequence, and (5) prediction methods for addressing double mutants. A list of online resources for predicting has also been provided.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India.
| | - P Anoosha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India
| | - Liang-Tsung Huang
- Department of Medical Informatics, Tzu Chi University, Hualien, 970, Taiwan
| |
Collapse
|
16
|
Structure Based Thermostability Prediction Models for Protein Single Point Mutations with Machine Learning Tools. PLoS One 2015; 10:e0138022. [PMID: 26361227 PMCID: PMC4567301 DOI: 10.1371/journal.pone.0138022] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Accepted: 08/24/2015] [Indexed: 11/19/2022] Open
Abstract
Thermostability issue of protein point mutations is a common occurrence in protein engineering. An application which predicts the thermostability of mutants can be helpful for guiding decision making process in protein design via mutagenesis. An in silico point mutation scanning method is frequently used to find “hot spots” in proteins for focused mutagenesis. ProTherm (http://gibk26.bio.kyutech.ac.jp/jouhou/Protherm/protherm.html) is a public database that consists of thousands of protein mutants’ experimentally measured thermostability. Two data sets based on two differently measured thermostability properties of protein single point mutations, namely the unfolding free energy change (ddG) and melting temperature change (dTm) were obtained from this database. Folding free energy change calculation from Rosetta, structural information of the point mutations as well as amino acid physical properties were obtained for building thermostability prediction models with informatics modeling tools. Five supervised machine learning methods (support vector machine, random forests, artificial neural network, naïve Bayes classifier, K nearest neighbor) and partial least squares regression are used for building the prediction models. Binary and ternary classifications as well as regression models were built and evaluated. Data set redundancy and balancing, the reverse mutations technique, feature selection, and comparison to other published methods were discussed. Rosetta calculated folding free energy change ranked as the most influential features in all prediction models. Other descriptors also made significant contributions to increasing the accuracy of the prediction models.
Collapse
|
17
|
Support vector machine with a Pearson VII function kernel for discriminating halophilic and non-halophilic proteins. Comput Biol Chem 2013; 46:16-22. [DOI: 10.1016/j.compbiolchem.2013.05.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2013] [Revised: 04/24/2013] [Accepted: 05/03/2013] [Indexed: 01/15/2023]
|
18
|
Thomas JC, O'Hara JM, Hu L, Gao FP, Joshi SB, Volkin DB, Brey RN, Fang J, Karanicolas J, Mantis NJ, Middaugh CR. Effect of single-point mutations on the stability and immunogenicity of a recombinant ricin A chain subunit vaccine antigen. Hum Vaccin Immunother 2013; 9:744-52. [PMID: 23563512 DOI: 10.4161/hv.22998] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
There is great interest in the design and development of highly thermostable and immunogenic protein subunit vaccines for biodefense. In this study, we used two orthogonal and complementary computational protein design approaches to generate a series of single-point mutants of RiVax, an attenuated recombinant ricin A chain (RTA) protein subunit vaccine antigen. As assessed by differential scanning calorimetry, the conformational stabilities of the designed mutants ranged from 4°C less stable to 4.5°C more stable than RiVax, depending on solution pH. Two more thermostable (V18P, C171L) and two less thermostable (T13V, S89T) mutants that displayed native-like secondary and tertiary structures (as determined by circular dichroism and fluorescence spectral analysis, respectively) were tested for their capacity to elicit RTA-specific antibodies and toxin-neutralizing activity. Following a prime-boost regimen, we found qualitative differences with respect to specific antibody titers and toxin neutralizing antibody levels induced by the different mutants. Upon a second boost with the more thermostable mutant C171L, a statistically significant increase in RTA-specific antibody titers was observed when compared with RiVax-immunized mice. Notably, the results indicate that single residue changes can be made to the RiVax antigen that increase its thermal stability without adversely impacting the efficacy of the vaccine.
Collapse
Affiliation(s)
- Justin C Thomas
- Macromolecule and Vaccine Stabilization Center; Department of Pharmaceutical Chemistry; University of Kansas; Lawrence, KS USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Thiltgen G, Goldstein RA. Assessing predictors of changes in protein stability upon mutation using self-consistency. PLoS One 2012; 7:e46084. [PMID: 23144695 PMCID: PMC3483175 DOI: 10.1371/journal.pone.0046084] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Accepted: 08/29/2012] [Indexed: 11/18/2022] Open
Abstract
The ability to predict the effect of mutations on protein stability is important for a wide range of tasks, from protein engineering to assessing the impact of SNPs to understanding basic protein biophysics. A number of methods have been developed that make these predictions, but assessing the accuracy of these tools is difficult given the limitations and inconsistencies of the experimental data. We evaluate four different methods based on the ability of these methods to generate consistent results for forward and back mutations, and examine how this ability varies with the nature and location of the mutation. We find that, while one method seems to outperform the others, the ability of these methods to make accurate predictions is limited.
Collapse
Affiliation(s)
| | - Richard A. Goldstein
- Department of Mathematical Biology, National Institute for Medical Research, Mill Hill, London, United Kingdom
- * E-mail:
| |
Collapse
|
20
|
Li Y, Fang J. PROTS-RF: a robust model for predicting mutation-induced protein stability changes. PLoS One 2012; 7:e47247. [PMID: 23077576 PMCID: PMC3471942 DOI: 10.1371/journal.pone.0047247] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2012] [Accepted: 09/11/2012] [Indexed: 11/19/2022] Open
Abstract
The ability to improve protein thermostability via protein engineering is of great scientific interest and also has significant practical value. In this report we present PROTS-RF, a robust model based on the Random Forest algorithm capable of predicting thermostability changes induced by not only single-, but also double- or multiple-point mutations. The model is built using 41 features including evolutionary information, secondary structure, solvent accessibility and a set of fragment-based features. It achieves accuracies of 0.799,0.782, 0.787, and areas under receiver operating characteristic (ROC) curves of 0.873, 0.868 and 0.862 for single-, double- and multiple- point mutation datasets, respectively. Contrary to previous suggestions, our results clearly demonstrate that a robust predictive model trained for predicting single point mutation induced thermostability changes can be capable of predicting double and multiple point mutations. It also shows high levels of robustness in the tests using hypothetical reverse mutations. We demonstrate that testing datasets created based on physical principles can be highly useful for testing the robustness of predictive models.
Collapse
Affiliation(s)
- Yunqi Li
- Applied Bioinformatics Laboratory, The University of Kansas, Lawrence, Kansas, United States of America
| | - Jianwen Fang
- Applied Bioinformatics Laboratory, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail:
| |
Collapse
|
21
|
Structure-based mutant stability predictions on proteins of unknown structure. J Biotechnol 2012; 161:287-93. [DOI: 10.1016/j.jbiotec.2012.06.020] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Revised: 06/19/2012] [Accepted: 06/22/2012] [Indexed: 11/23/2022]
|