1
|
Hong X, Tong X, Xie J, Liu P, Liu X, Song Q, Liu S, Liu S. An updated dataset and a structure-based prediction model for protein-RNA binding affinity. Proteins 2023; 91:1245-1253. [PMID: 37186412 DOI: 10.1002/prot.26503] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 03/08/2023] [Accepted: 04/12/2023] [Indexed: 05/17/2023]
Abstract
Understanding the process of protein-RNA interaction is essential for structural biology. The thermodynamic process is an important part to uncover the protein-RNA interaction mechanism. The regulatory networks between protein and RNA in organisms are dominated by the binding or dissociation in the cells. Therefore, determining the binding affinity for protein-RNA complexes can help us to understand the regulation mechanism of protein-RNA interaction. Since it is time-consuming and labor-intensive to determine the binding affinity for protein-RNA complexes by experimental methods, it is necessary and urgent to develop computational methods to predict that. To develop a binding affinity prediction model, first we update the dataset of protein-RNA binding affinity benchmark (PRBAB), which includes 145 complexes now. Second, we extract the structural features based on complex structure, and then we analyze and select the representative structural features to train the regression model. Third, we random select the subset from the PRBAB2.0 to fit the protein-RNA binding affinity determined by experiment. In the end, we tested our model on the nonredundant PDBbind dataset, and the results showed that Pearson correlation coefficient r = .57 and RMSE = 2.51 kcal/mol. The Pearson correlation coefficient achieves 0.7 while removing 5 complex structures with modified residues/nucleotides and metal ions. While testing on ProNAB, the results showed that 71.60% of the prediction achieves Pearson correlation coefficient r = .61 and RMSE = 1.56 kcal/mol with experiment values.
Collapse
Affiliation(s)
- Xu Hong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xiaoxue Tong
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Juan Xie
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Pinyu Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Xudong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| | - Qi Song
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, Wuhan, China
| | - Sen Liu
- Key Laboratory of Fermentation Engineering (Ministry of Education), Hubei University of Technology, Wuhan, China
| | - Shiyong Liu
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
| |
Collapse
|
2
|
Park T, Woo H, Baek M, Yang J, Seok C. Structure prediction of biological assemblies using GALAXY in CAPRI rounds 38-45. Proteins 2019; 88:1009-1017. [PMID: 31774573 DOI: 10.1002/prot.25859] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 11/11/2019] [Accepted: 11/23/2019] [Indexed: 12/12/2022]
Abstract
We participated in CARPI rounds 38-45 both as a server predictor and a human predictor. These CAPRI rounds provided excellent opportunities for testing prediction methods for three classes of protein interactions, that is, protein-protein, protein-peptide, and protein-oligosaccharide interactions. Both template-based methods (GalaxyTBM for monomer protein, GalaxyHomomer for homo-oligomer protein, GalaxyPepDock for protein-peptide complex) and ab initio docking methods (GalaxyTongDock and GalaxyPPDock for protein oligomer, GalaxyPepDock-ab-initio for protein-peptide complex, GalaxyDock2 and Galaxy7TM for protein-oligosaccharide complex) have been tested. Template-based methods depend heavily on the availability of proper templates and template-target similarity, and template-target difference is responsible for inaccuracy of template-based models. Inaccurate template-based models could be improved by our structure refinement and loop modeling methods based on physics-based energy optimization (GalaxyRefineComplex and GalaxyLoop) for several CAPRI targets. Current ab initio docking methods require accurate protein structures as input. Small conformational changes from input structure could be accounted for by our docking methods, producing one of the best models for several CAPRI targets. However, predicting large conformational changes involving protein backbone is still challenging, and full exploration of physics-based methods for such problems is still to come.
Collapse
Affiliation(s)
- Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jinsol Yang
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
3
|
Tyunina EY, Badelin VG. Isotherms of the Molar Viscosity of Liquids and Fluids over a Wide Range of Pressures. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY A 2018. [DOI: 10.1134/s0036024418100357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
4
|
Jain T, Boland T, Lilov A, Burnina I, Brown M, Xu Y, Vásquez M. Prediction of delayed retention of antibodies in hydrophobic interaction chromatography from sequence using machine learning. Bioinformatics 2017; 33:3758-3766. [DOI: 10.1093/bioinformatics/btx519] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 08/11/2017] [Indexed: 12/16/2022] Open
Affiliation(s)
- Tushar Jain
- Computational Biology, Adimab, Palo Alto, CA, USA
| | - Todd Boland
- Computational Biology, Adimab, Palo Alto, CA, USA
| | | | | | | | - Yingda Xu
- Protein Analytics, Adimab, Lebanon, NH, USA
| | | |
Collapse
|
5
|
Apgar JR, Mader M, Agostinelli R, Benard S, Bialek P, Johnson M, Gao Y, Krebs M, Owens J, Parris K, St. Andre M, Svenson K, Morris C, Tchistiakova L. Beyond CDR-grafting: Structure-guided humanization of framework and CDR regions of an anti-myostatin antibody. MAbs 2016; 8:1302-1318. [PMID: 27625211 PMCID: PMC5058614 DOI: 10.1080/19420862.2016.1215786] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Revised: 06/23/2016] [Accepted: 07/18/2016] [Indexed: 01/29/2023] Open
Abstract
Antibodies are an important class of biotherapeutics that offer specificity to their antigen, long half-life, effector function interaction and good manufacturability. The immunogenicity of non-human-derived antibodies, which can be a major limitation to development, has been partially overcome by humanization through complementarity-determining region (CDR) grafting onto human acceptor frameworks. The retention of foreign content in the CDR regions, however, is still a potential immunogenic liability. Here, we describe the humanization of an anti-myostatin antibody utilizing a 2-step process of traditional CDR-grafting onto a human acceptor framework, followed by a structure-guided approach to further reduce the murine content of CDR-grafted antibodies. To accomplish this, we solved the co-crystal structures of myostatin with the chimeric (Protein Databank (PDB) id 5F3B) and CDR-grafted anti-myostatin antibody (PDB id 5F3H), allowing us to computationally predict the structurally important CDR residues as well as those making significant contacts with the antigen. Structure-based rational design enabled further germlining of the CDR-grafted antibody, reducing the murine content of the antibody without affecting antigen binding. The overall "humanness" was increased for both the light and heavy chain variable regions.
Collapse
Affiliation(s)
| | | | | | - Susan Benard
- Biomedicine Design, Pfizer Inc., Cambridge, MA, USA
| | - Peter Bialek
- Rare Disease Research Unit, Pfizer Inc., Cambridge, MA, USA
| | - Mark Johnson
- Rare Disease Research Unit, Pfizer Inc., Cambridge, MA, USA
| | - Yijie Gao
- Biomedicine Design, Pfizer Inc., Cambridge, MA, USA
| | - Mark Krebs
- Biomedicine Design, Pfizer Inc., Cambridge, MA, USA
| | - Jane Owens
- Rare Disease Research Unit, Pfizer Inc., Cambridge, MA, USA
| | - Kevin Parris
- Biomedicine Design, Pfizer Inc., Cambridge, MA, USA
| | | | - Kris Svenson
- Biomedicine Design, Pfizer Inc., Cambridge, MA, USA
| | - Carl Morris
- Rare Disease Research Unit, Pfizer Inc., Cambridge, MA, USA
| | | |
Collapse
|
6
|
Gromiha MM, Anoosha P, Huang LT. Applications of Protein Thermodynamic Database for Understanding Protein Mutant Stability and Designing Stable Mutants. Methods Mol Biol 2016; 1415:71-89. [PMID: 27115628 DOI: 10.1007/978-1-4939-3572-7_4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Protein stability is the free energy difference between unfolded and folded states of a protein, which lies in the range of 5-25 kcal/mol. Experimentally, protein stability is measured with circular dichroism, differential scanning calorimetry, and fluorescence spectroscopy using thermal and denaturant denaturation methods. These experimental data have been accumulated in the form of a database, ProTherm, thermodynamic database for proteins and mutants. It also contains sequence and structure information of a protein, experimental methods and conditions, and literature information. Different features such as search, display, and sorting options and visualization tools have been incorporated in the database. ProTherm is a valuable resource for understanding/predicting the stability of proteins and it can be accessed at http://www.abren.net/protherm/ . ProTherm has been effectively used to examine the relationship among thermodynamics, structure, and function of proteins. We describe the recent progress on the development of methods for understanding/predicting protein stability, such as (1) general trends on mutational effects on stability, (2) relationship between the stability of protein mutants and amino acid properties, (3) applications of protein three-dimensional structures for predicting their stability upon point mutations, (4) prediction of protein stability upon single mutations from amino acid sequence, and (5) prediction methods for addressing double mutants. A list of online resources for predicting has also been provided.
Collapse
Affiliation(s)
- M Michael Gromiha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India.
| | - P Anoosha
- Department of Biotechnology, Bhupat & Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India
| | - Liang-Tsung Huang
- Department of Medical Informatics, Tzu Chi University, Hualien, 970, Taiwan
| |
Collapse
|
7
|
Li L, Huang Y, Xiao Y. How to use not-always-reliable binding site information in protein-protein docking prediction. PLoS One 2013; 8:e75936. [PMID: 24124522 PMCID: PMC3790831 DOI: 10.1371/journal.pone.0075936] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2013] [Accepted: 08/22/2013] [Indexed: 11/19/2022] Open
Abstract
In many protein-protein docking algorithms, binding site information is used to help predicting the protein complex structures. Using correct and accurate binding site information can increase protein-protein docking success rate significantly. On the other hand, using wrong binding sites information should lead to a failed prediction, or, at least decrease the success rate. Recently, various successful theoretical methods have been proposed to predict the binding sites of proteins. However, the predicted binding site information is not always reliable, sometimes wrong binding site information could be given. Hence there is a high risk to use the predicted binding site information in current docking algorithms. In this paper, a softly restricting method (SRM) is developed to solve this problem. By utilizing predicted binding site information in a proper way, the SRM algorithm is sensitive to the correct binding site information but insensitive to wrong information, which decreases the risk of using predicted binding site information. This SRM is tested on benchmark 3.0 using purely predicted binding site information. The result shows that when the predicted information is correct, SRM increases the success rate significantly; however, even if the predicted information is completely wrong, SRM only decreases success rate slightly, which indicates that the SRM is suitable for utilizing predicted binding site information.
Collapse
Affiliation(s)
- Lin Li
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, South Carolina, United States of America
| | - Yanzhao Huang
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (YH); (YX)
| | - Yi Xiao
- Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan, Hubei, China
- * E-mail: (YH); (YX)
| |
Collapse
|
8
|
Moal IH, Fernandez-Recio J. Intermolecular Contact Potentials for Protein-Protein Interactions Extracted from Binding Free Energy Changes upon Mutation. J Chem Theory Comput 2013; 9:3715-27. [PMID: 26584123 DOI: 10.1021/ct400295z] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Understanding and predicting the energetics of protein-protein interactions is fundamental to the structural modeling of protein complexes. Binding free energy can be approximated as a sum of pairwise atomic or residue contact energies, which are commonly inferred from contact frequencies observed in experimental protein structures. However, such statistically inferred potentials require certain assumptions and approximation. Here, we explore the possibility of deriving atomic and residue contact potentials directly from experimental binding free energy changes following mutation and present a number of such potentials. The first set of potentials is obtained by unweighted least-squares fitting and bootsrap aggregating. The second set is calculated using a weighting scheme optimized against absolute binding affinity data, so as to account for the over-representation of certain complexes, residues, and families of interactions. The congruence of the potentials with known physical chemistry is investigated. The potentials are further validated by ranking and clustering protein-protein docking poses.
Collapse
Affiliation(s)
- Iain H Moal
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Supercomputing Center , C/Jordi Girona 29, 08034 Barcelona, Spain
| | - Juan Fernandez-Recio
- Joint BSC-IRB Research Program in Computational Biology, Life Science Department, Barcelona Supercomputing Center , C/Jordi Girona 29, 08034 Barcelona, Spain
| |
Collapse
|
9
|
Abstract
We examine the relationship between binding affinity and interface size for reversible protein-protein interactions (PPIs), using cytokines from the tumor necrosis factor (TNF) superfamily and their receptors as a test case. Using surface plasmon resonance, we measured single-site binding affinities for binding of the large receptor TNFR1 to its ligands TNFα (K(D) = 1.4 ± 0.4 nM) and lymphotoxin-α (K(D) = 50 ± 10 nM), and also for binding of the small receptor Fn14 to TWEAK (K(D) = 70 ± 10 nM). We additionally assembled data for all other TNF-TNFR family complexes for which reliable single-site binding affinities have been reported. We used these values to calculate the binding efficiencies, defined as binding energy per square angstrom of surface area buried at the contact interface, for nine of these complexes for which cocrystal structures are available, and compared the results to those for a set of 144 protein-protein complexes with published affinities. The results show that the most efficient PPI complexes generate ~20 cal mol(-1) Å(-2) of binding energy. A minimal contact area of ~500 Å(2) is required for a stable complex, required to generate sufficient interaction energy to pay the entropic cost of colocalizing two proteins from 1 M solution. The most compact and efficient TNF-TNFR complex was the BAFF-BR3 complex, which achieved ~80% of the maximal achievable binding efficiency. Other small receptors also gave high binding efficiencies, while the larger receptors generated only 44-49% of this limit despite interacting primarily through just a single small domain. The results provide new insight into how much binding energy can be generated by a PPI interface of a given size, and establish a quantitative method for predicting how large a natural or engineered contact interface must be to achieve a given level of binding affinity.
Collapse
Affiliation(s)
- Eric S Day
- Biogen Idec, 14 Cambridge Center, Cambridge, Massachusetts 02142, United States
| | | | | |
Collapse
|
10
|
Feld GK, Brown MJ, Krantz BA. Ratcheting up protein translocation with anthrax toxin. Protein Sci 2012; 21:606-24. [PMID: 22374876 DOI: 10.1002/pro.2052] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2012] [Revised: 02/21/2012] [Accepted: 02/22/2012] [Indexed: 01/09/2023]
Abstract
Energy-consuming nanomachines catalyze the directed movement of biopolymers in the cell. They are found both dissolved in the aqueous cytosol as well as embedded in lipid bilayers. Inquiries into the molecular mechanism of nanomachine-catalyzed biopolymer transport have revealed that these machines are equipped with molecular parts, including adjustable clamps, levers, and adaptors, which interact favorably with substrate polypeptides. Biological nanomachines that catalyze protein transport, known as translocases, often require that their substrate proteins unfold before translocation. An unstructured protein chain is likely entropically challenging to bind, push, or pull in a directional manner, especially in a way that produces an unfolding force. A number of ingenious solutions to this problem are now evident in the anthrax toxin system, a model used to study protein translocation. Here we highlight molecular ratchets and current research on anthrax toxin translocation. A picture is emerging of proton-gradient-driven anthrax toxin translocation, and its associated ratchet mechanism likely applies broadly to other systems. We suggest a cyclical thermodynamic order-to-disorder mechanism (akin to a heat-engine cycle) is central to underlying protein translocation: peptide substrates nonspecifically bind to molecular clamps, which possess adjustable affinities; polypeptide substrates compress into helical structures; these clamps undergo proton-gated switching; and the substrate subsequently expands regaining its unfolded state conformational entropy upon translocation.
Collapse
Affiliation(s)
- Geoffrey K Feld
- Department of Chemistry, University of California, Berkeley, California 94720, USA
| | | | | |
Collapse
|
11
|
Moal IH, Agius R, Bates PA. Protein-protein binding affinity prediction on a diverse set of structures. Bioinformatics 2011; 27:3002-9. [PMID: 21903632 DOI: 10.1093/bioinformatics/btr513] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2024] Open
Abstract
MOTIVATION Accurate binding free energy functions for protein-protein interactions are imperative for a wide range of purposes. Their construction is predicated upon ascertaining the factors that influence binding and their relative importance. A recent benchmark of binding affinities has allowed, for the first time, the evaluation and construction of binding free energy models using a diverse set of complexes, and a systematic assessment of our ability to model the energetics of conformational changes. RESULTS We construct a large set of molecular descriptors using commonly available tools, introducing the use of energetic factors associated with conformational changes and disorder to order transitions, as well as features calculated on structural ensembles. The descriptors are used to train and test a binding free energy model using a consensus of four machine learning algorithms, whose performance constitutes a significant improvement over the other state of the art empirical free energy functions tested. The internal workings of the learners show how the descriptors are used, illuminating the determinants of protein-protein binding. AVAILABILITY The molecular descriptor set and descriptor values for all complexes are available in the Supplementary Material. A web server for the learners and coordinates for the bound and unbound structures can be accessed from the website: http://bmm.cancerresearchuk.org/~Affinity. CONTACT paul.bates@cancer.org.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Iain H Moal
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London WC2A 3LY, UK
| | | | | |
Collapse
|
12
|
Li L, Guo D, Huang Y, Liu S, Xiao Y. ASPDock: protein-protein docking algorithm using atomic solvation parameters model. BMC Bioinformatics 2011; 12:36. [PMID: 21269517 PMCID: PMC3039575 DOI: 10.1186/1471-2105-12-36] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
Background Atomic Solvation Parameters (ASP) model has been proven to be a very successful method of calculating the binding free energy of protein complexes. This suggests that incorporating it into docking algorithms should improve the accuracy of prediction. In this paper we propose an FFT-based algorithm to calculate ASP scores of protein complexes and develop an ASP-based protein-protein docking method (ASPDock). Results The ASPDock is first tested on the 21 complexes whose binding free energies have been determined experimentally. The results show that the calculated ASP scores have stronger correlation (r ≈ 0.69) with the binding free energies than the pure shape complementarity scores (r ≈ 0.48). The ASPDock is further tested on a large dataset, the benchmark 3.0, which contain 124 complexes and also shows better performance than pure shape complementarity method in docking prediction. Comparisons with other state-of-the-art docking algorithms showed that ASP score indeed gives higher success rate than the pure shape complementarity score of FTDock but lower success rate than Zdock3.0. We also developed a softly restricting method to add the information of predicted binding sites into our docking algorithm. The ASP-based docking method performed well in CAPRI rounds 18 and 19. Conclusions ASP may be more accurate and physical than the pure shape complementarity in describing the feature of protein docking.
Collapse
Affiliation(s)
- Lin Li
- Biomolecular Physics and Modelling Group, Department of Physics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, PR China
| | | | | | | | | |
Collapse
|
13
|
Rowling PJE, Cook R, Itzhaki LS. Toward classification of BRCA1 missense variants using a biophysical approach. J Biol Chem 2010; 285:20080-7. [PMID: 20378548 PMCID: PMC2888420 DOI: 10.1074/jbc.m109.088922] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2009] [Revised: 04/08/2010] [Indexed: 11/29/2022] Open
Abstract
Carriers of germ line mutations in breast cancer susceptibility gene BRCA1 have an increased risk of developing breast and ovarian cancers; missense mutations have, however, been difficult to assess for disease association. Here we have used a biophysical approach to classify these variants. We established an assay for measuring the thermodynamic stability of the BRCA1 BRCT domains and investigated the effects of 36 missense mutations. The mutations show a range of effects. Some do not change the stability, whereas others destabilize the protein by as much as 6 kcal mol(-1); one-third of the mutants could not be expressed in soluble form in Escherichia coli, and we conclude that these destabilize the protein by an even greater amount. We tested several computer algorithms for their ability to predict the mutant effects and found that by grouping them into two classes (destabilizing by less than or more than 2.2 kcal mol(-1)), the algorithms could predict the stability changes. Importantly, with the exception of the few mutants located in the binding site, none showed a significant reduction in affinity for phosphorylated substrate. These results indicate that despite very large losses in stability, the integrity of the structure is not compromised by the mutations. Thus, the majority of mutations cause loss of function by reducing the proportion of BRCA1 molecules that are in the folded state and increasing the proportion of molecules that are unfolded. Consequently, small molecule stabilization of the structure could be a generally applicable preventative therapeutic strategy for rescuing many BRCA1 mutations.
Collapse
Affiliation(s)
- Pamela J. E. Rowling
- From the Medical Research Council (MRC) Cancer Cell Unit, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 0XZ, United Kingdom
| | - Rebecca Cook
- From the Medical Research Council (MRC) Cancer Cell Unit, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 0XZ, United Kingdom
| | - Laura S. Itzhaki
- From the Medical Research Council (MRC) Cancer Cell Unit, Hutchison/MRC Research Centre, Hills Road, Cambridge CB2 0XZ, United Kingdom
| |
Collapse
|
14
|
Su Y, Zhou A, Xia X, Li W, Sun Z. Quantitative prediction of protein-protein binding affinity with a potential of mean force considering volume correction. Protein Sci 2010; 18:2550-8. [PMID: 19798743 DOI: 10.1002/pro.257] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Quantitative prediction of protein-protein binding affinity is essential for understanding protein-protein interactions. In this article, an atomic level potential of mean force (PMF) considering volume correction is presented for the prediction of protein-protein binding affinity. The potential is obtained by statistically analyzing X-ray structures of protein-protein complexes in the Protein Data Bank. This approach circumvents the complicated steps of the volume correction process and is very easy to implement in practice. It can obtain more reasonable pair potential compared with traditional PMF and shows a classic picture of nonbonded atom pair interaction as Lennard-Jones potential. To evaluate the prediction ability for protein-protein binding affinity, six test sets are examined. Sets 1-5 were used as test set in five published studies, respectively, and set 6 was the union set of sets 1-5, with a total of 86 protein-protein complexes. The correlation coefficient (R) and standard deviation (SD) of fitting predicted affinity to experimental data were calculated to compare the performance of ours with that in literature. Our predictions on sets 1-5 were as good as the best prediction reported in the published studies, and for union set 6, R = 0.76, SD = 2.24 kcal/mol. Furthermore, we found that the volume correction can significantly improve the prediction ability. This approach can also promote the research on docking and protein structure prediction.
Collapse
Affiliation(s)
- Yu Su
- MOE Key Laboratory of Bioinformatics, State Key Laboratory of Biomembrane and Membrane Biotechnology, Department of Biological Sciences and Biotechnology, Tsinghua University, Beijing 100084, China
| | | | | | | | | |
Collapse
|
15
|
Binding site on the transferrin receptor for the parvovirus capsid and effects of altered affinity on cell uptake and infection. J Virol 2010; 84:4969-78. [PMID: 20200243 DOI: 10.1128/jvi.02623-09] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Canine parvovirus (CPV) and its relative feline panleukopenia virus (FPV) bind the transferrin receptor type 1 (TfR) to infect their host cells but show differences in the interactions with the feline and canine TfRs that determine viral host range and tissue tropism. We changed apical and protease-like domain residues by introducing point mutations and adding or removing glycosylation signals, and we then examined the interactions of those mutant TfRs with the capsids. Most substitutions had little effect on virus binding and uptake. However, mutations of several sites in the apical domain of the receptor either prevented binding to the capsids or reduced the affinity of receptor binding to various degrees. Glycans within the virus binding face of the apical domain also controlled capsid binding. CPV, but not the related feline parvovirus, could use receptors containing a canine TfR-specific glycosylation to mediate efficient infection, while addition of other N-linked glycosylation sites into the virus binding face of the feline apical domain reduced or eliminated both binding and infection. Replacement of critical feline TfR residue 221 with every amino acid had effects on binding and infection which were significantly associated with the biochemical properties of the residue replaced. Receptors with reduced affinities mostly showed proportional changes in their ability to mediate infection. Testing feline TfR variants for their binding and uptake patterns in cells showed that low-affinity versions bound fewer capsids and also differed in attachment to the cell surface and filopodia, but transport to the perinuclear endosome was similar.
Collapse
|
16
|
Abstract
We have developed a thermodynamic database for proteins and mutants, ProTherm, which is a collection of a large number of thermodynamic data on protein stability along with the sequence and structure information, experimental methods and conditions, and literature information. This is a valuable resource for understanding/predicting the stability of proteins, and it can be accessible at http://www.gibk26.bse.kyutech.ac.jp/jouhou/Protherm/protherm.html . ProTherm has several features including various search, display, and sorting options and visualization tools. We have analyzed the data in ProTherm to examine the relationship among thermodynamics, structure, and function of proteins. We describe the progress on the development of methods for understanding/predicting protein stability, such as (i) relationship between the stability of protein mutants and amino acid properties, (ii) average assignment method, (iii) empirical energy functions, (iv) torsion, distance, and contact potentials, and (v) machine learning techniques. The list of online resources for predicting protein stability has also been provided.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | | |
Collapse
|
17
|
Statistical theory of neutral protein evolution by random site mutations. J CHEM SCI 2009. [DOI: 10.1007/s12039-009-0105-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
18
|
Tan Y, Luo R. Structural and functional implications of p53 missense cancer mutations. PMC BIOPHYSICS 2009; 2:5. [PMID: 19558684 PMCID: PMC2709103 DOI: 10.1186/1757-5036-2-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2009] [Accepted: 06/26/2009] [Indexed: 11/16/2022]
Abstract
Most human cancers contain mutations in the transcription factor p53 and majority of these are missense and located in the DNA binding core domain. In this study, the stabilities of all core domain missense mutations are predicted and are used to infer their likely inactivation mechanisms. Overall, 47.0% non-PRO/GLY mutants are stable (DeltaDeltaG < 1.0 kT) and 36.3% mutants are unstable (DeltaDeltaG > 3.0 kT), 12.2% mutants are with 1.0 kT < DeltaDeltaG < 3.0 kT. Only 4.5% mutants are with no conclusive predictions. Certain types of either stable or unstable mutations are found not to depend on their local structures. Y, I, C, V, F and W (W, R and F) are the most common residues before (after) mutation in unstable mutants. Q, N, K, D, A, S and T (I, T, L and V) are the most common residues before (after) mutation in stable mutants. The stability correlations with sequence, structure, and molecular contacts are also analyzed. No direct correlation between secondary structure and stability is apparent, but a strong correlation between solvent exposure and stability is noticeable. Our correlation analysis shows that loss of protein-protein contacts may be an alternative cause for p53 inactivation. Correlation with clinical data shows that loss of stability and loss of DNA contacts are the two main inactivation mechanisms. Finally, correlation with functional data shows that most mutations which retain functions are stable, and most mutations that gain functions are unstable, indicating destabilized and deformed p53 proteins are more likely to find new binding partners.PACS codes: 87.14.E-
Collapse
Affiliation(s)
- Yuhong Tan
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA 92697-3900, USA
| | - Ray Luo
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA 92697-3900, USA
| |
Collapse
|
19
|
Dynerman D, Butzlaff E, Mitchell JC. CUSA and CUDE: GPU-accelerated methods for estimating solvent accessible surface area and desolvation. J Comput Biol 2009; 16:523-37. [PMID: 19361325 DOI: 10.1089/cmb.2008.0157] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
It is well-established that a linear correlation exists between accessible surface areas and experimentally measured solvation energies. Combining this knowledge with an analytic formula for calculation of solvent accessible surfaces, we derive a simple model of desolvation energy as a differentiable function of atomic positions. Additionally, we find that this algorithm is particularly well suited for hardware acceleration on graphics processing units (GPUs), outperforming the CPU by up to two orders of magnitude. We explore the scaling of this desolvation algorithm and provide implementation details applicable to general pairwise algorithms.
Collapse
Affiliation(s)
- David Dynerman
- Department of Mathematics, University of Wisconsin, Madison, Wisconsin 53706, USA
| | | | | |
Collapse
|
20
|
Zhou P, Tian F, Shang Z. 2D depiction of nonbonding interactions for protein complexes. J Comput Chem 2009; 30:940-51. [PMID: 18942722 DOI: 10.1002/jcc.21109] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A program called the 2D-GraLab is described for automatically generating schematic representation of nonbonding interactions across the protein binding interfaces. The input file of this program takes the standard PDB format, and the outputs are two-dimensional PostScript diagrams giving intuitive and informative description of the protein-protein interactions and their energetics properties, including hydrogen bond, salt bridge, van der Waals interaction, hydrophobic contact, pi-pi stacking, disulfide bond, desolvation effect, and loss of conformational entropy. To ensure these interaction information are determined accurately and reliably, methods and standalone programs employed in the 2D-GraLab are all widely used in the chemistry and biology community. The generated diagrams allow intuitive visualization of the interaction mode and binding specificity between two subunits in protein complexes, and by providing information on nonbonding energetics and geometric characteristics, the program offers the possibility of comparing different protein binding profiles in a detailed, objective, and quantitative manner. We expect that this 2D molecular graphics tool could be useful for the experimentalists and theoreticians interested in protein structure and protein engineering.
Collapse
Affiliation(s)
- Peng Zhou
- Institute of Molecular Design & Molecular Thermodynamics, Department of Chemistry, Zhejiang University, Hangzhou 310027, China
| | | | | |
Collapse
|
21
|
Bhattacherjee A, Biswas P. Statistical Theory of Protein Sequence Design by Random Mutation. J Phys Chem B 2009; 113:5520-7. [DOI: 10.1021/jp810515s] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Parbati Biswas
- Department of Chemistry, University of Delhi, Delhi-110007
| |
Collapse
|
22
|
The Thermodynamics of Protein–Ligand Interaction and Solvation: Insights for Ligand Design. J Mol Biol 2008; 384:1002-17. [DOI: 10.1016/j.jmb.2008.09.073] [Citation(s) in RCA: 249] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2008] [Revised: 09/26/2008] [Accepted: 09/26/2008] [Indexed: 11/21/2022]
|
23
|
Fernández M, Fernández L, Sánchez P, Caballero J, Abreu JI. Proteometric modelling of protein conformational stability using amino acid sequence autocorrelation vectors and genetic algorithm-optimised support vector machines. MOLECULAR SIMULATION 2008. [DOI: 10.1080/08927020802301920] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Michael Fernández
- a Faculty of Agronomy, Center for Biotechnological Studies, University of Matanzas, Molecular Modeling Group , Matanzas, Cuba
- b Kyushu Institute of Technology (KIT), Department of Bioscience and Bioinformatics , Iizuka, Fukuoka, Japan
| | - Leyden Fernández
- a Faculty of Agronomy, Center for Biotechnological Studies, University of Matanzas, Molecular Modeling Group , Matanzas, Cuba
| | - Pedro Sánchez
- a Faculty of Agronomy, Center for Biotechnological Studies, University of Matanzas, Molecular Modeling Group , Matanzas, Cuba
- c Faculty of Informatics, University of Matanzas, Artificial Intelligence Lab , Matanzas, Cuba
| | - Julio Caballero
- d Centro de Bioinformática y Simulación Molecular, Universidad de Talca , Talca, Chile
| | - Jose Ignacio Abreu
- a Faculty of Agronomy, Center for Biotechnological Studies, University of Matanzas, Molecular Modeling Group , Matanzas, Cuba
- c Faculty of Informatics, University of Matanzas, Artificial Intelligence Lab , Matanzas, Cuba
| |
Collapse
|
24
|
Dea-Ayuela MA, Pérez-Castillo Y, Meneses-Marcel A, Ubeira FM, Bolas-Fernández F, Chou KC, González-Díaz H. HP-Lattice QSAR for dynein proteins: experimental proteomics (2D-electrophoresis, mass spectrometry) and theoretic study of a Leishmania infantum sequence. Bioorg Med Chem 2008; 16:7770-6. [PMID: 18662882 DOI: 10.1016/j.bmc.2008.07.023] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2008] [Revised: 06/23/2008] [Accepted: 07/02/2008] [Indexed: 10/21/2022]
Abstract
The toxicity and inefficacy of actual organic drugs against Leishmaniosis justify research projects to find new molecular targets in Leishmania species including Leishmania infantum (L. infantum) and Leishmaniamajor (L. major), both important pathogens. In this sense, quantitative structure-activity relationship (QSAR) methods, which are very useful in Bioorganic and Medicinal Chemistry to discover small-sized drugs, may help to identify not only new drugs but also new drug targets, if we apply them to proteins. Dyneins are important proteins of these parasites governing fundamental processes such as cilia and flagella motion, nuclear migration, organization of the mitotic splinde, and chromosome separation during mitosis. However, despite the interest for them as potential drug targets, so far there has been no report whatsoever on dyneins with QSAR techniques. To the best of our knowledge, we report here the first QSAR for dynein proteins. We used as input the Spectral Moments of a Markov matrix associated to the HP-Lattice Network of the protein sequence. The data contain 411 protein sequences of different species selected by ClustalX to develop a QSAR that correctly discriminates on average between 92.75% and 92.51% of dyneins and other proteins in four different train and cross-validation datasets. We also report a combined experimental and theoretic study of a new dynein sequence in order to illustrate the utility of the model to search for potential drug targets with a practical example. First, we carried out a 2D-electrophoresis analysis of L. infantum biological samples. Next, we excised from 2D-E gels one spot of interest belonging to an unknown protein or protein fragment in the region M<20,200 and pI<4. We used MASCOT search engine to find proteins in the L. major data base with the highest similarity score to the MS of the protein isolated from L. infantum. We used the QSAR model to predict the new sequence as dynein with probability of 99.99% without relying upon alignment. In order to confirm the previous function annotation we predicted the sequences as dynein with BLAST and the omniBLAST tools (96% alignment similarity to dyneins of other species). Using this combined strategy, we have successfully identified L. infantum protein containing dynein heavy chain, and illustrated the potential use of the QSAR model as a complement to alignment tools.
Collapse
|
25
|
Abstract
Prediction of protein stability upon amino acid substitution is a challenging problem and it will be helpful for designing stable mutants. We have developed a thermodynamic database for proteins and mutants (ProTherm), which has more than 20000 thermodynamic data along with sequence and structure information, experimental conditions and literature information. It is freely accessible at http://gibk26.bse.kyutech.ac.jp/jouhou/protherm/protherm.html. Utilizing the database, we have analysed the relationship between amino acid properties and protein stability and developed different methods, such as average assignment method, distance and torsion potentials and decision tree models to discriminate the stabilizing and destabilizing mutants, and to predict the stability change upon mutation. Our method could distinguish the stabilizing and destabilizing mutants with an accuracy of 82 and 85% respectively from amino acid sequence and protein three-dimensional structure. We obtained the correlation of 0.70 and 0.87, between the experimental and predicted stability changes upon mutations, from sequence and structure respectively. Furthermore, we have developed different web servers for discrimination and prediction and they are freely accessible at http://bioinformatics.myweb.hinet.net/iptree.htm and http://cupsat.tu-bs.de/.
Collapse
|
26
|
am Busch MS, Lopes A, Amara N, Bathelt C, Simonson T. Testing the Coulomb/Accessible Surface Area solvent model for protein stability, ligand binding, and protein design. BMC Bioinformatics 2008; 9:148. [PMID: 18366628 PMCID: PMC2292695 DOI: 10.1186/1471-2105-9-148] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2007] [Accepted: 03/13/2008] [Indexed: 11/10/2022] Open
Abstract
Background Protein structure prediction and computational protein design require efficient yet sufficiently accurate descriptions of aqueous solvent. We continue to evaluate the performance of the Coulomb/Accessible Surface Area (CASA) implicit solvent model, in combination with the Charmm19 molecular mechanics force field. We test a set of model parameters optimized earlier, and we also carry out a new optimization in this work, using as a target a set of experimental stability changes for single point mutations of various proteins and peptides. The optimization procedure is general, and could be used with other force fields. The computation of stability changes requires a model for the unfolded state of the protein. In our approach, this state is represented by tripeptide structures of the sequence Ala-X-Ala for each amino acid type X. We followed an iterative optimization scheme which, at each cycle, optimizes the solvation parameters and a set of tripeptide structures for the unfolded state. This protocol uses a set of 140 experimental stability mutations and a large set of tripeptide conformations to find the best tripeptide structures and solvation parameters. Results Using the optimized parameters, we obtain a mean unsigned error of 2.28 kcal/mol for the stability mutations. The performance of the CASA model is assessed by two further applications: (i) calculation of protein-ligand binding affinities and (ii) computational protein design. For these two applications, the previous parameters and the ones optimized here give a similar performance. For ligand binding, we obtain reasonable agreement with a set of 55 experimental mutation data, with a mean unsigned error of 1.76 kcal/mol with the new parameters and 1.47 kcal/mol with the earlier ones. We show that the optimized CASA model is not inferior to the Generalized Born/Surface Area (GB/SA) model for the prediction of these binding affinities. Likewise, the new parameters perform well for the design of 8 SH3 domain proteins where an average of 32.8% sequence identity relative to the native sequences was achieved. Further, it was shown that the computed sequences have the character of naturally-occuring homologues of the native sequences. Conclusion Overall, the two CASA variants explored here perform very well for a wide variety of applications. Both variants provide an efficient solvent treatment for the computational engineering of ligands and proteins.
Collapse
Affiliation(s)
- Marcel Schmidt am Busch
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France.
| | | | | | | | | |
Collapse
|
27
|
Fernández M, Caballero J, Fernández L, Abreu JI, Garriga M. Protein radial distribution function (P-RDF) and Bayesian-Regularized Genetic Neural Networks for modeling protein conformational stability: Chymotrypsin inhibitor 2 mutants. J Mol Graph Model 2007; 26:748-59. [PMID: 17569565 DOI: 10.1016/j.jmgm.2007.04.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2007] [Revised: 04/03/2007] [Accepted: 04/28/2007] [Indexed: 11/30/2022]
Abstract
Development of novel computational approaches for modeling protein properties is a main goal in applied Proteomics. In this work, we reported the extension of the radial distribution function (RDF) scores formalism to proteins for encoding 3D structural information with modeling purposes. Protein-RDF (P-RDF) scores measure spherical distributions on protein 3D structure of 48 amino acids/residues properties selected from the AAindex data base. P-RDF scores were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (DeltaDeltaG) of chymotrypsin inhibitor 2 upon mutations. In this sense, an ensemble of Bayesian-Regularized Genetic Neural Networks (BRGNNs) yielded an optimum nonlinear model for the conformational stability. The ensemble predictor described about 84% and 70% variance of the data in training and test sets, respectively.
Collapse
Affiliation(s)
- Michael Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba.
| | | | | | | | | |
Collapse
|
28
|
Fernández M, Abreu JI, Caballero J, Garriga M, Fernández L. Comparative modeling of the conformational stability of chymotrypsin inhibitor 2 protein mutants using amino acid sequence autocorrelation (AASA) and amino acid 3D autocorrelation (AA3DA) vectors and ensembles of Bayesian-regularized genetic neural networks. MOLECULAR SIMULATION 2007. [DOI: 10.1080/08927020701564479] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
29
|
Stumpff-Kane AW, Maksimiak K, Lee MS, Feig M. Sampling of near-native protein conformations during protein structure refinement using a coarse-grained model, normal modes, and molecular dynamics simulations. Proteins 2007; 70:1345-56. [PMID: 17876825 DOI: 10.1002/prot.21674] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Protein structure refinement from comparative models with the goal of predicting structures at near-experimental accuracy remains an unsolved problem. Structure refinement might be achieved with an iterative protocol where the most native-like structure from a set of decoys generated from an initial model in one cycle is used as the starting structure for the next cycle. Conformational sampling based on the coarse-grained SICHO model, atomic level of detail molecular dynamics simulations, and normal-mode analysis is compared in the context of such a protocol. All of the sampling methods can achieve significant refinement close to experimental structures, although the distribution of structures and the ability to reach native-like structures differs greatly. Implications for the practical application of such sampling methods and the requirements for scoring functions in an iterative refinement protocol are analyzed in the context of theoretical predictions for the distribution of protein-like conformations with a random sampling protocol.
Collapse
Affiliation(s)
- Andrew W Stumpff-Kane
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, USA
| | | | | | | |
Collapse
|
30
|
Fernández M, Caballero J, Fernández L, Abreu JI, Acosta G. Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines. MOLECULAR SIMULATION 2007. [DOI: 10.1080/08927020701377070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
31
|
Staritzbichler R, Gu W, Helms V. Are solvation free energies of homogeneous helical peptides additive? J Phys Chem B 2007; 109:19000-7. [PMID: 16853446 DOI: 10.1021/jp052403x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We investigated the additivity of the solvation free energy of amino acids in homogeneous helices of different length in water and in chloroform. Solvation free energies were computed by multiconfiguration thermodynamic integration involving extended molecular dynamics simulations and by applying the generalized-born surface area solvation model to static helix geometries. The investigation focused on homogeneous peptides composed of uncharged amino acids, where the backbone atoms are kept fixed in an ideal helical conformation. We found nonlinearity especially for short peptides, which does not allow a simple treatment of the interaction of amino acids with their surroundings. For homogeneous peptides longer than five residues, the results from both methods are in quite good agreement and solvation energies are to a good extent additive.
Collapse
|
32
|
Fernández M, Caballero J, Fernández L, Abreu JI, Acosta G. Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines. Proteins 2007; 70:167-75. [PMID: 17654549 DOI: 10.1002/prot.21524] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This work reports a novel 3D pseudo-folding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (DeltaDeltaG) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over 70% in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin.
Collapse
Affiliation(s)
- Michael Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba.
| | | | | | | | | |
Collapse
|
33
|
Bueno M, Camacho CJ, Sancho J. SIMPLE estimate of the free energy change due to aliphatic mutations: Superior predictions based on first principles. Proteins 2007; 68:850-62. [PMID: 17523191 DOI: 10.1002/prot.21453] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The bioinformatics revolution of the last decade has been instrumental in the development of empirical potentials to quantitatively estimate protein interactions for modeling and design. Although computationally efficient, these potentials hide most of the relevant thermodynamics in 5-to-40 parameters that are fitted against a large experimental database. Here, we revisit this longstanding problem and show that a careful consideration of the change in hydrophobicity, electrostatics, and configurational entropy between the folded and unfolded state of aliphatic point mutations predicts 20-30% less false positives and yields more accurate predictions than any published empirical energy function. This significant improvement is achieved with essentially no free parameters, validating past theoretical and experimental efforts to understand the thermodynamics of protein folding. Our first principle analysis strongly suggests that both the solute-solute van der Waals interactions in the folded state and the electrostatics free energy change of exposed aliphatic mutations are almost completely compensated by similar interactions operating in the unfolded ensemble. Not surprisingly, the problem of properly accounting for the solvent contribution to the free energy of polar and charged group mutations, as well as of mutations that disrupt the protein backbone remains open.
Collapse
Affiliation(s)
- Marta Bueno
- Department of Computational Biology, University of Pittsburgh, Pennsylvania, USA
| | | | | |
Collapse
|
34
|
Fernández L, Caballero J, Abreu JI, Fernández M. Amino acid sequence autocorrelation vectors and bayesian-regularized genetic neural networks for modeling protein conformational stability: Gene V protein mutants. Proteins 2007; 67:834-52. [PMID: 17377990 DOI: 10.1002/prot.21349] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Development of novel computational approaches for modeling protein properties from their primary structure is the main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino acid sequence autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex data base. A total of 720 AASA descriptors were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (delta deltaG) of gene V protein upon mutation. In this sense, ensembles of Bayesian-regularized genetic neural networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 66% variance of the data in training and test sets respectively. Furthermore, the optimum AASA vector subset not only helped to successfully model unfolding stability but also well distributed wild-type and gene V protein mutants on a stability self-organized map (SOM), when used for unsupervised training of competitive neurons.
Collapse
Affiliation(s)
- Leyden Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba
| | | | | | | |
Collapse
|
35
|
Huang LT, Saraboji K, Ho SY, Hwang SF, Ponnuswamy MN, Gromiha MM. Prediction of protein mutant stability using classification and regression tool. Biophys Chem 2007; 125:462-70. [PMID: 17113702 DOI: 10.1016/j.bpc.2006.10.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2006] [Revised: 10/19/2006] [Accepted: 10/23/2006] [Indexed: 11/18/2022]
Abstract
Prediction of protein stability upon amino acid substitutions is an important problem in molecular biology and the solving of which would help for designing stable mutants. In this work, we have analyzed the stability of protein mutants using two different datasets of 1396 and 2204 mutants obtained from ProTherm database, respectively for free energy change due to thermal (DeltaDeltaG) and denaturant denaturations (DeltaDeltaG(H(2)O)). We have used a set of 48 physical, chemical energetic and conformational properties of amino acid residues and computed the difference of amino acid properties for each mutant in both sets of data. These differences in amino acid properties have been related to protein stability (DeltaDeltaG and DeltaDeltaG(H(2)O)) and are used to train with classification and regression tool for predicting the stability of protein mutants. Further, we have tested the method with 4 fold, 5 fold and 10 fold cross validation procedures. We found that the physical properties, shape and flexibility are important determinants of protein stability. The classification of mutants based on secondary structure (helix, strand, turn and coil) and solvent accessibility (buried, partially buried, partially exposed and exposed) distinguished the stabilizing/destabilizing mutants at an average accuracy of 81% and 80%, respectively for DeltaDeltaG and DeltaDeltaG(H(2)O). The correlation between the experimental and predicted stability change is 0.61 for DeltaDeltaG and 0.44 for DeltaDeltaG(H(2)O). Further, the free energy change due to the replacement of amino acid residue has been predicted within an average error of 1.08 kcal/mol and 1.37 kcal/mol for thermal and chemical denaturation, respectively. The relative importance of secondary structure and solvent accessibility, and the influence of the dataset on prediction of protein mutant stability have been discussed.
Collapse
Affiliation(s)
- Liang-Tsung Huang
- Institute of Information Engineering and Computer Science, Feng-Chia University, Taichung, 407, Taiwan
| | | | | | | | | | | |
Collapse
|
36
|
González-Díaz H, Pérez-Castillo Y, Podda G, Uriarte E. Computational chemistry comparison of stable/nonstable protein mutants classification models based on 3D and topological indices. J Comput Chem 2007; 28:1990-5. [PMID: 17450569 DOI: 10.1002/jcc.20700] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
In principle, there are different protein structural parameters that can be used in computational chemistry studies to classify protein mutants according to thermal stability including: sequence, connectivity, and 3D descriptors. Connectivity parameters (called topological indices, TIs) are simpler than 3D parameters being then less computationally expensive. However, TIs ignore important aspects of protein structure and hence are expected to be inaccurate. In any case, a comparison of 3D and TIs has not been reported with respect to the power of discrimination of proteins according to stability. In this study, we compare both classes of indices in this sense by the first time. The best model found, based on 3D spectral moments correctly classified 507 out of 525 (96.6%) proteins while TIs model correctly classified 404 out of 525 (77.0%) proteins. We have shown that, in fact, 3D descriptor models gave more accurate results than TIs but interestingly, TIs give acceptable results in a timely way in spite of their simplicity.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela 15782, Spain.
| | | | | | | |
Collapse
|
37
|
González-Díaz H, Uriarte E. Biopolymer stochastic moments. I. Modeling human rhinovirus cellular recognition with protein surface electrostatic moments. Biopolymers 2006; 77:296-303. [PMID: 15648087 DOI: 10.1002/bip.20234] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Stochastic moments may be applied as molecular descriptors in quantitative structure-activity relationship (QSAR) studies for small molecules (H. González-Dìaz et al., Journal of Molecular Modeling, 2002, Vol. 8, pp. 237-245; 2003, Vol. 9, pp. 395-407). However, applications in the field of biopolymers are less known. Recently, the MARCH-INSIDE approach has been generalized to encode structural features of proteins and other biopolymers (H. González-Dáaz et al., Bioinformatics, 2003, Vol. 19, pp. 2079-2087; Bioorganic & Medicinal Chemistry Letters, 2004, Vol. 14, pp. 4691-4695; Polymers, 2004, Vol. 45, pp. 3845-3853; Bioorganic & Medicinal Chemistry, 2005, Vol. 13, pp. 323-331). The present article attempts to extend this research by introducing for the first time stochastic moments for a surface road map of viral proteins. These moments are afterward used to seek a model that predicts the cellular receptor for human rhinoviruses. The model correctly classified 100% of 10 viruses binding to low-density lipoprotein receptor (LDLR) and 88.9% of 9 viruses binding to the intracellular adhesion molecule (ICAM) receptors in training. The same results have been obtained in four cross-validation experiments using a resubstitution technique. The present model favorably compares, in terms of complexity, with other previously reported based on entropy considerations, and offers a quantitative basis for the visual rule previously reported by Vlasak et al.
Collapse
|
38
|
Parthiban V, Gromiha MM, Hoppe C, Schomburg D. Structural analysis and prediction of protein mutant stability using distance and torsion potentials: Role of secondary structure and solvent accessibility. Proteins 2006; 66:41-52. [PMID: 17068801 DOI: 10.1002/prot.21115] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Analyzing the factors behind protein stability is a key research topic in molecular biology, and has direct implications on protein structure prediction and protein-protein interactions. We have analyzed protein stability upon point mutations using a distance-dependant pair potential representing mainly through-space interactions, and torsion angle potential representing mainly neighboring effects as a basic statistical mechanical setup for the analysis. The synergetic effect of accessible surface area and secondary structure preferences was used as a classifier for the potentials. In addition, short-, medium-, and long-range interactions of the protein environment were also analyzed. Two datasets of point mutations were taken for the comparison of theoretically predicted stabilizing energy values with experimental DeltaDeltaG and DeltaDeltaGH(2)O from thermal and chemical denaturation experiments. These include 1538 and 1603 mutations, respectively, and contain 101 proteins that share a wide range of sequence identity. The resulting force fields were carefully evaluated with different statistical tests. Results show a maximum correlation of 0.87 with a standard error of 0.71 kcal/mol between predicted and measured DeltaDeltaG values and a prediction accuracy of 85.3% (stabilizing or destabilizing) for all mutations together. A correlation of 0.77 (more than 80% prediction accuracy with a standard error of 0.95 kcal/mol) each for the test dataset of split-sample validation and fivefold crossvalidation was obtained and a correlation of 0.70 (77.4% prediction accuracy with a standard error of 1.17 kcal/mol) was shown by the jackknife test. The same model was implemented, and the results were analyzed for mutations with DeltaDeltaGH(2)O. A correlation of 0.78 (standard error 0.96 kcal/mol) was observed with a prediction efficiency of 84.65%. This model can be used for the future prediction of protein structural stability together with various experimental techniques.
Collapse
Affiliation(s)
- Vijaya Parthiban
- Cologne University Bioinformatics Center, International Max Planck Research School, Cologne, Germany
| | | | | | | |
Collapse
|
39
|
Gudiksen KL, Gitlin I, Moustakas DT, Whitesides GM. Increasing the net charge and decreasing the hydrophobicity of bovine carbonic anhydrase decreases the rate of denaturation with sodium dodecyl sulfate. Biophys J 2006; 91:298-310. [PMID: 16617087 PMCID: PMC1479075 DOI: 10.1529/biophysj.106.081547] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2006] [Accepted: 03/22/2006] [Indexed: 11/18/2022] Open
Abstract
This study compares the rate of denaturation with sodium dodecyl sulfate (SDS) of the individual rungs of protein charge ladders generated by acylation of the lysine epsilon-NH3+ groups of bovine carbonic anhydrase II (BCA). Each acylation decreases the number of positively charged groups, increases the net negative charge, and increases the hydrophobic surface area of BCA. This study reports the kinetics of denaturation in solutions containing SDS of the protein charge ladders generated with acetic and hexanoic anhydrides; plotting these rates of denaturation as a function of the number of modifications yields a U-shaped curve. The proteins with an intermediate number of modifications are the most stable to denaturation by SDS. There are four competing interactions-two resulting from the change in electrostatics and two resulting from the change in exposed hydrophobic surface area-that determine how a modification affects the stability of a rung of a charge ladder of BCA to denaturation with SDS. A model based on assumptions about how these interactions affect the folded and transition states has been developed and fits the experimental results. Modeling indicates that for each additional acylation, the magnitude of the change in the activation energy of denaturation (DeltaDeltaG(double dagger)) due to changes in the electrostatics is much larger than the change in DeltaDeltaG(double dagger) due to changes in the hydrophobicity, but the intermolecular and intramolecular electrostatic effects are opposite in sign. At the high numbers of acylations, hydrophobic interactions cause the hexanoyl-modified BCA to denature nearly three orders of magnitude more rapidly than the acetyl-modified BCA.
Collapse
Affiliation(s)
- Katherine L Gudiksen
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, USA
| | | | | | | |
Collapse
|
40
|
Saraboji K, Gromiha MM, Ponnuswamy MN. Average assignment method for predicting the stability of protein mutants. Biopolymers 2006; 82:80-92. [PMID: 16453276 DOI: 10.1002/bip.20462] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Prediction of protein stability upon amino acid substitutions is an important problem in molecular biology and it will be helpful for designing stable mutants. In this work, we have analyzed the stability of protein mutants using three different data sets of 1791, 1396, and 2204 mutants, respectively, for thermal stability (DeltaTm), free energy change due to thermal (DeltaDeltaG), and denaturant denaturations (DeltaDeltaGH2O), obtained from the ProTherm database. We have classified the mutants into 380 possible substitutions and assigned the stability of each mutant using the information obtained with similar type of mutations. We observed that this assignment could distinguish the stabilizing and destabilizing mutants to an accuracy of 70-80% at different measures of stability. Further, we have classified the mutants based on secondary structure and solvent accessibility (ASA) and observed that the classification significantly improved the accuracy of prediction. The classification of mutants based on helix, strand, and coil distinguished the stabilizing/destabilizing mutants at an average accuracy of 82% and the correlation is 0.56; information about the location of residues at the interior, partially buried, and surface regions of a protein correctly identified the stabilizing/destabilizing residues at an average accuracy of 81% and the correlation is 0.59. The nine subclassifications based on three secondary structures and solvent accessibilities improved the accuracy of assigning stabilizing/destabilizing mutants to an accuracy of 84-89% for the three data sets. Further, the present method is able to predict the free energy change (DeltaDeltaG) upon mutations within a deviation of 0.64 kcal/mol. We suggest that this method could be used for predicting the stability of protein mutants.
Collapse
Affiliation(s)
- K Saraboji
- Department of Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai-600 025, India
| | | | | |
Collapse
|
41
|
Pei J, Wang Q, Zhou J, Lai L. Estimating protein-ligand binding free energy: atomic solvation parameters for partition coefficient and solvation free energy calculation. Proteins 2006; 57:651-64. [PMID: 15390269 DOI: 10.1002/prot.20198] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Solvation energy calculation is one of the main difficulties for the estimation of protein-ligand binding free energy and the correct scoring in docking studies. We have developed a new solvation energy estimation method for protein-ligand binding based on atomic solvation parameter (ASP), which has been shown to improve the power of protein-ligand binding free energy predictions. The ASP set, designed to handle both proteins and organic compounds and derived from experimental n-octanol/water partition coefficient (log P) data, contains 100 atom types (united model that treats hydrogen atoms implicitly) or 119 atom types (all-atom model that treats hydrogen atoms explicitly). By using this unified ASP set, an algorithm was developed for solvation energy calculation and was further integrated into a score function for predicting protein-ligand binding affinity. The score function reproduced the absolute binding free energies of a test set of 50 protein-ligand complexes with a standard error of 8.31 kJ/mol. As a byproduct, a conformation-dependent log P calculation algorithm named ASPLOGP was also implemented. The predictive results of ASPLOGP for a test set of 138 compounds were r = 0.968, s = 0.344 for the all-atom model and r = 0.962, s = 0.367 for the united model, which were better than previous conformation-dependent approaches and comparable to fragmental and atom-based methods. ASPLOGP also gave good predictive results for small peptides. The score function based on the ASP model can be applied widely in protein-ligand interaction studies and structure-based drug design.
Collapse
Affiliation(s)
- Jianfeng Pei
- State Key Laboratory for Structural Chemistry of Stable and Unstable Species, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | | | | | | |
Collapse
|
42
|
Ulmschneider MB, Sansom MSP, Di Nola A. Properties of integral membrane protein structures: derivation of an implicit membrane potential. Proteins 2006; 59:252-65. [PMID: 15723347 DOI: 10.1002/prot.20334] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Distributions of each amino acid in the trans-membrane domain were calculated as a function of the membrane normal using all currently available alpha-helical membrane protein structures with resolutions better than 4 A. The results were compared with previous sequence- and structure-based analyses. Calculation of the average hydrophobicity along the membrane normal demonstrated that the protein surface in the membrane domain is in fact much more hydrophobic than the protein core. While hydrophobic residues dominate the membrane domain, the interfacial regions of membrane proteins were found to be abundant in the small residues glycine, alanine, and serine, consistent with previous studies on membrane protein packing. Charged residues displayed nonsymmetric distributions with a preference for the intracellular interface. This effect was more prominent for Arg and Lys resulting in a direct confirmation of the positive inside rule. Potentials of mean force along the membrane normal were derived for each amino acid by fitting Gaussian functions to the residue distributions. The individual potentials agree well with experimental and theoretical considerations. The resulting implicit membrane potential was tested on various membrane proteins as well as single trans-membrane alpha-helices. All membrane proteins were found to be at an energy minimum when correctly inserted into the membrane. For alpha-helices both interfacial (i.e. surface bound) and inserted configurations were found to correspond to energy minima. The results demonstrate that the use of trans-membrane amino acid distributions to derive an implicit membrane representation yields meaningful residue potentials.
Collapse
|
43
|
Feig M, Chocholoušová J, Tanizaki S. Extending the horizon: towards the efficient modeling of large biomolecular complexes in atomic detail. Theor Chem Acc 2005. [DOI: 10.1007/s00214-005-0062-4] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
44
|
González-Díaz H, Uriarte E. Proteins QSAR with Markov average electrostatic potentials. Bioorg Med Chem Lett 2005; 15:5088-94. [PMID: 16169216 DOI: 10.1016/j.bmcl.2005.07.056] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2005] [Revised: 06/28/2005] [Accepted: 07/05/2005] [Indexed: 11/30/2022]
Abstract
Classic physicochemical and topological indices have been largely used in small molecules QSAR but less in proteins QSAR. In this study, a Markov model is used to calculate, for the first time, average electrostatic potentials xik for an indirect interaction between aminoacids placed at topologic distances k within a given protein backbone. The short-term average stochastic potential xi1 for 53 Arc repressor mutants was used to model the effect of Alanine scanning on thermal stability. The Arc repressor is a model protein of relevance for biochemical studies on bioorganics and medicinal chemistry. A linear discriminant analysis model developed correctly classified 43 out of 53, 81.1% of proteins according to their thermal stability. More specifically, the model classified 20/28, 71.4% of proteins with near wild-type stability and 23/25, 92.0% of proteins with reduced stability. Moreover, predictability in cross-validation procedures was of 81.0%. Expansion of the electrostatic potential in the series xi0, xi1, xi2, and xi3, justified the use of the abrupt truncation approach, being the overall accuracy >70.0% for xi0 but equal for xi1, xi2, and xi3. The xi1 model compared favorably with respect to others based on D-Fire potential, surface area, volume, partition coefficient, and molar refractivity, with less than 77.0% of accuracy [Ramos de Armas, R.; González-Díaz, H.; Molina, R.; Uriarte, E. Protein Struct. Func. Bioinf.2004, 56, 715]. The xi1 model also has more tractable interpretation than others based on Markovian negentropies and stochastic moments. Finally, the model is notably simpler than the two models based on quadratic and linear indices. Both models, reported by Marrero-Ponce et al., use four-to-five time more descriptors. Introduction of average stochastic potentials may be useful for QSAR applications; having xik amenable physical interpretation and being very effective.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela 15782, Spain.
| | | |
Collapse
|
45
|
Stochastic molecular descriptors for polymers. 3. Markov electrostatic moments as polymer 2D-folding descriptors: RNA–QSAR for mycobacterial promoters. POLYMER 2005. [DOI: 10.1016/j.polymer.2005.04.104] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
46
|
González-Díaz H, Molina R, Uriarte E. Recognition of stable protein mutants with 3D stochastic average electrostatic potentials. FEBS Lett 2005; 579:4297-301. [PMID: 16081074 DOI: 10.1016/j.febslet.2005.06.065] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2004] [Revised: 06/07/2005] [Accepted: 06/23/2005] [Indexed: 11/15/2022]
Abstract
As more and more proteins are applied to biochemical research there is increasing interest in studying their stability. In this study, a Markov model has been used to calculate molecular descriptors of the protein structure and these are called the average electrostatic potentials (xi(k)). These descriptors were intended to encode indirect electrostatic pair-wise interactions between amino acids located at Euclidean distance k within a given 3D protein backbone. The different xi(k) values could be calculated for the protein as a whole or for specific protein regions (orbits), which include amino acids that lie within a given range of distances from the center of charge of the protein. In this work we calculated the xi(k) values for 657 mutants of different proteins. A Linear Discriminant Analysis model correctly classified a subset of 435 out of 493 proteins according to their thermal stability - a level of predictability of 88.2%. This experiment was repeated with three additional subsets of proteins selected at random from the initial series of 657. More specifically, the model predicted 314/356 (88.2%) of mutants with higher stability than the corresponding wild-type protein and 264/301 (86.7%) of proteins with near wild-type stability. These results illustrate the possibilities for the average stochastic potentials xi(k) in the study of 3D-structure/property relationships for biochemically relevant proteins.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela 15782, Spain.
| | | | | |
Collapse
|
47
|
Feig M, Brooks CL. Recent advances in the development and application of implicit solvent models in biomolecule simulations. Curr Opin Struct Biol 2005; 14:217-24. [PMID: 15093837 DOI: 10.1016/j.sbi.2004.03.009] [Citation(s) in RCA: 403] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Advances have recently been made in the development of implicit solvent methodologies and their application to the modeling of biomolecules, particularly with regard to generalized Born approaches, dielectric screening function formulations and models based on solvent-accessible surface areas. Interesting new developments include more refined non-polar solvation energy estimators, and implicit methods for modeling low-dielectric and heterogeneous environments such as membrane systems. These have been successfully applied to molecular dynamics simulations, the scoring of protein conformations, and the calculation of binding affinities and folding free energy landscapes.
Collapse
Affiliation(s)
- Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, USA
| | | |
Collapse
|
48
|
González-Díaz H, Uriarte E, Ramos de Armas R. Predicting stability of Arc repressor mutants with protein stochastic moments. Bioorg Med Chem 2005; 13:323-31. [PMID: 15598555 DOI: 10.1016/j.bmc.2004.10.024] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2004] [Revised: 10/08/2004] [Accepted: 10/09/2004] [Indexed: 11/18/2022]
Abstract
As more and more protein structures are determined and applied to drug manufacture, there is increasing interest in studying their stability. In this study, the stochastic moments ((SR)pi(k)) of 53 Arc repressor mutants were introduced as molecular descriptors modeling protein stability. The Linear Discriminant Analysis model developed correctly classified 43 out of 53, 81.13% of proteins according to their thermal stability. More specifically, the model classified 20/28 (71.4%) proteins with near wild-type stability and 23/25 (92%) proteins with reduced stability. Moreover, validation of the model was carried out by re-substitution procedures (81.0%). In addition, the stochastic moments based model compared favorably with respect to others based on physicochemical and geometric parameters such as D-Fire potential, surface area, volume, partition coefficient, and molar refractivity, which presented less than 77% of accuracy. This result illustrates the possibilities of the stochastic moments' method for the study of bioorganic and medicinal chemistry relevant proteins.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela 15706, Spain.
| | | | | |
Collapse
|
49
|
Ponce YM, Marrero RM, Castro EA, Ramos de Armas R, Díaz HG, Zaldivar VR, Torrens F. Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability. Molecules 2004; 9:1124-47. [PMID: 18007508 DOI: 10.3390/91201124] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2004] [Revised: 12/12/2004] [Accepted: 12/13/2004] [Indexed: 11/16/2022] Open
Abstract
This report describes a new set of macromolecular descriptors of relevance to protein QSAR/QSPR studies, protein's quadratic indices. These descriptors are calculated from the macromolecular pseudograph's alpha-carbon atom adjacency matrix. A study of the protein stability effects for a complete set of alanine substitutions in Arc repressor illustrates this approach. Quantitative Structure-Stability Relationship (QSSR) models allow discriminating between near wild-type stability and reduced-stability A-mutants. A linear discriminant function gives rise to excellent discrimination between 85.4% (35/41)and 91.67% (11/12) of near wild-type stability/reduced stability mutants in training and test series, respectively. The model's overall predictability oscillates from 80.49 until 82.93, when n varies from 2 to 10 in leave-n-out cross validation procedures. This value stabilizes around 80.49% when n was > 6. Additionally, canonical regression analysis corroborates the statistical quality of the classification model (Rcanc = 0.72, p-level <0.0001). This analysis was also used to compute biological stability canonical scores for each Arc A-mutant. On the other hand, nonlinear piecewise regression model compares favorably with respect to linear regression one on predicting the melting temperature (tm)of the Arc A-mutants. The linear model explains almost 72% of the variance of the experimental tm (R = 0.85 and s = 5.64) and LOO press statistics evidenced its predictive ability (q2 = 0.55 and scv = 6.24). However, this linear regression model falls to resolve t(m) predictions of Arc A-mutants in external prediction series. Therefore, the use of nonlinear piecewise models was required. The tm values of A-mutants in training (R = 0.94) and test(R = 0.91) sets are calculated by piecewise model with a high degree of precision. A break-point value of 51.32 degrees C characterizes two mutants' clusters and coincides perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutants' Arc homodimers. These models also permit the interpretation of the driving forces of such a folding process. The models include protein's quadratic indices accounting for hydrophobic (z1), bulk-steric (z2), and electronic (z3) features of the studied molecules. Preponderance of z1 and z3 over z2 indicates the higher importance of the hydrophobic and electronic side chain terms in the folding of the Arc dimer. In this sense, developed equations involve short-reaching (k < or = 3), middle- reaching (3 < k < or = 7) and far-reaching (k= 8 or greater) z1, 2, 3-protein's quadratic indices. This situation points to topologic/topographic protein's backbone interactions control of the stability profile of wild-type Arc and its A-mutants. Consequently, the present approach represents a novel and very promising way to mathematical research in biology sciences.
Collapse
Affiliation(s)
- Yovani Marrero Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | |
Collapse
|
50
|
Miao J, Klein-Seetharaman J, Meirovitch H. The Optimal Fraction of Hydrophobic Residues Required to Ensure Protein Collapse. J Mol Biol 2004; 344:797-811. [PMID: 15533446 DOI: 10.1016/j.jmb.2004.09.061] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2004] [Revised: 09/14/2004] [Accepted: 09/21/2004] [Indexed: 11/30/2022]
Abstract
The hydrophobic interaction is the main driving force for protein folding. Here, we address the question of what is the optimal fraction, f of hydrophobic (H) residues required to ensure protein collapse. For very small f (say f<0.1), the protein chain is expected to behave as a random coil, where the H residues are "wrapped" locally by polar (P) residues. However, for large enough f this local coverage cannot be achieved and the thermodynamic alternative to avoid contact with water is burying the H residues in the interior of a compact chain structure. The interior also contains P residues that are known to be clustered to optimize their electrostatic interactions. This means that the H residues are clustered as well, i.e. they effectively attract each other like the H-monomers in Dill's HP lattice model. Previously, we asked the question: assuming that the H monomers in the HP model are distributed randomly along the chain, what fraction of them is required to ensure a compact ground state? We claimed there that f approximately p(c), where p(c) is the site percolation threshold of the lattice (in a percolation experiment, each site of an initially empty lattice is visited and a particle is placed there with a probability p. The interest is in the critical (minimal) value, p(c), for which percolation occurs, i.e. a cluster connecting the opposite sides of the lattice is created). Due to the above correspondence between the HP model and real proteins (and assuming that the H residues are distributed at random) we suggest that the experimental f should lead to percolating clusters of H residues over the highly dense protein core, i.e. clusters of the core size. To check this theory, we treat a simplified model consisting of H and P residues represented by their alpha-carbon atoms only. The structure is defined by the C(alpha)-C(alpha) virtual bond lengths, angles and dihedral angles, and the X-ray structure is best-fitted onto a face-centered cubic lattice. Percolation experiments are carried out for 103 single-chain proteins using six different hydrophobic sets of residues. Indeed, on average, percolating clusters are generated, which supports our theory; however, some sets lead to a better core coverage than others. We also calculate the largest actual hydrophobic cluster of each protein and show that, on average, these clusters span the core, again in accord with our theory. We discuss the effect of protein size, deviations from the average picture, and implications of this study for defining reliable simplified models of proteins.
Collapse
Affiliation(s)
- Jiangbo Miao
- Carnegie Mellon University School of Computer Science, Language Technologies Institute, Pittsburgh, PA 15213, USA
| | | | | |
Collapse
|