1
|
Sapozhnikov Y, Patel JS, Ytreberg FM, Miller CR. Statistical modeling to quantify the uncertainty of FoldX-predicted protein folding and binding stability. BMC Bioinformatics 2023; 24:426. [PMID: 37953256 PMCID: PMC10642056 DOI: 10.1186/s12859-023-05537-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 10/17/2023] [Indexed: 11/14/2023] Open
Abstract
BACKGROUND Computational methods of predicting protein stability changes upon missense mutations are invaluable tools in high-throughput studies involving a large number of protein variants. However, they are limited by a wide variation in accuracy and difficulty of assessing prediction uncertainty. Using a popular computational tool, FoldX, we develop a statistical framework that quantifies the uncertainty of predicted changes in protein stability. RESULTS We show that multiple linear regression models can be used to quantify the uncertainty associated with FoldX prediction for individual mutations. Comparing the performance among models with varying degrees of complexity, we find that the model precision improves significantly when we utilize molecular dynamics simulation as part of the FoldX workflow. Based on the model that incorporates information from molecular dynamics, biochemical properties, as well as FoldX energy terms, we can generally expect upper bounds on the uncertainty of folding stability predictions of ± 2.9 kcal/mol and ± 3.5 kcal/mol for binding stability predictions. The uncertainty for individual mutations varies; our model estimates it using FoldX energy terms, biochemical properties of the mutated residue, as well as the variability among snapshots from molecular dynamics simulation. CONCLUSIONS Using a linear regression framework, we construct models to predict the uncertainty associated with FoldX prediction of stability changes upon mutation. This technique is straightforward and can be extended to other computational methods as well.
Collapse
Affiliation(s)
- Yesol Sapozhnikov
- Program in Bioinformatics and Computational Biology, University of Idaho, Moscow, ID, 83844, USA
| | - Jagdish Suresh Patel
- Department of Chemical and Biological Engineering, University of Idaho, Moscow, ID, 83844, USA
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, 83844, USA
| | - F Marty Ytreberg
- Department of Physics, University of Idaho, Moscow, ID, 83844, USA
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, 83844, USA
| | - Craig R Miller
- Department of Biological Sciences, University of Idaho, Moscow, ID, 83844, USA.
- Institute for Modeling Collaboration and Innovation, University of Idaho, Moscow, ID, 83844, USA.
| |
Collapse
|
2
|
Biswas G, Mukherjee D, Dutta N, Ghosh P, Basu S. EnCPdock: a web-interface for direct conjoint comparative analyses of complementarity and binding energetics in inter-protein associations. J Mol Model 2023; 29:239. [PMID: 37423912 DOI: 10.1007/s00894-023-05626-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/20/2023] [Indexed: 07/11/2023]
Abstract
CONTEXT Protein-protein interaction (PPI) is a key component linked to virtually all cellular processes. Be it an enzyme catalysis ('classic type functions' of proteins) or a signal transduction ('non-classic'), proteins generally function involving stable or quasi-stable multi-protein associations. The physical basis for such associations is inherent in the combined effect of shape and electrostatic complementarities (Sc, EC) of the interacting protein partners at their interface, which provides indirect probabilistic estimates of the stability and affinity of the interaction. While Sc is a necessary criterion for inter-protein associations, EC can be favorable as well as disfavored (e.g., in transient interactions). Estimating equilibrium thermodynamic parameters (∆Gbinding, Kd) by experimental means is costly and time consuming, thereby opening windows for computational structural interventions. Attempts to empirically probe ∆Gbinding from coarse-grain structural descriptors (primarily, surface area based terms) have lately been overtaken by physics-based, knowledge-based and their hybrid approaches (MM/PBSA, FoldX, etc.) that directly compute ∆Gbinding without involving intermediate structural descriptors. METHODS Here, we present EnCPdock ( https://www.scinetmol.in/EnCPdock/ ), a user-friendly web-interface for the direct conjoint comparative analyses of complementarity and binding energetics in proteins. EnCPdock returns an AI-predicted ∆Gbinding computed by combining complementarity (Sc, EC) and other high-level structural descriptors (input feature vectors), and renders a prediction accuracy comparable to the state-of-the-art. EnCPdock further locates a PPI complex in terms of its {Sc, EC} values (taken as an ordered pair) in the two-dimensional complementarity plot (CP). In addition, it also generates mobile molecular graphics of the interfacial atomic contact network for further analyses. EnCPdock also furnishes individual feature trends along with the relative probability estimates (Prfmax) of the obtained feature-scores with respect to the events of their highest observed frequencies. Together, these functionalities are of real practical use for structural tinkering and intervention as might be relevant in the design of targeted protein-interfaces. Combining all its features and applications, EnCPdock presents a unique online tool that should be beneficial to structural biologists and researchers across related fraternities.
Collapse
Affiliation(s)
- Gargi Biswas
- Department of Chemistry and Structural Biology, Weizmann Institute of Science, 7610001, Rehovot, Israel
| | - Debasish Mukherjee
- Institute of Molecular Biology gGmbH (IMB), Ackermannweg 4, 55128, Mainz, Germany
| | - Nalok Dutta
- Dept of Biochemical Engineering, Faculty of Engineering Science, University College London, London, WC1E 6BT, UK
| | - Prithwi Ghosh
- Department of Botany, Narajole Raj College, Vidyasagar University, Midnapore, 721211, India
| | - Sankar Basu
- Department of Microbiology, Asutosh College (affiliated with University of Calcutta), 92, Shyama Prasad Mukherjee Rd, Bhowanipore, 700026, Kolkata, India.
| |
Collapse
|
3
|
Elmaidomy AH, Mohamed EM, Aly HF, Younis EA, Shams SGE, Altemani FH, Alzubaidi MA, Almaghrabi M, Harbi AA, Alsenani F, Sayed AM, Abdelmohsen UR. Anti-Inflammatory and Antioxidant Properties of Malapterurus electricus Skin Fish Methanolic Extract in Arthritic Rats: Therapeutic and Protective Effects. Mar Drugs 2022; 20:639. [PMID: 36286462 PMCID: PMC9604635 DOI: 10.3390/md20100639] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/05/2022] [Accepted: 10/12/2022] [Indexed: 11/04/2022] Open
Abstract
The protective and therapeutic anti-inflammatory and antioxidant potency of Malapterurus electricus (F. Malapteruridae) skin fish methanolic extract (FE) (300 mg/kg.b.wt/day for 7 days, orally) was tested in monosodium urate(MSU)-induced arthritic Wistar albino male rats' joints. Serum uric acid, TNF-α, IL-1β, NF-𝜅B, MDA, GSH, catalase, SOD, and glutathione reductase levels were all measured. According to the findings, FE significantly reduced uric acid levels and ankle swelling in both protective and therapeutic groups. Furthermore, it has anti-inflammatory effects by downregulating inflammatory cytokines, primarily through decreased oxidative stress and increased antioxidant status. All the aforementioned lesions were significantly improved in protected and treated rats with FE, according to histopathological findings. iNOS immunostaining revealed that protected and treated arthritic rats with FE had weak positive immune-reactive cells. Phytochemical analysis revealed that FE was high in fatty and amino acids. The most abundant compounds were vaccenic (24.52%), 9-octadecenoic (11.66%), palmitic (34.66%), stearic acids (14.63%), glycine (0.813 mg/100 mg), and alanine (1.645 mg/100 mg). Extensive molecular modelling and dynamics simulation experiments revealed that compound 4 has the potential to target and inhibit COX isoforms with a higher affinity for COX-2. As a result, we contend that FE could be a promising protective and therapeutic option for arthritis, aiding in the prevention and progression of this chronic inflammatory disease.
Collapse
Affiliation(s)
- Abeer H. Elmaidomy
- Department of Pharmacognosy, Faculty of Pharmacy, Beni-Suef University, Beni-Suef 62511, Egypt
| | - Esraa M. Mohamed
- Department of Pharmacognosy, Faculty of Pharmacy, MUST, Giza 12566, Egypt
| | - Hanan F. Aly
- Department of Therapeutic Chemistry, Pharmaceutical and Drug Industries Research Institute, National Research Centre, El Bouhouth St., Dokki, Giza 12622, Egypt
| | - Eman A. Younis
- Department of Therapeutic Chemistry, Pharmaceutical and Drug Industries Research Institute, National Research Centre, El Bouhouth St., Dokki, Giza 12622, Egypt
| | - Shams Gamal Eldin Shams
- Department of Therapeutic Chemistry, Pharmaceutical and Drug Industries Research Institute, National Research Centre, El Bouhouth St., Dokki, Giza 12622, Egypt
| | - Faisal H. Altemani
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, University of Tabuk, Tabuk 71491, Saudi Arabia
| | - Mubarak A. Alzubaidi
- Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Mohammed Almaghrabi
- Pharmacognosy and Pharmaceutical Chemistry Department, Faculty of Pharmacy, Taibah University, Al Madinah Al Munawarah 42353, Saudi Arabia
| | - Adnan Al Harbi
- Clinical Pharmacy Department, College of Pharmacy, Umm Al-Qura University, Makkah 21955, Saudi Arabia
| | - Faisal Alsenani
- Department of Pharmacognosy, College of Pharmacy, Umm Al-Qura University, Makkah 21955, Saudi Arabia
| | - Ahmed M. Sayed
- Department of Pharmacognosy, Faculty of Pharmacy, Nahda University, Beni-Suef 62513, Egypt
| | - Usama Ramadan Abdelmohsen
- Department of Pharmacognosy, Faculty of Pharmacy, Minia University, Minia 61519, Egypt
- Department of Pharmacognosy, Faculty of Pharmacy, Deraya University, 7 Universities Zone, New Minia 61111, Egypt
| |
Collapse
|
4
|
Bheemireddy S, Srinivasan N. Computational Study on the Dynamics of Mycobacterium Tuberculosis RNA Polymerase Assembly. Methods Mol Biol 2022; 2516:61-79. [PMID: 35922622 DOI: 10.1007/978-1-0716-2413-5_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Gene regulation is an intricate phenomenon involving precise function of many macromolecular complexes. Molecular basis of this phenomenon is highly complex and cannot be fully understood using a single technique. Computational approaches can play a crucial role in overall understanding of functional and mechanistic features of a protein or an assembly. Large amounts of structural data pertaining to these complexes are publicly available. In this project, we took advantage of the availability of the structural information to unravel functional intricacies of Mycobacterium tuberculosis RNA polymerase upon interaction with RbpA. In this article, we discuss how the knowledge on protein structure and dynamics can be exploited to study function using various computational tools and resources. Overall, this article provides an overview of various computational methods which can be efficiently used to understand the role of any protein. We hope especially the nonexperts in the field could benefit from our article.
Collapse
Affiliation(s)
- Sneha Bheemireddy
- Molecular Biophysics Unit, Indian Institute of Science, Bengaluru, Karnataka, India.
| | | |
Collapse
|
5
|
Planas-Iglesias J, Marques SM, Pinto GP, Musil M, Stourac J, Damborsky J, Bednar D. Computational design of enzymes for biotechnological applications. Biotechnol Adv 2021; 47:107696. [PMID: 33513434 DOI: 10.1016/j.biotechadv.2021.107696] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 01/12/2021] [Accepted: 01/13/2021] [Indexed: 12/14/2022]
Abstract
Enzymes are the natural catalysts that execute biochemical reactions upholding life. Their natural effectiveness has been fine-tuned as a result of millions of years of natural evolution. Such catalytic effectiveness has prompted the use of biocatalysts from multiple sources on different applications, including the industrial production of goods (food and beverages, detergents, textile, and pharmaceutics), environmental protection, and biomedical applications. Natural enzymes often need to be improved by protein engineering to optimize their function in non-native environments. Recent technological advances have greatly facilitated this process by providing the experimental approaches of directed evolution or by enabling computer-assisted applications. Directed evolution mimics the natural selection process in a highly accelerated fashion at the expense of arduous laboratory work and economic resources. Theoretical methods provide predictions and represent an attractive complement to such experiments by waiving their inherent costs. Computational techniques can be used to engineer enzymatic reactivity, substrate specificity and ligand binding, access pathways and ligand transport, and global properties like protein stability, solubility, and flexibility. Theoretical approaches can also identify hotspots on the protein sequence for mutagenesis and predict suitable alternatives for selected positions with expected outcomes. This review covers the latest advances in computational methods for enzyme engineering and presents many successful case studies.
Collapse
Affiliation(s)
- Joan Planas-Iglesias
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Sérgio M Marques
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Gaspar P Pinto
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Milos Musil
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic; IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, 61266 Brno, Czech Republic
| | - Jan Stourac
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic.
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5/A13, 625 00 Brno, Czech Republic.
| |
Collapse
|
6
|
Fattahi M, Bushehri A, Alavi A, Asghariazar V, Nozari A, Ghasemi Firouzabadi S, Motamedian Dehkordi P, Javid M, Farajzadeh Valiliou S, Karimian J, Behjati F. Bi-allelic Mutations in ALDH5A1 is associated with succinic semialdehyde dehydrogenase deficiency and severe intellectual disability. Gene 2020:144918. [PMID: 32621952 DOI: 10.1016/j.gene.2020.144918] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 06/20/2020] [Indexed: 11/24/2022]
Abstract
Homozygous mutations of ALDH5A1 have been reportedly associated with Succinic semialdehyde dehydrogenase deficiency (SSADHD) that affects gamma-aminobutyric acid (GABA) catabolism and evinces a wide range of clinical phenotype from mild intellectual disability to severe neurodegenerative disorders. We report clinical and molecular data of a Lor family with 2 affected members presenting with severe intellectual disability, developmental delay, and generalized tonic-clonic seizures. A comprehensive genetic study that included whole-exome sequencing identified a homozygous missense substitution (NM_001080:c.G1321A:p.G441R) in ALDH5A1 (Aldehyde Dehydrogenase 5 Family Member A1) gene, consistent with clinical phenotype in the patients and co-segregating with the disease in the family. The non-synonymous mutation, p.G441R, affects a highly conserved amino acid residue, which is expected to cause a severe destabilization of the enzyme. Protein modeling demonstrated an impairment of the succinic semialdehyde (SSA) binding tunnel accessibility, and the anticipation of the protein folding stability and dynamics was a decrease in the free energy by 4.02 kcal/mol. Consistent with these in silico findings, excessive γ -hydroxybutyrate (GHB) could be detected in patients' urine as the byproduct of the GABA pathway. SSADHD, Succinic semialdehyde dehydrogenase deficiency; GABA, gamma-aminobutyric acid; ALDH5A1, Aldehyde Dehydrogenase 5 Family Member A1; GHB, γ -hydroxybutyrate; SSA, succinic semi aldehyde; WISC, Wechsler Intelligence Scale for Children; CNS, central nervous system ; EEG, electroencephalography; EEEF, empirical effective energy functions; ASD, autism spectrum disorder; ADHD, attention deficit hyperactivity disorder; IQ, intelligence quotient; EMG, electromyography; NCV, nerve conduction velocity; CP, cerebral palsy.
Collapse
Affiliation(s)
- Mahshid Fattahi
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - Ata Bushehri
- Department of Medical Genetics, Ilam University of Medical Sciences, Pajuhesh street, Ilam, Iran
| | - Afagh Alavi
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - Vahid Asghariazar
- Department of Medical Genetics, Faculty of Medicine, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Ahoura Nozari
- Medical Genetics Lab, Infertility Clinic, Shahrekord University of Medical Sciences, Shahrekord, Iran
| | | | | | - Marzieh Javid
- Department of Genetics, Faculty of Advanced Sciences & Technology, Pharmaceutical Sciences Branch, Islamic Azad University, Tehran Iran IAUPS
| | | | - Javad Karimian
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran
| | - Farkhondeh Behjati
- Genetics Research Center, University of Social Welfare and Rehabilitation Sciences, Tehran, Iran.
| |
Collapse
|
7
|
Musil M, Konegger H, Hon J, Bednar D, Damborsky J. Computational Design of Stable and Soluble Biocatalysts. ACS Catal 2018. [DOI: 10.1021/acscatal.8b03613] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Milos Musil
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, 612 66 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - Hannes Konegger
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - Jiri Hon
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- IT4Innovations Centre of Excellence, Faculty of Information Technology, Brno University of Technology, 612 66 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Centre for Toxic Compounds in the Environment (RECETOX), and Department of Experimental Biology, Faculty of Science, Masaryk University, 625 00 Brno, Czech Republic
- International Clinical Research Center, St. Anne’s University Hospital, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
8
|
Menduti G, Biamino E, Vittorini R, Vesco S, Puccinelli MP, Porta F, Capo C, Leo S, Ciminelli BM, Iacovelli F, Spada M, Falconi M, Malaspina P, Rossi L. Succinic semialdehyde dehydrogenase deficiency: The combination of a novel ALDH5A1 gene mutation and a missense SNP strongly affects SSADH enzyme activity and stability. Mol Genet Metab 2018; 124:210-215. [PMID: 29895405 DOI: 10.1016/j.ymgme.2018.05.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Revised: 05/28/2018] [Accepted: 05/30/2018] [Indexed: 02/06/2023]
Abstract
Succinic semialdehyde dehydrogenase deficiency (SSADHD) is a rare autosomal recessive metabolic disorder of GABA catabolism. SSADH is a mitochondrial homotetrameric enzyme encoded by ALDH5A1 gene. We report the molecular characterization of ALDH5A1 gene in an Italian SSADHD patient, showing heterozygosity for four missense mutations: c.526G>A (p.G176R), c.538C>T (p.H180Y), c.709G>T (p.A237S) and c.1267A>T (p.T423S), the latter never described so far. The patient inherited c.526A in cis with c.538T from the mother and c.709T in cis with c.1267T from the father. To explore the effects of the two allelic arrangements on SSADH activity and protein level, wild type, single or double mutated cDNA constructs were expressed in a cell system. The p.G176R change, alone or in combination with p.H180Y, causes the abolishment of enzyme activity. Western blot analysis showed a strongly reduced amount of the p.176R-p.180Y double mutant protein, suggesting increased degradation. Indeed, in silico analyses confirmed high instability of this mutant homotetramer. Enzyme activity relative to the other p.423S-p.237S double mutant is around 30% of wt. Further in silico analyses on all the possible combinations of mutant monomers suggest the lowest stability for the tetramer constituted by p.176R-p.180Y monomers and the highest stability for that constituted by p.237S-p.423S monomers. The present study shows that when a common SNP, associated with a slight reduction of SSADH activity, is inherited in cis with a mutation showing no consequences on the enzyme function, the activity is strongly affected. In conclusion, the peculiar arrangement of four missense mutations occurring in this patient is responsible for the SSADHD phenotype.
Collapse
Affiliation(s)
| | - Elisa Biamino
- Department of Pediatrics, University of Turin, Italy
| | - Roberta Vittorini
- Department of Pediatric Neurology, Regina Margherita Children Hospital, University of Turin, Italy
| | - Serena Vesco
- Department of Pediatric Neurology, Regina Margherita Children Hospital, University of Turin, Italy
| | - Maria Paola Puccinelli
- Department of Laboratory Medicine, Azienda Ospedaliera Città della Salute e della Scienza, Turin, Italy
| | | | - Concetta Capo
- Department of Biology, University of Rome Tor Vergata, Italy
| | - Sara Leo
- Department of Biology, University of Rome Tor Vergata, Italy
| | | | | | - Marco Spada
- Department of Pediatrics, University of Turin, Italy
| | - Mattia Falconi
- Department of Biology, University of Rome Tor Vergata, Italy
| | | | - Luisa Rossi
- Department of Biology, University of Rome Tor Vergata, Italy.
| |
Collapse
|
9
|
Leo S, Capo C, Ciminelli BM, Iacovelli F, Menduti G, Funghini S, Donati MA, Falconi M, Rossi L, Malaspina P. SSADH deficiency in an Italian family: a novel ALDH5A1 gene mutation affecting the succinic semialdehyde substrate binding site. Metab Brain Dis 2017; 32:1383-1388. [PMID: 28664505 DOI: 10.1007/s11011-017-0058-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Accepted: 06/20/2017] [Indexed: 12/20/2022]
Abstract
SSADH deficiency (SSADHD) is a rare autosomal recessively inherited metabolic disorder. It is associated with mutations of ALDH5A1 gene, coding for the homotetrameric enzyme SSADH. This enzyme is involved in γ-aminobutyric acid (GABA) catabolism, since it oxidizes succinic semialdehyde (SSA) to succinate. Mutations in ALDH5A1 gene result in the abnormal accumulation of γ-hydroxybutyrate (GHB), which is pathognomonic of SSADHD. In the present report, diagnosis of SSADHD in a three-month-old female was achieved by detection of high levels of GHB in urine. Sequence analysis of ALDH5A1 gene showed that the patient was a compound heterozygote for c.1226G > A (p.G409D) and the novel missense mutation, c.1498G > C (p.V500 L). By ALDH5A1 gene expression in transiently transfected HEK293 cells and enzyme activity assays, we demonstrate that the p.V500 L mutation, despite being conservative, produces complete loss of enzyme activity. In silico protein modelling analysis and evaluation of tetramer destabilizing energies suggest that structural impairment and partial occlusion of the access channel to the active site affect enzyme activity. These findings add further knowledge on the missense mutations associated with SSADHD and the molecular mechanisms underlying the loss of the enzyme activity.
Collapse
Affiliation(s)
- Sara Leo
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Concetta Capo
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Bianca Maria Ciminelli
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Federico Iacovelli
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Giovanna Menduti
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Silvia Funghini
- Newborn Screening Biochemistry and Pharmacology Laboratory, A. Meyer Children's Hospital, Florence, Italy
| | | | - Mattia Falconi
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Luisa Rossi
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy
| | - Patrizia Malaspina
- Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica, snc, 00133, Rome, Italy.
| |
Collapse
|
10
|
Gaillard T, Simonson T. Full Protein Sequence Redesign with an MMGBSA Energy Function. J Chem Theory Comput 2017; 13:4932-4943. [DOI: 10.1021/acs.jctc.7b00202] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biochimie
(CNRS UMR7654), Department of Biology, Ecole Polytechnique, 91128 Palaiseau, France
| |
Collapse
|
11
|
Molecular interactions of CPC, CPB, CTAB, and EPC biosurfactants in aqueous olive oil mixtures analyzed with physicochemical data and SEM micrographs. ARAB J CHEM 2014. [DOI: 10.1016/j.arabjc.2010.12.034] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
12
|
Mutation induced structural variation in membrane proteins. Chem Res Chin Univ 2013. [DOI: 10.1007/s40242-013-2427-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
13
|
Steinkruger JD, Bartlett GJ, Hadley EB, Fay L, Woolfson DN, Gellman SH. The d'--d--d' vertical triad is less discriminating than the a'--a--a' vertical triad in the antiparallel coiled-coil dimer motif. J Am Chem Soc 2012; 134:2626-33. [PMID: 22296518 DOI: 10.1021/ja208855x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Elucidating relationships between the amino-acid sequences of proteins and their three-dimensional structures, and uncovering non-covalent interactions that underlie polypeptide folding, are major goals in protein science. One approach toward these goals is to study interactions between selected residues, or among constellations of residues, in small folding motifs. The α-helical coiled coil has served as a platform for such studies because this folding unit is relatively simple in terms of both sequence and structure. Amino acid side chains at the helix-helix interface of a coiled coil participate in so-called "knobs-into-holes" (KIH) packing whereby a side chain (the knob) on one helix inserts into a space (the hole) generated by four side chains on a partner helix. The vast majority of sequence-stability studies on coiled-coil dimers have focused on lateral interactions within these KIH arrangements, for example, between an a position on one helix and an a' position of the partner in a parallel coiled-coil dimer, or between a--d' pairs in an antiparallel dimer. More recently, it has been shown that vertical triads (specifically, a'--a--a' triads) in antiparallel dimers exert a significant impact on pairing preferences. This observation provides impetus for analysis of other complex networks of side-chain interactions at the helix-helix interface. Here, we describe a combination of experimental and bioinformatics studies that show that d'--d--d' triads have much less impact on pairing preference than do a'--a--a' triads in a small, designed antiparallel coiled-coil dimer. However, the influence of the d'--d--d' triad depends on the lateral a'--d interaction. Taken together, these results strengthen the emerging understanding that simple pairwise interactions are not sufficient to describe side-chain interactions and overall stability in antiparallel coiled-coil dimers; higher-order interactions must be considered as well.
Collapse
Affiliation(s)
- Jay D Steinkruger
- Department of Chemistry, University of Wisconsin, Madison, Wisconsin 53706, USA
| | | | | | | | | | | |
Collapse
|
14
|
Marino SM, Gladyshev VN. Redox biology: computational approaches to the investigation of functional cysteine residues. Antioxid Redox Signal 2011; 15:135-46. [PMID: 20812876 PMCID: PMC3110093 DOI: 10.1089/ars.2010.3561] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2010] [Revised: 08/19/2010] [Accepted: 09/02/2010] [Indexed: 12/18/2022]
Abstract
Cysteine (Cys) residues serve many functions, such as catalysis, stabilization of protein structure through disulfides, metal binding, and regulation of protein function. Cys residues are also subject to numerous post-translational modifications. In recent years, various computational tools aiming at classifying and predicting different functional categories of Cys have been developed, particularly for structural and catalytic Cys. On the other hand, given complexity of the subject, bioinformatics approaches have been less successful for the investigation of regulatory Cys sites. In this review, we introduce different functional categories of Cys residues. For each category, an overview of state-of-the-art bioinformatics methods and tools is provided, along with examples of successful applications and potential limitations associated with each approach. Finally, we discuss Cys-based redox switches, which modify the view of distinct functional categories of Cys in proteins.
Collapse
Affiliation(s)
- Stefano M Marino
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | |
Collapse
|
15
|
Johnston MA, Søndergaard CR, Nielsen JE. Integrated prediction of the effect of mutations on multiple protein characteristics. Proteins 2011; 79:165-78. [PMID: 21058401 DOI: 10.1002/prot.22870] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Site-directed mutagenesis is routinely used in modern biology to elucidate the functional or biophysical roles of protein residues, and plays an important role in the field of rational protein design. Over the past decade, a number of computational tools have been developed that can predict the effect of point mutations on a protein's biophysical characteristics. However, these programs usually provide predictions for only a single characteristic. Furthermore, online versions of these tools are often impractical to use for examination of large and diverse sets of mutants. We have created a new web application, (http://enzyme.ucd.ie/PEAT_SA), that can simultaneously predict the effect of mutations on stability, ligand affinity and pK(a) values. PEAT-SA also provides an expanded feature-set with respect to other online tools which includes the ability to obtain predictions for multiple mutants in one submission. As a result, researchers who use site-directed mutagenesis can access state-of-the-art protein design methods with a fraction of the effort previously required. The results of benchmarking PEAT-SA on standard test-sets demonstrate that its accuracy for all three prediction types compares well to currently available tools. We illustrate PEAT-SA's potential by using it to investigate the influence of mutations on the activity of Subtilisin BPN'. This example demonstrates how the ability to obtain a wide range of information from one source, that can be combined to obtain deeper insight into the influence of mutations, makes PEAT-SA a valuable service to both experimental and computational biologists.
Collapse
Affiliation(s)
- Michael A Johnston
- School of Biomolecular and Biomedical Science, Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Belfield, Dublin 4, Ireland
| | | | | |
Collapse
|
16
|
Ackerman SH, Gatti DL. The contribution of coevolving residues to the stability of KDO8P synthase. PLoS One 2011; 6:e17459. [PMID: 21408011 PMCID: PMC3052366 DOI: 10.1371/journal.pone.0017459] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2010] [Accepted: 02/03/2011] [Indexed: 12/03/2022] Open
Abstract
Background The evolutionary tree of 3-deoxy-D-manno-octulosonate 8-phosphate (KDO8P) synthase (KDO8PS), a bacterial enzyme that catalyzes a key step in the biosynthesis of bacterial endotoxin, is evenly divided between metal and non-metal forms, both having similar structures, but diverging in various degrees in amino acid sequence. Mutagenesis, crystallographic and computational studies have established that only a few residues determine whether or not KDO8PS requires a metal for function. The remaining divergence in the amino acid sequence of KDO8PSs is apparently unrelated to the underlying catalytic mechanism. Methodology/Principal Findings The multiple alignment of all known KDO8PS sequences reveals that several residue pairs coevolved, an indication of their possible linkage to a structural constraint. In this study we investigated by computational means the contribution of coevolving residues to the stability of KDO8PS. We found that about 1/4 of all strongly coevolving pairs probably originated from cycles of mutation (decreasing stability) and suppression (restoring it), while the remaining pairs are best explained by a succession of neutral or nearly neutral covarions. Conclusions/Significance Both sequence conservation and coevolution are involved in the preservation of the core structure of KDO8PS, but the contribution of coevolving residues is, in proportion, smaller. This is because small stability gains or losses associated with selection of certain residues in some regions of the stability landscape of KDO8PS are easily offset by a large number of possible changes in other regions. While this effect increases the tolerance of KDO8PS to deleterious mutations, it also decreases the probability that specific pairs of residues could have a strong contribution to the thermodynamic stability of the protein.
Collapse
Affiliation(s)
- Sharon H. Ackerman
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan, United States of America
| | - Domenico L. Gatti
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan, United States of America
- Cardiovascular Research Institute, Wayne State University School of Medicine, Detroit, Michigan, United States of America
- * E-mail:
| |
Collapse
|
17
|
van der Sloot AM, Quax WJ. Computational design of TNF ligand-based protein therapeutics. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2011; 691:521-34. [PMID: 21153357 DOI: 10.1007/978-1-4419-6612-4_54] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Almer M van der Sloot
- EMBL-CRG Systems Biology Program, Design of Biological Systems, Centre de Regulació Genòmica, Dr Aiguader 88, 08003, Barcelona, Spain
| | | |
Collapse
|
18
|
Experimental library screening demonstrates the successful application of computational protein design to large structural ensembles. Proc Natl Acad Sci U S A 2010; 107:19838-43. [PMID: 21045132 DOI: 10.1073/pnas.1012985107] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The stability, activity, and solubility of a protein sequence are determined by a delicate balance of molecular interactions in a variety of conformational states. Even so, most computational protein design methods model sequences in the context of a single native conformation. Simulations that model the native state as an ensemble have been mostly neglected due to the lack of sufficiently powerful optimization algorithms for multistate design. Here, we have applied our multistate design algorithm to study the potential utility of various forms of input structural data for design. To facilitate a more thorough analysis, we developed new methods for the design and high-throughput stability determination of combinatorial mutation libraries based on protein design calculations. The application of these methods to the core design of a small model system produced many variants with improved thermodynamic stability and showed that multistate design methods can be readily applied to large structural ensembles. We found that exhaustive screening of our designed libraries helped to clarify several sources of simulation error that would have otherwise been difficult to ascertain. Interestingly, the lack of correlation between our simulated and experimentally measured stability values shows clearly that a design procedure need not reproduce experimental data exactly to achieve success. This surprising result suggests potentially fruitful directions for the improvement of computational protein design technology.
Collapse
|
19
|
Lassila JK. Conformational diversity and computational enzyme design. Curr Opin Chem Biol 2010; 14:676-82. [PMID: 20829099 PMCID: PMC2953567 DOI: 10.1016/j.cbpa.2010.08.010] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2010] [Revised: 08/06/2010] [Accepted: 08/06/2010] [Indexed: 11/22/2022]
Abstract
The application of computational protein design methods to the design of enzyme active sites offers potential routes to new catalysts and new reaction specificities. Computational design methods have typically treated the protein backbone as a rigid structure for the sake of computational tractability. However, this fixed-backbone approximation introduces its own special challenges for enzyme design and it contrasts with an emerging picture of natural enzymes as dynamic ensembles with multiple conformations and motions throughout a reaction cycle. This review considers the impact of conformational variation and dynamics on computational enzyme design and it highlights new approaches to addressing protein conformational diversity in enzyme design including recent advances in multi-state design, backbone flexibility, and computational library design.
Collapse
Affiliation(s)
- Jonathan K Lassila
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
20
|
Roy L, Case MA. Protein Core Packing by Dynamic Combinatorial Chemistry. J Am Chem Soc 2010; 132:8894-6. [DOI: 10.1021/ja1029717] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Liton Roy
- Department of Chemistry, The University of Vermont, Burlington, Vermont 05405
| | - Martin A. Case
- Department of Chemistry, The University of Vermont, Burlington, Vermont 05405
| |
Collapse
|
21
|
Yamashiro K, Yokobori SI, Koikeda S, Yamagishi A. Improvement of Bacillus circulans beta-amylase activity attained using the ancestral mutation method. Protein Eng Des Sel 2010; 23:519-28. [PMID: 20406825 DOI: 10.1093/protein/gzq021] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Thermostabilization of enzymes is one of the greatest challenges of protein engineering. The ancestral mutation method, which introduces ancestral residues into a target enzyme, has previously been developed and used to improve the thermostabilities of thermophilic enzymes. Herein, we report a study that used the ancestral mutation method to improve the thermostability of Bacillus circulans beta-amylase, a mesophilic enzyme. A bacterial, common-ancestral beta-amylase sequence was inferred using a phylogenetic tree composed of higher plant and bacterial amylase sequences. Eighteen mutants containing ancestral residues were designed, expressed in Escherichia coli and purified. Several of these mutants were more thermostable than that of the wild-type amylase. Notably, one mutant had both greater activity and greater thermostability. The relationship between the extent to which the amino acid residues within 5 A of the mutation site were evolutionarily conserved and the extent to which thermostability was improved was examined. Apparently, it is necessary to conserve the residues surrounding an ancestral residue if thermostability is to be improved by the ancestral mutation method.
Collapse
Affiliation(s)
- Kan Yamashiro
- Department of Frontier Research, Amano Enzyme Inc., 1-6, Technoplaza, Kakamigahara-Shi, Gifu 509-0109, Japan
| | | | | | | |
Collapse
|
22
|
Bonnard C, Kleinman CL, Rodrigue N, Lartillot N. Fast optimization of statistical potentials for structurally constrained phylogenetic models. BMC Evol Biol 2009; 9:227. [PMID: 19740424 PMCID: PMC2754480 DOI: 10.1186/1471-2148-9-227] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2009] [Accepted: 09/09/2009] [Indexed: 11/16/2022] Open
Abstract
Background Statistical approaches for protein design are relevant in the field of molecular evolutionary studies. In recent years, new, so-called structurally constrained (SC) models of protein-coding sequence evolution have been proposed, which use statistical potentials to assess sequence-structure compatibility. In a previous work, we defined a statistical framework for optimizing knowledge-based potentials especially suited to SC models. Our method used the maximum likelihood principle and provided what we call the joint potentials. However, the method required numerical estimations by the use of computationally heavy Markov Chain Monte Carlo sampling algorithms. Results Here, we develop an alternative optimization procedure, based on a leave-one-out argument coupled to fast gradient descent algorithms. We assess that the leave-one-out potential yields very similar results to the joint approach developed previously, both in terms of the resulting potential parameters, and by Bayes factor evaluation in a phylogenetic context. On the other hand, the leave-one-out approach results in a considerable computational benefit (up to a 1,000 fold decrease in computational time for the optimization procedure). Conclusion Due to its computational speed, the optimization method we propose offers an attractive alternative for the design and empirical evaluation of alternative forms of potentials, using large data sets and high-dimensional parameterizations.
Collapse
Affiliation(s)
- Cécile Bonnard
- Département d'Informatique, LIRMM, 161 rue Ada, 34392 Montpellier Cedex 5, France.
| | | | | | | |
Collapse
|
23
|
Suárez M, Jaramillo A. Challenges in the computational design of proteins. J R Soc Interface 2009; 6 Suppl 4:S477-91. [PMID: 19324680 PMCID: PMC2843960 DOI: 10.1098/rsif.2008.0508.focus] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2008] [Accepted: 02/04/2009] [Indexed: 11/12/2022] Open
Abstract
Protein design has many applications not only in biotechnology but also in basic science. It uses our current knowledge in structural biology to predict, by computer simulations, an amino acid sequence that would produce a protein with targeted properties. As in other examples of synthetic biology, this approach allows the testing of many hypotheses in biology. The recent development of automated computational methods to design proteins has enabled proteins to be designed that are very different from any known ones. Moreover, some of those methods mostly rely on a physical description of atomic interactions, which allows the designed sequences not to be biased towards known proteins. In this paper, we will describe the use of energy functions in computational protein design, the use of atomic models to evaluate the free energy in the unfolded and folded states, the exploration and optimization of amino acid sequences, the problem of negative design and the design of biomolecular function. We will also consider its use together with the experimental techniques such as directed evolution. We will end by discussing the challenges ahead in computational protein design and some of their future applications.
Collapse
Affiliation(s)
- María Suárez
- Laboratoire de Biochimie, Ecole Polytechnique, CNRS, 91128 Palaiseau Cedex, France
- Epigenomics Project, Genopole, Université d'Evry Val d'Essonne-Genopole-CNRS, Tour Evry2, Etage 10, Terrasses de l'Agora, 91034 Evry Cedex, France
| | - Alfonso Jaramillo
- Laboratoire de Biochimie, Ecole Polytechnique, CNRS, 91128 Palaiseau Cedex, France
- Epigenomics Project, Genopole, Université d'Evry Val d'Essonne-Genopole-CNRS, Tour Evry2, Etage 10, Terrasses de l'Agora, 91034 Evry Cedex, France
| |
Collapse
|
24
|
Cohen M, Potapov V, Schreiber G. Four distances between pairs of amino acids provide a precise description of their interaction. PLoS Comput Biol 2009; 5:e1000470. [PMID: 19680437 PMCID: PMC2715887 DOI: 10.1371/journal.pcbi.1000470] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2009] [Accepted: 07/15/2009] [Indexed: 11/18/2022] Open
Abstract
The three-dimensional structures of proteins are stabilized by the interactions between amino acid residues. Here we report a method where four distances are calculated between any two side chains to provide an exact spatial definition of their bonds. The data were binned into a four-dimensional grid and compared to a random model, from which the preference for specific four-distances was calculated. A clear relation between the quality of the experimental data and the tightness of the distance distribution was observed, with crystal structure data providing far tighter distance distributions than NMR data. Since the four-distance data have higher information content than classical bond descriptions, we were able to identify many unique inter-residue features not found previously in proteins. For example, we found that the side chains of Arg, Glu, Val and Leu are not symmetrical in respect to the interactions of their head groups. The described method may be developed into a function, which computationally models accurately protein structures.
Collapse
Affiliation(s)
- Mati Cohen
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Vladimir Potapov
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Gideon Schreiber
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
25
|
Backbone flexibility in computational protein design. Curr Opin Biotechnol 2009; 20:420-8. [DOI: 10.1016/j.copbio.2009.07.006] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2009] [Revised: 07/17/2009] [Accepted: 07/25/2009] [Indexed: 11/22/2022]
|
26
|
Van der Sloot AM, Kiel C, Serrano L, Stricher F. Protein design in biological networks: from manipulating the input to modifying the output. Protein Eng Des Sel 2009; 22:537-42. [DOI: 10.1093/protein/gzp032] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
27
|
Leemhuis H, Nightingale KP, Hollfelder F. Directed evolution of a histone acetyltransferase--enhancing thermostability, whilst maintaining catalytic activity and substrate specificity. FEBS J 2008; 275:5635-47. [PMID: 18959749 DOI: 10.1111/j.1742-4658.2008.06689.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Histone acetylation plays an integral role in the epigenetic regulation of gene expression. Transcriptional activity reflects the recruitment of opposing classes of enzymes to promoter elements; histone acetyltransferases (EC 2.3.1.48) that deposit acetyl marks at a subset of histone residues and histone deacetylases that remove them. Many histone acetyltransferases are difficult to study in solution because of their limited stability once purified. We have developed a directed evolution protocol that allows the screening of hundreds of histone acetyltransferase mutants for histone acetylating activity, and used this to enhance the thermostability of the human P/CAF histone acetyltransferase. Two rounds of directed evolution significantly stabilized the enzyme without lowering the catalytic efficiency and substrate specificity of the enzyme. Twenty-four variants with higher thermostability were identified. Detailed analysis revealed twelve single amino acid mutants that were found to possess a higher thermostability. The residues affected are scattered over the entire protein structure, and are different from mutations predicted by sequence alignment approaches, suggesting that sequence comparison and directed evolution methods are complementary strategies in engineering increased protein thermostability. The stabilizing mutations are predominately located at surface of the enzyme, suggesting that the protein's surface is important for stability. The directed evolution approach described in the present study is easily adapted to other histone modifying enzymes, requiring only appropriate peptide substrates and antibodies, which are available from commercial suppliers.
Collapse
Affiliation(s)
- Hans Leemhuis
- Department of Biochemistry, University of Cambridge, UK
| | | | | |
Collapse
|
28
|
Kang S, Chen G, Xiao G. Robust prediction of mutation-induced protein stability change by property encoding of amino acids. Protein Eng Des Sel 2008; 22:75-83. [PMID: 19054789 DOI: 10.1093/protein/gzn063] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Current methods of predicting mutation-induced protein stability change are imprecise. Machine learning methods have been introduced for this prediction recently; however, the available experimental data used for training these predictors are biased. Abundant data are available for several frequently occurring amino acid substitutions, whereas only limited data have been accumulated for some other mutation types. Generally, current statistical models do not account for this bias toward the commoner amino acids during the encoding process and are thus less effective in making predictions on less frequently occurring mutations. In this paper, we propose a method based on support vector machines and property encoding of amino acids. The predictor we constructed outperforms other methods on the same data sets and is more robust with poor training data. The prediction accuracy for mutations with no training data exceeded 80%. This advantage is critical for practical application, where the prediction could be applied for any type of mutations. Further analysis demonstrates our model relies on biological significant features to make predictions. To overcome the drawbacks of classifying mutations into 'stabilizing' and 'destabilizing' ones, a three-class classification of mutations was also discussed, where our method obtained an overall accuracy of 79.1%.
Collapse
Affiliation(s)
- Shuli Kang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan, Hubei 430072, China
| | | | | |
Collapse
|
29
|
Fretwell JF, K. Ismail SM, Cummings JM, Selby TL. Characterization of a randomized FRET library for protease specificity determination. MOLECULAR BIOSYSTEMS 2008; 4:862-70. [DOI: 10.1039/b709290c] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
30
|
Fernández M, Caballero J, Fernández L, Abreu JI, Garriga M. Protein radial distribution function (P-RDF) and Bayesian-Regularized Genetic Neural Networks for modeling protein conformational stability: Chymotrypsin inhibitor 2 mutants. J Mol Graph Model 2007; 26:748-59. [PMID: 17569565 DOI: 10.1016/j.jmgm.2007.04.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2007] [Revised: 04/03/2007] [Accepted: 04/28/2007] [Indexed: 11/30/2022]
Abstract
Development of novel computational approaches for modeling protein properties is a main goal in applied Proteomics. In this work, we reported the extension of the radial distribution function (RDF) scores formalism to proteins for encoding 3D structural information with modeling purposes. Protein-RDF (P-RDF) scores measure spherical distributions on protein 3D structure of 48 amino acids/residues properties selected from the AAindex data base. P-RDF scores were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (DeltaDeltaG) of chymotrypsin inhibitor 2 upon mutations. In this sense, an ensemble of Bayesian-Regularized Genetic Neural Networks (BRGNNs) yielded an optimum nonlinear model for the conformational stability. The ensemble predictor described about 84% and 70% variance of the data in training and test sets, respectively.
Collapse
Affiliation(s)
- Michael Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba.
| | | | | | | | | |
Collapse
|
31
|
Fernández M, Abreu JI, Caballero J, Garriga M, Fernández L. Comparative modeling of the conformational stability of chymotrypsin inhibitor 2 protein mutants using amino acid sequence autocorrelation (AASA) and amino acid 3D autocorrelation (AA3DA) vectors and ensembles of Bayesian-regularized genetic neural networks. MOLECULAR SIMULATION 2007. [DOI: 10.1080/08927020701564479] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
32
|
Fernández M, Caballero J, Fernández L, Abreu JI, Acosta G. Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines. MOLECULAR SIMULATION 2007. [DOI: 10.1080/08927020701377070] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
33
|
Fernández M, Caballero J, Fernández L, Abreu JI, Acosta G. Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines. Proteins 2007; 70:167-75. [PMID: 17654549 DOI: 10.1002/prot.21524] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This work reports a novel 3D pseudo-folding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (DeltaDeltaG) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over 70% in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin.
Collapse
Affiliation(s)
- Michael Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba.
| | | | | | | | | |
Collapse
|
34
|
Dallüge R, Oschmann J, Birkenmeier O, Lücke C, Lilie H, Rudolph R, Lange C. A tetrapeptide fragment-based design method results in highly stable artificial proteins. Proteins 2007; 68:839-49. [PMID: 17557327 DOI: 10.1002/prot.21493] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Computational protein design has progressed rapidly over the last years. A number of design methods have been proposed and tested. In this paper, we report the successful application of a fragment-based method for protein design. The method uses statistical information on tetrapeptide backbone conformations. The previously published artificial fold of TOP 7 (Kuhlman et al., Science, 2003; 302:1364-1368) was chosen as template. A series of polypeptide sequences were created that were predicted to fold into this target structure. Two of the designed proteins, M5 and M7, were expressed and characterized by fluorescence spectroscopy, circular dichroism and NMR. They showed the hallmarks of well-ordered tertiary structure as well as cooperative folding/unfolding transitions. Furthermore, the two novel proteins were found to be highly stable against temperature and denaturant-induced unfolding.
Collapse
Affiliation(s)
- Roman Dallüge
- Institut für Biotechnologie, Martin-Luther-Universität Halle-Wittenberg, 06099 Halle, Saale, Germany
| | | | | | | | | | | | | |
Collapse
|
35
|
Bueno M, Camacho CJ, Sancho J. SIMPLE estimate of the free energy change due to aliphatic mutations: Superior predictions based on first principles. Proteins 2007; 68:850-62. [PMID: 17523191 DOI: 10.1002/prot.21453] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The bioinformatics revolution of the last decade has been instrumental in the development of empirical potentials to quantitatively estimate protein interactions for modeling and design. Although computationally efficient, these potentials hide most of the relevant thermodynamics in 5-to-40 parameters that are fitted against a large experimental database. Here, we revisit this longstanding problem and show that a careful consideration of the change in hydrophobicity, electrostatics, and configurational entropy between the folded and unfolded state of aliphatic point mutations predicts 20-30% less false positives and yields more accurate predictions than any published empirical energy function. This significant improvement is achieved with essentially no free parameters, validating past theoretical and experimental efforts to understand the thermodynamics of protein folding. Our first principle analysis strongly suggests that both the solute-solute van der Waals interactions in the folded state and the electrostatics free energy change of exposed aliphatic mutations are almost completely compensated by similar interactions operating in the unfolded ensemble. Not surprisingly, the problem of properly accounting for the solvent contribution to the free energy of polar and charged group mutations, as well as of mutations that disrupt the protein backbone remains open.
Collapse
Affiliation(s)
- Marta Bueno
- Department of Computational Biology, University of Pittsburgh, Pennsylvania, USA
| | | | | |
Collapse
|
36
|
Huang LT, Gromiha MM, Ho SY. Sequence analysis and rule development of predicting protein stability change upon mutation using decision tree model. J Mol Model 2007; 13:879-90. [PMID: 17394029 DOI: 10.1007/s00894-007-0197-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2006] [Accepted: 03/01/2007] [Indexed: 11/26/2022]
Abstract
Understanding the mechanism of the protein stability change is one of the most challenging tasks. Recently, the prediction of protein stability change affected by single point mutations has become an interesting topic in molecular biology. However, it is desirable to further acquire knowledge from large databases to provide new insights into the nature of them. This paper presents an interpretable prediction tree method (named iPTREE-2) that can accurately predict changes of protein stability upon mutations from sequence based information and analyze sequence characteristics from the viewpoint of composition and order. Therefore, iPTREE-2 based on a regression tree algorithm exhibits the ability of finding important factors and developing rules for the purpose of data mining. On a dataset of 1859 different single point mutations from thermodynamic database, ProTherm, iPTREE-2 yields a correlation coefficient of 0.70 between predicted and experimental values. In the task of data mining, detailed analysis of sequences reveals the possibility of the compositional specificity of residues in different ranges of stability change and implies the existence of certain patterns. As building rules, we found that the mutation residues in wild type and in mutant protein play an important role. The present study demonstrates that iPTREE-2 can serve the purpose of predicting protein stability change, especially when one requires more understandable knowledge.
Collapse
Affiliation(s)
- Liang-Tsung Huang
- Institute of Information Engineering and Computer Science, Feng-Chia University, Taichung, Taiwan
| | | | | |
Collapse
|
37
|
Fernández L, Caballero J, Abreu JI, Fernández M. Amino acid sequence autocorrelation vectors and bayesian-regularized genetic neural networks for modeling protein conformational stability: Gene V protein mutants. Proteins 2007; 67:834-52. [PMID: 17377990 DOI: 10.1002/prot.21349] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Development of novel computational approaches for modeling protein properties from their primary structure is the main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino acid sequence autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex data base. A total of 720 AASA descriptors were tested for building predictive models of the change of thermal unfolding Gibbs free energy change (delta deltaG) of gene V protein upon mutation. In this sense, ensembles of Bayesian-regularized genetic neural networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 66% variance of the data in training and test sets respectively. Furthermore, the optimum AASA vector subset not only helped to successfully model unfolding stability but also well distributed wild-type and gene V protein mutants on a stability self-organized map (SOM), when used for unsupervised training of competitive neurons.
Collapse
Affiliation(s)
- Leyden Fernández
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba
| | | | | | | |
Collapse
|
38
|
Lassila JK, Privett HK, Allen BD, Mayo SL. Combinatorial methods for small-molecule placement in computational enzyme design. Proc Natl Acad Sci U S A 2006; 103:16710-5. [PMID: 17075051 PMCID: PMC1636520 DOI: 10.1073/pnas.0607691103] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The incorporation of small-molecule transition state structures into protein design calculations poses special challenges because of the need to represent the added translational, rotational, and conformational freedoms within an already difficult optimization problem. Successful approaches to computational enzyme design have focused on catalytic side-chain contacts to guide placement of small molecules in active sites. We describe a process for modeling small molecules in enzyme design calculations that extends previously described methods, allowing favorable small-molecule positions and conformations to be explored simultaneously with sequence optimization. Because all current computational enzyme design methods rely heavily on sampling of possible active site geometries from discrete conformational states, we tested the effects of discretization parameters on calculation results. Rotational and translational step sizes as well as side-chain library types were varied in a series of computational tests designed to identify native-like binding contacts in three natural systems. We find that conformational parameters, especially the type of rotamer library used, significantly affect the ability of design calculations to recover native binding-site geometries. We describe the construction and use of a crystallographic conformer library and find that it more reliably captures active-site geometries than traditional rotamer libraries in the systems tested.
Collapse
Affiliation(s)
| | | | | | - Stephen L. Mayo
- Division of Chemistry and Chemical Engineering, and
- Division of Biology and Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
39
|
Sen TZ, Cheng H, Kloczkowski A, Jernigan RL. A Consensus Data Mining secondary structure prediction by combining GOR V and Fragment Database Mining. Protein Sci 2006; 15:2499-506. [PMID: 17001039 PMCID: PMC2242411 DOI: 10.1110/ps.062125306] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
The major aim of tertiary structure prediction is to obtain protein models with the highest possible accuracy. Fold recognition, homology modeling, and de novo prediction methods typically use predicted secondary structures as input, and all of these methods may significantly benefit from more accurate secondary structure predictions. Although there are many different secondary structure prediction methods available in the literature, their cross-validated prediction accuracy is generally <80%. In order to increase the prediction accuracy, we developed a novel hybrid algorithm called Consensus Data Mining (CDM) that combines our two previous successful methods: (1) Fragment Database Mining (FDM), which exploits the Protein Data Bank structures, and (2) GOR V, which is based on information theory, Bayesian statistics, and multiple sequence alignments (MSA). In CDM, the target sequence is dissected into smaller fragments that are compared with fragments obtained from related sequences in the PDB. For fragments with a sequence identity above a certain sequence identity threshold, the FDM method is applied for the prediction. The remainder of the fragments are predicted by GOR V. The results of the CDM are provided as a function of the upper sequence identities of aligned fragments and the sequence identity threshold. We observe that the value 50% is the optimum sequence identity threshold, and that the accuracy of the CDM method measured by Q(3) ranges from 67.5% to 93.2%, depending on the availability of known structural fragments with sufficiently high sequence identity. As the Protein Data Bank grows, it is anticipated that this consensus method will improve because it will rely more upon the structural fragments.
Collapse
Affiliation(s)
- Taner Z Sen
- Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, Iowa 50011-3020, USA.
| | | | | | | |
Collapse
|
40
|
Ogata K, Soejima K, Higo J. A Monte Carlo sampling method of amino acid sequences adaptable to given main-chain atoms in the proteins. J Biochem 2006; 140:543-52. [PMID: 16945938 DOI: 10.1093/jb/mvj184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We have developed a computational method of protein design to detect amino acid sequences that are adaptable to given main-chain coordinates of a protein. In this method, the selection of amino acid types employs a Metropolis Monte Carlo method with a scoring function in conjunction with the approximation of free energies computed from 3D structures. To compute the scoring function, a side-chain prediction using another Metropolis Monte Carlo method was performed to select structurally suitable side-chain conformations from a side-chain library. In total, two layers of Monte Carlo procedures were performed, first to select amino acid types (1st layer Monte Carlo) and then to predict side-chain conformations (2nd layers Monte Carlo). We applied this method to sequence design for the entire sequence on the SH3 domain, Protein G, and BPTI. The predicted sequences were similar to those of the wild-type proteins. We compared the results of the predictions with and without the 2nd layer Monte Carlo method. The results revealed that the two-layer Monte Carlo method produced better sequence similarity to the wild-type proteins than the one-layer method. Finally, we applied this method to neuraminidase of influenza virus. The results were consistent with the sequences identified from the isolated viruses.
Collapse
Affiliation(s)
- Koji Ogata
- Centre for Computational Biology, The Hospital for Sick Children, 555 University Avenue, Toronot, Ontario M5G 1X8, Canada
| | | | | |
Collapse
|
41
|
Tran HT, Pappu RV. Toward an accurate theoretical framework for describing ensembles for proteins under strongly denaturing conditions. Biophys J 2006; 91:1868-86. [PMID: 16766618 PMCID: PMC1544316 DOI: 10.1529/biophysj.106.086264] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2006] [Accepted: 05/31/2006] [Indexed: 11/18/2022] Open
Abstract
Our focus is on an appropriate theoretical framework for describing highly denatured proteins. In high concentrations of denaturants, proteins behave like polymers in a good solvent and ensembles for denatured proteins can be modeled by ignoring all interactions except excluded volume (EV) effects. To assay conformational preferences of highly denatured proteins, we quantify a variety of properties for EV-limit ensembles of 23 two-state proteins. We find that modeled denatured proteins can be best described as follows. Average shapes are consistent with prolate ellipsoids. Ensembles are characterized by large correlated fluctuations. Sequence-specific conformational preferences are restricted to local length scales that span five to nine residues. Beyond local length scales, chain properties follow well-defined power laws that are expected for generic polymers in the EV limit. The average available volume is filled inefficiently, and cavities of all sizes are found within the interiors of denatured proteins. All properties characterized from simulated ensembles match predictions from rigorous field theories. We use our results to resolve between conflicting proposals for structure in ensembles for highly denatured states.
Collapse
Affiliation(s)
- Hoang T Tran
- Department of Biomedical Engineering and Center for Computational Biology, Washington University in St. Louis, St. Louis, Missouri 63130-4899, USA
| | | |
Collapse
|
42
|
Abstract
Over the past 10 years there has been tremendous success in the area of computational protein design. Protein design software has been used to stabilize proteins, solubilize membrane proteins, design intermolecular interactions, and design new protein structures. A key motivation for these studies is that they test our understanding of protein energetics and structure. De novo design of novel structures is a particularly rigorous test because the protein backbone must be designed in addition to the amino acid side chains. A priori it is not guaranteed that the target backbone is even designable. To address this issue, researchers have developed a variety of methods for generating protein-like scaffolds and for optimizing the protein backbone in conjunction with the amino acid sequence. These protocols have been used to design proteins from scratch and to explore sequence space for naturally occurring protein folds.
Collapse
Affiliation(s)
- Glenn L Butterfoss
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7260, USA.
| | | |
Collapse
|
43
|
Caballero J, Fernández L, Abreu JI, Fernández M. Amino Acid Sequence Autocorrelation Vectors and Ensembles of Bayesian-Regularized Genetic Neural Networks for Prediction of Conformational Stability of Human Lysozyme Mutants. J Chem Inf Model 2006; 46:1255-68. [PMID: 16711745 DOI: 10.1021/ci050507z] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Development of novel computational approaches for modeling protein properties from their primary structure is a main goal in applied proteomics. In this work, we reported the extension of the autocorrelation vector formalism to amino acid sequences for encoding protein structural information with modeling purposes. Amino Acid Sequence Autocorrelation (AASA) vectors were calculated by measuring the autocorrelations at sequence lags ranging from 1 to 15 on the protein primary structure of 48 amino acid/residue properties selected from the AAindex database. A total of 720 AASA descriptors were tested for building predictive models of the thermal unfolding Gibbs free energy change of human lysozyme mutants. In this sense, ensembles of Bayesian-Regularized Genetic Neural Networks (BRGNNs) were used for obtaining an optimum nonlinear model for the conformational stability. The ensemble predictor described about 88% and 68% variance of the data in training and test sets, respectively. Furthermore, the optimum AASA vector subset was shown not only to successfully model unfolding thermal stability but also to distribute wild-type and mutant lysozymes on a stability Self-organized Map (SOM) when used for unsupervised training of competitive neurons.
Collapse
Affiliation(s)
- Julio Caballero
- Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, and Artificial Intelligence Lab, Faculty of Informatics, University of Matanzas, 44740 Matanzas, Cuba
| | | | | | | |
Collapse
|
44
|
Cheng J, Randall A, Baldi P. Prediction of protein stability changes for single-site mutations using support vector machines. Proteins 2006; 62:1125-32. [PMID: 16372356 DOI: 10.1002/prot.20810] [Citation(s) in RCA: 670] [Impact Index Per Article: 37.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Accurate prediction of protein stability changes resulting from single amino acid mutations is important for understanding protein structures and designing new proteins. We use support vector machines to predict protein stability changes for single amino acid mutations leveraging both sequence and structural information. We evaluate our approach using cross-validation methods on a large dataset of single amino acid mutations. When only the sign of the stability changes is considered, the predictive method achieves 84% accuracy-a significant improvement over previously published results. Moreover, the experimental results show that the prediction accuracy obtained using sequence alone is close to the accuracy obtained using tertiary structure information. Because our method can accurately predict protein stability changes using primary sequence information only, it is applicable to many situations where the tertiary structure is unknown, overcoming a major limitation of previous methods which require tertiary information. The web server for predictions of protein stability changes upon mutations (MUpro), software, and datasets are available at http://www.igb.uci.edu/servers/servers.html.
Collapse
Affiliation(s)
- Jianlin Cheng
- Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California, Irvine, California 92697-3425, USA
| | | | | |
Collapse
|
45
|
Grigoryan G, Keating AE. Structure-based Prediction of bZIP Partnering Specificity. J Mol Biol 2006; 355:1125-42. [PMID: 16359704 DOI: 10.1016/j.jmb.2005.11.036] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2005] [Revised: 11/10/2005] [Accepted: 11/11/2005] [Indexed: 10/25/2022]
Abstract
Predicting protein interaction specificity from sequence is an important goal in computational biology. We present a model for predicting the interaction preferences of coiled-coil peptides derived from bZIP transcription factors that performs very well when tested against experimental protein microarray data. We used only sequence information to build atomic-resolution structures for 1711 dimeric complexes, and evaluated these with a variety of functions based on physics, learned empirical weights or experimental coupling energies. A purely physical model, similar to those used for protein design studies, gave reasonable performance. The results were improved significantly when helix propensities were used in place of a structurally explicit model to represent the unfolded reference state. Further improvement resulted upon accounting for residue-residue interactions in competing states in a generic way. Purely physical structure-based methods had difficulty capturing core interactions accurately, especially those involving polar residues such as asparagine. When these terms were replaced with weights from a machine-learning approach, the resulting model was able to correctly order the stabilities of over 6000 pairs of complexes with greater than 90% accuracy. The final model is physically interpretable, and suggests specific pairs of residues that are important for bZIP interaction specificity. Our results illustrate the power and potential of structural modeling as a method for predicting protein interactions and highlight obstacles that must be overcome to reach quantitative accuracy using a de novo approach. Our method shows unprecedented performance in predicting protein-protein interaction specificity accurately using structural modeling and suggests that predicting coiled-coil interactions generally may be within reach.
Collapse
|
46
|
Koder RL, Dutton PL. Intelligent design: the de novo engineering of proteins with specified functions. Dalton Trans 2006:3045-51. [PMID: 16786062 DOI: 10.1039/b514972j] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
One of the principal successes of de novo protein design has been the creation of small, robust protein-cofactor complexes which can serve as simplified models, or maquettes, of more complicated multicofactor protein complexes commonly found in nature. Different maquettes, generated by us and others, recreate a variety of aspects, or functional elements, recognized as parts of natural enzyme function. The current challenge is to both expand the palette of functional elements and combine and/or integrate them in recreating familiar enzyme activities or generating novel catalysis in the simplest protein scaffolds.
Collapse
Affiliation(s)
- Ronald L Koder
- Johnson Research Foundation and Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | |
Collapse
|
47
|
Vizcarra CL, Mayo SL. Electrostatics in computational protein design. Curr Opin Chem Biol 2005; 9:622-6. [PMID: 16257567 DOI: 10.1016/j.cbpa.2005.10.014] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2005] [Accepted: 10/11/2005] [Indexed: 11/18/2022]
Abstract
Catalytic activity and protein-protein recognition have proven to be significant challenges for computational protein design. Electrostatic interactions are crucial for these and other protein functions, and therefore accurate modeling of electrostatics is necessary for successfully advancing protein design into the realm of protein function. This review focuses on recent progress in modeling electrostatic interactions in computational protein design, with particular emphasis on continuum models.
Collapse
Affiliation(s)
- Christina L Vizcarra
- Division of Chemistry and Chemical Engineering, Division of Biology and Howard Hughes Medical Institute, California Institute of Technology, Pasadena, California 91125, USA
| | | |
Collapse
|
48
|
Zhou F, Grigoryan G, Lustig SR, Keating AE, Ceder G, Morgan D. Coarse-graining protein energetics in sequence variables. PHYSICAL REVIEW LETTERS 2005; 95:148103. [PMID: 16241695 DOI: 10.1103/physrevlett.95.148103] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2005] [Indexed: 05/05/2023]
Abstract
We show that cluster expansions (CE), previously used to model solid-state materials with binary or ternary configurational disorder, can be extended to the protein design problem. We present a generalized CE framework, in which properties such as energy can be unambiguously expanded in the amino-acid sequence space. The CE coarse grains over nonsequence degrees of freedom (e.g., side-chain conformations) and thereby simplifies the problem of designing proteins, or predicting the compatibility of a sequence with a given structure, by many orders of magnitude. The CE is physically transparent, and can be evaluated through linear regression on the energies of training sequences. We show, as example, that good prediction accuracy is obtained with up to pairwise interactions for a coiled-coil backbone, and that triplet interactions are important in the energetics of a more globular zinc-finger backbone.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | | | | | | | | |
Collapse
|
49
|
Schymkowitz J, Borg J, Stricher F, Nys R, Rousseau F, Serrano L. The FoldX web server: an online force field. Nucleic Acids Res 2005; 33:W382-8. [PMID: 15980494 PMCID: PMC1160148 DOI: 10.1093/nar/gki387] [Citation(s) in RCA: 1780] [Impact Index Per Article: 93.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
FoldX is an empirical force field that was developed for the rapid evaluation of the effect of mutations on the stability, folding and dynamics of proteins and nucleic acids. The core functionality of FoldX, namely the calculation of the free energy of a macromolecule based on its high-resolution 3D structure, is now publicly available through a web server at . The current release allows the calculation of the stability of a protein, calculation of the positions of the protons and the prediction of water bridges, prediction of metal binding sites and the analysis of the free energy of complex formation. Alanine scanning, the systematic truncation of side chains to alanine, is also included. In addition, some reporting functions have been added, and it is now possible to print both the atomic interaction networks that constitute the protein, print the structural and energetic details of the interactions per atom or per residue, as well as generate a general quality report of the pdb structure. This core functionality will be further extended as more FoldX applications are developed.
Collapse
Affiliation(s)
- Joost Schymkowitz
- Switch Laboratory, Flanders Interuniversity Institute for Biotechnology (VIB), Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussel, Belgium.
| | | | | | | | | | | |
Collapse
|
50
|
Schymkowitz JWH, Rousseau F, Martins IC, Ferkinghoff-Borg J, Stricher F, Serrano L. Prediction of water and metal binding sites and their affinities by using the Fold-X force field. Proc Natl Acad Sci U S A 2005; 102:10147-52. [PMID: 16006526 PMCID: PMC1177371 DOI: 10.1073/pnas.0501980102] [Citation(s) in RCA: 280] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The empirical force field Fold-X was developed previously to allow rapid free energy calculations in proteins. Here, we present an enhanced version of the force field allowing prediction of the position of structural water molecules and metal ions, together called single atom ligands. Fold-X picks up 76% of water molecules found to interact with two or more polar atoms of proteins in high-resolution crystal structures and predicts their position to within 0.8 A on average. The prediction of metal ion-binding sites have success rates between 90% and 97% depending on the metal, with an overall standard deviation on the position of binding of 0.3-0.6 A. The following metals were included in the force field: Mg2+, Ca2+, Zn2+, Mn2+, and Cu2+. As a result, the current version of Fold-X can accurately decorate a protein structure with biologically important ions and water molecules. Additionally, the free energy of binding of Ca2+ and Zn2+ (i.e., the natural logarithm of the dissociation constant) and its dependence on ionic strength correlate reasonably well with the experimental data available in the literature, allowing one to discriminate between high- and low-affinity binding sites. Importantly, the accuracy of the energy prediction presented here is sufficient to efficiently discriminate between Mg2+, Ca2+, and Zn2+ binding.
Collapse
Affiliation(s)
- Joost W H Schymkowitz
- European Molecular Biology Laboratory, Meyerhofstrasse 1, Heidelberg D-69117, Germany
| | | | | | | | | | | |
Collapse
|