1
|
Nand KN, Jordan TB, Yuan X, Basore DA, Zagorevski D, Clarke C, Werner G, Hwang JY, Wang H, Chung JJ, McKenna A, Jarvis MD, Singh G, Bystroff C. Bacterial production of recombinant contraceptive vaccine antigen from CatSper displayed on a human papilloma virus-like particle. Vaccine 2023; 41:6791-6801. [PMID: 37833124 DOI: 10.1016/j.vaccine.2023.09.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 09/19/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023]
Abstract
CatSper is a voltage dependent calcium ion channel present in the principal piece of sperm tail. It plays a crucial role in sperm hyperactivated motility and so in fertilization. Extracellular loops of mouse sperm CatSper were used to develop a vaccine to achieve protection from pregnancy. These loops were inserted at one of the three hypervariable regions of Human Papilloma Virus (HPV) capsid protein (L1). Recombinant vaccines were expressed in E.coli as inclusion body (IB), purified, refolded and assembled into virus-like particles (VLP) in vitro, and adsorbed on alum. Four vaccine candidates were tested in Balb/C mice. All the constructs proved immunogenic, one showed contraceptive efficacy. This recombinant contraceptive vaccine is a non-hormonal intervention and is expected to give long-acting protection from undesired pregnancies.
Collapse
Affiliation(s)
- K N Nand
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States
| | - T B Jordan
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States
| | - X Yuan
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States
| | - D A Basore
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States; Department of Health and Natural Science, Mercy University, Dobbs Ferry, NY, United States
| | - D Zagorevski
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States
| | - C Clarke
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States
| | - G Werner
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States
| | - J Y Hwang
- Dept of Cellular and Molecular Physiology, Yale University School of Medicine, New Haven, CT, United States
| | - H Wang
- Dept of Cellular and Molecular Physiology, Yale University School of Medicine, New Haven, CT, United States
| | - J-J Chung
- Dept of Cellular and Molecular Physiology, Yale University School of Medicine, New Haven, CT, United States; Department of Gynecology and Obstetrics, Yale University School of Medicine, New Haven, CT, United States
| | - A McKenna
- Bioresearch Core, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - M D Jarvis
- Bioresearch Core, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - G Singh
- Bioresearch Core, Rensselaer Polytechnic Institute, Troy, NY, United States
| | - C Bystroff
- Dept of Biological Sciences, Rensselaer Polytechnic Institute, Troy NY, United States.
| |
Collapse
|
2
|
Rothfuss MT, Becht DC, Zeng B, McClelland LJ, Yates-Hansen C, Bowler BE. High-Accuracy Prediction of Stabilizing Surface Mutations to the Three-Helix Bundle, UBA(1), with EmCAST. J Am Chem Soc 2023; 145:22979-22992. [PMID: 37815921 PMCID: PMC10626973 DOI: 10.1021/jacs.3c04966] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/12/2023]
Abstract
The accurate modeling of energetic contributions to protein structure is a fundamental challenge in computational approaches to protein analysis and design. We describe a general computational method, EmCAST (empirical Cα stabilization), to score and optimize the sequence to the structure in proteins. The method relies on an empirical potential derived from the database of the Cα dihedral angle preferences for all possible four-residue sequences, using the data available in the Protein Data Bank. Our method produces stability predictions that naturally correlate one-to-one with the experimental results for solvent-exposed mutation sites. EmCAST predicted four mutations that increased the stability of a three-helix bundle, UBA(1), from 2.4 to 4.8 kcal/mol by optimizing residues in both helices and turns. For a set of eight variants, the predicted and experimental stabilizations correlate very well (R2 = 0.97) with a slope near 1 and with a 0.16 kcal/mol standard error for EmCAST predictions. Tests against literature data for the stability effects of surface-exposed mutations show that EmCAST outperforms the existing stability prediction methods. UBA(1) variants were crystallized to verify and analyze their structures at an atomic resolution. Thermodynamic and kinetic folding experiments were performed to determine the magnitude and mechanism of stabilization. Our method has the potential to enable the rapid, rational optimization of natural proteins, expand the analysis of the sequence/structure relationship, and supplement the existing protein design strategies.
Collapse
Affiliation(s)
- Michael T. Rothfuss
- Department of Chemistry and Biochemistry, University of Montana, Missoula, MT 59812, United States
| | - Dustin C. Becht
- Department of Chemistry and Biochemistry, University of Montana, Missoula, MT 59812, United States
| | - Baisen Zeng
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
| | - Levi J. McClelland
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
- Division of Biological Sciences, University of Montana, Missoula, MT 59812, United States
| | - Cindee Yates-Hansen
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
| | - Bruce E. Bowler
- Department of Chemistry and Biochemistry, University of Montana, Missoula, MT 59812, United States
- Center for Biomolecular Structure and Dynamics, University of Montana, Missoula, MT 59812, United States
| |
Collapse
|
3
|
Sahu S, Banerjee R, Pal D. Intrinsic proclivity of left-handed conformation in large Nest motif peptides inferred from molecular dynamics. J Biomol Struct Dyn 2023:1-10. [PMID: 37464873 DOI: 10.1080/07391102.2023.2236710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 07/10/2023] [Indexed: 07/20/2023]
Abstract
The 'Nest' motif plays a functional role in protein owing to its ligand binding potential aided by geometric concavity. The presence of less favored left-handed conformation (L-state) in its structure makes this concavity possible and in shaping the native chemical environment amenable to stable binding interactions. To understand the persistent appearance of L-state torsion in the Nest motif, we analyzed 0.5μs Molecular Dynamics (MD) simulation trajectories of 35 six-residue peptides (out of a total of 50 large Nest sequences of ≥6 residues) identified in our previous study. Analysis of the MD trajectories of the individual peptides reveals initial L-state in 60% of the peptides persists for >40% of the trajectory. Further, Nests with different sequences appear to adopt a specific conformational state driven by the neighboring L-state residues. The sequences also possess short secondary structures and amino acid repeats, suggesting evolutionary conservation and the specific role of amino acids in locally predisposing the torsion angle to the L-state. These findings help us to understand how L-state conformation is an essential prerequisite in stabilizing the Nest motif and shed light on the sequence-structure-function paradigm in the rational design of peptides and peptidomimetics for therapeutics.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Subhankar Sahu
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, Haringhata, West Bengal, India
| | - Raja Banerjee
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, Haringhata, West Bengal, India
| | - Debnath Pal
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, Karnataka, India
| |
Collapse
|
4
|
Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:biom12091246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein–ligand binding, including allosteric effects, protein–protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
|
5
|
Modeling of protein conformational changes with Rosetta guided by limited experimental data. Structure 2022; 30:1157-1168.e3. [PMID: 35597243 PMCID: PMC9357069 DOI: 10.1016/j.str.2022.04.013] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 04/08/2022] [Accepted: 04/25/2022] [Indexed: 11/24/2022]
Abstract
Conformational changes are an essential component of functional cycles of many proteins, but their characterization often requires an integrative structural biology approach. Here, we introduce and benchmark ConfChangeMover (CCM), a new method built into the widely used macromolecular modeling suite Rosetta that is tailored to model conformational changes in proteins using sparse experimental data. CCM can rotate and translate secondary structural elements and modify their backbone dihedral angles in regions of interest. We benchmarked CCM on soluble and membrane proteins with simulated Cα-Cα distance restraints and sparse experimental double electron-electron resonance (DEER) restraints, respectively. In both benchmarks, CCM outperformed state-of-the-art Rosetta methods, showing that it can model a diverse array of conformational changes. In addition, the Rosetta framework allows a wide variety of experimental data to be integrated with CCM, thus extending its capability beyond DEER restraints. This method will contribute to the biophysical characterization of protein dynamics.
Collapse
|
6
|
Zhong W, Gu F. Predicting Local Protein 3D Structures Using Clustering Deep Recurrent Neural Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:593-604. [PMID: 32750880 DOI: 10.1109/tcbb.2020.3005972] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Since protein 3D structure prediction is very important for biochemical study and drug design, researchers have developed many machine learning algorithms to predict protein 3D structures using the sequence information only. Understanding the sequence-to-structure relationship is key for the successful structure prediction. Previous approaches including the single shallow learning model, the single deep learning model and clustering algorithms all have disadvantages to understand precise sequence-to-structure relationship. In order to further improve the performance of the local protein structure prediction, a novel deep learning model called Clustering Recurrent Neural Network (CRNN) is proposed. In this model, the whole protein dataset is divided into multiple cluster subtrees. A RNN is trained for each cluster in the subtrees so that each RNN can be used to learn the computationally simpler local sequence-to-structure relationship instead of attempting to capture the global sequence-to-structure relationship. After learning the local sequence-to-structure relationship using RNN, CRNN is designed to predict distance matrices, torsion angles and secondary structures for backbone α-carbon atoms of protein sequence segments. Our experimental analysis indicates that 3D structure prediction accuracy is comparable or better than other state-of-art approaches.
Collapse
|
7
|
Konagurthu AS, Subramanian R, Allison L, Abramson D, Stuckey PJ, Garcia de la Banda M, Lesk AM. Universal Architectural Concepts Underlying Protein Folding Patterns. Front Mol Biosci 2021; 7:612920. [PMID: 33996891 PMCID: PMC8120156 DOI: 10.3389/fmolb.2020.612920] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 12/16/2020] [Indexed: 11/17/2022] Open
Abstract
What is the architectural “basis set” of the observed universe of protein structures? Using information-theoretic inference, we answer this question with a dictionary of 1,493 substructures—called concepts—typically at a subdomain level, based on an unbiased subset of known protein structures. Each concept represents a topologically conserved assembly of helices and strands that make contact. Any protein structure can be dissected into instances of concepts from this dictionary. We dissected the Protein Data Bank and completely inventoried all the concept instances. This yields many insights, including correlations between concepts and catalytic activities or binding sites, useful for rational drug design; local amino-acid sequence–structure correlations, useful for ab initio structure prediction methods; and information supporting the recognition and exploration of evolutionary relationships, useful for structural studies. An interactive site, Proçodic, at http://lcb.infotech.monash.edu.au/prosodic (click), provides access to and navigation of the entire dictionary of concepts and their usages, and all associated information. This report is part of a continuing programme with the goal of elucidating fundamental principles of protein architecture, in the spirit of the work of Cyrus Chothia.
Collapse
Affiliation(s)
- Arun S Konagurthu
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Ramanan Subramanian
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Lloyd Allison
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - David Abramson
- Research Computing Center, University of Queensland, Brisbane, QLD, Australia
| | - Peter J Stuckey
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia
| | - Maria Garcia de la Banda
- Department of Data Science and Artificial Intelligence, Faculty of Information Technology, Monash University, Clayton, VIC, Australia
| | - Arthur M Lesk
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States.,MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| |
Collapse
|
8
|
Lapenta F, Jerala R. Design of novel protein building modules and modular architectures. Curr Opin Struct Biol 2020; 63:90-96. [PMID: 32505942 DOI: 10.1016/j.sbi.2020.04.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 04/14/2020] [Accepted: 04/15/2020] [Indexed: 12/31/2022]
Abstract
Nature uses only a limited number of protein topologies and while several folds have evolved independently over time, there are clearly many possible topologies that have not been explored by evolution. With recent advances of protein design concepts, computational modeling tools, high resolution and high-throughput experimental methods it is now possible to design new protein architectures. The collection of building blocks and design principles widened both in size and complexity, offering an expanded toolset for building new modular folds and functional protein structures. Here we review and discuss recent achievements of protein design, focusing in particular on the use and prospects of modular approaches for assembling new protein folds.
Collapse
Affiliation(s)
- Fabio Lapenta
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Slovenia
| | - Roman Jerala
- Department of Synthetic Biology and Immunology, National Institute of Chemistry, Ljubljana, Slovenia; EN-FIST Centre of Excellence, Ljubljana, Slovenia.
| |
Collapse
|
9
|
Torshin IY, Batyanovskii AV, Uroshlev LA, Tumanyan VG, Volotovskii ID, Esipova NG. The Conformational Stability/Lability of Peptide Fragments in the Sequence Context of Amino Acids. Biophysics (Nagoya-shi) 2019. [DOI: 10.1134/s0006350919020180] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
10
|
Physics-Based Modeling of Side Chain—Side Chain Interactions in the UNRES Force Field. SPRINGER SERIES ON BIO- AND NEUROSYSTEMS 2019. [DOI: 10.1007/978-3-319-95843-9_4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
11
|
Affiliation(s)
- Ngoc Hieu Tran
- David R. Cheriton School of Computer Science; University of Waterloo; Waterloo, ON Canada
| | - Xianglilan Zhang
- David R. Cheriton School of Computer Science; University of Waterloo; Waterloo, ON Canada
- State Key Laboratory of Pathogen and Biosecurity; Beijing Institute of Microbiology and Epidemiology; Beijing P.R. China
| | - Ming Li
- David R. Cheriton School of Computer Science; University of Waterloo; Waterloo, ON Canada
| |
Collapse
|
12
|
Mackenzie CO, Grigoryan G. Protein structural motifs in prediction and design. Curr Opin Struct Biol 2017; 44:161-167. [PMID: 28460216 PMCID: PMC5513761 DOI: 10.1016/j.sbi.2017.03.012] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Revised: 03/18/2017] [Accepted: 03/28/2017] [Indexed: 01/11/2023]
Abstract
The Protein Data Bank (PDB) has been an integral resource for shaping our fundamental understanding of protein structure and for the advancement of such applications as protein design and structure prediction. Over the years, information from the PDB has been used to generate models ranging from specific structural mechanisms to general statistical potentials. With accumulating structural data, it has become possible to mine for more complete and complex structural observations, deducing more accurate generalizations. Motif libraries, which capture recurring structural features along with their sequence preferences, have exposed modularity in the structural universe and found successful application in various problems of structural biology. Here we summarize recent achievements in this arena, focusing on subdomain level structural patterns and their applications to protein design and structure prediction, and suggest promising future directions as the structural database continues to grow.
Collapse
Affiliation(s)
- Craig O Mackenzie
- Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH 03755, United States
| | - Gevorg Grigoryan
- Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH 03755, United States; Department of Computer Science, Dartmouth College, Hanover, NH 03755, United States.
| |
Collapse
|
13
|
Gong H, Zhang H, Zhu J, Wang C, Sun S, Zheng WM, Bu D. Improving prediction of burial state of residues by exploiting correlation among residues. BMC Bioinformatics 2017; 18:70. [PMID: 28361691 PMCID: PMC5374591 DOI: 10.1186/s12859-017-1475-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Background Residues in a protein might be buried inside or exposed to the solvent surrounding the protein. The buried residues usually form hydrophobic cores to maintain the structural integrity of proteins while the exposed residues are tightly related to protein functions. Thus, the accurate prediction of solvent accessibility of residues will greatly facilitate our understanding of both structure and functionalities of proteins. Most of the state-of-the-art prediction approaches consider the burial state of each residue independently, thus neglecting the correlations among residues. Results In this study, we present a high-order conditional random field model that considers burial states of all residues in a protein simultaneously. Our approach exploits not only the correlation among adjacent residues but also the correlation among long-range residues. Experimental results showed that by exploiting the correlation among residues, our approach outperformed the state-of-the-art approaches in prediction accuracy. In-depth case studies also showed that by using the high-order statistical model, the errors committed by the bidirectional recurrent neural network and chain conditional random field models were successfully corrected. Conclusions Our methods enable the accurate prediction of residue burial states, which should greatly facilitate protein structure prediction and evaluation.
Collapse
Affiliation(s)
- Hai'e Gong
- Key Lab of Intelligent Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.,School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Haicang Zhang
- Key Lab of Intelligent Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.,School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Jianwei Zhu
- Key Lab of Intelligent Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.,School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Chao Wang
- Key Lab of Intelligent Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.,School of Computer Science, University of Chinese Academy of Sciences, Beijing, China
| | - Shiwei Sun
- Key Lab of Intelligent Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
| | - Wei-Mou Zheng
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, 100190, China.
| | - Dongbo Bu
- Key Lab of Intelligent Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China.
| |
Collapse
|
14
|
Abstract
More than two decades of research have enabled dihedral angle predictions at an accuracy that makes them an interesting alternative or supplement to secondary structure prediction that provides detailed local structure information for every residue of a protein. The evolution of dihedral angle prediction methods is closely linked to advancements in machine learning and other relevant technologies. Consequently recent improvements in large-scale training of deep neural networks have led to the best method currently available, which achieves a mean absolute error of 19° for phi, and 30° for psi. This performance opens interesting perspectives for the application of dihedral angle prediction in the comparison, prediction, and design of protein structures.
Collapse
Affiliation(s)
- Olav Zimmermann
- Jülich Supercomputing Centre (JSC), Institute for Advanced Simulation (IAS), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany.
| |
Collapse
|
15
|
Zou J, Song B, Simmerling C, Raleigh D. Experimental and Computational Analysis of Protein Stabilization by Gly-to-d-Ala Substitution: A Convolution of Native State and Unfolded State Effects. J Am Chem Soc 2016; 138:15682-15689. [PMID: 27934019 PMCID: PMC5442443 DOI: 10.1021/jacs.6b09511] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The rational and predictable enhancement of protein stability is an important goal in protein design. Most efforts target the folded state, however stability is the free energy difference between the folded and unfolded states thus both are suitable targets. Strategies directed at the unfolded state usually seek to decrease chain entropy by introducing cross-links or by replacing glycines. Cross-linking has led to mixed results. Replacement of glycine with an l-amino acid, while reducing the entropy of the unfolded state, can introduce unfavorable steric interactions in the folded state, since glycine is often found in conformations that require a positive φ angle such as helical C-capping motifs or type I' and II″ β-turns. l-Amino acids are strongly disfavored in these conformations, but d-amino acids are not. However, there are few reported examples and conflicting results have been obtained when glycines are replaced with d-Ala. We critically examine the effect of Gly-to-d-Ala substitutions on protein stability using experimental approaches together with molecular dynamics simulations and free energy calculations. The data, together with a survey of high resolution structures, show that the vast majority of proteins can be stabilized by substitution of C-capping glycines with d-Ala. Sites suitable for substitutions can be identified via sequence alignment with a high degree of success. Steric clashes in the native state due to the new side chain are rarely observed, but are likely responsible for the destabilizing or null effect observed for the small subset of Gly-to-d-Ala substitutions which are not stabilizing. Changes in backbone solvation play less of a role. Favorable candidates for d-Ala substitution can be identified using a rapid algorithm based on molecular mechanics.
Collapse
Affiliation(s)
- Junjie Zou
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794-3400
| | - Benben Song
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794-3400
| | - Carlos Simmerling
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794-3400
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794-3400
| | - Daniel Raleigh
- Department of Chemistry, Stony Brook University, Stony Brook, New York 11794-3400
| |
Collapse
|
16
|
Abstract
Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence-a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure.
Collapse
|
17
|
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
18
|
Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. CURRENT PROTOCOLS IN BIOINFORMATICS 2016; 54:5.6.1-5.6.37. [PMID: 27322406 PMCID: PMC5031415 DOI: 10.1002/cpbi.3] [Citation(s) in RCA: 1827] [Impact Index Per Article: 228.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
19
|
A topological and conformational stability alphabet for multipass membrane proteins. Nat Chem Biol 2016; 12:167-73. [PMID: 26780406 DOI: 10.1038/nchembio.2001] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2015] [Accepted: 11/13/2015] [Indexed: 12/27/2022]
Abstract
Multipass membrane proteins perform critical signal transduction and transport across membranes. How transmembrane helix (TMH) sequences encode the topology and conformational flexibility regulating these functions remains poorly understood. Here we describe a comprehensive analysis of the sequence-structure relationships at multiple interacting TMHs from all membrane proteins with structures in the Protein Data Bank (PDB). We found that membrane proteins can be deconstructed in interacting TMH trimer units, which mostly fold into six distinct structural classes of topologies and conformations. Each class is enriched in recurrent sequence motifs from functionally unrelated proteins, revealing unforeseen consensus and evolutionary conserved networks of stabilizing interhelical contacts. Interacting TMHs' topology and local protein conformational flexibility were remarkably well predicted in a blinded fashion from the identified binding-hotspot motifs. Our results reveal universal sequence-structure principles governing the complex anatomy and plasticity of multipass membrane proteins that may guide de novo structure prediction, design, and studies of folding and dynamics.
Collapse
|
20
|
Shirke AN, Basore D, Butterfoss GL, Bonneau R, Bystroff C, Gross RA. Toward rational thermostabilization of Aspergillus oryzae cutinase: Insights into catalytic and structural stability. Proteins 2015; 84:60-72. [PMID: 26522152 DOI: 10.1002/prot.24955] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Revised: 10/11/2015] [Accepted: 10/12/2015] [Indexed: 11/10/2022]
Abstract
Cutinases are powerful hydrolases that can cleave ester bonds of polyesters such as poly(ethylene terephthalate) (PET), opening up new options for enzymatic routes for polymer recycling and surface modification reactions. Cutinase from Aspergillus oryzae (AoC) is promising owing to the presence of an extended groove near the catalytic triad which is important for the orientation of polymeric chains. However, the catalytic efficiency of AoC on rigid polymers like PET is limited by its low thermostability; as it is essential to work at or over the glass transition temperature (Tg) of PET, that is, 70 °C. Consequently, in this study we worked toward the thermostabilization of AoC. Use of Rosetta computational protein design software in conjunction with rational design led to a 6 °C improvement in the thermal unfolding temperature (Tm) and a 10-fold increase in the half-life of the enzyme activity at 60 °C. Surprisingly, thermostabilization did not improve the rate or temperature optimum of enzyme activity. Three notable findings are presented as steps toward designing more thermophilic cutinase: (a) surface salt bridge optimization produced enthalpic stabilization, (b) mutations to proline reduced the entropy loss upon folding, and (c) the lack of a correlative increase in the temperature optimum of catalytic activity with thermodynamic stability suggests that the active site is locally denatured at a temperature below the Tm of the global structure.
Collapse
Affiliation(s)
- Abhijit N Shirke
- Department of Chemistry and Chemical Biology, Rensselaer Polytechnic Institute, Troy, New York.,Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York
| | - Danielle Basore
- Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York.,Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, New York
| | - Glenn L Butterfoss
- Center for Genomics and Systems Biology, New York University Abu Dhabi, Abu Dhabi, UAE
| | - Richard Bonneau
- Center for Genomics and Systems Biology, New York University, New York
| | - Christopher Bystroff
- Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York.,Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, New York.,Department of Computer Science, Rensselaer Polytechnic Institute, Troy, New York
| | - Richard A Gross
- Department of Chemistry and Chemical Biology, Rensselaer Polytechnic Institute, Troy, New York.,Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York
| |
Collapse
|
21
|
Vallat B, Madrid-Aliste C, Fiser A. Modularity of Protein Folds as a Tool for Template-Free Modeling of Structures. PLoS Comput Biol 2015; 11:e1004419. [PMID: 26252221 PMCID: PMC4529212 DOI: 10.1371/journal.pcbi.1004419] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 06/30/2015] [Indexed: 12/25/2022] Open
Abstract
Predicting the three-dimensional structure of proteins from their amino acid sequences remains a challenging problem in molecular biology. While the current structural coverage of proteins is almost exclusively provided by template-based techniques, the modeling of the rest of the protein sequences increasingly require template-free methods. However, template-free modeling methods are much less reliable and are usually applicable for smaller proteins, leaving much space for improvement. We present here a novel computational method that uses a library of supersecondary structure fragments, known as Smotifs, to model protein structures. The library of Smotifs has saturated over time, providing a theoretical foundation for efficient modeling. The method relies on weak sequence signals from remotely related protein structures to create a library of Smotif fragments specific to the target protein sequence. This Smotif library is exploited in a fragment assembly protocol to sample decoys, which are assessed by a composite scoring function. Since the Smotif fragments are larger in size compared to the ones used in other fragment-based methods, the proposed modeling algorithm, SmotifTF, can employ an exhaustive sampling during decoy assembly. SmotifTF successfully predicts the overall fold of the target proteins in about 50% of the test cases and performs competitively when compared to other state of the art prediction methods, especially when sequence signal to remote homologs is diminishing. Smotif-based modeling is complementary to current prediction methods and provides a promising direction in addressing the structure prediction problem, especially when targeting larger proteins for modeling. Each protein folds into a unique three-dimensional structure that enables it to carry out its biological function. Knowledge of the atomic details of protein structures is therefore a key to understanding their function. Advances in high throughput experimental technologies have lead to an exponential increase in the availability of known protein sequences. Although strong progress has been made in experimental protein structure determination, it remains a fact that more than 99% of structural information is provided by computational modeling methods. We describe here a novel structure prediction method, SmotifTF, which uses a unique library of known protein fragments to assemble the three-dimensional structure of a sequence. The fragment library has saturated over time and therefore provides a complete set of building blocks required for model building. The method performs competitively compared to existing methods of structure prediction.
Collapse
Affiliation(s)
- Brinda Vallat
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, New York, United States of America
| | - Carlos Madrid-Aliste
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, New York, United States of America
| | - Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, New York, United States of America
| |
Collapse
|
22
|
A study of the influence of charged residues on β-hairpin formation by nuclear magnetic resonance and molecular dynamics. Protein J 2014; 33:525-35. [PMID: 25316116 PMCID: PMC4239826 DOI: 10.1007/s10930-014-9585-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Chain reversals are often nucleation sites in protein folding. The β-hairpins of FBP28 WW domain and IgG are stable and have been proved to initiate the folding and are, therefore, suitable for studying the influence of charged residues on β-hairpin conformation. In this paper, we carried out NMR examination of the conformations in solution of two fragments from the FPB28 protein (PDB code: 1E0L) (N-terminal part) namely KTADGKT-NH2 (1E0L 12–18, D7) and YKTADGKTY-NH2 (1E0L 11–19, D9), one from the B3 domain of the protein G (PDB code: 1IGD), namely DDATKT-NH2 (1IGD 51–56) (Dag1), and three variants of Dag1 peptide: DVATKT-NH2 (Dag2), OVATKT-NH2 (Dag3) and KVATKT-NH2 (Dag4), respectively, in which the original charged residue were replaced with non-polar residues or modified charged residues. It was found that both the D7 and D9 peptides form a large fraction bent conformations. However, no hydrophobic contacts between the terminal Tyr residues of D9 occur, which suggests that the presence of a pair of like-charged residues stabilizes chain reversal. Conversely, only the Dag1 and Dag2 peptides exhibit some chain reversal; replacing the second aspartic-acid residue with a valine and the first one with a basic residue results in a nearly extended conformation. These results suggest that basic residues farther away in sequence can result in stabilization of chain reversal owing to screening of the non-polar core. Conversely, smaller distance in sequence prohibits this screening, while the presence oppositely-charged residues can stabilize a turn because of salt-bridge formation.
Collapse
|
23
|
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | | |
Collapse
|
24
|
Carrascoza F, Zaric S, Silaghi-Dumitrescu R. Computational study of protein secondary structure elements: Ramachandran plots revisited. J Mol Graph Model 2014; 50:125-33. [DOI: 10.1016/j.jmgm.2014.04.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Revised: 04/01/2014] [Accepted: 04/02/2014] [Indexed: 11/28/2022]
|
25
|
Joseph AP, de Brevern AG. From local structure to a global framework: recognition of protein folds. J R Soc Interface 2014; 11:20131147. [PMID: 24740960 DOI: 10.1098/rsif.2013.1147] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Protein folding has been a major area of research for many years. Nonetheless, the mechanisms leading to the formation of an active biological fold are still not fully apprehended. The huge amount of available sequence and structural information provides hints to identify the putative fold for a given sequence. Indeed, protein structures prefer a limited number of local backbone conformations, some being characterized by preferences for certain amino acids. These preferences largely depend on the local structural environment. The prediction of local backbone conformations has become an important factor to correctly identifying the global protein fold. Here, we review the developments in the field of local structure prediction and especially their implication in protein fold recognition.
Collapse
Affiliation(s)
- Agnel Praveen Joseph
- Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Oxford, , Didcot OX11 0QX, UK
| | | |
Collapse
|
26
|
Schneider B, Černý J, Svozil D, Čech P, Gelly JC, de Brevern AG. Bioinformatic analysis of the protein/DNA interface. Nucleic Acids Res 2014; 42:3381-94. [PMID: 24335080 PMCID: PMC3950675 DOI: 10.1093/nar/gkt1273] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 11/14/2013] [Accepted: 11/14/2013] [Indexed: 01/04/2023] Open
Abstract
To investigate the principles driving recognition between proteins and DNA, we analyzed more than thousand crystal structures of protein/DNA complexes. We classified protein and DNA conformations by structural alphabets, protein blocks [de Brevern, Etchebest and Hazout (2000) (Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks. Prots. Struct. Funct. Genet., 41:271-287)] and dinucleotide conformers [Svozil, Kalina, Omelka and Schneider (2008) (DNA conformations and their sequence preferences. Nucleic Acids Res., 36:3690-3706)], respectively. Assembling the mutually interacting protein blocks and dinucleotide conformers into 'interaction matrices' revealed their correlations and conformer preferences at the interface relative to their occurrence outside the interface. The analyzed data demonstrated important differences between complexes of various types of proteins such as transcription factors and nucleases, distinct interaction patterns for the DNA minor groove relative to the major groove and phosphate and importance of water-mediated contacts. Water molecules mediate proportionally the largest number of contacts in the minor groove and form the largest proportion of contacts in complexes of transcription factors. The generally known induction of A-DNA forms by complexation was more accurately attributed to A-like and intermediate A/B conformers rare in naked DNA molecules.
Collapse
Affiliation(s)
- Bohdan Schneider
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Jiří Černý
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Daniel Svozil
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Petr Čech
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Jean-Christophe Gelly
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| | - Alexandre G. de Brevern
- Institute of Biotechnology AS CR, Videnska 1083, CZ-142 20 Prague, Czech Republic, Laboratory of Informatics and Chemistry, Faculty of Chemical Technology, Institute of Chemical Technology Prague, Technická 5, CZ-166 28 Prague, Czech Republic, INSERM, U665, DSIMB, F-75739 Paris, France, University of Paris Diderot, Sorbonne Paris Cité, UMR_S 665, F-75739 Paris, France, Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France and Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
| |
Collapse
|
27
|
Rosenman DJ, Huang YM, Xia K, Fraser K, Jones VE, Lamberson CM, Van Roey P, Colón W, Bystroff C. Green-lighting green fluorescent protein: faster and more efficient folding by eliminating a cis-trans peptide isomerization event. Protein Sci 2014; 23:400-10. [PMID: 24408076 DOI: 10.1002/pro.2421] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2013] [Revised: 01/02/2014] [Accepted: 01/06/2014] [Indexed: 11/06/2022]
Abstract
Wild-type green fluorescent protein (GFP) folds on a time scale of minutes. The slow step in folding is a cis-trans peptide bond isomerization. The only conserved cis-peptide bond in the native GFP structure, at P89, was remodeled by the insertion of two residues, followed by iterative energy minimization and side chain design. The engineered GFP was synthesized and found to fold faster and more efficiently than its template protein, recovering 50% more of its fluorescence upon refolding. The slow phase of folding is faster and smaller in amplitude, and hysteresis in refolding has been eliminated. The elimination of a previously reported kinetically trapped state in refolding suggests that X-P89 is trans in the trapped state. A 2.55 Å resolution crystal structure revealed that the new variant contains only trans-peptide bonds, as designed. This is the first instance of a computationally remodeled fluorescent protein that folds faster and more efficiently than wild type.
Collapse
Affiliation(s)
- David J Rosenman
- Rensselaer Polytechnic Institute, Biological Sciences, 110 8th St., Troy, New York, 12180
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Ma J, Wang S. Algorithms, Applications, and Challenges of Protein Structure Alignment. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 94:121-75. [DOI: 10.1016/b978-0-12-800168-4.00005-6] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
29
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
30
|
Shen Y, Picord G, Guyon F, Tuffery P. Detecting protein candidate fragments using a structural alphabet profile comparison approach. PLoS One 2013; 8:e80493. [PMID: 24303019 PMCID: PMC3841190 DOI: 10.1371/journal.pone.0080493] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Accepted: 10/03/2013] [Indexed: 01/28/2023] Open
Abstract
Predicting accurate fragments from sequence has recently become a critical step for protein structure modeling, as protein fragment assembly techniques are presently among the most efficient approaches for de novo prediction. A key step in these approaches is, given the sequence of a protein to model, the identification of relevant fragments - candidate fragments - from a collection of the available 3D structures. These fragments can then be assembled to produce a model of the complete structure of the protein of interest. The search for candidate fragments is classically achieved by considering local sequence similarity using profile comparison, or threading approaches. In the present study, we introduce a new profile comparison approach that, instead of using amino acid profiles, is based on the use of predicted structural alphabet profiles, where structural alphabet profiles contain information related to the 3D local shapes associated with the sequences. We show that structural alphabet profile-profile comparison can be used efficiently to retrieve accurate structural fragments, and we introduce a fully new protocol for the detection of candidate fragments. It identifies fragments specific of each position of the sequence and of size varying between 6 and 27 amino-acids. We find it outperforms present state of the art approaches in terms (i) of the accuracy of the fragments identified, (ii) the rate of true positives identified, while having a high coverage score. We illustrate the relevance of the approach on complete target sets of the two previous Critical Assessment of Techniques for Protein Structure Prediction (CASP) rounds 9 and 10. A web server for the approach is freely available at http://bioserv.rpbs.univ-paris-diderot.fr/SAFrag.
Collapse
Affiliation(s)
- Yimin Shen
- INSERM, U973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Géraldine Picord
- INSERM, U973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Frédéric Guyon
- INSERM, U973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Pierre Tuffery
- INSERM, U973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, Paris, France
- RPBS, Paris, France
- * E-mail:
| |
Collapse
|
31
|
Kalev I, Habeck M. Confidence-guided local structure prediction with HHfrag. PLoS One 2013; 8:e76512. [PMID: 24146881 PMCID: PMC3797814 DOI: 10.1371/journal.pone.0076512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2013] [Accepted: 08/28/2013] [Indexed: 12/04/2022] Open
Abstract
We present a method to assess the reliability of local structure prediction from sequence. We introduce a greedy algorithm for filtering and enrichment of dynamic fragment libraries, compiled with remote-homology detection methods such as HHfrag. After filtering false hits at each target position, we reduce the fragment library to a minimal set of representative fragments, which are guaranteed to have correct local structure in regions of detectable conservation. We demonstrate that the location of conserved motifs in a protein sequence can be predicted by examining the recurrence and structural homogeneity of detected fragments. The resulting confidence score correlates with the local RMSD of the representative fragments and allows us to predict torsion angles from sequence with better accuracy compared to existing machine learning methods.
Collapse
Affiliation(s)
- Ivan Kalev
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
- * E-mail: (IK); (MH)
| | - Michael Habeck
- Institute for Mathematical Stochastics, Georg-August-University of Göttingen, Göttingen, Germany
- * E-mail: (IK); (MH)
| |
Collapse
|
32
|
Soong TT, Hwang MJ, Chen CM. Discovery of Recurrent Structural Motifs for Approximating Three-Dimensional Protein Structures. J CHIN CHEM SOC-TAIP 2013. [DOI: 10.1002/jccs.200400164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
33
|
Mishra S, Saxena A, Sangwan RS. Fundamentals of Homology Modeling Steps and Comparison among Important Bioinformatics Tools: An Overview. ACTA ACUST UNITED AC 2013. [DOI: 10.17311/sciintl.2013.237.252] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
34
|
Johansson MU, Zoete V, Guex N. Recurrent structural motifs in non-homologous protein structures. Int J Mol Sci 2013; 14:7795-814. [PMID: 23574940 PMCID: PMC3645717 DOI: 10.3390/ijms14047795] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2013] [Revised: 03/27/2013] [Accepted: 04/01/2013] [Indexed: 11/18/2022] Open
Abstract
We have extracted an extensive collection of recurrent structural motifs (RSMs), which consist of sequentially non-contiguous structural motifs (4–6 residues), each of which appears with very similar conformation in three or more mutually unrelated protein structures. We find that the proteins in our set are covered to a substantial extent by the recurrent non-contiguous structural motifs, especially the helix and strand regions. Computational alanine scanning calculations indicate that the average folding free energy changes upon alanine mutation for most types of non-alanine residues are higher for amino acids that are present in recurrent structural motifs than for amino acids that are not. The non-alanine amino acids that are most common in the recurrent structural motifs, i.e., phenylalanine, isoleucine, leucine, valine and tyrosine and the less abundant methionine and tryptophan, have the largest folding free energy changes. This indicates that the recurrent structural motifs, as we define them, describe recurrent structural patterns that are important for protein stability. In view of their properties, such structural motifs are potentially useful for inter-residue contact prediction and protein structure refinement.
Collapse
Affiliation(s)
- Maria U. Johansson
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
- Authors to whom correspondence should be addressed; E-Mails: (M.U.J.); (N.G.); Tel.: +41-21-692-40-86 (M.U.J.); +41-21-692-40-37 (N.G.); Fax: +41-21-692-40-65 (M.U.J. & N.G.)
| | - Vincent Zoete
- Molecular Modelling Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland; E-Mail:
| | - Nicolas Guex
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, CH-1015 Lausanne, Switzerland
- Authors to whom correspondence should be addressed; E-Mails: (M.U.J.); (N.G.); Tel.: +41-21-692-40-86 (M.U.J.); +41-21-692-40-37 (N.G.); Fax: +41-21-692-40-65 (M.U.J. & N.G.)
| |
Collapse
|
35
|
Rybka K, Toal SE, Verbaro DJ, Mathieu D, Schwalbe H, Schweitzer-Stenner R. Disorder and order in unfolded and disordered peptides and proteins: a view derived from tripeptide conformational analysis. II. Tripeptides with short side chains populating asx and β-type like turn conformations. Proteins 2013; 81:968-83. [PMID: 23229867 DOI: 10.1002/prot.24226] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2012] [Revised: 11/07/2012] [Accepted: 11/21/2012] [Indexed: 11/08/2022]
Abstract
In the preceding paper, we found that ensembles of tripeptides with long or bulky chains can include up to 20% of various turns. Here, we determine the structural and thermodynamic characteristics of GxG peptides with short polar and/or ionizable central residues (D, N, C), whose conformational distributions exhibit higher than average percentage (>20%) of turn conformations. To probe the side-chain conformations of these peptides, we determined the (3)J(H(α),H(β)) coupling constants and derived the population of three rotamers with χ1 -angles of -60°, 180° and 60°, which were correlated with residue propensities by DFT-calculations. For protonated GDG, the rotamer distribution provides additional evidence for asx-turns. A comparison of vibrational spectra and NMR coupling constants of protonated GDG, ionized GDG, and the protonated aspartic acid dipeptide revealed that side chain protonation increases the pPII content at the expense of turn populations. The charged terminal groups, however, have negligible influence on the conformational properties of the central residue. Like protonated GDG, cationic GCG samples asx-turns to a significant extent. The temperature dependence of the UVCD spectra and (3)J(H(N)H(α)) constants suggest that the turn populations of GDG and GNG are practically temperature-independent, indicating enthalpic and entropic stabilization. The temperature-independent J-coupling and UVCD spectra of GNG require a three-state model. Our results indicate that short side chains with hydrogen bonding capability in GxG segments of proteins may serve as hinge regions for establishing compact structures of unfolded proteins and peptides.
Collapse
Affiliation(s)
- Karin Rybka
- Center for Biomolecular Magnetic Resonance, Institute of Organic Chemistry and Chemical Biology, Goethe-University Frankfurt, Frankfurt/Main, Germany
| | | | | | | | | | | |
Collapse
|
36
|
Røgen P, Koehl P. Extracting knowledge from protein structure geometry. Proteins 2013; 81:841-51. [PMID: 23280479 DOI: 10.1002/prot.24242] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Revised: 11/28/2012] [Accepted: 12/08/2012] [Indexed: 11/06/2022]
Abstract
Protein structure prediction techniques proceed in two steps, namely the generation of many structural models for the protein of interest, followed by an evaluation of all these models to identify those that are native-like. In theory, the second step is easy, as native structures correspond to minima of their free energy surfaces. It is well known however that the situation is more complicated as the current force fields used for molecular simulations fail to recognize native states from misfolded structures. In an attempt to solve this problem, we follow an alternate approach and derive a new potential from geometric knowledge extracted from native and misfolded conformers of protein structures. This new potential, Metric Protein Potential (MPP), has two main features that are key to its success. Firstly, it is composite in that it includes local and nonlocal geometric information on proteins. At the short range level, it captures and quantifies the mapping between the sequences and structures of short (7-mer) fragments of protein backbones through the introduction of a new local energy term. The local energy term is then augmented with a nonlocal residue-based pairwise potential, and a solvent potential. Secondly, it is optimized to yield a maximized correlation between the energy of a structural model and its root mean square (RMS) to the native structure of the corresponding protein. We have shown that MPP yields high correlation values between RMS and energy and that it is able to retrieve the native structure of a protein from a set of high-resolution decoys.
Collapse
Affiliation(s)
- Peter Røgen
- Department of Mathematics, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| | | |
Collapse
|
37
|
Principles for designing ideal protein structures. Nature 2013; 491:222-7. [PMID: 23135467 DOI: 10.1038/nature11600] [Citation(s) in RCA: 406] [Impact Index Per Article: 36.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2012] [Accepted: 09/19/2012] [Indexed: 02/03/2023]
Abstract
Unlike random heteropolymers, natural proteins fold into unique ordered structures. Understanding how these are encoded in amino-acid sequences is complicated by energetically unfavourable non-ideal features--for example kinked α-helices, bulged β-strands, strained loops and buried polar groups--that arise in proteins from evolutionary selection for biological function or from neutral drift. Here we describe an approach to designing ideal protein structures stabilized by completely consistent local and non-local interactions. The approach is based on a set of rules relating secondary structure patterns to protein tertiary motifs, which make possible the design of funnel-shaped protein folding energy landscapes leading into the target folded state. Guided by these rules, we designed sequences predicted to fold into ideal protein structures consisting of α-helices, β-strands and minimal loops. Designs for five different topologies were found to be monomeric and very stable and to adopt structures in solution nearly identical to the computational models. These results illuminate how the folding funnels of natural proteins arise and provide the foundation for engineering a new generation of functional proteins free from natural evolution.
Collapse
|
38
|
Zaki MJ, Jin S, Bystroff C. Mining residue contacts in proteins using local structure predictions. ACTA ACUST UNITED AC 2012; 33:789-801. [PMID: 18238232 DOI: 10.1109/tsmcb.2003.816916] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
In this paper we develop data mining techniques to predict 3D contact potentials among protein residues (or amino acids) based on the hierarchical nucleation-propagation model of protein folding. We apply a hybrid approach, using a hidden Markov model to extract folding initiation sites, and then apply association mining to discover contact potentials. The new hybrid approach achieves accuracy results better than those reported previously.
Collapse
Affiliation(s)
- M J Zaki
- Comput. Sci. Dept., Rensselaer Polytech. Inst., Troy, NY, USA
| | | | | |
Collapse
|
39
|
Duitch L, Toal S, Measey TJ, Schweitzer-Stenner R. Triaspartate: A Model System for Conformationally Flexible DDD Motifs in Proteins. J Phys Chem B 2012; 116:5160-71. [DOI: 10.1021/jp2121565] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Laura Duitch
- Department of Chemistry, Drexel University, 3141 Chestnut Street,
Philadelphia, Pennsylvania 19104, United States
| | - Siobhan Toal
- Department of Chemistry, Drexel University, 3141 Chestnut Street,
Philadelphia, Pennsylvania 19104, United States
| | - Thomas J. Measey
- Department of Chemistry, University of Pennsylvania, Philadelphia,
Pennsylvania 19104, United States
| | - Reinhard Schweitzer-Stenner
- Department of Chemistry, Drexel University, 3141 Chestnut Street,
Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
40
|
Joo H, Chavan AG, Phan J, Day R, Tsai J. An amino acid packing code for α-helical structure and protein design. J Mol Biol 2012; 419:234-54. [PMID: 22426125 DOI: 10.1016/j.jmb.2012.03.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2011] [Revised: 02/22/2012] [Accepted: 03/07/2012] [Indexed: 11/19/2022]
Abstract
This work demonstrates that all packing in α-helices can be simplified to repetitive patterns of a single motif: the knob-socket. Using the precision of Voronoi Polyhedra/Delauney Tessellations to identify contacts, the knob-socket is a four-residue tetrahedral motif: a knob residue on one α-helix packs into the three-residue socket on another α-helix. The principle of the knob-socket model relates the packing between levels of protein structure: the intra-helical packing arrangements within secondary structure that permit inter-helix tertiary packing interactions. Within an α-helix, the three-residue sockets arrange residues into a uniform packing lattice. Inter-helix packing results from a definable pattern of interdigitated knob-socket motifs between two α-helices. Furthermore, the knob-socket model classifies three types of sockets: (1) free, favoring only intra-helical packing; (2) filled, favoring inter-helical interactions; and (3) non, disfavoring α-helical structure. The amino acid propensities in these three socket classes essentially represent an amino acid code for structure in α-helical packing. Using this code, we used a novel yet straightforward approach for the design of α-helical structure to validate the knob-socket model. Unique sequences for three peptides were created to produce a predicted amount of α-helical structure: mostly helical, some helical, and no helix. These three peptides were synthesized, and helical content was assessed using CD spectroscopy. The measured α-helicity of each peptide was consistent with the expected predictions. These results and analysis demonstrate that the knob-socket motif functions as the basic unit of packing and presents an intuitive tool to decipher the rules governing packing in protein structure.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, CA 95211, USA
| | | | | | | | | |
Collapse
|
41
|
Shen Y, Bax A. Identification of helix capping and b-turn motifs from NMR chemical shifts. JOURNAL OF BIOMOLECULAR NMR 2012; 52:211-32. [PMID: 22314702 PMCID: PMC3357447 DOI: 10.1007/s10858-012-9602-0] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2011] [Accepted: 01/02/2012] [Indexed: 05/11/2023]
Abstract
We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and (13)C(β) chemical shifts. Elements identified include the N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I', II' and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7-0.9 for the Matthews correlation coefficient of its predictions far exceed those attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures.
Collapse
Affiliation(s)
- Yang Shen
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| | | |
Collapse
|
42
|
Tomii K, Sawada Y, Honda S. Convergent evolution in structural elements of proteins investigated using cross profile analysis. BMC Bioinformatics 2012; 13:11. [PMID: 22244085 PMCID: PMC3398312 DOI: 10.1186/1471-2105-13-11] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 01/16/2012] [Indexed: 11/10/2022] Open
Abstract
Background Evolutionary relations of similar segments shared by different protein folds remain controversial, even though many examples of such segments have been found. To date, several methods such as those based on the results of structure comparisons, sequence-based classifications, and sequence-based profile-profile comparisons have been applied to identify such protein segments that possess local similarities in both sequence and structure across protein folds. However, to capture more precise sequence-structure relations, no method reported to date combines structure-based profiles, and sequence-based profiles based on evolutionary information. The former are generally regarded as representing the amino acid preferences at each position of a specific conformation of protein segment. They might reflect the nature of ancient short peptide ancestors, using the results of structural classifications of protein segments. Results This report describes the development and use of "Cross Profile Analysis" to compare sequence-based profiles and structure-based profiles based on amino acid occurrences at each position within a protein segment cluster. Using systematic cross profile analysis, we found structural clusters of 9-residue and 15-residue segments showing remarkably strong correlation with particular sequence profiles. These correlations reflect structural similarities among constituent segments of both sequence-based and structure-based profiles. We also report previously undetectable sequence-structure patterns that transcend protein family and fold boundaries, and present results of the conformational analysis of the deduced peptide of a segment cluster. These results suggest the existence of ancient short-peptide ancestors. Conclusions Cross profile analysis reveals the polyphyletic and convergent evolution of β-hairpin-like structures, which were verified both experimentally and computationally. The results presented here give us new insights into the evolution of short protein segments.
Collapse
|
43
|
Ramaraj T, Angel T, Dratz EA, Jesaitis AJ, Mumey B. Antigen-antibody interface properties: composition, residue interactions, and features of 53 non-redundant structures. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2012; 1824:520-32. [PMID: 22246133 DOI: 10.1016/j.bbapap.2011.12.007] [Citation(s) in RCA: 115] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2011] [Revised: 12/22/2011] [Accepted: 12/23/2011] [Indexed: 11/17/2022]
Abstract
The structures of protein antigen-antibody (Ag-Ab) interfaces contain information about how Ab recognize Ag as well as how Ag are folded to present surfaces for Ag recognition. As such, the Ab surface holds information about Ag folding that resides with the Ab-Ag interface residues and how they interact. In order to gain insight into the nature of such interactions, a data set comprised of 53 non-redundant 3D structures of Ag-Ab complexes was analyzed. We assessed the physical and biochemical features of the Ag-Ab interfaces and the degree to which favored interactions exist between amino acid residues on the corresponding interface surfaces. Amino acid compositional analysis of the interfaces confirmed the dominance of TYR in the Ab paratope-containing surface (PCS), with almost two fold greater abundance than any other residue. Additionally TYR had a much higher than expected presence in the PCS compared to the surface of the whole antibody (defined as the occurrence propensity), along with aromatics PHE, TRP, and to a lesser degree HIS and ILE. In the Ag epitope-containing surface (ECS), there were slightly increased occurrence propensities of TRP and TYR relative to the whole Ag surface, implying an increased significance over the compositionally most abundant LYS>ASN>GLU>ASP>ARG. This examination encompasses a large, diverse set of unique Ag-Ab crystal structures that help explain the biological range and specificity of Ag-Ab interactions. This analysis may also provide a measure of the significance of individual amino acid residues in phage display analysis of Ag binding.
Collapse
|
44
|
Hassan R, Othman RM, Saad P, Kasim S. A compact hybrid feature vector for an accurate secondary structure prediction. Inf Sci (N Y) 2011. [DOI: 10.1016/j.ins.2011.07.019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
45
|
Ku SY, Hu YJ. Structural alphabet motif discovery and a structural motif database. Comput Biol Med 2011; 42:93-105. [PMID: 22099701 DOI: 10.1016/j.compbiomed.2011.10.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2010] [Revised: 09/28/2011] [Accepted: 10/27/2011] [Indexed: 10/15/2022]
Abstract
This study proposes a general framework for structural motif discovery. The framework is based on a modular design in which the system components can be modified or replaced independently to increase its applicability to various studies. It is a two-stage approach that first converts protein 3D structures into structural alphabet sequences, and then applies a sequence motif-finding tool to these sequences to detect conserved motifs. We named the structural motif database we built the SA-Motifbase, which provides the structural information conserved at different hierarchical levels in SCOP. For each motif, SA-Motifbase presents its 3D view; alphabet letter preference; alphabet letter frequency distribution; and the significance. SA-Motifbase is available at http://bioinfo.cis.nctu.edu.tw/samotifbase/.
Collapse
Affiliation(s)
- Shih-Yen Ku
- Department of Computer Science, National Chiao Tung University, 1001 Tashuei Rd., Hsinchu, Taiwan
| | | |
Collapse
|
46
|
Handl J, Knowles J, Vernon R, Baker D, Lovell SC. The dual role of fragments in fragment-assembly methods for de novo protein structure prediction. Proteins 2011; 80:490-504. [PMID: 22095594 DOI: 10.1002/prot.23215] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2011] [Revised: 08/17/2011] [Accepted: 09/14/2011] [Indexed: 11/07/2022]
Abstract
In fragment-assembly techniques for protein structure prediction, models of protein structure are assembled from fragments of known protein structures. This process is typically guided by a knowledge-based energy function and uses a heuristic optimization method. The fragments play two important roles in this process: they define the set of structural parameters available, and they also assume the role of the main variation operators that are used by the optimiser. Previous analysis has typically focused on the first of these roles. In particular, the relationship between local amino acid sequence and local protein structure has been studied by a range of authors. The correlation between the two has been shown to vary with the window length considered, and the results of these analyses have informed directly the choice of fragment length in state-of-the-art prediction techniques. Here, we focus on the second role of fragments and aim to determine the effect of fragment length from an optimization perspective. We use theoretical analyses to reveal how the size and structure of the search space changes as a function of insertion length. Furthermore, empirical analyses are used to explore additional ways in which the size of the fragment insertion influences the search both in a simulation model and for the fragment-assembly technique, Rosetta.
Collapse
Affiliation(s)
- Julia Handl
- Manchester Business School, The University of Manchester, United Kingdom.
| | | | | | | | | |
Collapse
|
47
|
Abstract
MOTIVATION Over the last decade, both static and dynamic fragment libraries for protein structure prediction have been introduced. The former are built from clusters in either sequence or structure space and aim to extract a universal structural alphabet. The latter are tailored for a particular query protein sequence and aim to provide local structural templates that need to be assembled in order to build the full-length structure. RESULTS Here, we introduce HHfrag, a dynamic HMM-based fragment search method built on the profile-profile comparison tool HHpred. We show that HHfrag provides advantages over existing fragment assignment methods in that it: (i) improves the precision of the fragments at the expense of a minor loss in sequence coverage; (ii) detects fragments of variable length (6-21 amino acid residues); (iii) allows for gapped fragments and (iv) does not assign fragments to regions where there is no clear sequence conservation. We illustrate the usefulness of fragments detected by HHfrag on targets from most recent CASP.
Collapse
Affiliation(s)
- Ivan Kalev
- Department of Protein Evolution and Department of Empirical Inference, Max Planck Institute for Intelligent Systems, Tübingen, Germany
| | | |
Collapse
|
48
|
Gamliel R, Kedem K, Kolodny R, Keasar C. A library of protein surface patches discriminates between native structures and decoys generated by structure prediction servers. BMC STRUCTURAL BIOLOGY 2011; 11:20. [PMID: 21542935 PMCID: PMC3114701 DOI: 10.1186/1472-6807-11-20] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/03/2010] [Accepted: 05/04/2011] [Indexed: 11/10/2022]
Abstract
Background Protein surfaces serve as an interface with the molecular environment and are thus tightly bound to protein function. On the surface, geometric and chemical complementarity to other molecules provides interaction specificity for ligand binding, docking of bio-macromolecules, and enzymatic catalysis. As of today, there is no accepted general scheme to represent protein surfaces. Furthermore, most of the research on protein surface focuses on regions of specific interest such as interaction, ligand binding, and docking sites. We present a first step toward a general purpose representation of protein surfaces: a novel surface patch library that represents most surface patches (~98%) in a data set regardless of their functional roles. Results Surface patches, in this work, are small fractions of the protein surface. Using a measure of inter-patch distance, we clustered patches extracted from a data set of high quality, non-redundant, proteins. The surface patch library is the collection of all the cluster centroids; thus, each of the data set patches is close to one of the elements in the library. We demonstrate the biological significance of our method through the ability of the library to capture surface characteristics of native protein structures as opposed to those of decoy sets generated by state-of-the-art protein structure prediction methods. The patches of the decoys are significantly less compatible with the library than their corresponding native structures, allowing us to reliably distinguish native models from models generated by servers. This trend, however, does not extend to the decoys themselves, as their similarity to the native structures does not correlate with compatibility with the library. Conclusions We expect that this high-quality, generic surface patch library will add a new perspective to the description of protein structures and improve our ability to predict them. In particular, we expect that it will help improve the prediction of surface features that are apparently neglected by current techniques. The surface patch libraries are publicly available at http://www.cs.bgu.ac.il/~keasar/patchLibrary.
Collapse
Affiliation(s)
- Roi Gamliel
- Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | | | | | | |
Collapse
|
49
|
Zhou Y, Duan Y, Yang Y, Faraggi E, Lei H. Trends in template/fragment-free protein structure prediction. Theor Chem Acc 2011; 128:3-16. [PMID: 21423322 PMCID: PMC3030773 DOI: 10.1007/s00214-010-0799-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 08/15/2010] [Indexed: 12/13/2022]
Abstract
Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward.
Collapse
Affiliation(s)
- Yaoqi Zhou
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Yong Duan
- UC Davis Genome Center and Department of Applied Science, University of California, One Shields Avenue, Davis, CA USA
- College of Physics, Huazhong University of Science and Technology, 1037 Luoyu Road, 430074 Wuhan, China
| | - Yuedong Yang
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Eshel Faraggi
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Hongxing Lei
- UC Davis Genome Center and Department of Applied Science, University of California, One Shields Avenue, Davis, CA USA
- Beijing Institute of Genomics, Chinese Academy of Sciences, 100029 Beijing, China
| |
Collapse
|
50
|
Reeder PJ, Huang YM, Dordick JS, Bystroff C. A rewired green fluorescent protein: folding and function in a nonsequential, noncircular GFP permutant. Biochemistry 2010; 49:10773-9. [PMID: 21090791 DOI: 10.1021/bi100975z] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The sequential order of secondary structural elements in proteins affects the folding and activity to an unknown extent. To test the dependence on sequential connectivity, we reconnected secondary structural elements by their solvent-exposed ends, permuting their sequential order, called "rewiring". This new protein design strategy changes the topology of the backbone without changing the core side chain packing arrangement. While circular and noncircular permutations have been observed in protein structures that are not related by sequence homology, to date no one has attempted to rationally design and construct a protein with a sequence that is noncircularly permuted while conserving three-dimensional structure. Herein, we show that green fluorescent protein can be rewired, still functionally fold, and exhibit wild-type fluorescence excitation and emission spectra.
Collapse
Affiliation(s)
- Philippa J Reeder
- Department of Chemical and Biological Engineering, University of Colorado, Boulder, Colorado 80309, United States
| | | | | | | |
Collapse
|