101
|
Krawczyk K, Baker T, Shi J, Deane CM. Antibody i-Patch prediction of the antibody binding site improves rigid local antibody-antigen docking. Protein Eng Des Sel 2013; 26:621-9. [PMID: 24006373 DOI: 10.1093/protein/gzt043] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Antibodies are a class of proteins indispensable for the vertebrate immune system. The general architecture of all antibodies is very similar, but they contain a hypervariable region which allows millions of antibody variants to exist, each of which can bind to different molecules. This binding malleability means that antibodies are an increasingly important category of biopharmaceuticals and biomarkers. We present Antibody i-Patch, a method that annotates the most likely antibody residues to be in contact with the antigen. We show that our predictions correlate with energetic importance and thus we argue that they may be useful in guiding mutations in the artificial affinity maturation process. Using our predictions as constraints for a rigid-body docking algorithm, we are able to obtain high-quality results in minutes. Our annotation method and re-scoring system for docking achieve their predictive power by using antibody-specific statistics. Antibody i-Patch is available from http://www.stats.ox.ac.uk/research/proteins/resources.
Collapse
Affiliation(s)
- Konrad Krawczyk
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK
| | | | | | | |
Collapse
|
102
|
MacDonald JT, Kelley LA, Freemont PS. Validating a Coarse-Grained Potential Energy Function through Protein Loop Modelling. PLoS One 2013; 8:e65770. [PMID: 23824634 PMCID: PMC3688807 DOI: 10.1371/journal.pone.0065770] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Accepted: 04/26/2013] [Indexed: 12/02/2022] Open
Abstract
Coarse-grained (CG) methods for sampling protein conformational space have the potential to increase computational efficiency by reducing the degrees of freedom. The gain in computational efficiency of CG methods often comes at the expense of non-protein like local conformational features. This could cause problems when transitioning to full atom models in a hierarchical framework. Here, a CG potential energy function was validated by applying it to the problem of loop prediction. A novel method to sample the conformational space of backbone atoms was benchmarked using a standard test set consisting of 351 distinct loops. This method used a sequence-independent CG potential energy function representing the protein using -carbon positions only and sampling conformations with a Monte Carlo simulated annealing based protocol. Backbone atoms were added using a method previously described and then gradient minimised in the Rosetta force field. Despite the CG potential energy function being sequence-independent, the method performed similarly to methods that explicitly use either fragments of known protein backbones with similar sequences or residue-specific /-maps to restrict the search space. The method was also able to predict with sub-Angstrom accuracy two out of seven loops from recently solved crystal structures of proteins with low sequence and structure similarity to previously deposited structures in the PDB. The ability to sample realistic loop conformations directly from a potential energy function enables the incorporation of additional geometric restraints and the use of more advanced sampling methods in a way that is not possible to do easily with fragment replacement methods and also enable multi-scale simulations for protein design and protein structure prediction. These restraints could be derived from experimental data or could be design restraints in the case of computational protein design. C++ source code is available for download from http://www.sbg.bio.ic.ac.uk/phyre2/PD2/.
Collapse
Affiliation(s)
- James T. MacDonald
- Division of Molecular Biosciences, Imperial College London, London, United Kingdom
- * E-mail:
| | - Lawrence A. Kelley
- Division of Molecular Biosciences, Imperial College London, London, United Kingdom
| | - Paul S. Freemont
- Division of Molecular Biosciences, Imperial College London, London, United Kingdom
| |
Collapse
|
103
|
Ebejer JP, Hill JR, Kelm S, Shi J, Deane CM. Memoir: template-based structure prediction for membrane proteins. Nucleic Acids Res 2013; 41:W379-83. [PMID: 23640332 PMCID: PMC3692111 DOI: 10.1093/nar/gkt331] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Membrane proteins are estimated to be the targets of 50% of drugs that are currently in development, yet we have few membrane protein crystal structures. As a result, for a membrane protein of interest, the much-needed structural information usually comes from a homology model. Current homology modelling software is optimized for globular proteins, and ignores the constraints that the membrane is known to place on protein structure. Our Memoir server produces homology models using alignment and coordinate generation software that has been designed specifically for transmembrane proteins. Memoir is easy to use, with the only inputs being a structural template and the sequence that is to be modelled. We provide a video tutorial and a guide to assessing model quality. Supporting data aid manual refinement of the models. These data include a set of alternative conformations for each modelled loop, and a multiple sequence alignment that incorporates the query and template. Memoir works with both α-helical and β-barrel types of membrane proteins and is freely available at http://opig.stats.ox.ac.uk/webapps/memoir.
Collapse
Affiliation(s)
- Jean-Paul Ebejer
- Department of Statistics, Oxford University, Oxford, OX1 3TG, UK
| | | | | | | | | |
Collapse
|
104
|
Epa VC, Dolezal O, Doughty L, Xiao X, Jost C, Plückthun A, Adams TE. Structural model for the interaction of a designed Ankyrin Repeat Protein with the human epidermal growth factor receptor 2. PLoS One 2013; 8:e59163. [PMID: 23527120 PMCID: PMC3602593 DOI: 10.1371/journal.pone.0059163] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2013] [Accepted: 02/12/2013] [Indexed: 02/02/2023] Open
Abstract
Designed Ankyrin Repeat Proteins are a class of novel binding proteins that can be selected and evolved to bind to targets with high affinity and specificity. We are interested in the DARPin H10-2-G3, which has been evolved to bind with very high affinity to the human epidermal growth factor receptor 2 (HER2). HER2 is found to be over-expressed in 30% of breast cancers, and is the target for the FDA-approved therapeutic monoclonal antibodies trastuzumab and pertuzumab and small molecule tyrosine kinase inhibitors. Here, we use computational macromolecular docking, coupled with several interface metrics such as shape complementarity, interaction energy, and electrostatic complementarity, to model the structure of the complex between the DARPin H10-2-G3 and HER2. We analyzed the interface between the two proteins and then validated the structural model by showing that selected HER2 point mutations at the putative interface with H10-2-G3 reduce the affinity of binding up to 100-fold without affecting the binding of trastuzumab. Comparisons made with a subsequently solved X-ray crystal structure of the complex yielded a backbone atom root mean square deviation of 0.84-1.14 Ångstroms. The study presented here demonstrates the capability of the computational techniques of structural bioinformatics in generating useful structural models of protein-protein interactions.
Collapse
Affiliation(s)
- V Chandana Epa
- Commonwealth Scientific & Industrial Research Organization Materials Science & Engineering, Parkville, Victoria, Australia.
| | | | | | | | | | | | | |
Collapse
|
105
|
Felix J, Elegheert J, Gutsche I, Shkumatov AV, Wen Y, Bracke N, Pannecoucke E, Vandenberghe I, Devreese B, Svergun DI, Pauwels E, Vergauwen B, Savvides SN. Human IL-34 and CSF-1 establish structurally similar extracellular assemblies with their common hematopoietic receptor. Structure 2013; 21:528-39. [PMID: 23478061 DOI: 10.1016/j.str.2013.01.018] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2012] [Revised: 01/22/2013] [Accepted: 01/28/2013] [Indexed: 12/21/2022]
Abstract
The discovery that hematopoietic human colony stimulating factor-1 receptor (CSF-1R) can be activated by two distinct cognate cytokines, colony stimulating factor-1 (CSF-1) and interleukin-34 (IL-34), created puzzling scenarios for the two possible signaling complexes. We here employ a hybrid structural approach based on small-angle X-ray scattering (SAXS) and negative-stain EM to reveal that bivalent binding of human IL-34 to CSF-1R leads to an extracellular assembly hallmarked by striking similarities to the CSF-1:CSF-1R complex, including homotypic receptor-receptor interactions. Thus, IL-34 and CSF-1 have evolved to exploit the geometric requirements of CSF-1R activation. Our models include N-linked oligomannose glycans derived from a systematic approach resulting in the accurate fitting of glycosylated models to the SAXS data. We further show that the C-terminal region of IL-34 is heavily glycosylated and that it can be proteolytically cleaved from the IL-34:hCSF-1R complex, providing insights into its role in the functional nonredundancy of IL-34 and CSF-1.
Collapse
Affiliation(s)
- Jan Felix
- Unit for Structural Biology, Laboratory for Protein Biochemistry and Biomolecular Engineering (L-ProBE), Ghent University, K.L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
106
|
Abstract
Loops are irregular structures which connect two secondary structure elements in proteins. They often play important roles in function, including enzyme reactions and ligand binding. Despite their importance, their structure remains difficult to predict. Most protein loop structure prediction methods sample local loop segments and score them. In particular protein loop classifications and database search methods depend heavily on local properties of loops. Here we examine the distance between a loop's end points (span). We find that the distribution of loop span appears to be independent of the number of residues in the loop, in other words the separation between the anchors of a loop does not increase with an increase in the number of loop residues. Loop span is also unaffected by the secondary structures at the end points, unless the two anchors are part of an anti-parallel beta sheet. As loop span appears to be independent of global properties of the protein we suggest that its distribution can be described by a random fluctuation model based on the Maxwell-Boltzmann distribution. It is believed that the primary difficulty in protein loop structure prediction comes from the number of residues in the loop. Following the idea that loop span is an independent local property, we investigate its effect on protein loop structure prediction and show how normalised span (loop stretch) is related to the structural complexity of loops. Highly contracted loops are more difficult to predict than stretched loops.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Department of Computer Science , Dartmouth College , Hanover, NH , USA
| | | | | |
Collapse
|
107
|
Fernandez-Fuentes N, Fiser A. A modular perspective of protein structures: application to fragment based loop modeling. Methods Mol Biol 2013; 932:141-58. [PMID: 22987351 PMCID: PMC3635063 DOI: 10.1007/978-1-62703-065-6_9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Proteins can be decomposed into supersecondary structure modules. We used a generic definition of supersecondary structure elements, so-called Smotifs, which are composed of two flanking regular secondary structures connected by a loop, to explore the evolution and current variety of structure building blocks. Here, we discuss recent observations about the saturation of Smotif geometries in protein structures and how it opens new avenues in protein structure modeling and design. As a first application of these observations we describe our loop conformation modeling algorithm, ArchPred that takes advantage of Smotifs classification. In this application, instead of focusing on specific loop properties the method narrows down possible template conformations in other, often not homologous structures, by identifying the most likely supersecondary structure environment that cradles the loop. Beyond identifying the correct starting supersecondary structure geometry, it takes into account information of fit of anchor residues, sterical clashes, match of predicted and observed dihedral angle preferences, and local sequence signal.
Collapse
Affiliation(s)
- Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, Section of Experimental Therapeutics, University of Leeds, St. James's University Hospital, Leeds LS9 7TF, UK
| | - Andras Fiser
- Department of Systems and Computational Biology, Department of Biochemistry Albert Einstein College of Medicine, 1301 Morris Park Ave, Bronx, NY 10461, USA
| |
Collapse
|
108
|
Hill JR, Deane CM. MP-T: improving membrane protein alignment for structure prediction. Bioinformatics 2012; 29:54-61. [DOI: 10.1093/bioinformatics/bts640] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
|
109
|
Kuroda D, Shirai H, Jacobson MP, Nakamura H. Computer-aided antibody design. Protein Eng Des Sel 2012; 25:507-21. [PMID: 22661385 PMCID: PMC3449398 DOI: 10.1093/protein/gzs024] [Citation(s) in RCA: 169] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2012] [Revised: 04/14/2012] [Accepted: 04/19/2012] [Indexed: 11/12/2022] Open
Abstract
Recent clinical trials using antibodies with low toxicity and high efficiency have raised expectations for the development of next-generation protein therapeutics. However, the process of obtaining therapeutic antibodies remains time consuming and empirical. This review summarizes recent progresses in the field of computer-aided antibody development mainly focusing on antibody modeling, which is divided essentially into two parts: (i) modeling the antigen-binding site, also called the complementarity determining regions (CDRs), and (ii) predicting the relative orientations of the variable heavy (V(H)) and light (V(L)) chains. Among the six CDR loops, the greatest challenge is predicting the conformation of CDR-H3, which is the most important in antigen recognition. Further computational methods could be used in drug development based on crystal structures or homology models, including antibody-antigen dockings and energy calculations with approximate potential functions. These methods should guide experimental studies to improve the affinities and physicochemical properties of antibodies. Finally, several successful examples of in silico structure-based antibody designs are reviewed. We also briefly review structure-based antigen or immunogen design, with application to rational vaccine development.
Collapse
Affiliation(s)
- Daisuke Kuroda
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka, Japan.
| | | | | | | |
Collapse
|
110
|
Producing high-accuracy lattice models from protein atomic coordinates including side chains. Adv Bioinformatics 2012; 2012:148045. [PMID: 22934109 PMCID: PMC3426164 DOI: 10.1155/2012/148045] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Accepted: 06/18/2012] [Indexed: 02/08/2023] Open
Abstract
Lattice models are a common abstraction used in the study of protein structure, folding, and refinement. They are advantageous because the discretisation of space can make extensive protein evaluations computationally feasible. Various approaches to the protein chain lattice fitting problem have been suggested but only a single backbone-only tool is available currently. We introduce LatFit, a new tool to produce high-accuracy lattice protein models. It generates both backbone-only and backbone-side-chain models in any user defined lattice. LatFit implements a new distance RMSD-optimisation fitting procedure in addition to the known coordinate RMSD method. We tested LatFit's accuracy and speed using a large nonredundant set of high resolution proteins (SCOP database) on three commonly used lattices: 3D cubic, face-centred cubic, and knight's walk. Fitting speed compared favourably to other methods and both backbone-only and backbone-side-chain models show low deviation from the original data (~1.5 Å RMSD in the FCC lattice). To our knowledge this represents the first comprehensive study of lattice quality for on-lattice protein models including side chains while LatFit is the only available tool for such models.
Collapse
|
111
|
Abstract
The prediction of loop structures is considered one of the main challenges in the protein folding problem. Regardless of the dependence of the overall algorithm on the protein data bank, the flexibility of loop regions dictates the need for special attention to their structures. In this article, we present algorithms for loop structure prediction with fixed stem and flexible stem geometry. In the flexible stem geometry problem, only the secondary structure of three stem residues on either side of the loop is known. In the fixed stem geometry problem, the structure of the three stem residues on either side of the loop is also known. Initial loop structures are generated using a probability database for the flexible stem geometry problem, and using torsion angle dynamics for the fixed stem geometry problem. Three rotamer optimization algorithms are introduced to alleviate steric clashes between the generated backbone structures and the side chain rotamers. The structures are optimized by energy minimization using an all-atom force field. The optimized structures are clustered using a traveling salesman problem-based clustering algorithm. The structures in the densest clusters are then utilized to refine dihedral angle bounds on all amino acids in the loop. The entire procedure is carried out for a number of iterations, leading to improved structure prediction and refined dihedral angle bounds. The algorithms presented in this article have been tested on 3190 loops from the PDBSelect25 data set and on targets from the recently concluded CASP9 community-wide experiment.
Collapse
Affiliation(s)
- A. Subramani
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
112
|
St-Pierre JF, Mousseau N. Large loop conformation sampling using the activation relaxation technique, ART-nouveau method. Proteins 2012; 80:1883-94. [PMID: 22488731 DOI: 10.1002/prot.24085] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Revised: 03/19/2011] [Accepted: 03/30/2012] [Indexed: 12/25/2022]
Abstract
We present an adaptation of the ART-nouveau energy surface sampling method to the problem of loop structure prediction. This method, previously used to study protein folding pathways and peptide aggregation, is well suited to the problem of sampling the conformation space of large loops by targeting probable folding pathways instead of sampling exhaustively that space. The number of sampled conformations needed by ART nouveau to find the global energy minimum for a loop was found to scale linearly with the sequence length of the loop for loops between 8 and about 20 amino acids. Considering the linear scaling dependence of the computation cost on the loop sequence length for sampling new conformations, we estimate the total computational cost of sampling larger loops to scale quadratically compared to the exponential scaling of exhaustive search methods.
Collapse
Affiliation(s)
- Jean-François St-Pierre
- Département de Physique and Regroupement Québécois sur les Matériaux de Pointe, Université de Montréal, CP 6128, Succursale Centre-Ville, Montréal, Québec, Canada H3C 3J7
| | | |
Collapse
|
113
|
Liang S, Zhang C, Sarmiento J, Standley DM. Protein Loop Modeling with Optimized Backbone Potential Functions. J Chem Theory Comput 2012; 8:1820-7. [PMID: 26593673 DOI: 10.1021/ct300131p] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We represented protein backbone potential as a Fourier series. The parameters of the backbone dihedral potential were initialized to random values and optimized by Monte Carlo simulations so that generated native-like loop decoys had a lower energy than non-native decoys. The low energy regions of the optimized backbone potential were consistent with observed Ramachandran plots derived from crystal structures. The backbone potential was then used for the prediction of loop conformations (OSCAR-loop) combining with the previously described OSCAR force field, which has been shown to be very accurate in side chain modeling. As a result, the accuracy of OSCAR-loop was improved by local energy minimization based on the complete force field. The average accuracies were 0.40, 0.70, 1.10, 2.08, and 3.58 Å for 4, 6, 8, 10, and 12-residue loops, respectively, with each size being represented by 325 to 2809 targets. The accuracy was better than that of other loop modeling algorithms for short loops (<10 residues). For longer loops, the prediction accuracy was improved by concurrently sampling with a fragment-based method, Spanner. OSCAR-loop is available for download at http://sysimm.ifrec.osaka-u.ac.jp/OSCAR/ .
Collapse
Affiliation(s)
- Shide Liang
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University , Suita, Osaka, 565-0871, Japan
| | - Chi Zhang
- School of Biological Sciences, Center for Plant Science and Innovation, University of Nebraska , Lincoln, Nebraska 68588, United States
| | - Jamica Sarmiento
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University , Suita, Osaka, 565-0871, Japan
| | - Daron M Standley
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University , Suita, Osaka, 565-0871, Japan
| |
Collapse
|
114
|
Cowtan K. Completion of autobuilt protein models using a database of protein fragments. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 2012; 68:328-35. [PMID: 22505253 PMCID: PMC3322592 DOI: 10.1107/s0907444911039655] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2011] [Accepted: 09/27/2011] [Indexed: 12/05/2022]
Abstract
Two developments in the process of automated protein model building in the Buccaneer software are described: the use of a database of protein fragments in improving the model completeness and the assembly of disconnected chain fragments into complete molecules. Two developments in the process of automated protein model building in the Buccaneer software are presented. A general-purpose library for protein fragments of arbitrary size is described, with a highly optimized search method allowing the use of a larger database than in previous work. The problem of assembling an autobuilt model into complete chains is discussed. This involves the assembly of disconnected chain fragments into complete molecules and the use of the database of protein fragments in improving the model completeness. Assembly of fragments into molecules is a standard step in existing model-building software, but the methods have not received detailed discussion in the literature.
Collapse
Affiliation(s)
- Kevin Cowtan
- Department of Chemistry, University of York, Heslington, York YO10 5DD, England.
| |
Collapse
|
115
|
Adhikari AN, Peng J, Wilde M, Xu J, Freed KF, Sosnick TR. Modeling large regions in proteins: applications to loops, termini, and folding. Protein Sci 2012; 21:107-21. [PMID: 22095743 PMCID: PMC3323786 DOI: 10.1002/pro.767] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2011] [Revised: 11/02/2011] [Accepted: 11/06/2011] [Indexed: 11/10/2022]
Abstract
Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ≤12 residues in crystal structures. However, InsEnds may contain as many as ~50 amino acids, and the template-based model of the protein itself may be imperfect. To address these challenges, we present a free modeling method for predicting the local structure of loops and large InsEnds in both crystal structures and template-based models. The approach uses single amino acid torsional angle "pivot" moves of the protein backbone with a C(β) level representation. Nevertheless, our accuracy for loops is comparable to existing methods. We also apply a more stringent test, the blind structure prediction and refinement categories of the CASP9 tournament, where we improve the quality of several homology based models by modeling InsEnds as long as 45 amino acids, sizes generally inaccessible to existing loop prediction methods. Our approach ranks as one of the best in the CASP9 refinement category that involves improving template-based models so that they can function as molecular replacement models to solve the phase problem for crystallographic structure determination.
Collapse
Affiliation(s)
- Aashish N Adhikari
- Department of Chemistry, The University of ChicagoChicago, Illinois 60637
- The James Franck Institute, The University of ChicagoChicago, Illinois 60637
| | - Jian Peng
- Toyota Technological Institute at ChicagoChicago, Illinois 60637
| | - Michael Wilde
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicago, Illinois 60637
| | - Jinbo Xu
- Toyota Technological Institute at ChicagoChicago, Illinois 60637
| | - Karl F Freed
- Department of Chemistry, The University of ChicagoChicago, Illinois 60637
- The James Franck Institute, The University of ChicagoChicago, Illinois 60637
- Computation Institute, The University of Chicago and Argonne National LaboratoryChicago, Illinois 60637
| | - Tobin R Sosnick
- Computation Institute, The University of Chicago and Argonne National LaboratoryChicago, Illinois 60637
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicago, Illinois 60637
- Institute for Biophysical Dynamics, The University of ChicagoChicago, Illinois 60637
| |
Collapse
|
116
|
Choi Y, Deane CM. Predicting antibody complementarity determining region structures without classification. MOLECULAR BIOSYSTEMS 2011; 7:3327-34. [PMID: 22011953 DOI: 10.1039/c1mb05223c] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Antibodies are used extensively in medical and biological research. Their complementarity determining regions (CDRs) define the majority of their antigen binding functionality. CDR structures have been intensively studied and classified (canonical structures). Here we show that CDR structure prediction is no different from the standard loop structure prediction problem and predict them without classification. FREAD, a successful database loop prediction technique, is able to produce accurate predictions for all CDR loops (0.81, 0.42, 0.96, 0.98, 0.88 and 2.25 Å RMSD for CDR-L1 to CDR-H3). In order to overcome the relatively poor predictions of CDR-H3, we developed two variants of FREAD, one focused on sequence similarity (FREAD-S) and another which includes contact information (ConFREAD). Both of the methods improve accuracy for CDR-H3 to 1.34 Å and 1.23 Å respectively. The FREAD variants are also tested on homology models and compared to RosettaAntibody (CDR-H3 prediction on models: 1.98 and 2.62 Å for ConFREAD and RosettaAntibody respectively). CDRs are known to change their structural conformations upon binding the antigen. Traditional CDR classifications are based on sequence similarity and do not account for such environment changes. Using a set of antigen-free and antigen-bound structures, we compared our FREAD variants. ConFREAD which includes contact information successfully discriminates the bound and unbound CDR structures and achieves an accuracy of 1.35 Å for bound structures of CDR-H3.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Department of Statistics, Oxford University, 1 South Parks Road, Oxford OX1 3TG, UK
| | | |
Collapse
|
117
|
Joo H, Chavan AG, Day R, Lennox KP, Sukhanov P, Dahl DB, Vannucci M, Tsai J. Near-native protein loop sampling using nonparametric density estimation accommodating sparcity. PLoS Comput Biol 2011; 7:e1002234. [PMID: 22028638 PMCID: PMC3197639 DOI: 10.1371/journal.pcbi.1002234] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2011] [Accepted: 09/01/2011] [Indexed: 11/29/2022] Open
Abstract
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD <2.0 Å), the DPM-HMM method performs as well or better than the best templates, demonstrating that our automated method recaptures these canonical loops without inclusion of any IgG specific terms or manual intervention. In cases with poor or few good templates (mean RMSD >7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/. A protein's structure consists of elements of regular secondary structure connected by less regular stretches of loop segments. The irregularity of the loop structure makes loop modeling quite challenging. More accurate sampling of these loop conformations has a direct impact on protein modeling, design, function classification, as well as protein interactions. A method has been developed that extends a more comprehensive knowledge-based approach to producing models of the loop regions of protein structure. Most physical models cannot adequately sample the large conformational space, while the more discrete knowledge based libraries are conformationally limited. To address both of these problems, we introduce a novel statistical method that produces a continuous yet weighted estimation of loop conformational space from a discrete library of structures by using a Dirichlet process mixture of hidden Markov models (DPM-HMM). Applied to loop structure sampling, the results of a number of tests demonstrate that our approach quickly generates large numbers of candidates with near native loop conformations. Most significantly, in the cases where the template sampling is sparse and/or far from native conformations, the DPM-HMM method samples close to the native space and produces a population of accurate loop structures.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Archana G. Chavan
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Ryan Day
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Kristin P. Lennox
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Paul Sukhanov
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - David B. Dahl
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, Texas, United States of America
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
- * E-mail:
| |
Collapse
|
118
|
Verschueren E, Vanhee P, van der Sloot AM, Serrano L, Rousseau F, Schymkowitz J. Protein design with fragment databases. Curr Opin Struct Biol 2011; 21:452-9. [PMID: 21684149 DOI: 10.1016/j.sbi.2011.05.002] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2011] [Accepted: 05/25/2011] [Indexed: 11/25/2022]
Abstract
Structure-based computational methods are popular tools for designing proteins and interactions between proteins because they provide the necessary insight and details required for rational engineering. Here, we first argue that large-scale databases of fragments contain a discrete but complete set of building blocks that can be used to design structures. We show that these structural alphabets can be saturated to provide conformational ensembles that sample the native structure space around energetic minima. Second, we show that catalogs of interaction patterns hold the key to overcome the lack of scaffolds when computationally designing protein interactions. Finally, we illustrate the power of database-driven computational protein design methods by recent successful applications and discuss what challenges remain to push this field forward.
Collapse
Affiliation(s)
- Erik Verschueren
- EMBL/CRG Systems Biology Research Unit, Centre for Genomic Regulation (CRG) and UPF, Barcelona, Spain
| | | | | | | | | | | |
Collapse
|
119
|
Sircar A, Sanni KA, Shi J, Gray JJ. Analysis and modeling of the variable region of camelid single-domain antibodies. THE JOURNAL OF IMMUNOLOGY 2011; 186:6357-67. [PMID: 21525384 DOI: 10.4049/jimmunol.1100116] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Camelids have a special type of Ab, known as heavy chain Abs, which are devoid of classical Ab light chains. Relative to classical Abs, camelid heavy chain Abs (cAbs) have comparable immunogenicity, Ag recognition diversity and binding affinities, higher stability and solubility, and better manufacturability, making them promising candidates for alternate therapeutic scaffolds. Rational engineering of cAbs to improve therapeutic function requires knowledge of the differences of sequence and structural features between cAbs and classical Abs. In this study, amino acid sequences of 27 cAb variable regions (V(H)H) were aligned with the respective regions of 54 classical Abs to detect amino acid differences, enabling automatic identification of cAb V(H)H CDRs. CDR analysis revealed that the H1 often (and sometimes the H2) adopts diverse conformations not classifiable by established canonical rules. Also, although the cAb H3 is much longer than classical H3 loops, it often contains common structural motifs and sometimes a disulfide bond to the H1. Leveraging these observations, we created a Monte Carlo-based cAb V(H)H structural modeling tool, where the CDR H1 and H2 loops exhibited a median root-mean-square deviation to natives of 3.1 and 1.5 Å, respectively. The protocol generated 8-12, 14-16, and 16-24 residue H3 loops with a median root-mean-square deviation to natives of 5.7, 4.5, and 6.8 Å, respectively. The large deviation of the predicted loops underscores the challenge in modeling such long loops. cAb V(H)H homology models can provide structural insights into interaction mechanisms to enable development of novel Abs for therapeutic and biotechnological use.
Collapse
Affiliation(s)
- Aroop Sircar
- Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, MD 21218, USA
| | | | | | | |
Collapse
|
120
|
Abstract
Loop modeling is crucial for high-quality homology model construction outside conserved secondary structure elements. Dozens of loop modeling protocols involving a range of database and ab initio search algorithms and a variety of scoring functions have been proposed. Knowledge-based loop modeling methods are very fast and some can successfully and reliably predict loops up to about eight residues long. Several recent ab initio loop simulation methods can be used to construct accurate models of loops up to 12-13 residues long, albeit at a substantial computational cost. Major current challenges are the simulations of loops longer than 12-13 residues, the modeling of multiple interacting flexible loops, and the sensitivity of the loop predictions to the accuracy of the loop environment.
Collapse
|
121
|
Kelm S, Shi J, Deane CM. MEDELLER: homology-based coordinate generation for membrane proteins. Bioinformatics 2010; 26:2833-40. [PMID: 20926421 PMCID: PMC2971581 DOI: 10.1093/bioinformatics/btq554] [Citation(s) in RCA: 92] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Revised: 09/21/2010] [Accepted: 09/25/2010] [Indexed: 01/13/2023] Open
Abstract
MOTIVATION Membrane proteins (MPs) are important drug targets but knowledge of their exact structure is limited to relatively few examples. Existing homology-based structure prediction methods are designed for globular, water-soluble proteins. However, we are now beginning to have enough MP structures to justify the development of a homology-based approach specifically for them. RESULTS We present a MP-specific homology-based coordinate generation method, MEDELLER, which is optimized to build highly reliable core models. The method outperforms the popular structure prediction programme Modeller on MPs. The comparison of the two methods was performed on 616 target-template pairs of MPs, which were classified into four test sets by their sequence identity. Across all targets, MEDELLER gave an average backbone root mean square deviation (RMSD) of 2.62 Å versus 3.16 Å for Modeller. On our 'easy' test set, MEDELLER achieves an average accuracy of 0.93 Å backbone RMSD versus 1.56 Å for Modeller. AVAILABILITY AND IMPLEMENTATION http://medeller.info; Implemented in Python, Bash and Perl CGI for use on Linux systems; Supplementary data are available at http://www.stats.ox.ac.uk/proteins/resources.
Collapse
Affiliation(s)
- Sebastian Kelm
- Department of Statistics, University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK.
| | | | | |
Collapse
|
122
|
Vanhee P, Verschueren E, Baeten L, Stricher F, Serrano L, Rousseau F, Schymkowitz J. BriX: a database of protein building blocks for structural analysis, modeling and design. Nucleic Acids Res 2010; 39:D435-42. [PMID: 20972210 PMCID: PMC3013806 DOI: 10.1093/nar/gkq972] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
High-resolution structures of proteins remain the most valuable source for understanding their function in the cell and provide leads for drug design. Since the availability of sufficient protein structures to tackle complex problems such as modeling backbone moves or docking remains a problem, alternative approaches using small, recurrent protein fragments have been employed. Here we present two databases that provide a vast resource for implementing such fragment-based strategies. The BriX database contains fragments from over 7000 non-homologous proteins from the Astral collection, segmented in lengths from 4 to 14 residues and clustered according to structural similarity, summing up to a content of 2 million fragments per length. To overcome the lack of loops classified in BriX, we constructed the Loop BriX database of non-regular structure elements, clustered according to end-to-end distance between the regular residues flanking the loop. Both databases are available online (http://brix.crg.es) and can be accessed through a user-friendly web-interface. For high-throughput queries a web-based API is provided, as well as full database downloads. In addition, two exciting applications are provided as online services: (i) user-submitted structures can be covered on the fly with BriX classes, representing putative structural variation throughout the protein and (ii) gaps or low-confidence regions in these structures can be bridged with matching fragments.
Collapse
Affiliation(s)
- Peter Vanhee
- VIB SWITCH Laboratory, Flanders Institute of Biotechnology, Free University of Brussels, Pleinlaan 2, 1050 Brussels, Belgium
| | | | | | | | | | | | | |
Collapse
|