451
|
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state of the art by a number of specific examples.
Collapse
|
452
|
Zhu J, Xie L, Honig B. Structural refinement of protein segments containing secondary structure elements: Local sampling, knowledge-based potentials, and clustering. Proteins 2006; 65:463-79. [PMID: 16927337 DOI: 10.1002/prot.21085] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
In this article, we present an iterative, modular optimization (IMO) protocol for the local structure refinement of protein segments containing secondary structure elements (SSEs). The protocol is based on three modules: a torsion-space local sampling algorithm, a knowledge-based potential, and a conformational clustering algorithm. Alternative methods are tested for each module in the protocol. For each segment, random initial conformations were constructed by perturbing the native dihedral angles of loops (and SSEs) of the segment to be refined while keeping the protein body fixed. Two refinement procedures based on molecular mechanics force fields - using either energy minimization or molecular dynamics - were also tested but were found to be less successful than the IMO protocol. We found that DFIRE is a particularly effective knowledge-based potential and that clustering algorithms that are biased by the DFIRE energies improve the overall results. Results were further improved by adding an energy minimization step to the conformations generated with the IMO procedure, suggesting that hybrid strategies that combine both knowledge-based and physical effective energy functions may prove to be particularly effective in future applications.
Collapse
Affiliation(s)
- Jiang Zhu
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, Room 815, New York, New York 10032, USA
| | | | | |
Collapse
|
453
|
Zhang J, Liu JS. On side-chain conformational entropy of proteins. PLoS Comput Biol 2006; 2:e168. [PMID: 17154716 PMCID: PMC1676032 DOI: 10.1371/journal.pcbi.0020168] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2006] [Accepted: 10/26/2006] [Indexed: 11/19/2022] Open
Abstract
The role of side-chain entropy (SCE) in protein folding has long been speculated about but is still not fully understood. Utilizing a newly developed Monte Carlo method, we conducted a systematic investigation of how the SCE relates to the size of the protein and how it differs among a protein's X-ray, NMR, and decoy structures. We estimated the SCE for a set of 675 nonhomologous proteins, and observed that there is a significant SCE for both exposed and buried residues for all these proteins-the contribution of buried residues approaches approximately 40% of the overall SCE. Furthermore, the SCE can be quite different for structures with similar compactness or even similar conformations. As a striking example, we found that proteins' X-ray structures appear to pack more "cleverly" than their NMR or decoy counterparts in the sense of retaining higher SCE while achieving comparable compactness, which suggests that the SCE plays an important role in favouring native protein structures. By including a SCE term in a simple free energy function, we can significantly improve the discrimination of native protein structures from decoys.
Collapse
Affiliation(s)
- Jinfeng Zhang
- Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, Massachusetts, United States of America
| |
Collapse
|
454
|
Blaney FE. Approaches to the molecular modeling of 7-transmembrane helical receptors. CURRENT PROTOCOLS IN PHARMACOLOGY 2006; Chapter 9:Unit9.8. [PMID: 22294180 DOI: 10.1002/0471141755.ph0908s35] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
7-Transmembrane helical receptors (7TMs) represent the single most important class of target for drug therapy; therefore, a great deal of effort has gone into computational studies of their structures. Historically, these were based on low resolution electron diffraction data, together with the use of computational methods such as multiple sequence alignments, distance geometry, and molecular mechanics calculations. In the year 2000 the situation changed when the first crystal structure of a 7TM, was published. It was then possible to use the homology modeling techniques to generate more accurate models of these proteins. This unit reviews the modeling of 7TMs and describes in detail how homology modeling can be used to build a structure of the 5-HT2a receptor. Special attention is given to the initial sequence alignment, the most important step in the process. Use of automatic alignment programs often produces incorrect results, and manual intervention is necessary before proceeding further.
Collapse
Affiliation(s)
- Frank E Blaney
- GlaxoSmithKline, NFSP (North), Harlow, Essex, United Kingdom
| |
Collapse
|
455
|
Zhang J, Chen Y, Chen R, Liang J. Importance of chirality and reduced flexibility of protein side chains: a study with square and tetrahedral lattice models. J Chem Phys 2006; 121:592-603. [PMID: 15260581 DOI: 10.1063/1.1756573] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Side chains of amino acid residues are the determining factor that distinguishes proteins from other unstable chain polymers. In simple models they are often represented implicitly (e.g., by spin states) or simplified as one atom. Here we study side chain effects using two-dimensional square lattice and three-dimensional tetrahedral lattice models, with explicitly constructed side chains formed by two atoms of different chirality and flexibility. We distinguish effects due to chirality and effects due to side chain flexibilities, since residues in proteins are L residues, and their side chains adopt different rotameric states. For short chains, we enumerate exhaustively all possible conformations. For long chains, we sample effectively rare events such as compact conformations and obtain complete pictures of ensemble properties of conformations of these models at all compactness region. This is made possible by using sequential Monte Carlo techniques based on chain growth method. Our results show that both chirality and reduced side chain flexibility lower the folding entropy significantly for globally compact conformations, suggesting that they are important properties of residues to ensure fast folding and stable native structure. This corresponds well with our finding that natural amino acid residues have reduced effective flexibility, as evidenced by statistical analysis of rotamer libraries and side chain rotatable bonds. We further develop a method calculating the exact side chain entropy for a given backbone structure. We show that simple rotamer counting underestimates side chain entropy significantly for both extended and near maximally compact conformations. We find that side chain entropy does not always correlate well with main chain packing. With explicit side chains, extended backbones do not have the largest side chain entropy. Among compact backbones with maximum side chain entropy, helical structures emerge as the dominating configurations. Our results suggest that side chain entropy may be an important factor contributing to the formation of alpha helices for compact conformations.
Collapse
Affiliation(s)
- Jinfeng Zhang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60607, USA
| | | | | | | |
Collapse
|
456
|
Martin MG, Frischknecht AL. Using arbitrary trial distributions to improve intramolecular sampling in configurational-bias Monte Carlo. Mol Phys 2006. [DOI: 10.1080/00268970600751078] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
457
|
Santana R, Larrañaga P, Lozano JA. Side chain placement using estimation of distribution algorithms. Artif Intell Med 2006; 39:49-63. [PMID: 16854574 DOI: 10.1016/j.artmed.2006.04.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Revised: 04/26/2006] [Accepted: 04/28/2006] [Indexed: 11/29/2022]
Abstract
OBJECTIVE This paper presents an algorithm for the solution of the side chain placement problem. METHODS AND MATERIALS The algorithm combines the application of the Goldstein elimination criterion with the univariate marginal distribution algorithm (UMDA), which stochastically searches the space of possible solutions. The suitability of the algorithm to address the problem is investigated using a set of 425 proteins. RESULTS For a number of difficult instances where inference algorithms do not converge, it has been shown that UMDA is able to find better structures. CONCLUSIONS The results obtained show that the algorithm can achieve better structures than those obtained with other state-of-the-art methods like inference-based techniques. Additionally, a theoretical and empirical analysis of the computational cost of the algorithm introduced has been presented.
Collapse
Affiliation(s)
- Roberto Santana
- Department of Computer Science and Artificial Intelligence, University of the Basque Country, CP-20080, Donostia-San Sebastián, Spain.
| | | | | |
Collapse
|
458
|
Poole AM, Ranganathan R. Knowledge-based potentials in protein design. Curr Opin Struct Biol 2006; 16:508-13. [PMID: 16843652 DOI: 10.1016/j.sbi.2006.06.013] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2006] [Revised: 06/07/2006] [Accepted: 06/30/2006] [Indexed: 02/03/2023]
Abstract
Knowledge-based potentials are statistical parameters derived from databases of known protein properties that empirically capture aspects of the physical chemistry of protein structure and function. These potentials play a key role in protein design by improving the accuracy of physics-based models of interatomic interactions and enhancing the computational efficiency of the design process by limiting the complexity of searching sequence space. Recently, knowledge-based potentials (in isolation or in combination with physics-based potentials) have been applied to the modification of existing protein function, the redesign of natural protein folds and the complete design of a non-natural protein fold. In addition, knowledge-based potentials appear to be providing important information about the global topology of amino acid interactions in natural proteins. A detailed study of the methods and products of these protein design efforts promises to greatly expand our understanding of proteins and the evolutionary process that created them.
Collapse
Affiliation(s)
- Alan M Poole
- Howard Hughes Medical Institute, Department of Pharmacology and the Green Comprehensive Center Division for Systems Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390-9050, USA
| | | |
Collapse
|
459
|
Grigoryan G, Zhou F, Lustig SR, Ceder G, Morgan D, Keating AE. Ultra-fast evaluation of protein energies directly from sequence. PLoS Comput Biol 2006; 2:e63. [PMID: 16789811 PMCID: PMC1479088 DOI: 10.1371/journal.pcbi.0020063] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2006] [Accepted: 04/24/2006] [Indexed: 11/22/2022] Open
Abstract
The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE) method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 107 compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1–4.7 kcal/mol, R2 = 0.7–1.0). Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets—a coiled coil, a zinc finger, and a WW domain—as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages, CE is likely to find many uses in computational structural modeling. Many applications in computational structural biology involve evaluating the energy of a protein adopting a specific structure. A variety of functions are used for this purpose. Statistical potentials are fast to evaluate but do not have a clear biophysical basis, whereas physics-based functions consist of well-defined terms that can be costly to compute. This paper describes how the theory of cluster expansion, originally developed to describe the energies of alloys, can be applied to generate a physical potential for proteins that is extremely fast to evaluate. Cluster expansion is a way of representing a property of a system as a discrete function of its degrees of freedom. In this paper, it is used for the problem of protein design, where the energy is determined by the identities and conformations of amino acids at different sites on a fixed protein backbone. Application of cluster expansion to three small protein folds—the α-helical coiled coil, the zinc finger, and the WW domain—shows that protein sequence can be mapped directly to energy using a surprisingly simple function that maintains high accuracy. Promising results on these small systems suggest that the theory may have utility for macromolecular modeling more generally.
Collapse
Affiliation(s)
- Gevorg Grigoryan
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Fei Zhou
- Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Steve R Lustig
- DuPont Central Research and Development, Experimental Station, Wilmington, Delaware, United States of America
| | - Gerbrand Ceder
- Department of Material Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Dane Morgan
- Department of Material Science and Engineering, University of Wisconsin, Madison, Wisconsin, United States of America
| | - Amy E Keating
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
460
|
Endres RG, Wingreen NS. Weight matrices for protein-DNA binding sites from a single co-crystal structure. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2006; 73:061921. [PMID: 16906878 DOI: 10.1103/physreve.73.061921] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2005] [Revised: 01/31/2006] [Indexed: 05/11/2023]
Abstract
Transcription-factor proteins bind to specific DNA sequences to regulate gene expression in cells. DNA-binding sites are often identified using weight matrices calculated from multiple known binding sites. However, in many cases the number of examples is limited. Here, we report on an atomistic method that starts from an x-ray co-crystal structure of the protein bound to one particular DNA sequence, and infers other binding sites, which are used to construct a weight matrix. The emphasis of the paper is on using the Wang-Landau Monte Carlo algorithm to efficiently sample high-affinity binding sites, which demonstrates that sampling can produce accurate weight matrices in analogy to bioinformatics approaches. For cases of low complexity, we compare to the exhaustive (but slow) dead-end elimination algorithm. To recover crystal binding sites, it is important to include bound water in the protein-DNA interface. Our approach can, in principle, even be applied when no native protein-DNA co-crystal structure is available, only the structure of a closely related homologous protein whose amino-acid sequence is changed to the protein of interest.
Collapse
Affiliation(s)
- Robert G Endres
- NEC Laboratories America, Inc., Princeton, New Jersey 08540, USA.
| | | |
Collapse
|
461
|
de Bakker PIW, Furnham N, Blundell TL, DePristo MA. Conformer generation under restraints. Curr Opin Struct Biol 2006; 16:160-5. [PMID: 16483766 DOI: 10.1016/j.sbi.2006.02.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2005] [Revised: 01/17/2006] [Accepted: 02/06/2006] [Indexed: 10/25/2022]
Abstract
Conformational sampling by direct optimization of an all-atom energy function is ineffective and inefficient because of the ruggedness of the energy landscape. Discrete sampling schemes represent an attractive alternative for generating ensembles of conformers consistent with spatial restraints derived from empirical data. Conformational sampling is becoming increasingly important for structure prediction as the bottleneck in accurate prediction shifts from energy functions to the methods used to find low-energy conformers. Experimental structure determination remains a perennial challenge as investigators tackle larger macromolecular systems, and begin to incorporate more complete descriptions of uncertainty, heterogeneity and dynamics into their models. Computational approaches that combine dense, discrete sampling with all-atom energy evaluation and refinement may help to overcome the remaining barriers to solving these problems.
Collapse
Affiliation(s)
- Paul I W de Bakker
- Department of Molecular Biology and Center for Human Genetic Research, Massachusetts General Hospital, and Department of Genetics, Harvard Medical School, Boston, MA 02114-2790, USA
| | | | | | | |
Collapse
|
462
|
Reddy CS, Vijayasarathy K, Srinivas E, Sastry GM, Sastry GN. Homology modeling of membrane proteins: A critical assessment. Comput Biol Chem 2006; 30:120-6. [PMID: 16540373 DOI: 10.1016/j.compbiolchem.2005.12.002] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2005] [Revised: 11/10/2005] [Accepted: 12/14/2005] [Indexed: 11/22/2022]
Abstract
Evaluation and validation of homology modeling protocols are indispensable for membrane proteins as experimental determination of their three-dimensional structure is an arduous task. The prediction ability of Modeller, MOE, InsightII-Homology and Swiss-PdbViewer (SPV) with different sequence alignments CLUSTALW, BLAST and 3D-JIGSAW have been assessed. The sequence identity of the target and template was chosen to be in the range of 25-35%. Validation protocols to assess the structure, fold and stereochemical quality, are employed by comparing with experimental structures. Two different ranking schemes are suggested to evaluate the performance of each methodology based on the validation scores. While unambiguous preference for any given procedure did not surface, statistically Modeller and the sequence alignment technique, 3D-JIGSAW, gave best results amongst the chosen protocols. The present study helps in selecting the right protocols when modeling membrane proteins, which form a major class of drug targets.
Collapse
Affiliation(s)
- Ch Surendhar Reddy
- Molecular Modelling Group, Organic Chemical Sciences, Indian Institute of Chemical Technology, Tarnaka, Hyderabad 500007, India
| | | | | | | | | |
Collapse
|
463
|
Saraf MC, Moore GL, Goodey NM, Cao VY, Benkovic SJ, Maranas CD. IPRO: an iterative computational protein library redesign and optimization procedure. Biophys J 2006; 90:4167-80. [PMID: 16513775 PMCID: PMC1459523 DOI: 10.1529/biophysj.105.079277] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A number of computational approaches have been developed to reengineer promising chimeric proteins one at a time through targeted point mutations. In this article, we introduce the computational procedure IPRO (iterative protein redesign and optimization procedure) for the redesign of an entire combinatorial protein library in one step using energy-based scoring functions. IPRO relies on identifying mutations in the parental sequences, which when propagated downstream in the combinatorial library, improve the average quality of the library (e.g., stability, binding affinity, specific activity, etc.). Residue and rotamer design choices are driven by a globally convergent mixed-integer linear programming formulation. Unlike many of the available computational approaches, the procedure allows for backbone movement as well as redocking of the associated ligands after a prespecified number of design iterations. IPRO can also be used, as a limiting case, for the redesign of a single or handful of individual sequences. The application of IPRO is highlighted through the redesign of a 16-member library of Escherichia coli/Bacillus subtilis dihydrofolate reductase hybrids, both individually and through upstream parental sequence redesign, for improving the average binding energy. Computational results demonstrate that it is indeed feasible to improve the overall library quality as exemplified by binding energy scores through targeted mutations in the parental sequences.
Collapse
Affiliation(s)
- Manish C Saraf
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | | | |
Collapse
|
464
|
Krasley E, Cooper KF, Mallory MJ, Dunbrack R, Strich R. Regulation of the oxidative stress response through Slt2p-dependent destruction of cyclin C in Saccharomyces cerevisiae. Genetics 2005; 172:1477-86. [PMID: 16387872 PMCID: PMC1456298 DOI: 10.1534/genetics.105.052266] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The Saccharomyces cerevisiae C-type cyclin and its cyclin-dependent kinase (Cdk8p) repress the transcription of several stress response genes. To relieve this repression, cyclin C is destroyed in cells exposed to reactive oxygen species (ROS). This report describes the requirement of cyclin C destruction for the cellular response to ROS. Compared to wild type, deleting cyclin C makes cells more resistant to ROS while its stabilization reduces viability. The Slt2p MAP kinase cascade mediates cyclin C destruction in response to ROS treatment but not heat shock. This destruction pathway is important as deleting cyclin C suppresses the hypersensitivity of slt2 mutants to oxidative damage. The ROS hypersensitivity of an slt2 mutant correlates with elevated programmed cell death as determined by TUNEL assays. Consistent with the viability studies, the elevated TUNEL signal is reversed in cyclin C mutants. Finally, two results suggest that cyclin C regulates programmed cell death independently of its function as a transcriptional repressor. First, deleting its corepressor CDK8 does not suppress the slt2 hypersensitivity phenotype. Second, the human cyclin C, which does not repress transcription in yeast, does regulate ROS sensitivity. These findings demonstrate a new role for the Slt2p MAP kinase cascade in protecting the cell from programmed cell death through cyclin C destruction.
Collapse
Affiliation(s)
- Elizabeth Krasley
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA
| | | | | | | | | |
Collapse
|
465
|
Abstract
In recent years, there has been significant progress in the ability to predict the three-dimensional structure of proteins from their amino acid sequence. Progress has been due to new methods to extract the growing amount of information in sequence and structure databases and improved computational descriptions of protein energetics. This review summarizes recent advances in these areas and describes a number of novel biological applications made possible by structure prediction. Despite remaining challenges, protein structure prediction is becoming an extremely useful tool in understanding phenomena in modern molecular and cell biology.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
466
|
Lasa M, Jiménez AI, Zurbano MM, Cativiela C. Model dipeptides incorporating the trans cyclohexane analogues of phenylalanine: further evidence of the relationship between side-chain orientation and β-turn type. Tetrahedron Lett 2005. [DOI: 10.1016/j.tetlet.2005.09.155] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
467
|
Riemann RN, Zacharias M. Refinement of protein cores and protein–peptide interfaces using a potential scaling approach. Protein Eng Des Sel 2005; 18:465-76. [PMID: 16155119 DOI: 10.1093/protein/gzi052] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Refinement of side chain conformations in protein model structures and at the interface of predicted protein-protein or protein-peptide complexes is an important step during protein structural modelling and docking. A common approach for side chain prediction is to assume a rigid protein main chain for both docking partners and search for an optimal set of side chain rotamers to optimize the steric fit. However, depending on the target-template similarity in the case of comparative protein modelling and on the accuracy of an initially docked complex, the main chain template structure is only an approximation of a realistic target main chain. An inaccurate rigid main chain conformation can in turn interfere with the prediction of side chain conformations. In the present study, a potential scaling approach (PS-MD) during a molecular dynamics (MD) simulation that also allows the inclusion of explicit solvent has been used to predict side chain conformations on semi-flexible protein main chains. The PS-MD method converges much faster to realistic protein-peptide interface structures or protein core structures than standard MD simulations. Depending on the accuracy of the protein main chain, it also gives significantly better results compared with the standard rotamer search method.
Collapse
Affiliation(s)
- Ralph Nico Riemann
- International University Bremen, School of Engineering and Science, D-28759 Bremen, Germany
| | | |
Collapse
|
468
|
Saha RP, Bahadur RP, Chakrabarti P. Interresidue Contacts in Proteins and Protein−Protein Interfaces and Their Use in Characterizing the Homodimeric Interface. J Proteome Res 2005; 4:1600-9. [PMID: 16212412 DOI: 10.1021/pr050118k] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The environment of amino acid residues in protein tertiary structures and three types of interfaces formed by protein-protein association--in complexes, homodimers, and crystal lattices of monomeric proteins--has been analyzed in terms of the propensity values of the 20 amino acid residues to be in contact with a given residue. On the basis of the similarity of the environment, twenty residues can be divided into nine classes, which may correspond to a set of reduced amino acid alphabet. There is no appreciable change in the environment in going from the tertiary structure to the interface, those participating in the crystal contacts showing the maximum deviation. Contacts between identical residues are very prominent in homodimers and crystal dimers and arise due to 2-fold related association of residues lining the axis of rotation. These two types of interfaces, representing specific and nonspecific associations, are characterized by the types of residues that partake in "self-contacts"--most notably Leu in the former and Glu in the latter. The relative preference of residues to be involved in "self-contacts" can be used to develop a scoring function to identify homodimeric proteins from crystal structures. Thirty-four percent of such residues are fully conserved among homologous proteins in the homodimer dataset, as opposed to only 20% in crystal dimers. Results point to Leu being the stickiest of all amino acid residues, hence its widespread use in motifs, such as leucine zippers.
Collapse
Affiliation(s)
- Rudra Prasad Saha
- Department of Biochemistry, Bose Institute, P-1/12 CIT Scheme 7M, Calcutta 700-054, India
| | | | | |
Collapse
|
469
|
Abstract
PISCES is a database server for producing lists of sequences from the Protein Data Bank (PDB) using a number of entry- and chain-specific criteria and mutual sequence identity. Our goal in culling the PDB is to provide the longest list possible of the highest resolution structures that fulfill the sequence identity and structural quality cut-offs. The new PISCES server uses a combination of PSI-BLAST and structure-based alignments to determine sequence identities. Structure alignment produces more complete alignments and therefore more accurate sequence identities than PSI-BLAST. PISCES now allows a user to cull the PDB by-entry in addition to the standard culling by individual chains. In this scenario, a list will contain only entries that do not have a chain that has a sequence identity to any chain in any other entry in the list over the sequence identity cut-off. PISCES also provides fully annotated sequences including gene name and species. The server allows a user to cull an input list of entries or chains, so that other criteria, such as function, can be used. Results from a search on the re-engineered RCSB's site for the PDB can be entered into the PISCES server by a single click, combining the powerful searching abilities of the PDB with PISCES's utilities for sequence culling. The server's data are updated weekly. The server is available at .
Collapse
Affiliation(s)
| | - Roland L. Dunbrack
- To whom correspondence should be addressed. Tel: +1 215 728 2434; Fax: +1 215 728 2412;
| |
Collapse
|
470
|
Abstract
Modeling a protein structure based on a homologous structure is a standard method in structural biology today. In this process an alignment of a target protein sequence onto the structure of a template(s) is used as input to a program that constructs a 3D model. It has been shown that the most important factor in this process is the correctness of the alignment and the choice of the best template structure(s), while it is generally believed that there are no major differences between the best modeling programs. Therefore, a large number of studies to benchmark the alignment qualities and the selection process have been performed. However, to our knowledge no large-scale benchmark has been performed to evaluate the programs used to transform the alignment to a 3D model. In this study, a benchmark of six different homology modeling programs- Modeller, SegMod/ENCAD, SWISS-MODEL, 3D-JIGSAW, nest, and Builder-is presented. The performance of these programs is evaluated using physiochemical correctness and structural similarity to the correct structure. From our analysis it can be concluded that no single modeling program outperform the others in all tests. However, it is quite clear that three modeling programs, Modeller, nest, and SegMod/ ENCAD, perform better than the others. Interestingly, the fastest and oldest modeling program, SegMod/ ENCAD, performs very well, although it was written more than 10 years ago and has not undergone any development since. It can also be observed that none of the homology modeling programs builds side chains as well as a specialized program (SCWRL), and therefore there should be room for improvement.
Collapse
Affiliation(s)
- Björn Wallner
- Stockholm Bioinformatics Center, Albanova University Center, Stockholm University, Stockholm, Sweden.
| | | |
Collapse
|
471
|
Rockey WM, Elcock AH. Rapid computational identification of the targets of protein kinase inhibitors. J Med Chem 2005; 48:4138-52. [PMID: 15943486 DOI: 10.1021/jm049461b] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We describe a method for rapidly computing the relative affinities of an inhibitor for all individual members of a family of homologous receptors. The approach, implemented in a new program, SCR, models inhibitor-receptor interactions in full atomic detail with an empirical energy function and includes an explicit account of flexibility in homology-modeled receptors through sampling of libraries of side chain rotamers. SCR's general utility was demonstrated by application to seven different protein kinase inhibitors: for each inhibitor, relative binding affinities with panels of approximately 20 protein kinases were computed and compared with experimental data. For five of the inhibitors (SB203580, purvalanol B, imatinib, H89, and hymenialdisine), SCR provided excellent reproduction of the experimental trends and, importantly, was capable of identifying the targets of inhibitors even when they belonged to different kinase families. The method's performance in a predictive setting was demonstrated by performing separate training and testing applications, and its key assumptions were tested by comparison with a number of alternative approaches employing the ligand-docking program AutoDock (Morris et al. J. Comput. Chem. 1998, 19, 1639-1662). These comparison tests included using AutoDock in nondocking and docking modes and performing energy minimizations of inhibitor-kinase complexes with the molecular mechanics code GROMACS (Berendsen et al. Comput. Phys. Commun. 1995, 91, 43-56). It was found that a surprisingly important aspect of SCR's approach is its assumption that the inhibitor be modeled in the same orientation for each kinase: although this assumption is in some respects unrealistic, calculations that used apparently more realistic approaches produced clearly inferior results. Finally, as a large-scale application of the method, SB203580, purvalanol B, and imatinib were screened against an almost full complement of 493 human protein kinases using SCR in order to identify potential new targets; the predicted targets of SB203580 were compared with those identified in recent proteomics-based experiments. These kinome-wide screens, performed within a day on a small cluster of PCs, indicate that explicit computation of inhibitor-receptor binding affinities has the potential to promote rapid discovery of new therapeutic targets for existing inhibitors.
Collapse
Affiliation(s)
- William M Rockey
- Department of Biochemistry, University of Iowa, Iowa City, 52242-1109, USA
| | | |
Collapse
|
472
|
Endres RG, Schulthess TC, Wingreen NS. Toward an atomistic model for predicting transcription-factor binding sites. Proteins 2005; 57:262-8. [PMID: 15340913 DOI: 10.1002/prot.20199] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Identifying the specific DNA-binding sites of transcription-factor proteins is essential to understanding the regulation of gene expression in the cell. Bioinformatics approaches are fast compared to experiments, but require prior knowledge of multiple binding sites for each protein. Here, we present an atomistic force-field method to predict binding sites based only on the X-ray structure of a related bound complex. Specific flexible contacts between the protein and DNA are modeled by a library of amino acid side-chain rotamers. Using the example of the mouse transcription factor, Zif268, a well-studied zinc-finger protein, we show that the protein sequence alone, without the detailed experimental structure, gives a strong bias toward the consensus binding site.
Collapse
Affiliation(s)
- Robert G Endres
- Center for Computational Sciences and Computer Science & Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6164, USA.
| | | | | |
Collapse
|
473
|
Yang AYC, Källblad P, Mancera RL. Molecular modelling prediction of ligand binding site flexibility. J Comput Aided Mol Des 2005; 18:235-50. [PMID: 15562988 DOI: 10.1023/b:jcam.0000046820.08222.83] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
We have investigated the efficacy of generating multiple sidechain conformations using a rotamer library in order to find the experimentally observed ligand binding site conformation of a protein in the presence of a bound ligand. We made use of a recently published algorithm that performs an exhaustive conformational search using a rotamer library to enumerate all possible sidechain conformations in a binding site. This approach was applied to a dataset of proteins whose structures were determined by X-ray and NMR methods. All chosen proteins had two or more structures, generally involving different bound ligands. By taking one of these structures as a reference, we were able in most cases to successfully reproduce the experimentally determined conformations of the other structures, as well as to suggest alternative low-energy conformations of the binding site. In those few cases where this procedure failed, we observed that the bound ligand had induced a high-energy conformation of the binding site. These results suggest that for most proteins that exhibit limited backbone motion, ligands tend to bind to low energy conformations of their binding sites. Our results also reveal that it is possible in most cases to use a rotamer search-based approach to predict alternative low-energy protein binding site conformations that can be used by different ligands. This opens the possibility of incorporating alternative binding site conformations to improve the efficacy of docking and structure-based drug design algorithms.
Collapse
Affiliation(s)
- Ami Yi-Ching Yang
- Department of Pharmacology, University of Cambridge, Tennis Court Road, Cambridge CB2 1HQ, UK
| | | | | |
Collapse
|
474
|
Park S, Kono H, Wang W, Boder ET, Saven JG. Progress in the development and application of computational methods for probabilistic protein design. Comput Chem Eng 2005. [DOI: 10.1016/j.compchemeng.2004.07.037] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
475
|
Shacham S, Marantz Y, Bar-Haim S, Kalid O, Warshaviak D, Avisar N, Inbal B, Heifetz A, Fichman M, Topf M, Naor Z, Noiman S, Becker OM. PREDICT modeling and in-silico screening for G-protein coupled receptors. Proteins 2005; 57:51-86. [PMID: 15326594 DOI: 10.1002/prot.20195] [Citation(s) in RCA: 90] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
G-protein coupled receptors (GPCRs) are a major group of drug targets for which only one x-ray structure is known (the nondrugable rhodopsin), limiting the application of structure-based drug discovery to GPCRs. In this paper we present the details of PREDICT, a new algorithmic approach for modeling the 3D structure of GPCRs without relying on homology to rhodopsin. PREDICT, which focuses on the transmembrane domain of GPCRs, starts from the primary sequence of the receptor, simultaneously optimizing multiple 'decoy' conformations of the protein in order to find its most stable structure, culminating in a virtual receptor-ligand complex. In this paper we present a comprehensive analysis of three PREDICT models for the dopamine D2, neurokinin NK1, and neuropeptide Y Y1 receptors. A shorter discussion of the CCR3 receptor model is also included. All models were found to be in good agreement with a large body of experimental data. The quality of the PREDICT models, at least for drug discovery purposes, was evaluated by their successful utilization in in-silico screening. Virtual screening using all three PREDICT models yielded enrichment factors 9-fold to 44-fold better than random screening. Namely, the PREDICT models can be used to identify active small-molecule ligands embedded in large compound libraries with an efficiency comparable to that obtained using crystal structures for non-GPCR targets.
Collapse
|
476
|
Saunders CT, Baker D. Recapitulation of protein family divergence using flexible backbone protein design. J Mol Biol 2005; 346:631-44. [PMID: 15670610 DOI: 10.1016/j.jmb.2004.11.062] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2004] [Revised: 11/18/2004] [Accepted: 11/22/2004] [Indexed: 11/30/2022]
Abstract
We use flexible backbone protein design to explore the sequence and structure neighborhoods of naturally occurring proteins. The method samples sequence and structure space in the vicinity of a known sequence and structure by alternately optimizing the sequence for a fixed protein backbone using rotamer based sequence search, and optimizing the backbone for a fixed amino acid sequence using atomic-resolution structure prediction. We find that such a flexible backbone design method better recapitulates protein family sequence variation than sequence optimization on fixed backbones or randomly perturbed backbone ensembles for ten diverse protein structures. For the SH3 domain, the backbone structure variation in the family is also better recapitulated than in randomly perturbed backbones. The potential application of this method as a model of protein family evolution is highlighted by a concerted transition to the amino acid sequence in the structural core of one SH3 domain starting from the backbone coordinates of an homologous structure.
Collapse
Affiliation(s)
- Christopher T Saunders
- Department of Genome Sciences, University of Washington, Box 357730, Seattle, WA 98195, USA
| | | |
Collapse
|
477
|
Leaver-Fay A, Kuhlman B, Snoeyink J. Rotamer-Pair Energy Calculations Using a Trie Data Structure. ACTA ACUST UNITED AC 2005. [DOI: 10.1007/11557067_32] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
|
478
|
Chapter 18 Computationally Assisted Protein Design. ACTA ACUST UNITED AC 2005. [DOI: 10.1016/s1574-1400(05)01018-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
479
|
Porotto M, Murrell M, Greengard O, Lawrence MC, McKimm-Breschkin JL, Moscona A. Inhibition of parainfluenza virus type 3 and Newcastle disease virus hemagglutinin-neuraminidase receptor binding: effect of receptor avidity and steric hindrance at the inhibitor binding sites. J Virol 2004; 78:13911-9. [PMID: 15564499 PMCID: PMC533954 DOI: 10.1128/jvi.78.24.13911-13919.2004] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Zanamivir (4-guanidino-Neu5Ac2en [4-GU-DANA]) inhibits not only the neuraminidase activity but also the receptor interaction of the human parainfluenza virus type 3 (HPIV3) hemagglutinin-neuraminidase (HN), blocking receptor binding and subsequent fusion promotion. All activities of the HPIV3 variant ZM1 HN (T193I/I567V) are less sensitive to 4-GU-DANA's effects. The T193I mutation in HN confers both increased receptor binding and increased neuraminidase activity, as well as reduced sensitivities of both activities to 4-GU-DANA inhibition, consistent with a single site on the HN molecule carrying out both catalysis and binding. We now provide evidence that the HPIV3 variant's resistance to receptor-binding inhibition by 4-GU-DANA is related to a reduced affinity of the HN receptor-binding site for this compound as well as to an increase in the avidity of HN for the receptor. Newcastle disease virus (NDV) HN and HPIV3 HN respond differently to inhibition in ways that suggest a fundamental distinction between them. NDV HN-receptor binding is less sensitive than HPIV3 HN-receptor binding to 4-GU-DANA, while its neuraminidase activity is highly sensitive. Both HPIV3 and NDV HNs are sensitive to receptor-binding inhibition by the smaller molecule DANA. However, for NDV HN, some receptor binding cannot be inhibited. These data are consistent with the presence in NDV HN of a second receptor-binding site that is devoid of enzyme activity and has a negligible, if any, affinity for 4-GU-DANA. Avidity for the receptor contributes to resistance by allowing the receptor to compete effectively with inhibitors for interaction with HN, while the further determinant of resistance is the reduced binding of the inhibitor molecule to the binding pocket on HN. Based upon our data and recent three-dimensional structural information on the HPIV3 and NDV HNs, we propose mechanisms for the observed sensitivity and resistance of HN to receptor-binding inhibition and discuss the implications of these mechanisms for the distribution of HN functions.
Collapse
Affiliation(s)
- Matteo Porotto
- Department of Pediatrics, Mount Sinai School of Medicine, 1 Gustave L. Levy Pl., New York, NY 10029, USA
| | | | | | | | | | | |
Collapse
|
480
|
Kilosanidze GT, Kutsenko AS, Esipova NG, Tumanyan VG. Analysis of forces that determine helix formation in alpha-proteins. Protein Sci 2004; 13:351-7. [PMID: 14739321 PMCID: PMC2286714 DOI: 10.1110/ps.03429104] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
A model for prediction of alpha-helical regions in amino acid sequences has been tested on the mainly-alpha protein structure class. The modeling represents the construction of a continuous hypothetical alpha-helical conformation for the whole protein chain, and was performed using molecular mechanics tools. The positive prediction of alpha-helical and non-alpha-helical pentapeptide fragments of the proteins is 79%. The model considers only local interactions in the polypeptide chain without the influence of the tertiary structure. It was shown that the local interaction defines the alpha-helical conformation for 85% of the native alpha-helical regions. The relative energy contributions to the energy of the model were analyzed with the finding that the van der Waals component determines the formation of alpha-helices. Hydrogen bonds remain at constant energy independently whether alpha-helix or non-alpha-helix occurs in the native protein, and do not determine the location of helical regions. In contrast to existing methods, this approach additionally permits the prediction of conformations of side chains. The model suggests the correct values for ~60% of all chi-angles of alpha-helical residues.
Collapse
Affiliation(s)
- Gelena T Kilosanidze
- Microbiology and Tumor Biology Center, Karolinska Institute, Box 280, S-171 77 Stockholm, Sweden
| | | | | | | |
Collapse
|
481
|
Zoete V, Meuwly M, Karplus M. A Comparison of the Dynamic Behavior of Monomeric and Dimeric Insulin Shows Structural Rearrangements in the Active Monomer. J Mol Biol 2004; 342:913-29. [PMID: 15342246 DOI: 10.1016/j.jmb.2004.07.033] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2004] [Revised: 07/07/2004] [Accepted: 07/07/2004] [Indexed: 10/26/2022]
Abstract
Molecular dynamics (MD) simulations (5-10ns in length) and normal mode analyses were performed for the monomer and dimer of native porcine insulin in aqueous solution; both starting structures were obtained from an insulin hexamer. Several simulations were done to confirm that the results obtained are meaningful. The insulin dimer is very stable during the simulation and remains very close to the starting X-ray structure; the RMS fluctuations calculated from the MD simulation agree with the experimental B-factors. Correlated motions were found within each of the two monomers; they can be explained by persistent non-bonded interactions and disulfide bridges. The correlated motions between residues B24 and B26 of the two monomers are due to non-bonded interactions between the side-chains and backbone atoms. For the isolated monomer in solution, the A chain and the helix of the B chain are found to be stable during 5ns and 10ns MD simulations. However, the N-terminal and the C-terminal parts of the B chain are very flexible. The C-terminal part of the B chain moves away from the X-ray conformation after 0.5-2.5ns and exposes the N-terminal residues of the A chain that are thought to be important for the binding of insulin to its receptor. Our results thus support the hypothesis that, when monomeric insulin is released from the hexamer (or the dimer in our study), the C-terminal end of the monomer (residues B25-B30) is rearranged to allow binding to the insulin receptor. The greater flexibility of the C-terminal part of the beta chain in the B24 (Phe-->Gly) mutant is in accord with the NMR results. The details of the backbone and side-chain motions are presented. The transition between the starting conformation and the more dynamic structure of the monomers is characterized by displacements of the backbone of Phe B25 and Tyr B26; of these, Phe B25 has been implicated in insulin activation.
Collapse
Affiliation(s)
- Vincent Zoete
- Laboratoire de Chimie Biophysique, ISIS/Université Louis Pasteur, 8, allée Gaspard Monge, BP 70028, 67083 Strasbourg Cedex, France
| | | | | |
Collapse
|
482
|
Abstract
We measured the frequency of side-chain rotamers in 14 alpha-helical and 16 beta-barrel membrane protein structures and found that the membrane environment considerably perturbs the rotamer frequencies compared to soluble proteins. Although there are limited experimental data, we found statistically significant changes in rotamer preferences depending on the residue environment. Rotamer distributions were influenced by whether the residues were lipid or protein facing, and whether the residues were found near the N- or C-terminus. Hydrogen-bonding interactions with the helical backbone perturbs the rotamer populations of Ser and His. Trp and Tyr favor side-chain conformations that allow their side chains to extend their polar atoms out of the membrane core, thereby aligning the side-chain polarity gradient with the polarity gradient of the membrane. Our results demonstrate how the membrane environment influences protein structures, providing information that will be useful in the structure prediction and design of transmembrane proteins.
Collapse
Affiliation(s)
- Aaron K Chamberlain
- Department of Chemistry and Biochemistry, UCLA-DOE Center for Genomics and Proteomics, Molecular Biology Institute, University of California, Los Angeles, California 90095-1570, USA
| | | |
Collapse
|
483
|
Rajamani D, Thiel S, Vajda S, Camacho CJ. Anchor residues in protein-protein interactions. Proc Natl Acad Sci U S A 2004; 101:11287-92. [PMID: 15269345 PMCID: PMC509196 DOI: 10.1073/pnas.0401942101] [Citation(s) in RCA: 257] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We show that the mechanism for molecular recognition requires one of the interacting proteins, usually the smaller of the two, to anchor a specific side chain in a structurally constrained binding groove of the other protein, providing a steric constraint that helps to stabilize a native-like bound intermediate. We identify the anchor residues in 39 protein-protein complexes and verify that, even in the absence of their interacting partners, the anchor side chains are found in conformations similar to those observed in the bound complex. These ready-made recognition motifs correspond to surface side chains that bury the largest solvent-accessible surface area after forming the complex (> or =100 A2). The existence of such anchors implies that binding pathways can avoid kinetically costly structural rearrangements at the core of the binding interface, allowing for a relatively smooth recognition process. Once anchors are docked, an induced fit process further contributes to forming the final high-affinity complex. This later stage involves flexible (solvent-exposed) side chains that latch to the encounter complex in the periphery of the binding pocket. Our results suggest that the evolutionary conservation of anchor side chains applies to the actual structure that these residues assume before the encounter complex and not just to their loci. Implications for protein docking are also discussed.
Collapse
Affiliation(s)
- Deepa Rajamani
- Departments of Biology and Biomedical Engineering and Bioinformatics Program, Boston University, Boston, MA 02215, USA
| | | | | | | |
Collapse
|
484
|
Eyal E, Najmanovich R, McConkey BJ, Edelman M, Sobolev V. Importance of solvent accessibility and contact surfaces in modeling side-chain conformations in proteins. J Comput Chem 2004; 25:712-24. [PMID: 14978714 DOI: 10.1002/jcc.10420] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Contact surface area and chemical properties of atoms are used to concurrently predict conformations of multiple amino acid side chains on a fixed protein backbone. The combination of surface complementarity and solvent-accessible surface accounts for van der Waals forces and solvation free energy. The scoring function is particularly suitable for modeling partially buried side chains. Both iterative and stochastic searching approaches are used. Our programs (Sccomp-I and Sccomp-S), with relatively fast execution times, correctly predict chi1 angles for 92-93% of buried residues and 82-84% for all residues, with an RMSD of approximately 1.7 A for side chain heavy atoms. We find that the differential between the atomic solvation parameters and the contact surface parameters (including those between noncomplementary atoms) is positive; i.e., most protein atoms prefer surface contact with other protein atoms rather than with the solvent. This might correspond to the driving force for maximizing packing of the protein. The influence of the crystal packing, completeness of rotamer library and precise positioning of Cbeta atoms on the accuracy of side-chain prediction are examined. The Sccomp-S and Sccomp-I programs can be accessed through the Web (http://sgedg.weizmann.ac.il/sccomp.html) and are available for several platforms.
Collapse
Affiliation(s)
- Eran Eyal
- Department of Plant Sciences, Weizmann Institute of Science, 76100, Rehovot, Israel
| | | | | | | | | |
Collapse
|
485
|
Rainey JK, Goh MC. Statistically Based Reduced Representation of Amino Acid Side Chains. ACTA ACUST UNITED AC 2004; 44:817-30. [PMID: 15154746 DOI: 10.1021/ci034177z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Preferred conformations of amino acid side chains have been well established through statistically obtained rotamer libraries. Typically, these provide bond torsion angles allowing a side chain to be traced atom by atom. In cases where it is desirable to reduce the complexity of a protein representation or prediction, fixing all side-chain atoms may prove unwieldy. Therefore, we introduce a general parametrization to allow positions of representative atoms (in the present study, these are terminal atoms) to be predicted directly given backbone atom coordinates. Using a large, culled data set of amino acid residues from high-resolution protein crystal structures, anywhere from 1 to 7 preferred conformations were observed for each terminal atom of the non-glycine residues. Side-chain length from the backbone C(alpha) is one of the parameters determined for each conformation, which should itself be useful. Prediction of terminal atoms was then carried out for a second, nonredundant set of protein structures to validate the data set. Using four simple probabilistic approaches, the Monte Carlo style prediction of terminal atom locations given only backbone coordinates produced an average root mean-square deviation (RMSD) of approximately 3 A from the experimentally determined terminal atom positions. With prediction using conditional probabilities based on the side-chain chi(1) rotamer, this average RMSD was improved to 1.74 A. The observed terminal atom conformations therefore provide reasonable and potentially highly accurate representations of side-chain conformation, offering a viable alternative to existing all-atom rotamers for any case where reduction in protein model complexity, or in the amount of data to be handled, is desired. One application of this representation with strong potential is the prediction of charge density in proteins. This would likely be especially valuable on protein surfaces, where side chains are much less likely to be fixed in single rotamers. Prediction of ensembles of structures provides a method to determine the probability density of charge and atom location; such a prediction is demonstrated graphically.
Collapse
Affiliation(s)
- Jan K Rainey
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada
| | | |
Collapse
|
486
|
Hirst WD, Abrahamsen B, Blaney FE, Calver AR, Aloj L, Price GW, Medhurst AD. Differences in the central nervous system distribution and pharmacology of the mouse 5-hydroxytryptamine-6 receptor compared with rat and human receptors investigated by radioligand binding, site-directed mutagenesis, and molecular modeling. Mol Pharmacol 2004; 64:1295-308. [PMID: 14645659 DOI: 10.1124/mol.64.6.1295] [Citation(s) in RCA: 180] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
There is increasing evidence for a role of 5-hydroxytrypta-mine-6 (5-HT6) receptors in cognitive function. In the rat and human brain, 5-HT6 receptors are widely expressed and highly enriched in the basal ganglia. However, in the mouse brain, only very low levels of 5-HT6 receptor mRNA and receptor protein, measured by TaqMan reverse transcriptase-polymerase chain reaction and selective radioligand binding, could be detected, with no evidence of enrichment in the basal ganglia. The mouse receptor was cloned and transiently expressed in human embryonic kidney 293 cells to characterize its pharmacological profile. Despite significant sequence homology between human, rat, and mouse 5-HT6 receptors, the pharmacological profile of the mouse receptor was significantly different from the rat and human receptors. Four amino acid residues, conserved in rat and human and divergent in mouse receptors, were identified, and various mutant receptors were generated and their pharmacologies studied. Residues 188 (tyrosine in mouse, phenylalanine in rat and human) in transmembrane region 5 and 290 (serine in mouse, asparagine in rat and human) in transmembrane region 6 were identified as key amino acids responsible for the different pharmacological profiles. Molecular modeling of the receptor and docking of selective and nonselective compounds was undertaken to elucidate the ligand receptor interactions. The binding pocket was predicted to be different in the mouse compared with rat and human 5-HT6 receptors, and the models were in excellent agreement with the observed mutation results and have been used extensively in the design of further selective 5-HT6 antagonists.
Collapse
Affiliation(s)
- Warren D Hirst
- Neurology and GI Centre of Excellence for Drug Discovery, GlaxoSmithKline, New Frontiers Science Park, Third Avenue, Harlow, Essex, CM19 5AW, United Kingdom.
| | | | | | | | | | | | | |
Collapse
|
487
|
Calhoun JR, Kono H, Lahr S, Wang W, DeGrado WF, Saven JG. Computational design and characterization of a monomeric helical dinuclear metalloprotein. J Mol Biol 2004; 334:1101-15. [PMID: 14643669 DOI: 10.1016/j.jmb.2003.10.004] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The de novo design of di-iron proteins is an important step towards understanding the diversity of function among this complex family of metalloenzymes. Previous designs of due ferro (DF) proteins have resulted in tetrameric and dimeric four-helix bundles having crystallographically well-defined structures and active-site geometries. Here, the design and characterization of DFsc, a 114 residue monomeric four-helix bundle, is presented. The backbone was modeled using previous oligomeric structures and appropriate inter-helical turns. The identities of 26 residues were predetermined, including the primary and secondary ligands in the active site, residues involved in active site accessibility, and the gamma beta gamma beta turn between helices 2 and 3. The remaining 88 amino acid residues were determined using statistical computer aided design, which is based upon a recent statistical theory of protein sequences. Rather than sampling sequences, the theory directly provides the site-specific amino acid probabilities, which are then used to guide sequence design. The resulting sequence (DFsc) expresses well in Escherichia coli and is highly soluble. Sedimentation studies confirm that the protein is monomeric in solution. Circular dichroism spectra are consistent with the helical content of the target structure. The protein is structured in both the apo and the holo forms, with the metal-bound form exhibiting increased stability. DFsc stoichiometrically binds a variety of divalent metal ions, including Zn(II), Co(II), Fe(II), and Mn(II), with micromolar affinities. 15N HSQC NMR spectra of both the apo and Zn(II) proteins reveal excellent dispersion with evidence of a significant structural change upon metal binding. DFsc is then a realization of complete de novo design, where backbone structure, activity, and sequence are specified in the design process.
Collapse
Affiliation(s)
- Jennifer R Calhoun
- Department of Biochemistry and Molecular Biophysics, Johnson Foundation, School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | | | | | | | | |
Collapse
|
488
|
Moore GL, Maranas CD. Computational challenges in combinatorial library design for protein engineering. AIChE J 2004. [DOI: 10.1002/aic.10025] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
489
|
Kaya H, Chan HS. Simple two-state protein folding kinetics requires near-levinthal thermodynamic cooperativity. Proteins 2003; 52:510-23. [PMID: 12910451 DOI: 10.1002/prot.10506] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Simple two-state folding kinetics of many small single-domain proteins are characterized by chevron plots with linear folding and unfolding arms consistent with an apparent two-state description of equilibrium thermodynamics. This phenomenon is hereby recognized as a nontrivial heteropolymer property capable of providing fundamental insight into protein energetics. Many current protein chain models, including common lattice and continuum Gō models with explicit native biases, fail to reproduce this generic protein property. Here we show that simple two-state kinetics is obtainable from models with a cooperative interplay between core burial and local conformational propensities or an extra strongly favorable energy for the native structure. These predictions suggest that intramolecular recognition in real two-state proteins is more specific than that envisioned by common Gō-like constructs with pairwise additive energies. The many-body interactions in the present kinetically two-state models lead to high thermodynamic cooperativity as measured by their van't Hoff to calorimetric enthalpy ratios, implying that the native and denatured conformational populations are well separated in enthalpy by a high free-energy barrier. It has been observed experimentally that deviations from Arrhenius behavior are often more severe for folding than for unfolding. This asymmetry may be rationalized by one of the present modeling scenarios if the effective many-body cooperative interactions stabilizing the native structure against unfolding is less dependent on temperature than the interactions that drive the folding kinetics.
Collapse
Affiliation(s)
- Hüseyin Kaya
- Protein Engineering Network of Centres of Excellence, Department of Biochemistry, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | | |
Collapse
|
490
|
Abstract
The success of structural genomics initiatives requires the development and application of tools for structure analysis, prediction, and annotation. In this paper we review recent developments in these areas; specifically structure alignment, the detection of remote homologs and analogs, homology modeling and the use of structures to predict function. We also discuss various rationales for structural genomics initiatives. These include the structure-based clustering of sequence space and genome-wide function assignment. It is also argued that structural genomics can be integrated into more traditional biological research if specific biological questions are included in target selection strategies.
Collapse
Affiliation(s)
- Sharon Goldsmith-Fischman
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
491
|
Kaya H, Chan HS. Contact order dependent protein folding rates: kinetic consequences of a cooperative interplay between favorable nonlocal interactions and local conformational preferences. Proteins 2003; 52:524-33. [PMID: 12910452 DOI: 10.1002/prot.10478] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Physical mechanisms underlying the empirical correlation between relative contact order (CO) and folding rate among naturally occurring small single-domain proteins are investigated by evaluating postulated interaction schemes for a set of three-dimensional 27mer lattice protein models with 97 different CO values. Many-body interactions are constructed such that contact energies become more favorable when short chain segments sequentially adjacent to the contacting residues adopt native-like conformations. At a given interaction strength, this scheme leads to folding rates that are logarithmically well correlated with CO (correlation coefficient r = 0.914) and span more than 2.5 orders of magnitude, whereas folding rates of the corresponding Gō models with additive contact energies have much less logarithmic correlation with CO and span only approximately one order of magnitude. The present protein chain models also exhibit calorimetric cooperativity and linear chevron plots similar to that observed experimentally for proteins with apparent simple two-state folding/unfolding kinetics. Thus, our findings suggest that CO-dependent folding rates of real proteins may arise partly from a significant positive coupling between nonlocal contact favorabilities and local conformational preferences.
Collapse
Affiliation(s)
- Hüseyin Kaya
- Protein Engineering Network of Centres of Excellence, Department of Biochemistry, Faculty of Medicine, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | | |
Collapse
|
492
|
Canutescu AA, Shelenkov AA, Dunbrack RL. A graph-theory algorithm for rapid protein side-chain prediction. Protein Sci 2003; 12:2001-14. [PMID: 12930999 PMCID: PMC2323997 DOI: 10.1110/ps.03154503] [Citation(s) in RCA: 743] [Impact Index Per Article: 35.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Fast and accurate side-chain conformation prediction is important for homology modeling, ab initio protein structure prediction, and protein design applications. Many methods have been presented, although only a few computer programs are publicly available. The SCWRL program is one such method and is widely used because of its speed, accuracy, and ease of use. A new algorithm for SCWRL is presented that uses results from graph theory to solve the combinatorial problem encountered in the side-chain prediction problem. In this method, side chains are represented as vertices in an undirected graph. Any two residues that have rotamers with nonzero interaction energies are considered to have an edge in the graph. The resulting graph can be partitioned into connected subgraphs with no edges between them. These subgraphs can in turn be broken into biconnected components, which are graphs that cannot be disconnected by removal of a single vertex. The combinatorial problem is reduced to finding the minimum energy of these small biconnected components and combining the results to identify the global minimum energy conformation. This algorithm is able to complete predictions on a set of 180 proteins with 34342 side chains in <7 min of computer time. The total chi(1) and chi(1 + 2) dihedral angle accuracies are 82.6% and 73.7% using a simple energy function based on the backbone-dependent rotamer library and a linear repulsive steric energy. The new algorithm will allow for use of SCWRL in more demanding applications such as sequence design and ab initio structure prediction, as well addition of a more complex energy function and conformational flexibility, leading to increased accuracy.
Collapse
Affiliation(s)
- Adrian A Canutescu
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA
| | | | | |
Collapse
|
493
|
Zacharias M. Protein-protein docking with a reduced protein model accounting for side-chain flexibility. Protein Sci 2003; 12:1271-82. [PMID: 12761398 PMCID: PMC2323887 DOI: 10.1110/ps.0239303] [Citation(s) in RCA: 244] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2002] [Revised: 02/07/2003] [Accepted: 02/28/2003] [Indexed: 10/27/2022]
Abstract
A protein-protein docking approach has been developed based on a reduced protein representation with up to three pseudo atoms per amino acid residue. Docking is performed by energy minimization in rotational and translational degrees of freedom. The reduced protein representation allows an efficient search for docking minima on the protein surfaces within. During docking, an effective energy function between pseudo atoms has been used based on amino acid size and physico-chemical character. Energy minimization of protein test complexes in the reduced representation results in geometries close to experiment with backbone root mean square deviations (RMSDs) of approximately 1 to 3 A for the mobile protein partner from the experimental geometry. For most test cases, the energy-minimized experimental structure scores among the top five energy minima in systematic docking studies when using both partners in their bound conformations. To account for side-chain conformational changes in case of using unbound protein conformations, a multicopy approach has been used to select the most favorable side-chain conformation during the docking process. The multicopy approach significantly improves the docking performance, using unbound (apo) binding partners without a significant increase in computer time. For most docking test systems using unbound partners, and without accounting for any information about the known binding geometry, a solution within approximately 2 to 3.5 A RMSD of the full mobile partner from the experimental geometry was found among the 40 top-scoring complexes. The approach could be extended to include protein loop flexibility, and might also be useful for docking of modeled protein structures.
Collapse
Affiliation(s)
- Martin Zacharias
- Computational Biology, School of Engineering and Science, International University Bremen, 28759 Bremen, Germany.
| |
Collapse
|
494
|
Abstract
Combinatorial protein libraries permit the examination of a wide range of sequences. Such methods are being used for denovo design and to investigate the determinants of protein folding. The exponentially large number of possible sequences, however, necessitates restrictions on the diversity of sequences in a combinatorial library. Recently, progress has been made in developing theoretical tools to bias and characterize the ensemble of sequences that fold into a given structure - tools that can be applied to the design and interpretation of combinatorial experiments.
Collapse
Affiliation(s)
- Jeffery G Saven
- Department of Chemistry, University of Pennsylvania, 231 South 34 Street, Philadelphia 19104, USA.
| |
Collapse
|