1
|
Schweke H, Xu Q, Tauriello G, Pantolini L, Schwede T, Cazals F, Lhéritier A, Fernandez-Recio J, Rodríguez-Lumbreras LÁ, Schueler-Furman O, Varga JK, Jiménez-García B, Réau MF, Bonvin A, Savojardo C, Martelli PL, Casadio R, Tubiana J, Wolfson H, Oliva R, Barradas-Bautista D, Ricciardelli T, Cavallo L, Venclovas Č, Olechnovič K, Guerois R, Andreani J, Martin J, Wang X, Kihara D, Marchand A, Correia B, Zou X, Dey S, Dunbrack R, Levy E, Wodak S. Discriminating physiological from non-physiological interfaces in structures of protein complexes: A community-wide study. Proteomics 2023; 23:e2200323. [PMID: 37365936 PMCID: PMC10937251 DOI: 10.1002/pmic.202200323] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 05/11/2023] [Accepted: 05/11/2023] [Indexed: 06/28/2023]
Abstract
Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Julia K. Varga
- Hebrew University of Jerusalem Institute for Medical Research Israel-Canada
| | | | | | | | | | | | | | - Jérôme Tubiana
- Tel Aviv University Blavatnik School of Computer Science
| | - Haim Wolfson
- Tel Aviv University Blavatnik School of Computer Science
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri
| | | | | | | | | |
Collapse
|
2
|
Ghersin N, Abadi S, Sabbag A, Lamash Y, Anderson RH, Wolfson H, Lessick J. The three-dimensional geometric relationship between the mitral valvar annulus and the coronary arteries as seen from the perspective of the cardiac surgeon using cardiac computed tomography. Eur J Cardiothorac Surg 2013; 44:1123-30. [DOI: 10.1093/ejcts/ezt152] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
3
|
Oron A, Wolfson H, Gunasekaran K, Nussinov R. Using DelPhi to compute electrostatic potentials and assess their contribution to interactions. ACTA ACUST UNITED AC 2008; Chapter 8:Unit 8.4. [PMID: 18428711 DOI: 10.1002/0471250953.bi0804s02] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
There is a general agreement that electrostatic interactions play a significant role in the structure and function of biological molecules. However, obtaining quantitative estimation of the electrostatic energy requires computational models that capture the microscopic nature of the heterogeneous environment of macromolecules. This protocol offers elaboration on one of the common methods to calculate the electrostatic energetic contributions using continuum electrostatics. The method involves solving the Poisson-Boltzmann (PB) equation numerically and regarding the solute as having a homogenous dielectric constant. In order to apply this method, a three dimensional structure of the molecule derived from experimental data (crystallography, NMR) or modeling techniques is required. The protocol will focus on the DelPhi program (Accelrys Inc. San Diego), which is one of the most common programs used for the estimation of electrostatic free energy contribution. A simple procedure of assigning criteria and parameters (charge distribution, solvent and solute dielectric constants, iterations, grid resolution, etc) enables one to illustrate an electrostatic potential map and estimate the electrostatic free energy, although with limited accuracy.
Collapse
|
4
|
Topf M, Lasker K, Webb B, Wolfson H, Chiu W, Sali A. Protein structure fitting and refinement guided by cryo-EM density. Structure 2008; 16:295-307. [PMID: 18275820 PMCID: PMC2409374 DOI: 10.1016/j.str.2007.11.016] [Citation(s) in RCA: 263] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2007] [Revised: 11/20/2007] [Accepted: 11/26/2007] [Indexed: 11/23/2022]
Abstract
For many macromolecular assemblies, both a cryo-electron microscopy map and atomic structures of its component proteins are available. Here we describe a method for fitting and refining a component structure within its map at intermediate resolution (<15 A). The atomic positions are optimized with respect to a scoring function that includes the crosscorrelation coefficient between the structure and the map as well as stereochemical and nonbonded interaction terms. A heuristic optimization that relies on a Monte Carlo search, a conjugate-gradients minimization, and simulated annealing molecular dynamics is applied to a series of subdivisions of the structure into progressively smaller rigid bodies. The method was tested on 15 proteins of known structure with 13 simulated maps and 3 experimentally determined maps. At approximately 10 A resolution, Calpha rmsd between the initial and final structures was reduced on average by approximately 53%. The method is automated and can refine both experimental and predicted atomic structures.
Collapse
Affiliation(s)
- Maya Topf
- School of Crystallography, Birkbeck College, University of London, Malet Street, London WC1E 7HX, United Kingdom.
| | | | | | | | | | | |
Collapse
|
5
|
Abstract
Currently there is increasing interest in nanostructures and their design. Nanostructure design involves the ability to predictably manipulate the properties of the self-assembly of autonomous units. Autonomous units have preferred conformational states. The units can be synthetic material science-based or derived from functional biological macromolecules. Autonomous biological building blocks with available structures provide an extremely rich and useful resource for design. For proteins, the structural databases contain large libraries of protein molecules and their building blocks with a range of shapes, surfaces, and chemical properties. The introduction of engineered synthetic residues or short peptides into these can expand the available chemical space and enhance the desired properties. Here we focus on the principles of nanostructure design with protein building blocks.
Collapse
Affiliation(s)
- Chung-Jung Tsai
- Basic Research Program, SAIC-Frederick, Inc., Center for Cancer Research Nanobiology Program, NCI-Frederick, Frederick, Maryland 21702, USA
| | | | | | | | | | | | | |
Collapse
|
6
|
Abstract
MOTIVATION A fast growing number of non-coding RNAs have recently been discovered to play essential roles in many cellular processes. Similar to proteins, understanding the functions of these active RNAs requires methods for analyzing their tertiary structures. However, in contrast to the wide range of structure-based approaches available for proteins, there is still a lack of methods for studying RNA structures. RESULTS We present a new computational method named ARTS (alignment of RNA tertiary structures). The method compares two nucleic acid structures (RNAs or DNAs) and detects a-priori unknown common substructures. These substructures can be either large global folds containing hundreds and even thousands of nucleotides or small local tertiary motifs with at least two successive base pairs. To the best of our knowledge, this is the first method of this type. The method is highly-efficient and was used to conduct an all-against-all comparison of all the RNA structures currently available in the Protein Data Bank. AVAILABILITY The program, a web-server and supplementary information are available on http://bioinfo3d.cs.tau.ac.il/ARTS
Collapse
Affiliation(s)
- Oranit Dror
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, Israel.
| | | | | |
Collapse
|
7
|
Haspel N, Zanuy D, Zheng J, Aleman C, Wolfson H, Nussinov R. Changing the charge distribution of beta-helical-based nanostructures can provide the conditions for charge transfer. Biophys J 2007; 93:245-53. [PMID: 17416628 PMCID: PMC1914416 DOI: 10.1529/biophysj.106.100644] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In this work we present a computational approach to the design of nanostructures made of structural motifs taken from left-handed beta-helical proteins. Previously, we suggested a structural model based on the self-assembly of motifs taken from Escherichia coli galactoside acetyltransferase (Protein Data Bank 1krr, chain A, residues 131-165, denoted krr1), which produced a very stable nanotube in molecular dynamics simulations. Here we modify this model by changing the charge distribution in the inner core of the system and testing the effect of this change on the structural arrangement of the construct. Our results demonstrate that it is possible to generate the proper conditions for charge transfer inside nanotubes based on assemblies of krr1 segment. The electronic transfer would be achieved by introducing different histidine ionization states in selected positions of the internal core of the construct, in addition to specific mutations with charged amino acids that altogether will allow the formation of coherent networks of aromatic ring stacking, salt-bridges, and hydrogen bonds.
Collapse
Affiliation(s)
- Nurit Haspel
- School of Computer Science Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | | | | | |
Collapse
|
8
|
|
9
|
Dror O, Lasker K, Nussinov R, Wolfson H. EMatch: an efficient method for aligning atomic resolution subunits into intermediate-resolution cryo-EM maps of large macromolecular assemblies. Acta Crystallogr D Biol Crystallogr 2007; 63:42-9. [PMID: 17164525 PMCID: PMC2483490 DOI: 10.1107/s0907444906041059] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/09/2006] [Accepted: 10/08/2006] [Indexed: 11/22/2022]
Abstract
Structural analysis of biological machines is essential for inferring their function and mechanism. Nevertheless, owing to their large size and instability, deciphering the atomic structure of macromolecular assemblies is still considered as a challenging task that cannot keep up with the rapid advances in the protein-identification process. In contrast, structural data at lower resolution is becoming more and more available owing to recent advances in cryo-electron microscopy (cryo-EM) techniques. Once a cryo-EM map is acquired, one of the basic questions asked is what are the folds of the components in the assembly and what is their configuration. Here, a novel knowledge-based computational method, named EMatch, towards tackling this task for cryo-EM maps at 6-10 A resolution is presented. The method recognizes and locates possible atomic resolution structural homologues of protein domains in the assembly. The strengths of EMatch are demonstrated on a cryo-EM map of native GroEL at 6 A resolution.
Collapse
Affiliation(s)
- Oranit Dror
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Keren Lasker
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ruth Nussinov
- Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
- Basic Research Program, SAIC-Frederick, Center for Cancer Research Nanobiology Program, NCI-Frederick, Building 469, Room 151, Frederick, MD 21702 USA
| | - Haim Wolfson
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
10
|
Haspel N, Zanuy D, Alemán C, Wolfson H, Nussinov R. De Novo Tubular Nanostructure Design Based on Self-Assembly of β-Helical Protein Motifs. Structure 2006; 14:1137-48. [PMID: 16843895 DOI: 10.1016/j.str.2006.05.016] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2006] [Revised: 04/06/2006] [Accepted: 05/01/2006] [Indexed: 12/01/2022]
Abstract
We present an approach for designing self-assembled nanostructures from naturally occurring building block segments obtained from native protein structures. We focus on structural motifs from left-handed beta-helical proteins. We selected 17 motifs. Copies of each of the motifs are stacked one atop the other. The obtained structures were simulated for long periods by using Molecular Dynamics to test their ability to retain their organization over time. We observed that a structural model based on the self-assembly of a motif from E. coli galactoside acetyltransferase produced a very stable tube. We studied the interactions that help maintain the conformational stability of the systems, focusing on the role of specific amino acids at specific positions. Analysis of these systems and a mutational study of selected candidates revealed that the presence of proline and glycine residues in the loops of beta-helical structures greatly enhances the structural stability of the systems.
Collapse
Affiliation(s)
- Nurit Haspel
- School of Computer Science, Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | | | |
Collapse
|
11
|
Abstract
Correlated mutations have been repeatedly exploited for intramolecular contact map prediction. Over the last decade these efforts yielded several methods for measuring correlated mutations. Nevertheless, the application of correlated mutations for the prediction of intermolecular interactions has not yet been explored. This gap is due to several obstacles, such as 3D complexes availability, paralog discrimination, and the availability of sequence pairs that are required for inter- but not intramolecular analyses. Here we selected for analysis fusion protein families that bypass some of these obstacles. We find that several correlated mutation measurements yield reasonable accuracy for intramolecular contact map prediction on the fusion dataset. However, the accuracy level drops sharply in intermolecular contacts prediction. This drop in accuracy does not occur always. In the Cohesin-Dockerin family, reasonable accuracy is achieved in the prediction of both intra- and intermolecular contacts. The Cohesin-Dockerin family is well suited for correlated mutation analysis. Because, however, this family constitutes a special case (it has radical mutations, has domain repeats, within each species each Dockerin domain interacts with each Cohesin domain, see below), the successful prediction in this family does not point to a general potential in using correlated mutations for predicting intermolecular contacts. Overall, the results of our study indicate that current methodologies of correlated mutations analysis are not suitable for large-scale intermolecular contact prediction, and thus cannot assist in docking. With current measurements, sequence availability, sequence annotations, and underdeveloped sequence pairing methods, correlated mutations can yield reasonable accuracy only for a handful of families.
Collapse
Affiliation(s)
- Inbal Halperin
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | |
Collapse
|
12
|
Alemán C, Zanuy D, Jiménez AI, Cativiela C, Haspel N, Zheng J, Casanovas J, Wolfson H, Nussinov R. Concepts and schemes for the re-engineering of physical protein modules: generating nanodevices via targeted replacements with constrained amino acids. Phys Biol 2006; 3:S54-62. [PMID: 16582465 DOI: 10.1088/1478-3975/3/1/s06] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Physically building complex multi-molecular structures from naturally occurring biological macromolecules has aroused a great deal of interest. Here we focus on nanostructures composed of re-engineered, natural 'foldamer' building blocks. Our aim is to provide some of the underlying concepts and schemes for crafting structures utilizing such conformationally relatively stable molecular components. We describe how, via chemical biology strategies, it is further possible to chemically manipulate the foldamer building blocks toward specific shape-driven structures, which in turn could be used toward potential-designed functions. We outline the criteria in choosing candidate foldamers from the vast biological repertoire, and how to enhance their stability through selected targeted replacements by non-proteinogenic conformationally constrained amino acids. These approaches combine bioinformatics, high performance computations and mathematics with synthetic organic chemistry. The resulting artificially engineered self-organizing molecular scale structures take advantage of nature's nanobiology toolkit and at the same time improve on it, since their new targeted function differs from that optimized by evolution. The major challenge facing nanobiology is to be able to exercise fine control over the performance of these target-specific molecular machines.
Collapse
Affiliation(s)
- Carlos Alemán
- Departament d'Enginyeria Química, ETS d'Enginyeria Industrial de Barcelona, Universitat Politècnica de Catalunya, Diagonal 647, Barcelona E-08028, Spain.
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Benyamini H, Gunasekaran K, Wolfson H, Nussinov R. Fibril modelling by sequence and structure conservation analysis combined with protein docking techniques: beta(2)-microglobulin amyloidosis. Biochim Biophys Acta 2005; 1753:121-30. [PMID: 16107326 DOI: 10.1016/j.bbapap.2005.07.012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2005] [Revised: 07/14/2005] [Accepted: 07/17/2005] [Indexed: 11/22/2022]
Abstract
Obtaining atomic resolution structural models of amyloid fibrils is currently impossible, yet crucial for our understanding of the amyloid mechanism. Different pathways in the transformation of a native globular domain to an amyloid fibril invariably involve domain destabilization. Hence, locating the unstable segments of a domain is important for understanding its amyloidogenic transformation and possibly control it. Since relative conservation is suggested to relate to local stability, we performed an extensive, sequence and structure conservation analysis of the beta(2)-microglobulin (beta(2)-m) domain. Our dataset include 51 high resolution structures belonging to the "C1 set domain" family and 132 clustered PSI-BLAST search results. Segments of the beta(2)-m domain corresponding to strands A (residues 12-18), D (45-55) and G (91-95) were found to be less conserved and stable, while the central strands B (residues 22-28), C (36-41), E (62-70) and F (78-83) were found conserved and stable. Our findings are supported by accumulating observations from various experimental methods, including urea denaturation, limited proteolysis, H/D exchange and structure determination by both NMR and X-ray crystallography. We used our conservation findings together with experimental literature information to suggest a structural model for the polymerized unit of beta(2)-m. Pairwise protein docking and subsequent monomer stacking in the same manner suggest a fibril model consistent with the cross-beta structure.
Collapse
Affiliation(s)
- Hadar Benyamini
- Bioinformatics Unit, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | |
Collapse
|
14
|
Haspel N, Zanuy D, Ma B, Wolfson H, Nussinov R. A Comparative Study of Amyloid Fibril Formation by Residues 15–19 of the Human Calcitonin Hormone: A Single β-Sheet Model with a Small Hydrophobic Core. J Mol Biol 2005; 345:1213-27. [PMID: 15644216 DOI: 10.1016/j.jmb.2004.11.002] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2003] [Revised: 05/19/2004] [Accepted: 11/01/2004] [Indexed: 11/30/2022]
Abstract
Experimentally, the human calcitonin hormone (hCT) can form highly stable amyloid protofibrils. Further, a peptide consisting of hCT residues 15-19, DFNKF, was shown to create highly ordered fibrils, similar to those formed by the entire hormone sequence. However, there are limited experimental data regarding the detailed 3D arrangement of either of these fibrils. We have modeled the DFNKF protofibril, using molecular dynamics simulations. We tested the stabilities of single sheet and of various multi sheet models. Remarkably, our most ordered and stable model consists of a parallel-stranded, single beta-sheet with a relatively insignificant hydrophobic core. We investigate the chemical and physical interactions responsible for the high structural organization of this single beta-sheet amyloid fibril. We observe that the most important chemical interactions contributing to the stability of the DFNKF organization are electrostatic, specifically between the Lys and the C terminus, between the Asp and N terminus, and a hydrogen bond network between the Asn side-chains of adjacent strands. Additionally, we observe hydrophobic and aromatic pi stacking interactions. We further simulated truncated filaments, FNKF and DFNK. Our tetra-peptide mutant simulations assume models similar to the penta-peptide. Experimentally, the FNKF does not create fibrils while DFNK does, albeit short and less ordered than DFNKF. In the simulations, the FNKF system was less stable than the DFNK and DFNKF. DFNK also lost many of its original interactions becoming less organized, however, many contacts were maintained. Thus, our results emphasize the role played by specific amino acid interactions. To further study specific interactions, we have mutated the penta-peptide, simulating DANKF, DFNKA and EFNKF. Here we describe the model, its relationship to experiment and its implications to amyloid organization.
Collapse
Affiliation(s)
- Nurit Haspel
- School of Computer Science, Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | | | |
Collapse
|
15
|
Abstract
The possibility is addressed that protein folding and function may be related via regions that are critical for both folding and function. This approach is based on the building blocks folding model that describes protein folding as binding events of conformationally fluctuating building blocks. Within these, we identify building block fragments that are critical for achieving the native fold. A library of such critical building blocks (CBBs) is constructed. Then, it is asked whether the functionally important residues fall in these CBB fragments. We find that for over two-thirds of the proteins in our library with available functional information, the catalytic or binding site residues lie within the CBB regions. From the evolutionary standpoint, a folding-function relationship is advantageous, since the need to guard against mutations is limited to one region. Furthermore, conformationally similar CBBs are found in globally unrelated proteins with different functions. Hence, substituting CBBs may lead to designed proteins with altered functions. We further find that the CBBs in our library are conformationally unstable.
Collapse
Affiliation(s)
- Adi Barzilai
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | | | |
Collapse
|
16
|
Abstract
We present a novel method for multiple alignment of protein structures and detection of structural motifs. To date, only a few methods are available for addressing this task. Most of them are based on a series of pairwise comparisons. In contrast, MASS (Multiple Alignment by Secondary Structures) considers all the given structures at the same time. Exploiting the secondary structure representation aids in filtering out noisy results and in making the method highly efficient and robust. MASS disregards the sequence order of the secondary structure elements. Thus, it can find non-sequential and even non-topological structural motifs. An important novel feature of MASS is subset alignment detection: It does not require that all the input molecules be aligned. Rather, MASS is capable of detecting structural motifs shared only by a subset of the molecules. Given its high efficiency and capability of detecting subset alignments, MASS is suitable for a broad range of challenging applications: It can handle large-scale protein ensembles (on the order of tens) that may be heterogeneous, noisy, topologically unrelated and contain structures of low resolution.
Collapse
Affiliation(s)
- O Dror
- School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel.
| | | | | | | |
Collapse
|
17
|
Halperin I, Wolfson H, Nussinov R. Protein-protein interactions; coupling of structurally conserved residues and of hot spots across interfaces. Implications for docking. Structure 2004; 12:1027-38. [PMID: 15274922 DOI: 10.1016/j.str.2004.04.009] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2003] [Revised: 04/01/2004] [Accepted: 04/01/2004] [Indexed: 11/30/2022]
Abstract
Hot spot residues contribute dominantly to protein-protein interactions. Statistically, conserved residues correlate with hot spots, and their occurrence can distinguish between binding sites and the remainder of the protein surface. The hot spot and conservation analyses have been carried out on one side of the interface. Here, we show that both experimental hot spots and conserved residues tend to couple across two-chain interfaces. Intriguingly, the local packing density around both hot spots and conserved residues is higher than expected. We further observe a correlation between local packing density and experimental deltadeltaG. Favorable conserved pairs include Gly coupled with aromatics, charged and polar residues, as well as aromatic residue coupling. Remarkably, charged residue couples are underrepresented. Overall, protein-protein interactions appear to consist of regions of high and low packing density, with the hot spots organized in the former. The high local packing density in binding interfaces is reminiscent of protein cores.
Collapse
Affiliation(s)
- Inbal Halperin
- Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel-Aviv University, Tel-Aviv 69978, Israel
| | | | | |
Collapse
|
18
|
Keskin O, Tsai CJ, Wolfson H, Nussinov R. A new, structurally nonredundant, diverse data set of protein-protein interfaces and its implications. Protein Sci 2004; 13:1043-55. [PMID: 15044734 PMCID: PMC2280042 DOI: 10.1110/ps.03484604] [Citation(s) in RCA: 144] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2003] [Revised: 12/23/2003] [Accepted: 01/09/2004] [Indexed: 10/26/2022]
Abstract
Here, we present a diverse, structurally nonredundant data set of two-chain protein-protein interfaces derived from the PDB. Using a sequence order-independent structural comparison algorithm and hierarchical clustering, 3799 interface clusters are obtained. These yield 103 clusters with at least five nonhomologous members. We divide the clusters into three types. In Type I clusters, the global structures of the chains from which the interfaces are derived are also similar. This cluster type is expected because, in general, related proteins associate in similar ways. In Type II, the interfaces are similar; however, remarkably, the overall structures and functions of the chains are different. The functional spectrum is broad, from enzymes/inhibitors to immunoglobulins and toxins. The fact that structurally different monomers associate in similar ways, suggests "good" binding architectures. This observation extends a paradigm in protein science: It has been well known that proteins with similar structures may have different functions. Here, we show that it extends to interfaces. In Type III clusters, only one side of the interface is similar across the cluster. This structurally nonredundant data set provides rich data for studies of protein-protein interactions and recognition, cellular networks and drug design. In particular, it may be useful in addressing the difficult question of what are the favorable ways for proteins to interact. (The data set is available at http://protein3d.ncifcrf.gov/~keskino/ and http://home.ku.edu.tr/~okeskin/INTERFACE/INTERFACES.html.)
Collapse
Affiliation(s)
- Ozlem Keskin
- NCI-Frederick, Building 469, Room 151, Frederick, MD 21702, USA
| | | | | | | |
Collapse
|
19
|
Halperin I, Wolfson H, Nussinov R. SiteLight: binding-site prediction using phage display libraries. Protein Sci 2003; 12:1344-59. [PMID: 12824481 PMCID: PMC2323941 DOI: 10.1110/ps.0237103] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2002] [Revised: 04/03/2003] [Accepted: 04/17/2003] [Indexed: 10/27/2022]
Abstract
Phage display enables the presentation of a large number of peptides on the surface of phage particles. Such libraries can be tested for binding to target molecules of interest by means of affinity selection. Here we present SiteLight, a novel computational tool for binding site prediction using phage display libraries. SiteLight is an algorithm that maps the 1D peptide library onto a three-dimensional (3D) protein surface. It is applicable to complexes made up of a protein Template and any type of molecule termed Target. Given the three-dimensional structure of a Template and a collection of sequences derived from biopanning against the Target, the Template interaction site with the Target is predicted. We have created a large diverse data set for assessing the ability of SiteLight to correctly predict binding sites. SiteLight predictive mapping enables discrimination between the binding and nonbinding parts of the surface. This prediction can be used to effectively reduce the surface by 75% without excluding the binding site. In 63% of the cases we have tested, there is at least one binding site prediction that overlaps the interface by at least 50%. These results suggest the applicability of phage display libraries for automated binding site prediction on three-dimensional structures. For most effective binding site prediction we propose using a random phage display library twice, to scan both binding partners of a given complex. The derived peptides are mapped to the other binding partner (now used as a Template). Here, the surface of each partner is reduced by 75%, focusing their relative positions with respect to each other significantly. Such information can be utilized to improve docking algorithms and scoring functions.
Collapse
Affiliation(s)
- Inbal Halperin
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine and
| | - Haim Wolfson
- School of Computer Science, Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel
| | - Ruth Nussinov
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine and
- Laboratory of Experimental and Computational Biology, Intramural Research Support Program, SAIC, Inc., NCI-Frederick, Frederick, Maryland 21702, USA
| |
Collapse
|
20
|
Benyamini H, Gunasekaran K, Wolfson H, Nussinov R. Beta2-microglobulin amyloidosis: insights from conservation analysis and fibril modelling by protein docking techniques. J Mol Biol 2003; 330:159-74. [PMID: 12818210 DOI: 10.1016/s0022-2836(03)00557-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Current data suggest that globular domains may form amyloids via different mechanisms. Nevertheless, there are indications that the initiation of the process takes place invariably in the less stable segments of a protein domain. We have studied the sequence and structural conservation of beta(2)-microglobulin that deposits into fibrils in dialysis-related amyloidosis. The dataset includes 51 high-resolution non-redundant structures of the antibody constant domain-like proteins (C1) and 132 related sequences. We describe a set of 30 conserved residues. Among them, 23 are conserved structurally, 16 are conserved sequentially and nine are conserved both sequentially and structurally. Strands A (12-18), G (91-95) and D (45-55) are the less conserved and stable segments of the domain, while strands B (22-28), C (36-41), E (62-70) and F (78-83) are the conserved and stable segments. We find that the conserved residues form a cluster with a network of interactions. The observed pattern of conservation is consistent with experimental data including H/D exchange, urea denaturation and limited proteolysis that suggest that strands A and G do not participate in the amyloid fibril. Additionally, the low conservation of strand D is consistent with the observation that this strand may acquire different conformations as seen in crystal structures of bound and isolated beta(2)-microglobulin. We used a docking technique to suggest a model for a fibril via stacking of beta(2)-microglobulin monomers. Our analysis suggests that the favored monomer building block for fibril elongation is the conformation of the isolated beta(2)-microglobulin, without the beta-bulge on strand D and without strands A and G participating in the fibril beta-sheet structure. This monomer retains all the conserved residues and their network of interactions, increasing the likelihood of its existence in solution. The inter-strand interaction between the two (monomer) building blocks forms a new continuous beta-sheet such that addition of monomers results in a fibril model that has the characteristic cross-beta structure.
Collapse
Affiliation(s)
- Hadar Benyamini
- Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Sackler Institute of Molecular Medicine, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
21
|
Haspel N, Tsai CJ, Wolfson H, Nussinov R. Reducing the computational complexity of protein folding via fragment folding and assembly. Protein Sci 2003; 12:1177-87. [PMID: 12761388 PMCID: PMC2323902 DOI: 10.1110/ps.0232903] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2002] [Revised: 12/23/2002] [Accepted: 02/23/2003] [Indexed: 10/27/2022]
Abstract
Understanding, and ultimately predicting, how a 1-D protein chain reaches its native 3-D fold has been one of the most challenging problems during the last few decades. Data increasingly indicate that protein folding is a hierarchical process. Hence, the question arises as to whether we can use the hierarchical concept to reduce the practically intractable computational times. For such a scheme to work, the first step is to cut the protein sequence into fragments that form local minima on the polypeptide chain. The conformations of such fragments in solution are likely to be similar to those when the fragments are embedded in the native fold, although alternate conformations may be favored during the mutual stabilization in the combinatorial assembly process. Two elements are needed for such cutting: (1) a library of (clustered) fragments derived from known protein structures and (2) an assignment algorithm that selects optimal combinations to "cover" the protein sequence. The next two steps in hierarchical folding schemes, not addressed here, are the combinatorial assembly of the fragments and finally, optimization of the obtained conformations. Here, we address the first step in a hierarchical protein-folding scheme. The input is a target protein sequence and a library of fragments created by clustering building blocks that were generated by cutting all protein structures. The output is a set of cutout fragments. We briefly outline a graph theoretic algorithm that automatically assigns building blocks to the target sequence, and we describe a sample of the results we have obtained.
Collapse
Affiliation(s)
- Nurit Haspel
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | | | | | | |
Collapse
|
22
|
Ma B, Elkayam T, Wolfson H, Nussinov R. Protein-protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. Proc Natl Acad Sci U S A 2003; 100:5772-7. [PMID: 12730379 PMCID: PMC156276 DOI: 10.1073/pnas.1030237100] [Citation(s) in RCA: 419] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Polar residue hot spots have been observed at protein-protein binding sites. Here we show that hot spots occur predominantly at the interfaces of macromolecular complexes, distinguishing binding sites from the remainder of the surface. Consequently, hot spots can be used to define binding epitopes. We further show a correspondence between energy hot spots and structurally conserved residues. The number of structurally conserved residues, particularly of high ranking energy hot spots, increases with the binding site contact size. This finding may suggest that effectively dispersing hot spots within a large contact area, rather than compactly clustering them, may be a strategy to sustain essential key interactions while still allowing certain protein flexibility at the interface. Thus, most conserved polar residues at the binding interfaces confer rigidity to minimize the entropic cost on binding, whereas surrounding residues form a flexible cushion. Furthermore, our finding that similar residue hot spots occur across different protein families suggests that affinity and specificity are not necessarily coupled: higher affinity does not directly imply greater specificity. Conservation of Trp on the protein surface indicates a highly likely binding site. To a lesser extent, conservation of Phe and Met also imply a binding site. For all three residues, there is a significant conservation in binding sites, whereas there is no conservation on the exposed surface. A hybrid strategy, mapping sequence alignment onto a single structure illustrates the possibility of binding site identification around these three residues.
Collapse
Affiliation(s)
- Buyong Ma
- Basic Research Program, SAIC-Frederick, Inc., Laboratory of Experimental and Computational Biology, National Cancer Institute, Frederick, MD 21702, USA
| | | | | | | |
Collapse
|
23
|
Abstract
We have previously presented a building block folding model. The model postulates that protein folding is a hierarchical top-down process. The basic unit from which a fold is constructed, referred to as a hydrophobic folding unit, is the outcome of combinatorial assembly of a set of "building blocks." Results obtained by the computational cutting procedure yield fragments that are in agreement with those obtained experimentally by limited proteolysis. Here we show that as expected, proteins from the same family give very similar building blocks. However, different proteins can also give building blocks that are similar in structure. In such cases the building blocks differ in sequence, stability, contacts with other building blocks, and in their 3D locations in the protein structure. This result, which we have repeatedly observed in many cases, leads us to conclude that while a building block is influenced by its environment, nevertheless, it can be viewed as a stand-alone unit. For small-sized building blocks existing in multiple conformations, interactions with sister building blocks in the protein will increase the population time of the native conformer. With this conclusion in hand, it is possible to develop an algorithm that predicts the building block assignment of a protein sequence whose structure is unknown. Toward this goal, we have created sequentially nonredundant databases of building block sequences. A protein sequence can be aligned against these, in order to be matched to a set of potential building blocks.
Collapse
Affiliation(s)
- Nurit Haspel
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | | | |
Collapse
|
24
|
Abstract
The mechanism through which globular proteins transform into amyloid fibrils is still not understood. Here we analyze the structure and sequence conservation to assess the differential stability of segments from two structurally related protein families: the amyloidogenic gelsolin-like and its structurally related cofilin-like. The two families belong to the actin depolymerizing proteins, with a central beta-sheet stacked between 2 and 4 alpha-helices. Although sequentially remote, the two families share regions of high and low conservation and stability. Our results show a highly conserved hydrophobic and aromatic cluster, located at a central buried beta-hairpin. The geometry of the aromatic residues with respect to each other is strictly conserved, suggesting involvement in strand registering and beta-sheet stabilization. Consistent with experiment, we find a region of weak conservation and stability at one of the exposed beta-strands (strand B in the gelsolin-like family). This region was recently found to be affected by a point mutation-mediated destabilization of the human gelsolin domain 2, which facilitates the first proteolytic event in the formation of the amyloidogenic fragment. Thus, both experimental and computational conservation analyses suggest that this unstable region may constitute a first step in amyloid formation. Our analysis uses a recently developed multiple-structure comparison algorithm in which molecules are aligned simultaneously.
Collapse
Affiliation(s)
- Hadar Benyamini
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | | | |
Collapse
|
25
|
Abstract
The docking field has come of age. The time is ripe to present the principles of docking, reviewing the current state of the field. Two reasons are largely responsible for the maturity of the computational docking area. First, the early optimism that the very presence of the "correct" native conformation within the list of predicted docked conformations signals a near solution to the docking problem, has been replaced by the stark realization of the extreme difficulty of the next scoring/ranking step. Second, in the last couple of years more realistic approaches to handling molecular flexibility in docking schemes have emerged. As in folding, these derive from concepts abstracted from statistical mechanics, namely, populations. Docking and folding are interrelated. From the purely physical standpoint, binding and folding are analogous processes, with similar underlying principles. Computationally, the tools developed for docking will be tremendously useful for folding. For large, multidomain proteins, domain docking is probably the only rational way, mimicking the hierarchical nature of protein folding. The complexity of the problem is huge. Here we divide the computational docking problem into its two separate components. As in folding, solving the docking problem involves efficient search (and matching) algorithms, which cover the relevant conformational space, and selective scoring functions, which are both efficient and effectively discriminate between native and non-native solutions. It is universally recognized that docking of drugs is immensely important. However, protein-protein docking is equally so, relating to recognition, cellular pathways, and macromolecular assemblies. Proteins function when they are bound to other molecules. Consequently, we present the review from both the computational and the biological points of view. Although large, it covers only partially the extensive body of literature, relating to small (drug) and to large protein-protein molecule docking, to rigid and to flexible. Unfortunately, when reviewing these, a major difficulty in assessing the results is the non-uniformity in the formats in which they are presented in the literature. Consequently, we further propose a way to rectify it here.
Collapse
Affiliation(s)
- Inbal Halperin
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | | | | | | |
Collapse
|
26
|
Tsai CD, Ma B, Kumar S, Wolfson H, Nussinov R. Protein folding: binding of conformationally fluctuating building blocks via population selection. Crit Rev Biochem Mol Biol 2002; 36:399-433. [PMID: 11724155 DOI: 10.1080/20014091074228] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Here we review different aspects of the protein folding literature. We present a broad range of observations, showing them to be consistent with a general hierarchical protein folding model. In such a model, local relatively stable, conformationally fluctuating building blocks bind through population selection, to yield the native state. The model includes several components: (1) the fluctuating building blocks that constitute local minima along the polypeptide chain, which even if unstable still possess higher population times than all alternate conformations; (2) the landscape around the bottom of the funnels; (3) the consideration that protein folding involves intramolecular recognition; (4) similar landscapes are observed for folding and for binding, and that (5) the landscape is dynamic, changing with the conditions. The model considers protein folding to be guided by native interactions. The reviewed literature includes the effects of changing the conditions, intermediates and kinetic traps, mutations, similar topologies, fragment complementation experiments, fragments and pathways, focusing on one specific well-studied example, that of the dihydrofolate reductase, chaperones, and chaperonines, in vivo vs. in vitro folding, still using the dihydrofolate example, amyloid formation, and molecular "disorder". These are consistent with the view that binding and folding are similar events, with the differences stemming from different stabilities and hence population times.
Collapse
Affiliation(s)
- C D Tsai
- Intramural Research Support Program--SAIC, Laboratory of Experimental and Computational Biology, NCI-Frederick, MD 21702, USA
| | | | | | | | | |
Collapse
|
27
|
Abstract
The toxic effects of lead have been known for centuries. Occupational exposure to this chemical hazard has also been well documented in relation to various industry groups, including construction, where workers are recognized as being significantly exposed during refurbishment work, in particular through inhalation and ingestion of lead fumes and dust. It is easy to see how so-called 'burners', 'cutters' and 'blasters'--workers directly involved in removing old lead paint--may become exposed; the influence of personal hygiene, smoking, eating/drinking and nail biting has also been documented in the literature. We now report on one group, the scaffolders, not previously considered to be at risk. Although not directly involved in the paint removal, anecdotal and personal experience of the authors indicate that these workers, who erect and later dismantle access structures during the renovation of previously lead-painted surfaces, may take up significant amounts of lead, mainly by ingestion, to raise their personal blood lead levels (and body burden) in line with recognized 'lead workers'. Exposures of this magnitude would also bring the scaffolders involved in such refurbishment work under the Control of Lead at Work Regulations 1998. The authors make various recommendations on measures to minimize and control exposure of scaffolders to lead.
Collapse
Affiliation(s)
- D Sen
- Employment Medical Advisory Service, Health and Safety Executive (North West) Occupational Hygiene Specialist Group (North West), Grove House, Skerton Road, Trafford, Manchester M16 0RB, UK.
| | | | | |
Collapse
|
28
|
|
29
|
Hu Z, Ma B, Wolfson H, Nussinov R. Conservation of polar residues as hot spots at protein interfaces. Proteins 2000; 39:331-42. [PMID: 10813815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
A number of studies have addressed the question of which are the critical residues at protein-binding sites. These studies examined either a single or a few protein-protein interfaces. The most extensive study to date has been an analysis of alanine-scanning mutagenesis. However, although the total number of mutations was large, the number of protein interfaces was small, with some of the interfaces closely related. Here we show that although overall binding sites are hydrophobic, they are studded with specific, conserved polar residues at specific locations, possibly serving as energy "hot spots." Our results confirm and generalize the alanine-scanning data analysis, despite its limited size. Previously Trp, Arg, and Tyr were shown to constitute energetic hot spots. These were rationalized by their polar interactions and by their surrounding rings of hydrophobic residues. However, there was no compelling reason as to why specifically these residues were conserved. Here we show that other polar residues are similarly conserved. These conserved residues have been detected consistently in all interface families that we have examined. Our results are based on an extensive examination of residues which are in contact across protein interfaces. We utilize all clustered interface families with at least five members and with sequence similarity between the members in the range of 20-90%. There are 11 such clustered interface families, comprising a total of 97 crystal structures. Our three-dimensional superpositioning analysis of the occurrences of matched residues in each of the families identifies conserved residues at spatially similar environments. Additionally, in enzyme inhibitors, we observe that residues are more conserved at the interfaces than at other locations. On the other hand, antibody-protein interfaces have similar surface conservation as compared to their corresponding linear sequence alignment, consistent with the suggestion that evolution has optimized protein interfaces for function.
Collapse
Affiliation(s)
- Z Hu
- NCI-FCRDC, Laboratory of Experimental and Computational Biology, Frederick, Maryland, USA
| | | | | | | |
Collapse
|
30
|
Abstract
In this article we focus on presenting a broad range of examples illustrating low-energy transitions via hinge-bending motions. The examples are divided according to the type of hinge-bending involved; namely, motions involving fragments of the protein chains, hinge-bending motions involving protein domains, and hinge-bending motions between the covalently unconnected subunits. We further make a distinction between allosterically and nonallosterically regulated proteins. These transitions are discussed within the general framework of folding and binding funnels. We propose that the conformers manifesting such swiveling motions are not the outcome of "induced fit" binding mechanism; instead, molecules exist in an ensemble of conformations that are in equilibrium in solution. These ensembles, which populate the bottoms of the funnels, a priori contain both the "open" and the "closed" conformational isomers. Furthermore, we argue that there are no fundamental differences among the physical principles behind the folding and binding funnels. Hence, there is no basic difference between funnels depicting ensembles of conformers of single molecules with fragment, or domain motions, as compared to subunits in multimeric quaternary structures, also showing such conformational transitions. The difference relates only to the size and complexity of the system. The larger the system, the more complex its corresponding fused funnel(s). In particular, funnels associated with allosterically regulated proteins are expected to be more complicated, because allostery is frequently involved with movements between subunits, and consequently is often observed in multichain and multimolecular complexes. This review centers on the critical role played by flexibility and conformational fluctuations in enzyme activity. Internal motions that extend over different time scales and with different amplitudes are known to be essential for the catalytic cycle. The conformational change observed in enzyme-substrate complexes as compared to the unbound enzyme state, and in particular the hinge-bending motions observed in enzymes with two domains, have a substantial effect on the enzymatic catalytic activity. The examples we review span the lipolytic enzymes that are particularly interesting, owing to their activation at the water-oil interface; an allosterically controlled dehydrogenase (lactate dehydrogenase); a DNA methyltransferase, with a covalently-bound intermediate; large-scale flexible loop motions in a glycolytic enzyme (TIM); domain motion in PGK, an enzyme which is essential in most cells, both for ATP generation in aerobes and for fermentation in anaerobes; adenylate kinase, showing large conformational changes, owing to their need to shield their catalytic centers from water; a calcium-binding protein (calmodulin), involved in a wide range of cellular calcium-dependent signaling; diphtheria toxin, whose large domain motion has been shown to yield "domain swapping;" the hexameric glutamate dehydrogenase, which has been studied both in a thermophile and in a mesophile; an allosteric enzyme, showing subunit motion between the R and the T states (aspartate transcarbamoylase), and the historically well-studied lac repressor. Nonallosteric subunit transitions are also addressed, with some examples (aspartate receptor and BamHI endonuclease). Hence, using this enzyme-catalysis-centered discussion, we address energy funnel landscapes of large-scale conformational transitions, rather than the faster, quasi-harmonic, thermal fluctuations.
Collapse
Affiliation(s)
- S Kumar
- Intramural Research Support Program-SAIC, Laboratory of Experimental and Computational Biology, NCI-FCRDC, Frederick, MD, 21702, USA
| | | | | | | | | |
Collapse
|
31
|
Abstract
We present an efficient method for flexible comparison of protein structures, allowing swiveling motions. In all currently available methodologies developed and applied to the comparisons of protein structures, the molecules are considered to be rigid objects. The method described here extends and generalizes current approaches to searches for structural similarity between molecules by viewing proteins as objects consisting of rigid parts connected by rotary joints. During the matching, the rigid subparts are allowed to be rotated with respect to each other around swiveling points in one of the molecules. This technique straightforwardly detects structural motifs having hinge(s) between their domains. Whereas other existing methods detect hinge-bent motifs by initially finding the matching rigid parts and subsequently merging these together, our method automatically detects recurring substructures, allowing full 3 dimensional rotations about their swiveling points. Yet the method is extremely fast, avoiding the time-consuming full conformational space search. Comparison of two protein structures, without a predefinition of the motif, takes only seconds to one minute on a workstation per hinge. Hence, the molecule can be scanned for many potential hinge sites, allowing practically all C(alpha) atoms to be tried as swiveling points. This algorithm provides a highly efficient, fully automated tool. Its complexity is only O(n2), where n is the number of C(alpha) atoms in the compared molecules. As in our previous methodologies, the matching is independent of the order of the amino acids in the polypeptide chain. Here we illustrate the performance of this highly powerful tool on a large number of proteins exhibiting hinge-bending domain movements. Despite the motions, known hinge-bent domains/motifs which have been assembled and classified, are correctly identified. Additional matches are detected as well. This approach has been motivated by a technique for model based recognition of articulated objects originating in computer vision and robotics.
Collapse
Affiliation(s)
- G Verbitsky
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | |
Collapse
|
32
|
Abstract
Here we examine the reliability of surface comparisons in searches for active sites in proteins. Detection of a patch of surface on one protein which is similar to an active site in another, may suggest similarities in enzymatic mechanisms, in enzyme functions and implicate a potential target for ligand/inhibitor design. Specifically, we compare the efficacy of molecular surface comparisons with comparisons of surface atoms and of C(alpha) backbone atoms. We further investigate comparisons of specific atoms, belonging to a predefined pattern of catalytic residues versus comparisons of molecular surfaces and, separately, of surface atoms. This aspect is particularly relevant, as catalytic residues may be (partially) buried. We also explore active site comparisons versus comparisons in which the entire molecular surfaces are scanned. While here we focus on the geometrical aspect of the problem, we also investigate the effect of adding residue labels in these comparisons. Our extensive studies cover the serine proteases, containing the highly conserved triad motif, and the chorismate mutases. Since such active site comparisons entail comparisons between unconnected points in 3D space, an order-independent comparison technique is necessary. The geometric hashing algorithm is ideally suited to handling such a task. It can perform both global shape matching for the whole surfaces of large protein molecules and searching for local shape similarities for small surface motifs. Our results show that molecular surface comparisons work best when the similarity is high. As the similarity deteriorates, the number of potential solutions increases rapidly, making their ranking difficult, particularly when scanning entire molecular surfaces. Utilizing atomic coordinates directly appears more adequate under such circumstances.
Collapse
Affiliation(s)
- M Rosen
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
33
|
Abstract
Here we address the following questions. How many structurally different entries are there in the Protein Data Bank (PDB)? How do the proteins populate the structural universe? To investigate these questions a structurally non-redundant set of representative entries was selected from the PDB. Construction of such a dataset is not trivial: (i) the considerable size of the PDB requires a large number of comparisons (there were more than 3250 structures of protein chains available in May 1994); (ii) the PDB is highly redundant, containing many structurally similar entries, not necessarily with significant sequence homology, and (iii) there is no clear-cut definition of structural similarity. The latter depend on the criteria and methods used. Here, we analyze structural similarity ignoring protein topology. To date, representative sets have been selected either by hand, by sequence comparison techniques which ignore the three-dimensional (3D) structures of the proteins or by using sequence comparisons followed by linear structural comparison (i.e. the topology, or the sequential order of the chains, is enforced in the structural comparison). Here we describe a 3D sequence-independent automated and efficient method to obtain a representative set of protein molecules from the PDB which contains all unique structures and which is structurally non-redundant. The method has two novel features. The first is the use of strictly structural criteria in the selection process without taking into account the sequence information. To this end we employ a fast structural comparison algorithm which requires on average approximately 2 s per pairwise comparison on a workstation. The second novel feature is the iterative application of a heuristic clustering algorithm that greatly reduces the number of comparisons required. We obtain a representative set of 220 chains with resolution better than 3.0 A, or 268 chains including lower resolution entries, NMR entries and models. The resulting set can serve as a basis for extensive structural classification and studies of 3D recurring motifs and of sequence-structure relationships. The clustering algorithm succeeds in classifying into the same structural family chains with no significant sequence homology, e.g. all the globins in one single group, all the trypsin-like serine proteases in another or all the immunoglobulin-like folds into a third. In addition, unexpected structural similarities of interest have been automatically detected between pairs of chains. A cluster analysis of the representative structures demonstrates the way the "structural universe' is populated.
Collapse
Affiliation(s)
- D Fischer
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
34
|
|
35
|
Fischer D, Wolfson H, Lin SL, Nussinov R. Three-dimensional, sequence order-independent structural comparison of a serine protease against the crystallographic database reveals active site similarities: potential implications to evolution and to protein folding. Protein Sci 1994; 3:769-78. [PMID: 8061606 PMCID: PMC2142723 DOI: 10.1002/pro.5560030506] [Citation(s) in RCA: 81] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We have recently developed a fast approach to comparisons of 3-dimensional structures. Our method is unique, treating protein structures as collections of unconnected points (atoms) in space. It is completely independent of the amino acid sequence order. It is unconstrained by insertions, deletions, and chain directionality. It matches single, isolated amino acids between 2 different structures strictly by their spatial positioning regardless of their relative sequential position in the amino acid chain. It automatically detects a recurring 3D motif in protein molecules. No predefinition of the motif is required. The motif can be either in the interior of the proteins or on their surfaces. In this work, we describe an enhancement over our previously developed technique, which considerably reduces the complexity of the algorithm. This results in an extremely fast technique. A typical pairwise comparison of 2 protein molecules requires less than 3 s on a workstation. We have scanned the structural database with dozens of probes, successfully detecting structures that are similar to the probe. To illustrate the power of this method, we compare the structure of a trypsin-like serine protease against the structural database. Besides detecting homologous trypsin-like proteases, we automatically obtain 3D, sequence order-independent, active-site similarities with subtilisin-like and sulfhydryl proteases. These similarities equivalence isolated residues, not conserving the linear order of the amino acids in the chains. The active-site similarities are well known and have been detected by manually inspecting the structures in a time-consuming, laborious procedure. This is the first time such equivalences are obtained automatically from the comparison of full structures. The far-reaching advantages and the implications of our novel algorithm to studies of protein folding, to evolution, and to searches for pharmacophoric patterns are discussed.
Collapse
Affiliation(s)
- D Fischer
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
36
|
Abstract
We present a unique sequence-order independent approach which allows examination of three dimensional structures, searching for spatially similar substructural motifs. If the amino acids composing the motifs are contiguous in the primary chain, that is, they follow each other in the sequence, a common ancestor and a divergent evolutionary process may be implied. On the other hand, if the three-dimensional substructural motif consists of amino acids whose positions in the sequences vary between the different proteins, a convergent evolution might have taken place. Starting from different, ancient sequences, mutations may have occurred that brought about formation and conservation of a truly structural motif. Such a motif might be particularly suitable for fulfilling a specific function. Clearly, in order to be able to carry out such a task one needs a technique which allows comparisons of protein structures absolutely independent of their amino acid sequence-order. Our novel, efficient, computer vision based technique treats atoms (residues) as unconnected points in space, using strictly the atomic (either all atoms or only the C alpha atoms) coordinates. The order of the residues is completely disregarded. Detection, cataloging and analysis of "real" three-dimensional, sequence-order independent motifs in the crystallographic database is expected to be an invaluable tool for protein folding. Here we demonstrate the power of the technique by applying it to alpha/beta proteins. Our studies indicate that for some of the proteins, the "classical" structural alignments (conserving the amino acid order) are the optimal ones. Nevertheless, for others, truly spatial (out of sequential-order) amino acid equivalencing results in a better geometrical match.
Collapse
Affiliation(s)
- D Fischer
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | |
Collapse
|
37
|
Fischer D, Norel R, Wolfson H, Nussinov R. Surface motifs by a computer vision technique: searches, detection, and implications for protein-ligand recognition. Proteins 1993; 16:278-92. [PMID: 8394000 DOI: 10.1002/prot.340160306] [Citation(s) in RCA: 55] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
We describe the application of a method geared toward structural and surface comparison of proteins. The method is based on the Geometric Hashing Paradigm adapted from Computer Vision. It allows for comparison of any two sets of 3-D coordinates, such as protein backbones, protein core or protein surface motifs, and small molecules such as drugs. Here we apply our method to 4 types of comparisons between pairs of molecules: (1) comparison of the backbones of two protein domains; (2) search for a predefined 3-D C alpha motif within the full backbone of a domain; and in particular, (3) comparison of the surfaces of two receptor proteins; and (4) comparison of the surface of a receptor to the surface of a ligand. These aspects complement each other and can contribute toward a better understanding of protein structure and biomolecular recognition. Searches for 3-D surface motifs can be carried out on either receptors or on ligands. The latter may result in the detection of pharmacophoric patterns. If the surfaces of the binding sites of either the receptors or of the ligands are relatively similar, surface superpositioning may aid significantly in the docking problem. Currently, only distance invariants are used in the matching, although additional geometric surface invariants are considered. The speed of our Geometric Hashing algorithm is encouraging, with a typical surface comparison taking only seconds or minutes of CPU time on a SUN 4 SPARC workstation. The direct application of this method to the docking problem is also discussed. We demonstrate the success of this method in its application to two members of the globin family and to two dehydrogenases.
Collapse
Affiliation(s)
- D Fischer
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
38
|
Ogden TL, Bartlett IW, Purnell CJ, Wells CJ, Armitage F, Wolfson H. Dust from cotton manufacture: changing from static to personal sampling. Ann Occup Hyg 1993; 37:271-85. [PMID: 8346875 DOI: 10.1093/annhyg/37.3.271] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Several designs of personal samplers were tested for use to collect cotton dust. The IOM personal inhalable-dust sampler was selected because: (1) collection of the whole inhalable fraction was preferred, since all inhaled sizes are under suspicion as contributing to respiratory symptoms in cotton exposure; (2) this sampler is well characterized; and (3) it was found to be practicable in the environments examined. Gauze shields to exclude 'fly' from the personal sampler were tried, but were rejected mainly because measurement of the whole inhalable fraction by a validated sampler was felt to be more appropriate. A range of processes at a representative selection of mills was assessed by a hygiene team, and classified as 'clean' or 'dirty' in terms of present standards of control. This classification agreed well with subsequent measurements using the present method, which uses a large static sampler. A personal sampling survey then showed that in about two-thirds of 'clean' processes personal exposure of at least 80% of those employed was less than about 2-2.5 mg m-3. Only one-tenth of 'dirty' processes met this standard. Personal exposure correlates poorly with the present static method, as expected, but comparison of the results suggested that a mean background level of 0.5 mg m-3 would correspond to a median personal exposure of about 2.2 mg m-3. Side-by-side measurements by the background method differed by less than 0.15 mg m-3 on about 95% of occasions. Niven et al. (to be published) have compared the IOM head used in this study with the Manchester University sampler previously used by Cinkotai et al. [Ann. occup. Hyg. 32, 103-113 (1988)] to derive a relationship between personal exposure and prevalence of byssinotic symptoms in spinners. According to Cinkotai et al.'s results the concentrations of 2-2.5 mg m-3 discussed would correspond to a prevalence of 3-5%. However, this prevalence probably reflects higher exposures in the past.
Collapse
Affiliation(s)
- T L Ogden
- Health and Safety Executive, Research and Laboratory Services Division, London, U.K
| | | | | | | | | | | |
Collapse
|
39
|
Bachar O, Fischer D, Nussinov R, Wolfson H. A computer vision based technique for 3-D sequence-independent structural comparison of proteins. Protein Eng 1993; 6:279-88. [PMID: 8506262 DOI: 10.1093/protein/6.3.279] [Citation(s) in RCA: 114] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
A detailed description of an efficient approach to comparison of protein structures is presented. Given the 3-D coordinate data of the structures to be compared, the system automatically identifies every region of structural similarity between the structures without prior knowledge of an initial alignment. The method uses the geometric hashing technique which was originally developed for model-based object recognition problems in the area of computer vision. It exploits a rotationally and translationally invariant representation of rigid objects, resulting in a highly efficient, fully automated tool. The method is independent of the amino acid sequence and, thus, insensitive to insertions, deletions and displacements of equivalent substructures between the molecules being compared. The method described here is general, identifies 'real' 3-D substructures and is not constrained by the order imposed by the primary chain of the amino acids. Typical structure comparison problems are examined and the results of the new method are compared with the published results from previous methods. These results, obtained without using the sequence order of the chains, confirm published structural analogies that use sequence-dependent techniques. Our results also extend previous analogies by detecting geometrically equivalent out-of-sequential-order structural elements which cannot be obtained by current techniques.
Collapse
Affiliation(s)
- O Bachar
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
40
|
Fischer D, Bachar O, Nussinov R, Wolfson H. An efficient automated computer vision based technique for detection of three dimensional structural motifs in proteins. J Biomol Struct Dyn 1992; 9:769-89. [PMID: 1616630 DOI: 10.1080/07391102.1992.10507955] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
As the number of available three dimensional coordinates of proteins increases, it is now recognized that proteins from different families and topologies are constructed from independent motifs. Detection of specific structural motifs within proteins aids in understanding their role and the mechanism of their operation. To aid in identification and use of these motifs it has become necessary to develop efficient methods for systematic scanning of structural databases. To date, methods of structural protein comparison suffer from at least one of the following limitations: (1) are not fully automated (require human intervention), (2) are limited to relatively similar structures, (3) are constrained to linear alignments of the structures, (4) are sensitive to insertions, deletions or gaps in the sequences or (5) are very time consuming. We present a method to overcome the above limitations. The method discovers and ranks every piece of structural similarity between the structures compared, thus allowing the simultaneous detection of real 3-D motifs in different domains, between domains, in active sites, surfaces etc. The method uses the Geometric Hashing Paradigm which is an efficient technique originally developed for Computer Vision. The algorithm exploits the geometrical constraints of rigid objects, it is especially geared towards recognition of partial structures in rigid objects belonging to large data bases and is straightforwardly parallelizable. Computer Vision techniques are for the first time applied to molecular structure comparison, resulting in an efficient, fully automated tool. The method has been tested in a number of cases, including comparisons of the haemoglobins, immunoglobulins, serine proteinases, calcium binding proteins, DNA binding proteins and others. In all examples our results were equivalent to the published results from previous methods and in some cases additional structural information was obtained by our method.
Collapse
Affiliation(s)
- D Fischer
- Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel
| | | | | | | |
Collapse
|
41
|
|
42
|
|
43
|
Purnell CJ, Martin GL, Wolfson H. The determination of airborne methyl ethyl ketone peroxide in the glass reinforced plastics industry. Ann Occup Hyg 1979; 22:383-7. [PMID: 547797 DOI: 10.1093/annhyg/22.4.383] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
44
|
Patrick J, McMillan J, Wolfson H, O'Brien JC. Acetylcholine receptor metabolism in a nonfusing muscle cell line. J Biol Chem 1977; 252:2143-53. [PMID: 845167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
The development and turnover of acetylcholine receptors in a nonfusing muscle cell line has been investigated using iodinated alpha-bungarotoxin as a probe for acetylcholine receptor. logarithmically growing cells do not bind toxin, while cells that have ceased cell division bind toxin at a site which has the pharmacological characteristics of an acetylcholine receptor. These binding sites are removed from the cell surface at a rate equal to 8.9 +/- 0.5% of the total surface binding sites/h and appear at a rate equal to 8.3 +/- 1.5% of the total surface binding sites/h. Appearance of new binding sites can occur for a period of 1 1/2 h in the presence of cycloheximide, during which time 15% of the surface receptors can be replaced. There is a hidden population of receptors which is not accessible to toxin without disrupting the cell. This population amounts to 35% of the Triton-extractable receptors in the cell and is composed of two classes. One class, termed a precursor receptor, appears to move from the hidden population to the cell surface, and composes about 40% of the total hidden receptor population. The second class of hidden receptors does not appear to function as a surface precursor and is neither depleted nor enriched by any of the procedures we employed. Surface receptors and hidden receptors are distinguishable on the basis of their sedimentation coefficient about 0.5 to 0.6 S lower than surface receptors. We were unable to distinguish between precursor and non-precursor hidden receptors on the basis of cursor and nonprecursor hidden receptors on the basis of their sedimentation coefficients.
Collapse
|
45
|
|
46
|
Wolfson H, Hirakis SS. Letter: Distribution of physicians in Connecticut. Conn Med 1975; 39:227. [PMID: 1149445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|