1
|
Garton M, MacKinnon SS, Malevanets A, Wodak SJ. Interplay of self-association and conformational flexibility in regulating protein function. Philos Trans R Soc Lond B Biol Sci 2019; 373:rstb.2017.0190. [PMID: 29735742 DOI: 10.1098/rstb.2017.0190] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/14/2018] [Indexed: 12/18/2022] Open
Abstract
Many functional roles have been attributed to homodimers, the most common mode of protein self-association, notably in the regulation of enzymes, ion channels, transporters and transcription factors. Here we review findings that offer new insights into the different roles conformational flexibility plays in regulating homodimer function. Intertwined homodimers of two-domain proteins and their related family members display significant conformational flexibility, which translates into concerted motion between structural domains. This flexibility enables the corresponding proteins to regulate function across family members by modulating the spatial positions of key recognition surfaces of individual domains, to either maintain subunit interfaces, alter them or break them altogether, leading to a variety of functional consequences. Many proteins may exist as monomers but carry out their biological function as homodimers or higher-order oligomers. We present early evidence that in such systems homodimer formation primes the protein for its functional role. It does so by inducing elevated mobility in protein regions corresponding to the binding epitopes of functionally important ligands. In some systems this process acts as an allosteric response elicited by the self-association reaction itself. Our analysis furthermore suggests that the induced extra mobility likely facilitates ligand binding through the mechanism of conformational selection.This article is part of a discussion meeting issue 'Allostery and molecular machines'.
Collapse
Affiliation(s)
- Michael Garton
- Department of Molecular Genetics, University of Toronto, The Donnelly Centre, 160 College Street, Toronto, Ontario M5S 3E1, Canada
| | - Stephen S MacKinnon
- Cyclica Inc., 18 King Street East, Suite 810, Toronto, Ontario M5C 1C4, Canada
| | - Anatoly Malevanets
- Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8, Canada
| | - Shoshana J Wodak
- VIB Structural Biology research Centre, VUB, Building E Pleinlaan 2, 1050 Brussels, Belgium
| |
Collapse
|
2
|
Koike R, Amemiya T, Horii T, Ota M. Structural changes of homodimers in the PDB. J Struct Biol 2017; 202:42-50. [PMID: 29233747 DOI: 10.1016/j.jsb.2017.12.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2017] [Revised: 11/30/2017] [Accepted: 12/08/2017] [Indexed: 01/25/2023]
Abstract
Protein complexes are involved in various biological phenomena. These complexes are intrinsically flexible, and structural changes are essential to their functions. To perform a large-scale automated analysis of the structural changes of complexes, we combined two original methods. An application, SCPC, compares two structures of protein complexes and decides the match of binding mode. Another application, Motion Tree, identifies rigid-body motions in various sizes and magnitude from the two structural complexes with the same binding mode. This approach was applied to all available homodimers in the Protein Data Bank (PDB). We defined two complex-specific motions: interface motion and subunit-spanning motion. In the former, each subunit of a complex constitutes a rigid body, and the relative movement between subunits occurs at the interface. In the latter, structural parts from distinct subunits constitute a rigid body, providing the relative movement spanning subunits. All structural changes were classified and examined. It was revealed that the complex-specific motions were common in the homodimers, detected in around 40% of families. The dimeric interfaces were likely to be small and flat for interface motion, while large and rugged for subunit-spanning motion. Interface motion was accompanied by a drastic change in contacts at the interface, while the change in the subunit-spanning motion was moderate. These results indicate that the interface properties of homodimers correlated with the type of complex-specific motion. The study demonstrates that the pipeline of SCPC and Motion Tree is useful for the massive analysis of structural change of protein complexes.
Collapse
Affiliation(s)
- Ryotaro Koike
- Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Takayuki Amemiya
- Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Tatsuya Horii
- Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan
| | - Motonori Ota
- Graduate School of Informatics, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan.
| |
Collapse
|
3
|
Bertoni M, Kiefer F, Biasini M, Bordoli L, Schwede T. Modeling protein quaternary structure of homo- and hetero-oligomers beyond binary interactions by homology. Sci Rep 2017; 7:10480. [PMID: 28874689 PMCID: PMC5585393 DOI: 10.1038/s41598-017-09654-8] [Citation(s) in RCA: 492] [Impact Index Per Article: 70.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2017] [Accepted: 07/28/2017] [Indexed: 01/01/2023] Open
Abstract
Cellular processes often depend on interactions between proteins and the formation of macromolecular complexes. The impairment of such interactions can lead to deregulation of pathways resulting in disease states, and it is hence crucial to gain insights into the nature of macromolecular assemblies. Detailed structural knowledge about complexes and protein-protein interactions is growing, but experimentally determined three-dimensional multimeric assemblies are outnumbered by complexes supported by non-structural experimental evidence. Here, we aim to fill this gap by modeling multimeric structures by homology, only using amino acid sequences to infer the stoichiometry and the overall structure of the assembly. We ask which properties of proteins within a family can assist in the prediction of correct quaternary structure. Specifically, we introduce a description of protein-protein interface conservation as a function of evolutionary distance to reduce the noise in deep multiple sequence alignments. We also define a distance measure to structurally compare homologous multimeric protein complexes. This allows us to hierarchically cluster protein structures and quantify the diversity of alternative biological assemblies known today. We find that a combination of conservation scores, structural clustering, and classical interface descriptors, can improve the selection of homologous protein templates leading to reliable models of protein complexes.
Collapse
Affiliation(s)
- Martino Bertoni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Florian Kiefer
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Marco Biasini
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Lorenza Bordoli
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland
| | - Torsten Schwede
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland. .,Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056, Basel, Switzerland.
| |
Collapse
|
4
|
Tanner JJ. Empirical power laws for the radii of gyration of protein oligomers. Acta Crystallogr D Struct Biol 2016; 72:1119-1129. [PMID: 27710933 PMCID: PMC5053138 DOI: 10.1107/s2059798316013218] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2016] [Accepted: 08/16/2016] [Indexed: 11/10/2022] Open
Abstract
The radius of gyration is a fundamental structural parameter that is particularly useful for describing polymers. It has been known since Flory's seminal work in the mid-20th century that polymers show a power-law dependence, where the radius of gyration is proportional to the number of residues raised to a power. The power-law exponent has been measured experimentally for denatured proteins and derived empirically for folded monomeric proteins using crystal structures. Here, the biological assemblies in the Protein Data Bank are surveyed to derive the power-law parameters for protein oligomers having degrees of oligomerization of 2-6 and 8. The power-law exponents for oligomers span a narrow range of 0.38-0.41, which is close to the value of 0.40 obtained for monomers. This result shows that protein oligomers exhibit essentially the same power-law behavior as monomers. A simple power-law formula is provided for estimating the oligomeric state from an experimental measurement of the radius of gyration. Several proteins in the Protein Data Bank are found to deviate substantially from power-law behavior by having an atypically large radius of gyration. Some of the outliers have highly elongated structures, such as coiled coils. For coiled coils, the radius of gyration does not follow a power law and instead scales linearly with the number of residues in the oligomer. Other outliers are proteins whose oligomeric state or quaternary structure is incorrectly annotated in the Protein Data Bank. The power laws could be used to identify such errors and help prevent them in future depositions.
Collapse
Affiliation(s)
- John J. Tanner
- Departments of Biochemistry and Chemistry, University of Missouri-Columbia, Columbia, MO 65211, USA
| |
Collapse
|
5
|
Wodak SJ, Malevanets A, MacKinnon SS. The Landscape of Intertwined Associations in Homooligomeric Proteins. Biophys J 2015; 109:1087-100. [PMID: 26340815 DOI: 10.1016/j.bpj.2015.08.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Revised: 06/06/2015] [Accepted: 08/03/2015] [Indexed: 01/22/2023] Open
Abstract
We present an overview of the full repertoire of intertwined associations in homooligomeric proteins. This overview summarizes recent findings on the different categories of intertwined associations in known protein structures, their assembly modes, the properties of their interfaces, and their structural plasticity. Furthermore, the current body of knowledge on the so-called three-dimensional domain-swapped systems is reexamined in the context of the wider landscape of intertwined homooligomers, with a particular focus on the mechanistic aspects that underpin intertwined self-association processes in proteins. Insights gained from this integrated overview into the physical and biological roles of intertwining are highlighted.
Collapse
Affiliation(s)
- Shoshana J Wodak
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada; VIB Structural Biology Research Center, Brussels, Belgium.
| | | | - Stephen S MacKinnon
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada; Cyclica, Inc., Toronto, Ontario, Canada
| |
Collapse
|
6
|
Abstract
A key reason three-dimensional (3-D) protein structures are annotated with supporting or derived information is to understand the molecular basis of protein function. To this end, protein structure annotation databases curate key facts and observations, based on community-accepted standards, about the ~100,000 3-D experimental protein structures to date. This review will introduce the primary structure repositories, databases, and value-added structural annotation databases, as well as the range of information they provide. The different levels of annotation data (primary vs. derived vs. inferred) and how they should all be considered accordingly will also be described.
Collapse
Affiliation(s)
- Margaret J. Gabanyi
- Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| | - Helen M. Berman
- Center for Integrative Proteomics Research, Rutgers, The State University of New Jersey, Piscataway, NJ 08854 USA
| |
Collapse
|
7
|
MacKinnon SS, Wodak SJ. Landscape of intertwined associations in multi-domain homo-oligomeric proteins. J Mol Biol 2014; 427:350-70. [PMID: 25451036 DOI: 10.1016/j.jmb.2014.11.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Revised: 10/31/2014] [Accepted: 11/03/2014] [Indexed: 10/24/2022]
Abstract
This study charts the landscape of multi-domain protein structures that form intertwined homodimers by exchanging structural domains between subunits. A representative dataset of such homodimers was derived from the Protein Data Bank, and their structural and topological properties were compared to those of a representative set of non-intertwined homodimers. Most of the intertwined dimers form closed assemblies with head-to-tail arrangements, where the subunit interface involves contacts between dissimilar domains. In contrast, the non-intertwined dimers form preferentially head-to-head arrangements, where the subunit interface involves contacts between identical domains. Most of these contacts engage only one structural domain from each subunit, leaving the remaining domains free to form other associations. Remarkably, we find that multi-domain proteins closely related to the intertwined homodimers are significantly more likely than relatives of the non-intertwined versions to adopt alternative intramolecular domain arrangements. In ~40% of the intertwined dimers, the plasticity in domain arrangements among relatives affords maintenance of the head-to-head or head-to-tail topology and conservation of the corresponding subunit interface. This property seems to be exploited in several systems to regulate DNA binding. In ~58%, however, intramolecular domain re-arrangements are associated with changes in oligomeric states and poorly conserved interfaces among relatives. This time, the corresponding structural plasticity appears to be exploited by evolution to modulate function by switching between active and inactive states of the protein. Surprisingly, in total, only three systems were found to undergo the classical monomer to intertwined dimer conversion associated with three-dimensional domain swapping.
Collapse
Affiliation(s)
- Stephen S MacKinnon
- Molecular Structure and Function Program, Hospital for Sick Children, 555 University Avenue, Toronto, ON, Canada M5G 1X8; Department of Biochemistry, University of Toronto, 1 King's College Circle, Toronto, ON, Canada M5S 1A8
| | - Shoshana J Wodak
- Molecular Structure and Function Program, Hospital for Sick Children, 555 University Avenue, Toronto, ON, Canada M5G 1X8; Department of Biochemistry, University of Toronto, 1 King's College Circle, Toronto, ON, Canada M5S 1A8; Department of Molecular Genetics, University of Toronto, 1 King's College Circle, Toronto, ON, Canada M5S 1A8.
| |
Collapse
|
8
|
|
9
|
Goncearenco A, Shoemaker BA, Zhang D, Sarychev A, Panchenko AR. Coverage of protein domain families with structural protein-protein interactions: current progress and future trends. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:187-93. [PMID: 24931138 DOI: 10.1016/j.pbiomolbio.2014.05.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Revised: 04/14/2014] [Accepted: 05/17/2014] [Indexed: 11/16/2022]
Abstract
Protein interactions have evolved into highly precise and regulated networks adding an immense layer of complexity to cellular systems. The most accurate atomistic description of protein binding sites can be obtained directly from structures of protein complexes. The availability of structurally characterized protein interfaces significantly improves our understanding of interactomes, and the progress in structural characterization of protein-protein interactions (PPIs) can be measured by calculating the structural coverage of protein domain families. We analyze the coverage of protein domain families (defined according to CDD and Pfam databases) by structures, structural protein-protein complexes and unique protein binding sites. Structural PPI coverage of currently available protein families is about 30% without any signs of saturation in coverage growth dynamics. Given the current growth rates of domain databases and structural PPI deposition, complete domain coverage with PPIs is not expected in the near future. As a result of this study we identify families without any protein-protein interaction evidence (listed on a supporting website http://www.ncbi.nlm.nih.gov/Structure/ibis/coverage/) and propose them as potential targets for structural studies with a focus on protein interactions.
Collapse
Affiliation(s)
- Alexander Goncearenco
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Benjamin A Shoemaker
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Dachuan Zhang
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Alexey Sarychev
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States
| | - Anna R Panchenko
- Computational Biology Branch of the National Center for Biotechnology Information in Bethesda, Maryland, United States.
| |
Collapse
|
10
|
BioAssemblyModeler (BAM): user-friendly homology modeling of protein homo- and heterooligomers. PLoS One 2014; 9:e98309. [PMID: 24922057 PMCID: PMC4055448 DOI: 10.1371/journal.pone.0098309] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 04/30/2014] [Indexed: 01/11/2023] Open
Abstract
Many if not most proteins function in oligomeric assemblies of one or more protein sequences. The Protein Data Bank provides coordinates for biological assemblies for each entry, at least 60% of which are dimers or larger assemblies. BioAssemblyModeler (BAM) is a graphical user interface to the basic steps in homology modeling of protein homooligomers and heterooligomers from the biological assemblies provided in the PDB. BAM takes as input up to six different protein sequences and begins by assigning Pfam domains to the target sequences. The program utilizes a complete assignment of Pfam domains to sequences in the PDB, PDBfam (http://dunbrack2.fccc.edu/protcid/pdbfam), to obtain templates that contain any or all of the domains assigned to the target sequence(s). The contents of the biological assemblies of potential templates are provided, and alignments of the target sequences to the templates are produced with a profile-profile alignment algorithm. BAM provides for visual examination and mouse-editing of the alignments supported by target and template secondary structure information and a 3D viewer of the template biological assembly. Side-chain coordinates for a model of the biological assembly are built with the program SCWRL4. A built-in protocol navigation system guides the user through all stages of homology modeling from input sequences to a three-dimensional model of the target complex. Availability: http://dunbrack.fccc.edu/BAM.
Collapse
|
11
|
Bhattacharjee K, Joshi SR. NEMiD: a web-based curated microbial diversity database with geo-based plotting. PLoS One 2014; 9:e94088. [PMID: 24714636 PMCID: PMC3979743 DOI: 10.1371/journal.pone.0094088] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2013] [Accepted: 03/11/2014] [Indexed: 11/19/2022] Open
Abstract
The majority of the Earth's microbes remain unknown, and that their potential utility cannot be exploited until they are discovered and characterized. They provide wide scope for the development of new strains as well as biotechnological uses. The documentation and bioprospection of microorganisms carry enormous significance considering their relevance to human welfare. This calls for an urgent need to develop a database with emphasis on the microbial diversity of the largest untapped reservoirs in the biosphere. The data annotated in the North-East India Microbial database (NEMiD) were obtained by the isolation and characterization of microbes from different parts of the Eastern Himalayan region. The database was constructed as a relational database management system (RDBMS) for data storage in MySQL in the back-end on a Linux server and implemented in an Apache/PHP environment. This database provides a base for understanding the soil microbial diversity pattern in this megabiodiversity hotspot and indicates the distribution patterns of various organisms along with identification. The NEMiD database is freely available at www.mblabnehu.info/nemid/.
Collapse
Affiliation(s)
- Kaushik Bhattacharjee
- Microbiology Laboratory, Department of Biotechnology and Bioinformatics, North-Eastern Hill University, Shillong, Meghalaya, India
| | - Santa Ram Joshi
- Microbiology Laboratory, Department of Biotechnology and Bioinformatics, North-Eastern Hill University, Shillong, Meghalaya, India
- * E-mail:
| |
Collapse
|
12
|
Encinar M, Kralicek AV, Martos A, Krupka M, Cid S, Alonso A, Rico AI, Jiménez M, Vélez M. Polymorphism of FtsZ filaments on lipid surfaces: role of monomer orientation. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2013; 29:9436-9446. [PMID: 23837832 DOI: 10.1021/la401673z] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
FtsZ is a bacterial cytoskeletal protein involved in cell division. It forms a ringlike structure that attaches to the membrane to complete bacterial division. It binds and hydrolyzes GTP, assembling into polymers in a GTP-dependent manner. To test how the orientation of the monomers affects the curvature of the filaments on a surface, we performed site-directed mutagenesis on the E. coli FtsZ protein to insert cysteine residues at lateral locations to orient FtsZ on planar lipid bilayers. The E93C and S255C mutants were overproduced, purified, and found to be functionally active in solution, as well as being capable of sustaining cell division in vivo in complementation assays. Atomic force microscopy was used to observe the shape of the filament fibers formed on the surface. The FtsZ mutants were covalently linked to the lipids and could be polymerized on the bilayer surface in the presence of GTP. Unexpectedly, both mutants assembled into straight structures. E93C formed a well-defined lattice with monomers interacting at 60° and 120° angles, whereas S255C formed a more open array of straight thicker filament aggregates. These results indicate that filament curvature and bending are not fixed and that they can be modulated by the orientation of the monomers with respect to the membrane surface. As filament curvature has been associated with the force generation mechanism, these results point to a possible role of filament membrane attachment in lateral association and curvature, elements currently identified as relevant for force generation.
Collapse
Affiliation(s)
- Mario Encinar
- Instituto de Catálisis y Petroleoquímica, CSIC, Marie Curie, 2, Cantoblanco, 28049 Madrid, Spain
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Levy ED, Teichmann S. Structural, evolutionary, and assembly principles of protein oligomerization. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2013; 117:25-51. [PMID: 23663964 DOI: 10.1016/b978-0-12-386931-9.00002-7] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
In the protein universe, 30-50% of proteins self-assemble to form symmetrical complexes consisting of multiple copies of themselves, called homomers. The prevalence of homomers motivates us to review many of their properties. In Section 1, we describe the methods and challenges associated with quaternary structure inference-these methods are indeed at the basis of any analysis on homomers. In Section 2, we describe the morphological properties of homomers, as well as the database 3DComplex, which provides a taxonomy for both homomeric and heteromeric protein complexes. In Section 3, we review interface properties of homomeric complexes. In Section 4, we then present recent findings on the evolution of homomer interfaces, which we link in Section 5 to the evolution of homomers as entire entities. In Section 6, we discuss mechanisms involved in their assembly and how these mechanisms can be linked to evolution.
Collapse
Affiliation(s)
- Emmanuel D Levy
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel.
| | | |
Collapse
|
14
|
Biophysical and computational fragment-based approaches to targeting protein-protein interactions: applications in structure-guided drug discovery. Q Rev Biophys 2012; 45:383-426. [PMID: 22971516 DOI: 10.1017/s0033583512000108] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Drug discovery has classically targeted the active sites of enzymes or ligand-binding sites of receptors and ion channels. In an attempt to improve selectivity of drug candidates, modulation of protein-protein interfaces (PPIs) of multiprotein complexes that mediate conformation or colocation of components of cell-regulatory pathways has become a focus of interest. However, PPIs in multiprotein systems continue to pose significant challenges, as they are generally large, flat and poor in distinguishing features, making the design of small molecule antagonists a difficult task. Nevertheless, encouragement has come from the recognition that a few amino acids - so-called hotspots - may contribute the majority of interaction-free energy. The challenges posed by protein-protein interactions have led to a wellspring of creative approaches, including proteomimetics, stapled α-helical peptides and a plethora of antibody inspired molecular designs. Here, we review a more generic approach: fragment-based drug discovery. Fragments allow novel areas of chemical space to be explored more efficiently, but the initial hits have low affinity. This means that they will not normally disrupt PPIs, unless they are tethered, an approach that has been pioneered by Wells and co-workers. An alternative fragment-based approach is to stabilise the uncomplexed components of the multiprotein system in solution and employ conventional fragment-based screening. Here, we describe the current knowledge of the structures and properties of protein-protein interactions and the small molecules that can modulate them. We then describe the use of sensitive biophysical methods - nuclear magnetic resonance, X-ray crystallography, surface plasmon resonance, differential scanning fluorimetry or isothermal calorimetry - to screen and validate fragment binding. Fragment hits can subsequently be evolved into larger molecules with higher affinity and potency. These may provide new leads for drug candidates that target protein-protein interactions and have therapeutic value.
Collapse
|
15
|
Xu Q, Dunbrack RL. Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB. Bioinformatics 2012; 28:2763-72. [PMID: 22942020 DOI: 10.1093/bioinformatics/bts533] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. RESULTS We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM-HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. AVAILABILITY The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly.
Collapse
Affiliation(s)
- Qifang Xu
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA
| | | |
Collapse
|
16
|
Nemoto W, Toh H. Functional region prediction with a set of appropriate homologous sequences--an index for sequence selection by integrating structure and sequence information with spatial statistics. BMC STRUCTURAL BIOLOGY 2012; 12:11. [PMID: 22643026 PMCID: PMC3533907 DOI: 10.1186/1472-6807-12-11] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2011] [Accepted: 04/19/2012] [Indexed: 11/17/2022]
Abstract
Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems.
Collapse
Affiliation(s)
- Wataru Nemoto
- Computational Biology Research Center (CBRC), Advanced Industrial Science and Technology (AIST), AIST Tokyo Waterfront Bio-IT Research Building, 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | |
Collapse
|
17
|
Franke V, Sikić M, Vlahoviček K. Prediction of interacting protein residues using sequence and structure data. Methods Mol Biol 2012; 819:233-251. [PMID: 22183541 DOI: 10.1007/978-1-61779-465-0_16] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Identifying hotspots responsible for protein interactions with other macromolecules or drugs provides insight into functional aspects of the protein network, and is a pivotal task in systems biology and drug discovery. Here, we present the protocol for the application of a machine-learning method - Random Forest - to prediction of interacting residues in proteins, based on either the structural parameters or the primary sequence alone.
Collapse
Affiliation(s)
- Vedran Franke
- Department of Molecular Biology, University of Zagreb, Zagreb, Croatia.
| | | | | |
Collapse
|
18
|
David A, Razali R, Wass MN, Sternberg MJE. Protein-protein interaction sites are hot spots for disease-associated nonsynonymous SNPs. Hum Mutat 2011; 33:359-63. [PMID: 22072597 DOI: 10.1002/humu.21656] [Citation(s) in RCA: 118] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 10/31/2011] [Indexed: 11/08/2022]
Abstract
Many nonsynonymous single nucleotide polymorphisms (nsSNPs) are disease causing due to effects at protein-protein interfaces. We have integrated a database of the three-dimensional (3D) structures of human protein/protein complexes and the humsavar database of nsSNPs. We analyzed the location of nsSNPS in terms of their location in the protein core, at protein-protein interfaces, and on the surface when not at an interface. Disease-causing nsSNPs that do not occur in the protein core are preferentially located at protein-protein interfaces rather than surface noninterface regions when compared to random segregation. The disruption of the protein-protein interaction can be explained by a range of structural effects including the loss of an electrostatic salt bridge, the destabilization due to reduction of the hydrophobic effect, the formation of a steric clash, and the introduction of a proline altering the main-chain conformation.
Collapse
Affiliation(s)
- Alessia David
- Centre for Integrative Systems Biology and Bioinformatics, Division of Molecular Biosciences, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | | | | | | |
Collapse
|
19
|
Bickerton GR, Higueruelo AP, Blundell TL. Comprehensive, atomic-level characterization of structurally characterized protein-protein interactions: the PICCOLO database. BMC Bioinformatics 2011; 12:313. [PMID: 21801404 PMCID: PMC3161047 DOI: 10.1186/1471-2105-12-313] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2011] [Accepted: 07/29/2011] [Indexed: 12/04/2022] Open
Abstract
Background Structural studies are increasingly providing huge amounts of information on multi-protein assemblies. Although a complete understanding of cellular processes will be dependent on an explicit characterization of the intermolecular interactions that underlie these assemblies and mediate molecular recognition, these are not well described by standard representations. Results Here we present PICCOLO, a comprehensive relational database capturing the details of structurally characterized protein-protein interactions. Interactions are described at the level of interacting pairs of atoms, residues and polypeptide chains, with the physico-chemical nature of the interactions being characterized. Distance and angle terms are used to distinguish 12 different interaction types, including van der Waals contacts, hydrogen bonds and hydrophobic contacts. The explicit aim of PICCOLO is to underpin large-scale analyses of the properties of protein-protein interfaces. This is exemplified by an analysis of residue propensity and interface contact preferences derived from a much larger data set than previously reported. However, PICCOLO also supports detailed inspection of particular systems of interest. Conclusions The current PICCOLO database comprises more than 260 million interacting atom pairs from 38,202 protein complexes. A web interface for the database is available at http://www-cryst.bioc.cam.ac.uk/piccolo.
Collapse
Affiliation(s)
- George R Bickerton
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| | | | | |
Collapse
|
20
|
Fernández‐Recio J. Prediction of protein binding sites and hot spots. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.45] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
21
|
Wei Q, Wang L, Wang Q, Kruger WD, Dunbrack RL. Testing computational prediction of missense mutation phenotypes: functional characterization of 204 mutations of human cystathionine beta synthase. Proteins 2010; 78:2058-74. [PMID: 20455263 DOI: 10.1002/prot.22722] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Predicting the phenotypes of missense mutations uncovered by large-scale sequencing projects is an important goal in computational biology. High-confidence predictions can be an aid in focusing experimental and association studies on those mutations most likely to be associated with causative relationships between mutation and disease. As an aid in developing these methods further, we have derived a set of random mutations of the enzymatic domains of human cystathionine beta synthase. This enzyme is a dimeric protein that catalyzes the condensation of serine and homocysteine to produce cystathionine. Yeast missing this enzyme cannot grow on medium lacking a source of cysteine, while transfection of functional human CBS into yeast strains missing endogenous enzyme can successfully complement for the missing gene. We used PCR mutagenesis with error-prone Taq polymerase to produce 948 colonies and compared cell growth in the presence or absence of a cysteine source as a measure of CBS function. We were able to infer the phenotypes of 204 single-site mutants, 79 of them deleterious and 125 neutral. This set was used to test the accuracy of six publicly available prediction methods for phenotype prediction of missense mutations: SIFT, PolyPhen, PMut, SNPs3D, PhD-SNP, and nsSNPAnalyzer. The top methods are PolyPhen, SIFT, and nsSNPAnalyzer, which have similar performance. Using kernel discriminant functions, we found that the difference in position-specific scoring matrix values is more predictive than the wild-type PSSM score alone, and that the relative surface area in the biologically relevant complex is more predictive than that of the monomeric proteins.
Collapse
Affiliation(s)
- Qiong Wei
- Program in Molecular and Translational Medicine, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111, USA
| | | | | | | | | |
Collapse
|
22
|
Xu Q, Dunbrack RL. The protein common interface database (ProtCID)--a comprehensive database of interactions of homologous proteins in multiple crystal forms. Nucleic Acids Res 2010; 39:D761-70. [PMID: 21036862 PMCID: PMC3013667 DOI: 10.1093/nar/gkq1059] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
The protein common interface database (ProtCID) is a database that contains clusters of similar homodimeric and heterodimeric interfaces observed in multiple crystal forms (CFs). Such interfaces, especially of homologous but non-identical proteins, have been associated with biologically relevant interactions. In ProtCID, protein chains in the protein data bank (PDB) are grouped based on their PFAM domain architectures. For a single PFAM architecture, all the dimers present in each CF are constructed and compared with those in other CFs that contain the same domain architecture. Interfaces occurring in two or more CFs comprise an interface cluster in the database. The same process is used to compare heterodimers of chains with different domain architectures. By examining interfaces that are shared by many homologous proteins in different CFs, we find that the PDB and the Protein Interfaces, Surfaces, and Assemblies (PISA) are not always consistent in their annotations of biological assemblies in a homologous family. Our data therefore provide an independent check on publicly available annotations of the structures of biological interactions for PDB entries. Common interfaces may also be useful in studies of protein evolution. Coordinates for all interfaces in a cluster are downloadable for further analysis. ProtCiD is available at http://dunbrack2.fccc.edu/protcid.
Collapse
Affiliation(s)
- Qifang Xu
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA
| | | |
Collapse
|
23
|
Abstract
The quaternary structure (QS) of a protein is determined by measuring its molecular weight in solution. The data have to be extracted from the literature, and they may be missing even for proteins that have a crystal structure reported in the Protein Data Bank (PDB). The PDB and other databases derived from it report QS information that either was obtained from the depositors or is based on an analysis of the contacts between polypeptide chains in the crystal, and this frequently differs from the QS determined in solution.The QS of a protein can be predicted from its sequence using either homology or threading methods. However, a majority of the proteins with less than 30% sequence identity have different QSs. A model of the QS can also be derived by docking the subunits when their 3D structure is independently known, but the model is likely to be incorrect if large conformation changes take place when the oligomer assembles.
Collapse
Affiliation(s)
- Anne Poupon
- Yeast Structural Genomics, IBBMC UMR 8619 CNRS, Université Paris-Sud, Orsay, France
| | | |
Collapse
|
24
|
Weitzner B, Meehan T, Xu Q, Dunbrack RL. An unusually small dimer interface is observed in all available crystal structures of cytosolic sulfotransferases. Proteins 2009; 75:289-95. [PMID: 19173308 DOI: 10.1002/prot.22347] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Cytosolic sulfotransferases catalyze the sulfonation of hormones, metabolites, and xenobiotics. Many of these proteins have been shown to form homodimers and heterodimers. An unusually small dimer interface was previously identified by Petrotchenko et al. (FEBS Lett 2001;490:39-43) by cross-linking, protease digestion, and mass spectrometry and verified by site-directed mutagenesis. Analysis of the crystal packing interfaces in all 28 available crystal structures consisting of 17 crystal forms shows that this interface occurs in all of them. With a small number of exceptions, the publicly available databases of biological assemblies contain either monomers or incorrect dimers. Even crystal structures of mouse SULT1E1, which is a monomer in solution, contain the common dimeric interface, although distorted and missing two important salt bridges.
Collapse
Affiliation(s)
- Brian Weitzner
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111, USA
| | | | | | | |
Collapse
|
25
|
Sharon A, Finkelstein A, Shlezinger N, Hatam I. Fungal apoptosis: function, genes and gene function. FEMS Microbiol Rev 2009; 33:833-54. [PMID: 19416362 DOI: 10.1111/j.1574-6976.2009.00180.x] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Cells of all living organisms are programmed to self-destruct under certain conditions. The most well known form of programmed cell death is apoptosis, which is essential for proper development in higher eukaryotes. In fungi, apoptotic-like cell death occurs naturally during aging and reproduction, and can be induced by environmental stresses and exposure to toxic metabolites. The core apoptotic machinery in fungi is similar to that in mammals, but the apoptotic network is less complex and of more ancient origin. Only some of the mammalian apoptosis-regulating proteins have fungal homologs, and the number of protein families is drastically reduced. Expression in fungi of animal proteins that do not have fungal homologs often affects apoptosis, suggesting functional conservation of these components despite the absence of protein-sequence similarity. Functional analysis of Saccharomyces cerevisiae apoptotic genes, and more recently of those in some filamentous species, has revealed partial conservation, along with substantial differences in function and mode of action between fungal and human proteins. It has been suggested that apoptotic proteins might be suitable targets for novel antifungal treatments. However, implementation of this approach requires a better understanding of fungal apoptotic networks and identification of the key proteins regulating apoptotic-like cell death in fungi.
Collapse
Affiliation(s)
- Amir Sharon
- Department of Plant Sciences, Tel Aviv University, Tel Aviv, Israel.
| | | | | | | |
Collapse
|
26
|
SCWRL and MolIDE: computer programs for side-chain conformation prediction and homology modeling. Nat Protoc 2009; 3:1832-47. [PMID: 18989261 DOI: 10.1038/nprot.2008.184] [Citation(s) in RCA: 154] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
SCWRL and MolIDE are software applications for prediction of protein structures. SCWRL is designed specifically for the task of prediction of side-chain conformations given a fixed backbone usually obtained from an experimental structure determined by X-ray crystallography or NMR. SCWRL is a command-line program that typically runs in a few seconds. MolIDE provides a graphical interface for basic comparative (homology) modeling using SCWRL and other programs. MolIDE takes an input target sequence and uses PSI-BLAST to identify and align templates for comparative modeling of the target. The sequence alignment to any template can be manually modified within a graphical window of the target-template alignment and visualization of the alignment on the template structure. MolIDE builds the model of the target structure on the basis of the template backbone, predicted side-chain conformations with SCWRL and a loop-modeling program for insertion-deletion regions with user-selected sequence segments. SCWRL and MolIDE can be obtained at (http://dunbrack.fccc.edu/Software.php).
Collapse
|
27
|
Lee S, Brown A, Pitt WR, Higueruelo AP, Gong S, Bickerton GR, Schreyer A, Tanramluk D, Baylay A, Blundell TL. Structural interactomics: informatics approaches to aid the interpretation of genetic variation and the development of novel therapeutics. MOLECULAR BIOSYSTEMS 2009; 5:1456-72. [DOI: 10.1039/b906402h] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
28
|
Devenish SRA, Gerrard JA. The role of quaternary structure in (β/α)8-barrel proteins: evolutionary happenstance or a higher level of structure-function relationships? Org Biomol Chem 2009; 7:833-9. [DOI: 10.1039/b818251p] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
29
|
Tsuchiya Y, Nakamura H, Kinoshita K. Discrimination between biological interfaces and crystal-packing contacts. Adv Appl Bioinform Chem 2008; 1:99-113. [PMID: 21918609 PMCID: PMC3169932 DOI: 10.2147/aabc.s4255] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A discrimination method between biologically relevant interfaces and artificial crystal-packing contacts in crystal structures was constructed. The method evaluates protein-protein interfaces in terms of complementarities for hydrophobicity, electrostatic potential and shape on the protein surfaces, and chooses the most probable biological interfaces among all possible contacts in the crystal. The method uses a discriminator named as "COMP", which is a linear combination of the complementarities for the above three surface features and does not correlate with the contact area. The discrimination of homo-dimer interfaces from symmetry-related crystal-packing contacts based on the COMP value achieved the modest success rate. Subsequent detailed review of the discrimination results raised the success rate to about 88.8%. In addition, our discrimination method yielded some clues for understanding the interaction patterns in several examples in the PDB. Thus, the COMP discriminator can also be used as an indicator of the "biological-ness" of protein-protein interfaces.
Collapse
Affiliation(s)
- Yuko Tsuchiya
- Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minatoku, Tokyo, 108-8639, Japan
| | | | | |
Collapse
|
30
|
Abstract
AbstractProtein–protein recognition plays an essential role in structure and function. Specific non-covalent interactions stabilize the structure of macromolecular assemblies, exemplified in this review by oligomeric proteins and the capsids of icosahedral viruses. They also allow proteins to form complexes that have a very wide range of stability and lifetimes and are involved in all cellular processes. We present some of the structure-based computational methods that have been developed to characterize the quaternary structure of oligomeric proteins and other molecular assemblies and analyze the properties of the interfaces between the subunits. We compare the size, the chemical and amino acid compositions and the atomic packing of the subunit interfaces of protein–protein complexes, oligomeric proteins, viral capsids and protein–nucleic acid complexes. These biologically significant interfaces are generally close-packed, whereas the non-specific interfaces between molecules in protein crystals are loosely packed, an observation that gives a structural basis to specific recognition. A distinction is made within each interface between a core that contains buried atoms and a solvent accessible rim. The core and the rim differ in their amino acid composition and their conservation in evolution, and the distinction helps correlating the structural data with the results of site-directed mutagenesis and in vitro studies of self-assembly.
Collapse
|
31
|
Lu CH, Huang SW, Lai YL, Lin CP, Shih CH, Huang CC, Hsu WL, Hwang JK. On the relationship between the protein structure and protein dynamics. Proteins 2008; 72:625-34. [PMID: 18247347 DOI: 10.1002/prot.21954] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Recently, we have developed a method (Shih et al., Proteins: Structure, Function, and Bioinformatics 2007;68: 34-38) to compute correlation of fluctuations of proteins. This method, referred to as the protein fixed-point (PFP) model, is based on the positional vectors of atoms issuing from the fixed point, which is the point of the least fluctuations in proteins. One corollary from this model is that atoms lying on the same shell centered at the fixed point will have the same thermal fluctuations. In practice, this model provides a convenient way to compute the average dynamical properties of proteins directly from the geometrical shapes of proteins without the need of any mechanical models, and hence no trajectory integration or sophisticated matrix operations are needed. As a result, it is more efficient than molecular dynamics simulation or normal mode analysis. Though in the previous study the PFP model has been successfully applied to a number of proteins of various folds, it is not clear to what extent this model will be applied. In this article, we have carried out the comprehensive analysis of the PFP model for a dataset comprising 972 high-resolution X-ray structures with pairwise sequence identity <or=25%. We found that in most cases the PFP model works well. However, in case of proteins comprising multiple domains, each domain should be treated separately as an independent dynamical module with its own fixed point; and in case of the protein complex comprising a number of subunits, if functioning as a biological unit, the whole complex should be considered as one single dynamical module with one fixed point. Under such considerations, the resultant correlation coefficient between the computed and the X-ray structural B-factors for the data set is 0.59 and 75% (727/972) of proteins with a correlation coefficient >or=0.5. Our result shows that the fixed-point model is indeed quite general and will be a useful tool for high throughput analysis of dynamical properties of proteins.
Collapse
Affiliation(s)
- Chih-Hao Lu
- Institute of Bioinformatics, National Chiao Tung University, HsinChu 30050, Taiwan, Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
32
|
Xu Q, Canutescu AA, Wang G, Shapovalov M, Obradovic Z, Dunbrack RL. Statistical analysis of interface similarity in crystals of homologous proteins. J Mol Biol 2008; 381:487-507. [PMID: 18599072 DOI: 10.1016/j.jmb.2008.06.002] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2008] [Revised: 05/30/2008] [Accepted: 06/02/2008] [Indexed: 11/27/2022]
Abstract
Many proteins function as homo-oligomers and are regulated via their oligomeric state. For some proteins, the stoichiometry of homo-oligomeric states under various conditions has been studied using gel filtration or analytical ultracentrifugation experiments. The interfaces involved in these assemblies may be identified using cross-linking and mass spectrometry, solution-state NMR, and other experiments. However, for most proteins, the actual interfaces that are involved in oligomerization are inferred from X-ray crystallographic structures using assumptions about interface surface areas and physical properties. Examination of interfaces across different Protein Data Bank (PDB) entries in a protein family reveals several important features. First, similarities in space group, asymmetric unit size, and cell dimensions and angles (within 1%) do not guarantee that two crystals are actually the same crystal form, containing similar relative orientations and interactions within the crystal. Conversely, two crystals in different space groups may be quite similar in terms of all the interfaces within each crystal. Second, NMR structures and an existing benchmark of PDB crystallographic entries consisting of 126 dimers as well as larger structures and 132 monomers were used to determine whether the existence or lack of common interfaces across multiple crystal forms can be used to predict whether a protein is an oligomer or not. Monomeric proteins tend to have common interfaces across only a minority of crystal forms, whereas higher-order structures exhibit common interfaces across a majority of available crystal forms. The data can be used to estimate the probability that an interface is biological if two or more crystal forms are available. Finally, the Protein Interfaces, Surfaces, and Assemblies (PISA) database available from the European Bioinformatics Institute is more consistent in identifying interfaces observed in many crystal forms compared with the PDB and the European Bioinformatics Institute's Protein Quaternary Server (PQS). The PDB, in particular, is missing highly likely biological interfaces in its biological unit files for about 10% of PDB entries.
Collapse
Affiliation(s)
- Qifang Xu
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA
| | | | | | | | | | | |
Collapse
|
33
|
Lin CP, Huang SW, Lai YL, Yen SC, Shih CH, Lu CH, Huang CC, Hwang JK. Deriving protein dynamical properties from weighted protein contact number. Proteins 2008; 72:929-35. [DOI: 10.1002/prot.21983] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
34
|
Selwood T, Tang L, Lawrence SH, Anokhina Y, Jaffe EK. Kinetics and Thermodynamics of the Interchange of the Morpheein Forms of Human Porphobilinogen Synthase. Biochemistry 2008; 47:3245-57. [DOI: 10.1021/bi702113z] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Trevor Selwood
- Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111
| | - Lei Tang
- Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111
| | - Sarah H. Lawrence
- Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111
| | - Yana Anokhina
- Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111
| | - Eileen K. Jaffe
- Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, Pennsylvania 19111
| |
Collapse
|
35
|
Abstract
In a cell, it has been estimated that each protein on average interacts with roughly 10 others, resulting in tens of thousands of proteins known or suspected to have interaction partners; of these, only a tiny fraction have solved protein structures. To partially address this problem, we have developed M-TASSER, a hierarchical method to predict protein quaternary structure from sequence that involves template identification by multimeric threading, followed by multimer model assembly and refinement. The final models are selected by structure clustering. M-TASSER has been tested on a benchmark set comprising 241 dimers having templates with weak sequence similarity and 246 without multimeric templates in the dimer library. Of the total of 207 targets predicted to interact as dimers, 165 (80%) were correctly assigned as interacting with a true positive rate of 68% and a false positive rate of 17%. The initial best template structures have an average root mean-square deviation to native of 5.3, 6.7, and 7.4 A for the monomer, interface, and dimer structures. The final model shows on average a root mean-square deviation improvement of 1.3, 1.3, and 1.5 A over the initial template structure for the monomer, interface, and dimer structures, with refinement evident for 87% of the cases. Thus, we have developed a promising approach to predict full-length quaternary structure for proteins that have weak sequence similarity to proteins of solved quaternary structure.
Collapse
Affiliation(s)
| | - Jeffrey Skolnick
- Address reprint requests to Jeffrey Skolnick, Tel.: 404-407-8975; Fax: 404-385-7478.
| |
Collapse
|
36
|
Günther S, May P, Hoppe A, Frömmel C, Preissner R. Docking without docking: ISEARCH-prediction of interactions using known interfaces. Proteins 2007; 69:839-44. [PMID: 17803236 DOI: 10.1002/prot.21746] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The increasing number of solved protein structures provides a solid number of interfaces, if protein-protein interactions, domain-domain contacts, and contacts between biological units are taken into account. An interface library gives us the opportunity to identify surface regions on a target molecule that are similar by local structure and residue composition. If both unbound components of a possible protein complex exhibit structural similarities to a known interface, the unbound structures can be superposed onto the known interfaces. The approach is accompanied by two mathematical problems. Protein surfaces have to be quickly screened by thousands of patches, and similarity has to be evaluated by a suitable scoring scheme. The used algorithm (NeedleHaystack) identifies similar patches within minutes. Structurally related sites are recognized even if only parts of the template patches are structurally related to the interface region. A successful prediction of the protein complex depends on a suitable template of the library. However, the performed tests indicate that interaction sites are identified even if the similarity is very low. The approach complements existing ab initio methods and provides valuable results on standard benchmark sets.
Collapse
Affiliation(s)
- Stefan Günther
- Institute of Molecular Biology and Bioinformatics - Charité, 14195 Berlin, Germany.
| | | | | | | | | |
Collapse
|