51
|
Andreeva A, Murzin AG. Structural classification of proteins and structural genomics: new insights into protein folding and evolution. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010; 66:1190-7. [PMID: 20944210 PMCID: PMC2954204 DOI: 10.1107/s1744309110007177] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Accepted: 02/24/2010] [Indexed: 11/10/2022]
Abstract
During the past decade, the Protein Structure Initiative (PSI) centres have become major contributors of new families, superfamilies and folds to the Structural Classification of Proteins (SCOP) database. The PSI results have increased the diversity of protein structural space and accelerated our understanding of it. This review article surveys a selection of protein structures determined by the Joint Center for Structural Genomics (JCSG). It presents previously undescribed β-sheet architectures such as the double barrel and spiral β-roll and discusses new examples of unusual topologies and peculiar structural features observed in proteins characterized by the JCSG and other Structural Genomics centres.
Collapse
Affiliation(s)
- Antonina Andreeva
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, England
| | - Alexey G. Murzin
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, England
| |
Collapse
|
52
|
Phages have adapted the same protein fold to fulfill multiple functions in virion assembly. Proc Natl Acad Sci U S A 2010; 107:14384-9. [PMID: 20660769 DOI: 10.1073/pnas.1005822107] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Evolutionary relationships may exist among very diverse groups of proteins even though they perform different functions and display little sequence similarity. The tailed bacteriophages present a uniquely amenable system for identifying such groups because of their huge diversity yet conserved genome structures. In this work, we used structural, functional, and genomic context comparisons to conclude that the head-tail connector protein and tail tube protein of bacteriophage lambda diverged from a common ancestral protein. Further comparisons of tertiary and quaternary structures indicate that the baseplate hub and tail terminator proteins of bacteriophage may also be part of this same family. We propose that all of these proteins evolved from a single ancestral tail tube protein fold, and that gene duplication followed by differentiation led to the specialized roles of these proteins seen in bacteriophages today. Although this type of evolutionary mechanism has been proposed for other systems, our work provides an evolutionary mechanism for a group of proteins with different functions that bear no sequence similarity. Our data also indicate that the addition of a structural element at the N terminus of the lambda head-tail connector protein endows it with a distinctive protein interaction capability compared with many of its putative homologues.
Collapse
|
53
|
Bryan PN, Orban J. Proteins that switch folds. Curr Opin Struct Biol 2010; 20:482-8. [PMID: 20591649 DOI: 10.1016/j.sbi.2010.06.002] [Citation(s) in RCA: 144] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2010] [Accepted: 06/02/2010] [Indexed: 11/26/2022]
Abstract
An increasing number of proteins demonstrate the ability to switch between very different fold topologies, expanding their functional utility through new binding interactions. Recent examples of fold switching from naturally occurring and designed systems have a number of common features: (i) The structural transitions require states with diminished stability; (ii) Switching involves flexible regions in one conformer or the other; (iii) A new binding surface is revealed in the alternate fold that can lead to both stabilization of the alternative state and expansion of biological function. Fold switching not only provides insight into how new folds evolve, but also indicates that an amino acid sequence has more information content than previously thought. A polypeptide chain can encode a stable fold while simultaneously hiding latent propensities for alternative states with novel functions.
Collapse
Affiliation(s)
- Philip N Bryan
- Institute for Bioscience and Biotechnology Research, Department of Bioengineering, University of Maryland, 9600 Gudelsky Drive, Rockville, MD 20850, USA.
| | | |
Collapse
|
54
|
Bouvignies G, Korzhnev DM, Neudecker P, Hansen DF, Cordes MHJ, Kay LE. A simple method for measuring signs of (1)H (N) chemical shift differences between ground and excited protein states. JOURNAL OF BIOMOLECULAR NMR 2010; 47:135-41. [PMID: 20428928 PMCID: PMC3034452 DOI: 10.1007/s10858-010-9418-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2010] [Accepted: 03/26/2010] [Indexed: 05/29/2023]
Abstract
NMR relaxation dispersion spectroscopy is a powerful method for studying protein conformational dynamics whereby visible, ground and invisible, excited conformers interconvert on the millisecond time-scale. In addition to providing kinetics and thermodynamics parameters of the exchange process, the CPMG dispersion experiment also allows extraction of the absolute values of the chemical shift differences between interconverting states, /Delta(omega)/, opening the way for structure determination of excited state conformers. Central to the goal of structural analysis is the availability of the chemical shifts of the excited state that can only be obtained once the signs of Delta(omega) are known. Herein we describe a very simple method for determining the signs of (1)H(N) Delta(omega) values based on a comparison of peak positions in the directly detected dimensions of a pair of (1)H(N)-(15)N correlation maps recorded at different static magnetic fields. The utility of the approach is demonstrated for three proteins that undergo millisecond time-scale conformational rearrangements. Although the method provides fewer signs than previously published techniques it does have a number of strengths: (1) Data sets needed for analysis are typically available from other experiments, such as those required for measuring signs of (15)N Delta(omega) values, thus requiring no additional experimental time, (2) acquisition times in the critical detection dimension can be as long as necessary and (3) the signs obtained can be used to cross-validate those from other approaches.
Collapse
Affiliation(s)
- Guillaume Bouvignies
- Department of Molecular Genetics, The University of Toronto, Toronto, Ontario, M5S 1A8, Canada
| | | | | | | | | | | |
Collapse
|
55
|
Metamorphic proteins mediate evolutionary transitions of structure. Proc Natl Acad Sci U S A 2010; 107:7287-92. [PMID: 20368465 DOI: 10.1073/pnas.0912616107] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The primary sequence of proteins usually dictates a single tertiary and quaternary structure. However, certain proteins undergo reversible backbone rearrangements. Such metamorphic proteins provide a means of facilitating the evolution of new folds and architectures. However, because natural folds emerged at the early stages of evolution, the potential role of metamorphic intermediates in mediating evolutionary transitions of structure remains largely unexplored. We evolved a set of new proteins based on approximately 100 amino acid fragments derived from tachylectin-2--a monomeric, 236 amino acids, five-bladed beta-propeller. Their structures reveal a unique pentameric assembly and novel beta-propeller structures. Although identical in sequence, the oligomeric subunits adopt two, or even three, different structures that together enable the pentameric assembly of two propellers connected via a small linker. Most of the subunits adopt a wild-type-like structure within individual five-bladed propellers. However, the bridging subunits exhibit domain swaps and asymmetric strand exchanges that allow them to complete the two propellers and connect them. Thus, the modular and metamorphic nature of these subunits enabled dramatic changes in tertiary and quaternary structure, while maintaining the lectin function. These oligomers therefore comprise putative intermediates via which beta-propellers can evolve from smaller elements. Our data also suggest that the ability of one sequence to equilibrate between different structures can be evolutionary optimized, thus facilitating the emergence of new structures.
Collapse
|
56
|
Kuhns MS, Girvin AT, Klein LO, Chen R, Jensen KD, Newell EW, Huppa JB, Lillemeier BF, Huse M, Chien YH, Garcia KC, Davis MM. Evidence for a functional sidedness to the alphabetaTCR. Proc Natl Acad Sci U S A 2010; 107:5094-9. [PMID: 20202921 PMCID: PMC2841884 DOI: 10.1073/pnas.1000925107] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The T cell receptor (TCR) and associated CD3gammaepsilon, deltaepsilon, and zetazeta signaling dimers allow T cells to discriminate between different antigens and respond accordingly, but our knowledge of how these parts fit and work together is incomplete. In this study, we provide additional evidence that the CD3 heterodimers congregate on one side of the TCR in both the alphabeta and gammadeltaTCR-CD3 complexes. We also report that the other side of the alphabetaTCR mediates homotypic alphabetaTCR interactions and signaling. Specifically, an erythropoietin receptor-based dimerization assay was used to show that, upon complex assembly, the CD3epsilon chains of two CD3 heterodimers are arranged side-by-side in both the alphabeta and gammadeltaTCR-CD3 complexes. This system was also used to show that alphabetaTCRs can dimerize in the cell membrane and that mutating the unusual outer strands of the Calpha domain impairs this dimerization. Finally, we present data showing that, for CD4 T cells, the mutations that impair alphabetaTCR dimerization also alter ligand-induced calcium mobilization, TCR accumulation at the site of pMHC contact, and polarization toward the site of antigen contact. These data reveal a "functional-sidedness" to the alphabetaTCR constant region, with dimerization occurring on the side of the TCR opposite from where the CD3 heterodimers are located.
Collapse
MESH Headings
- Animals
- Antigen-Presenting Cells/cytology
- CD3 Complex/metabolism
- Calcium Signaling
- Cell Line
- Cell Membrane/metabolism
- Cell Polarity
- Humans
- Intracellular Space/metabolism
- Mice
- Models, Molecular
- Mutation/genetics
- Protein Multimerization
- Protein Structure, Secondary
- Protein Subunits/metabolism
- Receptors, Antigen, T-Cell, alpha-beta/chemistry
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/metabolism
- Receptors, Antigen, T-Cell, gamma-delta/metabolism
- T-Lymphocytes/cytology
Collapse
Affiliation(s)
- Michael S. Kuhns
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Andrew T. Girvin
- Department of Microbiology and Immunology
- Graduate Program in Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Lawrence O. Klein
- Graduate Program in Biophysics
- Stanford University School of Medicine, Stanford, CA 94305
| | - Rebecca Chen
- CCIS/ITI Summer High School Research Program
- Stanford University School of Medicine, Stanford, CA 94305
| | - Kirk D.C. Jensen
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Evan W. Newell
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Johannes B. Huppa
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Björn F. Lillemeier
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Morgan Huse
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Yueh-hsiu Chien
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - K. Christopher Garcia
- Department of Molecular and Cellular Physiology
- Department of Structural Biology, and
- The Howard Hughes Medical Institute, Chevy Chase, MD 20815; and
- Stanford University School of Medicine, Stanford, CA 94305
| | - Mark M. Davis
- Department of Microbiology and Immunology
- The Howard Hughes Medical Institute, Chevy Chase, MD 20815; and
- Stanford University School of Medicine, Stanford, CA 94305
| |
Collapse
|
57
|
Cao B, Elber R. Computational exploration of the network of sequence flow between protein structures. Proteins 2010; 78:985-1003. [PMID: 19899165 DOI: 10.1002/prot.22622] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein-design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. (c) 2009 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Baoqiang Cao
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, Texas 78712, USA
| | | |
Collapse
|
58
|
Gramzow L, Ritz MS, Theissen G. On the origin of MADS-domain transcription factors. Trends Genet 2010; 26:149-53. [PMID: 20219261 DOI: 10.1016/j.tig.2010.01.004] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2009] [Revised: 01/06/2010] [Accepted: 01/06/2010] [Indexed: 11/30/2022]
Abstract
MADS-domain transcription factors are involved in signal transduction and developmental control in plants, animals and fungi. Because their diversification is linked to the origin of novelties in multicellular eukaryotes, the early evolution of MADS-domain proteins is of interest, but has remained enigmatic. Employing whole genome sequence information and remote homology detection methods, we demonstrate that the MADS domain originated from a region of topoisomerases IIA subunit A. Furthermore, we provide evidence that gene duplication occurred in the lineage that led to the MRCA of extant eukaryotes, giving rise to SRF-like and MEF2-like MADS-box genes.
Collapse
Affiliation(s)
- Lydia Gramzow
- Department of Genetics, Friedrich Schiller University Jena, D-07743 Jena, Germany
| | | | | |
Collapse
|
59
|
Sadowski MI, Taylor WR. Protein structures, folds and fold spaces. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2010; 22:033103. [PMID: 21386276 DOI: 10.1088/0953-8984/22/3/033103] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
There has been considerable progress towards the goal of understanding the space of possible tertiary structures adopted by proteins. Despite a greatly increased rate of structure determination and a deliberate strategy of sequencing proteins expected to be very different from those already known, it is now rare to see a genuinely new fold, leading to the conclusion that we have seen the majority of natural structural types. The increase in knowledge has also led to a critical examination of traditional fold-based classifications and their meaning for evolution and protein structures. We review these issues and discuss possible solutions.
Collapse
Affiliation(s)
- Michael I Sadowski
- Division of Mathematical Biology, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | | |
Collapse
|
60
|
A minimal sequence code for switching protein structure and function. Proc Natl Acad Sci U S A 2009; 106:21149-54. [PMID: 19923431 DOI: 10.1073/pnas.0906408106] [Citation(s) in RCA: 184] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present here a structural and mechanistic description of how a protein changes its fold and function, mutation by mutation. Our approach was to create 2 proteins that (i) are stably folded into 2 different folds, (ii) have 2 different functions, and (iii) are very similar in sequence. In this simplified sequence space we explore the mutational path from one fold to another. We show that an IgG-binding, 4beta+alpha fold can be transformed into an albumin-binding, 3-alpha fold via a mutational pathway in which neither function nor native structure is completely lost. The stabilities of all mutants along the pathway are evaluated, key high-resolution structures are determined by NMR, and an explanation of the switching mechanism is provided. We show that the conformational switch from 4beta+alpha to 3-alpha structure can occur via a single amino acid substitution. On one side of the switch point, the 4beta+alpha fold is >90% populated (pH 7.2, 20 degrees C). A single mutation switches the conformation to the 3-alpha fold, which is >90% populated (pH 7.2, 20 degrees C). We further show that a bifunctional protein exists at the switch point with affinity for both IgG and albumin.
Collapse
|
61
|
Galvagnion C, Smith MTJ, Broom A, Vassall KA, Meglei G, Gaspar JA, Stathopulos PB, Cheyne B, Meiering EM. Folding and association of thermophilic dimeric and trimeric DsrEFH proteins: Tm0979 and Mth1491. Biochemistry 2009; 48:2891-906. [PMID: 19290646 DOI: 10.1021/bi801784d] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Although the majority of natural proteins exist as protein-protein complexes, the molecular basis for the formation and regulation of such interactions and the evolution of protein interfaces remain poorly understood. We have investigated these phenomena by characterizing the thermal and chemical denaturation of thermophilic DsrEFH proteins that have a common subunit fold but distinct quaternary structures: homodimeric Tm0979 and homotrimeric Mth1491. Tm0979 forms a moderate affinity dimer, and a monomeric intermediate is readily populated at equilibrium and during folding kinetics. In contrast, the Mth1491 trimer has extremely high stability, so that a monomeric form is not measurably populated at equilibrium, although it may be during folding kinetics. A common mechanism for evolution of quaternary structures may be facile formation of a relatively stable monomeric species, with stabilizing intermolecular interactions centering on alternative environments for a beta-strand at the edge of the monomer, augmented by malleable hydrophobic interactions. The exceptional trimer stability arises from a remarkably slow unfolding rate constant, 6.5 x 10(-13) s(-1), which is a common characteristic of highly stable thermophilic and/or oligomeric proteins. The folding characteristics of Tm0979 and Mth1491 have interesting implications for assembly and regulation of homo- and heterooligomeric proteins in vivo.
Collapse
Affiliation(s)
- Céline Galvagnion
- Guelph-Waterloo Centre for Graduate Work in Chemistry and Biochemistry and Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|
62
|
Structural relationships among proteins with different global topologies and their implications for function annotation strategies. Proc Natl Acad Sci U S A 2009; 106:17377-82. [PMID: 19805138 DOI: 10.1073/pnas.0907971106] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
It has become increasingly apparent that geometric relationships often exist between regions of two proteins that have quite different global topologies or folds. In this article, we examine whether such relationships can be used to infer a functional connection between the two proteins in question. We find, by considering a number of examples involving metal and cation binding, sugar binding, and aromatic group binding, that geometrically similar protein fragments can share related functions, even if they have been classified as belonging to different folds and topologies. Thus, the use of classifications inevitably limits the number of functional inferences that can be obtained from the comparative analysis of protein structures. In contrast, the development of interactive computational tools that recognize the "continuous" nature of protein structure/function space, by increasing the number of potentially meaningful relationships that are considered, may offer a dramatic enhancement in the ability to extract information from protein structure databases. We introduce the MarkUs server, that embodies this strategy and that is designed for a user interested in developing and validating specific functional hypotheses.
Collapse
|
63
|
Horst J, Samudrala R. Diversity of protein structures and difficulties in fold recognition: the curious case of protein G. F1000 BIOLOGY REPORTS 2009; 1:69. [PMID: 20209018 PMCID: PMC2832337 DOI: 10.3410/b1-69] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We examine the ability of current state-of-the-art methods in protein structure prediction to discriminate topologically distant folds encoded by highly similar (>90% sequence identity) designed proteins in blind protein structure prediction experiments. We detail the corresponding prognosis for the protein fold recognition field and highlight the features of the methodologies that successfully deciphered this folding riddle.
Collapse
Affiliation(s)
- Jeremy Horst
- Department of Oral Biology, University of Washington School of Dentistry1959 NE Pacific Street, Seattle, WA 98195-7132USA
- Department of Microbiology, University of Washington School of Medicine1959 NE Pacific Street, Seattle, WA 98195-7132USA
| | - Ram Samudrala
- Department of Oral Biology, University of Washington School of Dentistry1959 NE Pacific Street, Seattle, WA 98195-7132USA
- Department of Microbiology, University of Washington School of Medicine1959 NE Pacific Street, Seattle, WA 98195-7132USA
| |
Collapse
|
64
|
The continuity of protein structure space is an intrinsic property of proteins. Proc Natl Acad Sci U S A 2009; 106:15690-5. [PMID: 19805219 DOI: 10.1073/pnas.0907683106] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The classical view of the space of protein structures is that it is populated by a discrete set of protein folds. For proteins up to 200 residues long, by using structural alignments and building upon ideas of the completeness and continuity of structure space, we show that nearly any structure is significantly related to any other using a transitive set of no more than 7 intermediate structurally related proteins. This result holds for all structures in the Protein Data Bank, even when structural relationships between evolutionary related proteins (as detected by threading or functional analyses) are excluded. A similar picture holds for an artificial library of compact, hydrogen-bonded, homopolypeptide structures. The 3 sets share the global connectivity features of random graphs, in which the local connectivity of each node (i.e., the number of neighboring structures per protein) is preserved. This high connectivity supports the continuous view of single-domain protein structure space. More importantly, these results do not depend on evolution, rather just on the physics of protein structures. The fact that evolutionary divergence need not be invoked to explain the continuous nature of protein structure space has implications for how the universe of protein structures might have originated, and how function should be transferred between proteins of similar structure.
Collapse
|
65
|
Sippl MJ. Fold space unlimited. Curr Opin Struct Biol 2009; 19:312-20. [DOI: 10.1016/j.sbi.2009.03.010] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2009] [Revised: 02/16/2009] [Accepted: 03/16/2009] [Indexed: 11/25/2022]
|
66
|
Petrey D, Honig B. Is protein classification necessary? Toward alternative approaches to function annotation. Curr Opin Struct Biol 2009; 19:363-8. [PMID: 19269161 PMCID: PMC2745633 DOI: 10.1016/j.sbi.2009.02.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2009] [Accepted: 02/02/2009] [Indexed: 11/16/2022]
Abstract
The current nonredundant protein sequence database contains over seven million entries and the number of individual functional domains is significantly larger than this value. The vast quantity of data associated with these proteins poses enormous challenges to any attempt at function annotation. Classification of proteins into sequence and structural groups has been widely used as an approach to simplifying the problem. In this article we question such strategies. We describe how the multifunctionality and structural diversity of even closely related proteins confounds efforts to assign function on the basis of overall sequence or structural similarity. Rather, we suggest that strategies that avoid classification may offer a more robust approach to protein function annotation.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
67
|
Pell LG, Liu A, Edmonds L, Donaldson LW, Howell PL, Davidson AR. The X-ray crystal structure of the phage lambda tail terminator protein reveals the biologically relevant hexameric ring structure and demonstrates a conserved mechanism of tail termination among diverse long-tailed phages. J Mol Biol 2009; 389:938-51. [PMID: 19426744 DOI: 10.1016/j.jmb.2009.04.072] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2009] [Revised: 04/23/2009] [Accepted: 04/28/2009] [Indexed: 01/28/2023]
Abstract
The tail terminator protein (TrP) plays an essential role in phage tail assembly by capping the rapidly polymerizing tail once it has reached its requisite length and serving as the interaction surface for phage heads. Here, we present the 2.7-A crystal structure of a hexameric ring of gpU, the TrP of phage lambda. Using sequence alignment analysis and site-directed mutagenesis, we have shown that this multimeric structure is biologically relevant and we have delineated its functional surfaces. Comparison of the hexameric crystal structure with the solution structure of gpU that we previously solved using NMR spectroscopy shows large structural changes occurring upon multimerization and suggests a mechanism that allows gpU to remain monomeric at high concentrations on its own, yet polymerize readily upon contact with an assembled tail tube. The gpU hexamer displays several flexible loops that play key roles in head and tail binding, implying a role for disorder-to-order transitions in controlling assembly as has been observed with other lambda morphogenetic proteins. Finally, we have found that the hexameric structure of gpU is very similar to the structure of a putative TrP from a contractile phage tail even though it displays no detectable sequence similarity. This finding coupled with further bioinformatic investigations has led us to conclude that the TrPs of non-contractile-tailed phages, such as lambda, are evolutionarily related to those of contractile-tailed phages, such as P2 and Mu, and that all long-tailed phages may utilize a conserved mechanism for tail termination.
Collapse
Affiliation(s)
- Lisa G Pell
- Department of Biochemistry, University of Toronto, ON, Canada
| | | | | | | | | | | |
Collapse
|
68
|
Das M, Ganguly T, Bandhu A, Mondal R, Chanda PK, Jana B, Sau S. Moderately thermostable phage Φ11 Cro repressor has novel DNA-binding capacity and physicochemical properties. BMB Rep 2009; 42:160-5. [DOI: 10.5483/bmbrep.2009.42.3.160] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
69
|
Pascual-García A, Abia D, Ortiz ÁR, Bastolla U. Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures. PLoS Comput Biol 2009; 5:e1000331. [PMID: 19325884 PMCID: PMC2654728 DOI: 10.1371/journal.pcbi.1000331] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Accepted: 02/11/2009] [Indexed: 11/19/2022] Open
Abstract
Structural classifications of proteins assume the existence of the fold, which is an intrinsic equivalence class of protein domains. Here, we test in which conditions such an equivalence class is compatible with objective similarity measures. We base our analysis on the transitive property of the equivalence relationship, requiring that similarity of A with B and B with C implies that A and C are also similar. Divergent gene evolution leads us to expect that the transitive property should approximately hold. However, if protein domains are a combination of recurrent short polypeptide fragments, as proposed by several authors, then similarity of partial fragments may violate the transitive property, favouring the continuous view of the protein structure space. We propose a measure to quantify the violations of the transitive property when a clustering algorithm joins elements into clusters, and we find out that such violations present a well defined and detectable cross-over point, from an approximately transitive regime at high structure similarity to a regime with large transitivity violations and large differences in length at low similarity. We argue that protein structure space is discrete and hierarchic classification is justified up to this cross-over point, whereas at lower similarities the structure space is continuous and it should be represented as a network. We have tested the qualitative behaviour of this measure, varying all the choices involved in the automatic classification procedure, i.e., domain decomposition, alignment algorithm, similarity score, and clustering algorithm, and we have found out that this behaviour is quite robust. The final classification depends on the chosen algorithms. We used the values of the clustering coefficient and the transitivity violations to select the optimal choices among those that we tested. Interestingly, this criterion also favours the agreement between automatic and expert classifications. As a domain set, we have selected a consensus set of 2,890 domains decomposed very similarly in SCOP and CATH. As an alignment algorithm, we used a global version of MAMMOTH developed in our group, which is both rapid and accurate. As a similarity measure, we used the size-normalized contact overlap, and as a clustering algorithm, we used average linkage. The resulting automatic classification at the cross-over point was more consistent than expert ones with respect to the structure similarity measure, with 86% of the clusters corresponding to subsets of either SCOP or CATH superfamilies and fewer than 5% containing domains in distinct folds according to both SCOP and CATH. Almost 15% of SCOP superfamilies and 10% of CATH superfamilies were split, consistent with the notion of fold change in protein evolution. These results were qualitatively robust for all choices that we tested, although we did not try to use alignment algorithms developed by other groups. Folds defined in SCOP and CATH would be completely joined in the regime of large transitivity violations where clustering is more arbitrary. Consistently, the agreement between SCOP and CATH at fold level was lower than their agreement with the automatic classification obtained using as a clustering algorithm, respectively, average linkage (for SCOP) or single linkage (for CATH). The networks representing significant evolutionary and structural relationships between clusters beyond the cross-over point may allow us to perform evolutionary, structural, or functional analyses beyond the limits of classification schemes. These networks and the underlying clusters are available at http://ub.cbm.uam.es/research/ProtNet.php Making order of the fast-growing information on proteins is essential for gaining evolutionary and functional knowledge. The most successful approaches to this task are based on classifications of protein structures, such as SCOP and CATH, which assume a discrete view of the protein structure space as a collection of separated equivalence classes (folds). However, several authors proposed that protein domains should be regarded as assemblies of polypeptide fragments, which implies that the protein–structure space is continuous. Here, we assess these views of domain space through the concept of transitivity; i.e., we test whether structure similarity of A with B and B with C implies that A and C are similar, as required for consistent classification. We find that the domain space is approximately transitive and discrete at high similarity and continuous at low similarity, where transitivity is severely violated. Comparing our classification at the cross-over similarity with CATH and SCOP, we find that they join proteins at low similarity where classification is inconsistent. Part of this discrepancy is due to structural divergence of homologous domains, which are forced to be in a single cluster in CATH and SCOP. Structural and evolutionary relationships between consistent clusters are represented as a network in our approach, going beyond current protein classification schemes. We conjecture that our results are related to a change of evolutionary regime, from uniparental divergent evolution for highly related domains to assembly of large fragments for which the classical tree representation is unsuitable.
Collapse
Affiliation(s)
| | - David Abia
- Centro de Biología Molecular ‘Severo Ochoa’ (CSIC-UAM), Cantoblanco, Madrid, Spain
| | - Ángel R. Ortiz
- Centro de Biología Molecular ‘Severo Ochoa’ (CSIC-UAM), Cantoblanco, Madrid, Spain
| | - Ugo Bastolla
- Centro de Biología Molecular ‘Severo Ochoa’ (CSIC-UAM), Cantoblanco, Madrid, Spain
- * E-mail:
| |
Collapse
|
70
|
Martínez-Cruz LA, Encinar JA, Kortazar D, Prieto J, Gómez J, Fernández-Millán P, Lucas M, Arribas EA, Fernández JA, Martínez-Chantar ML, Mato JM, Neira JL. The CBS Domain Protein MJ0729 of Methanocaldococcus jannaschii Is a Thermostable Protein with a pH-Dependent Self-Oligomerization. Biochemistry 2009; 48:2760-76. [DOI: 10.1021/bi801920r] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Luis Alfonso Martínez-Cruz
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - José A. Encinar
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - Danel Kortazar
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - Jesús Prieto
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - Javier Gómez
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - Pablo Fernández-Millán
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - María Lucas
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - Egoitz Astigarraga Arribas
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - José Andrés Fernández
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - María Luz Martínez-Chantar
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - José M. Mato
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| | - José Luis Neira
- Unidad de Biología Estructural, CIC bioGUNE, Parque Tecnológico de Vizcaya, Ed. 800, 48160 Derio, Bizkaia, Spain, Instituto de Biología Molecular y Celular, Universidad Miguel Hernández, Avda. del Ferrocarril s/n, 03202 Elche (Alicante), Spain, Structural Biology and Biocomputing Programme, Centro Nacional de Investigaciones Oncológicas (CNIO), 28007 Madrid, Spain, Departamento de Química-Física, Universidad del País Vasco UPV-EHU, Lejona, Bizkaia, Spain, Unidad de Metabolómica, CIC bioGUNE, Parque
| |
Collapse
|
71
|
Abstract
Molecular modeling techniques have made significant advances in recent years and are becoming essential components of many chemical, physical and biological studies. Here we present three widely used techniques used in the simulation of biomolecular systems: structural and homology modeling, molecular dynamics and molecular docking. For each of these topics we present a brief discussion of the underlying scientific basis of the technique, some simple examples of how the method is commonly applied, and some discussion of the limitations and caveats of which the user should be aware. References for further reading as well as an extensive list of software resources are provided.
Collapse
Affiliation(s)
- Akansha Saxena
- Biomedical Engineering, Washington University, St Louis, Missouri, USA
| | - Diana Wong
- Biomedical Engineering, Washington University, St Louis, Missouri, USA
| | - Karthikeyan Diraviyam
- Biomedical Engineering and Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - David Sept
- Biomedical Engineering and Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
72
|
Affiliation(s)
- Alexey G Murzin
- MRC Centre for Protein Engineering, Hills Road, Cambridge CB2 0QH, UK.
| |
Collapse
|
73
|
Dubrava MS, Ingram WM, Roberts SA, Weichsel A, Montfort WR, Cordes MHJ. N15 Cro and lambda Cro: orthologous DNA-binding domains with completely different but equally effective homodimer interfaces. Protein Sci 2008; 17:803-12. [PMID: 18369196 DOI: 10.1110/ps.073330808] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Bacteriophage Cro proteins bind to target DNA as dimers but do not all dimerize with equal strength, and differ in fold in the region of the dimer interface. We report the structure of the Cro protein from Enterobacteria phage N15 at 1.05 A resolution. The subunit fold contains five alpha-helices and is closely similar to the structure of P22 Cro (1.3 A backbone room mean square difference over 52 residues), but quite different from that of lambda Cro, a structurally diverged member of this family with a mixed alpha-helix/beta-sheet fold. N15 Cro crystallizes as a biological dimer with an extensive interface (1303 A(2) change in accessible surface area per dimer) and also dimerizes in solution with a K(d) of 5.1 +/- 1.5 microM. Its dimerization is much stronger than that of its structural homolog P22 Cro, which does not self-associate detectably in solution. Instead, the level of self-association and interfacial area for N15 Cro is similar to that of lambda Cro, even though these two orthologs do not share the same fold and have dimer interfaces that are qualitatively different in structure. The common Cro ancestor is thought to be an all-helical monomer similar to P22 Cro. We propose that two Cro descendants independently developed stronger dimerization by entirely different mechanisms.
Collapse
Affiliation(s)
- Matthew S Dubrava
- Department of Biochemistry and Molecular Biophysics, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | | | |
Collapse
|
74
|
|