1
|
Li Y, Arcos S, Sabsay KR, te Velthuis AJW, Lauring AS. Deep mutational scanning reveals the functional constraints and evolutionary potential of the influenza A virus PB1 protein. J Virol 2023; 97:e0132923. [PMID: 37882522 PMCID: PMC10688322 DOI: 10.1128/jvi.01329-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 10/08/2023] [Indexed: 10/27/2023] Open
Abstract
IMPORTANCE The influenza virus polymerase is important for adaptation to new hosts and, as a determinant of mutation rate, for the process of adaptation itself. We performed a deep mutational scan of the polymerase basic 1 (PB1) protein to gain insights into the structural and functional constraints on the influenza RNA-dependent RNA polymerase. We find that PB1 is highly constrained at specific sites that are only moderately predicted by the global structure or larger domain. We identified a number of beneficial mutations, many of which have been shown to be functionally important or observed in influenza virus' natural evolution. Overall, our atlas of PB1 mutations and their fitness impacts serves as an important resource for future studies of influenza replication and evolution.
Collapse
Affiliation(s)
- Yuan Li
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Sarah Arcos
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
| | - Kimberly R. Sabsay
- Department of Molecular Biology, Princeton University, Princeton, New Jersey, USA
- Lewis-Sigler Institute, Princeton University, Princeton, New Jersey, USA
| | | | - Adam S. Lauring
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
2
|
Artsimovitch I, Ramírez-Sarmiento CA. Metamorphic proteins under a computational microscope: Lessons from a fold-switching RfaH protein. Comput Struct Biotechnol J 2022; 20:5824-5837. [PMID: 36382197 PMCID: PMC9630627 DOI: 10.1016/j.csbj.2022.10.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 10/18/2022] [Accepted: 10/18/2022] [Indexed: 11/28/2022] Open
Abstract
Metamorphic proteins constitute unexpected paradigms of the protein folding problem, as their sequences encode two alternative folds, which reversibly interconvert within biologically relevant timescales to trigger different cellular responses. Once considered a rare aberration, metamorphism may be common among proteins that must respond to rapidly changing environments, exemplified by NusG-like proteins, the only transcription factors present in every domain of life. RfaH, a specialized paralog of bacterial NusG, undergoes an all-α to all-β domain switch to activate expression of virulence and conjugation genes in many animal and plant pathogens and is the quintessential example of a metamorphic protein. The dramatic nature of RfaH structural transformation and the richness of its evolutionary history makes for an excellent model for studying how metamorphic proteins switch folds. Here, we summarize the structural and functional evidence that sparked the discovery of RfaH as a metamorphic protein, the experimental and computational approaches that enabled the description of the molecular mechanism and refolding pathways of its structural interconversion, and the ongoing efforts to find signatures and general properties to ultimately describe the protein metamorphome.
Collapse
Affiliation(s)
- Irina Artsimovitch
- Department of Microbiology and The Center for RNA Biology, The Ohio State University, Columbus, OH, USA
| | - César A. Ramírez-Sarmiento
- Institute for Biological and Medical Engineering, Schools of Engineering, Medicine and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, Chile
- ANID, Millennium Science Initiative Program, Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| |
Collapse
|
3
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
4
|
Hanau S, Helliwell JR. 6-Phosphogluconate dehydrogenase and its crystal structures. Acta Crystallogr F Struct Biol Commun 2022; 78:96-112. [PMID: 35234135 PMCID: PMC8900737 DOI: 10.1107/s2053230x22001091] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 01/31/2022] [Indexed: 11/10/2022] Open
Abstract
6-Phosphogluconate dehydrogenase (6PGDH; EC 1.1.1.44) catalyses the oxidative decarboxylation of 6-phosphogluconate to ribulose 5-phosphate in the context of the oxidative part of the pentose phosphate pathway. Depending on the species, it can be a homodimer or a homotetramer. Oligomerization plays a functional role not only because the active site is at the interface between subunits but also due to the interlocking tail-modulating activity, similar to that of isocitrate dehydrogenase and malic enzyme, which catalyse a similar type of reaction. Since the pioneering crystal structure of sheep liver 6PGDH, which allowed motifs common to the β-hydroxyacid dehydrogenase superfamily to be recognized, several other 6PGDH crystal structures have been solved, including those of ternary complexes. These showed that more than one conformation exists, as had been suggested for many years from enzyme studies in solution. It is inferred that an asymmetrical conformation with a rearrangement of one of the two subunits underlies the homotropic cooperativity. There has been particular interest in the presence or absence of sulfate during crystallization. This might be related to the fact that this ion, which is a competitive inhibitor that binds in the active site, can induce the same 6PGDH configuration as in the complexes with physiological ligands. Mutagenesis, inhibitors, kinetic and binding studies, post-translational modifications and research on the enzyme in cancer cells have been complementary to the crystallographic studies. Computational modelling and new structural studies will probably help to refine the understanding of the functioning of this enzyme, which represents a promising therapeutic target in immunity, cancer and infective diseases. 6PGDH also has applied-science potential as a biosensor or a biobattery. To this end, the enzyme has been efficiently immobilized on specific polymers and nanoparticles. This review spans the 6PGDH literature and all of the 6PGDH crystal structure data files held by the Protein Data Bank.
Collapse
|
5
|
Sanchez-Pulido L, Ponting CP. Extending the Horizon of Homology Detection with Coevolution-based Structure Prediction. J Mol Biol 2021; 433:167106. [PMID: 34139218 PMCID: PMC8527833 DOI: 10.1016/j.jmb.2021.167106] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 12/12/2022]
Abstract
Traditional sequence analysis algorithms fail to identify distant homologies when they lie beyond a detection horizon. In this review, we discuss how co-evolution-based contact and distance prediction methods are pushing back this homology detection horizon, thereby yielding new functional insights and experimentally testable hypotheses. Based on correlated substitutions, these methods divine three-dimensional constraints among amino acids in protein sequences that were previously devoid of all annotated domains and repeats. The new algorithms discern hidden structure in an otherwise featureless sequence landscape. Their revelatory impact promises to be as profound as the use, by archaeologists, of ground-penetrating radar to discern long-hidden, subterranean structures. As examples of this, we describe how triplicated structures reflecting longin domains in MON1A-like proteins, or UVR-like repeats in DISC1, emerge from their predicted contact and distance maps. These methods also help to resolve structures that do not conform to a "beads-on-a-string" model of protein domains. In one such example, we describe CFAP298 whose ubiquitin-like domain was previously challenging to perceive owing to a large sequence insertion within it. More generally, the new algorithms permit an easier appreciation of domain families and folds whose evolution involved structural insertion or rearrangement. As we exemplify with α1-antitrypsin, coevolution-based predicted contacts may also yield insights into protein dynamics and conformational change. This new combination of structure prediction (using innovative co-evolution based methods) and homology inference (using more traditional sequence analysis approaches) shows great promise for bringing into view a sea of evolutionary relationships that had hitherto lain far beyond the horizon of homology detection.
Collapse
Affiliation(s)
- Luis Sanchez-Pulido
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK.
| | - Chris P Ponting
- Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh EH4 2XU, UK.
| |
Collapse
|
6
|
Maturana P, Tobar-Calfucoy E, Fuentealba M, Roversi P, Garratt R, Cabrera R. Crystal structure of the 6-phosphogluconate dehydrogenase from Gluconobacter oxydans reveals tetrameric 6PGDHs as the crucial intermediate in the evolution of structure and cofactor preference in the 6PGDH family. Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.16572.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Background: The enzyme 6-phosphogluconate dehydrogenase (6PGDH) is the central enzyme of the oxidative pentose phosphate pathway. Members of the 6PGDH family belong to different classes: either homodimeric enzymes assembled from long-chain subunits or homotetrameric ones assembled from short-chain subunits. Dimeric 6PGDHs bear an internal duplication absent in tetrameric 6PGDHs and distant homologues of the β-hydroxyacid dehydrogenase (βHADH) superfamily. Methods: We use X-ray crystallography to determine the structure of the apo form of the 6PGDH from Gluconobacter oxydans (Go6PGDH). We carried out a structural and phylogenetic analysis of short and long-chain 6PGDHs. We put forward an evolutionary hypothesis explaining the differences seen in oligomeric state vs. dinucleotide preference of the 6PGDH family. We determined the cofactor preference of Go6PGDH at different 6-phosphogluconate concentrations, characterizing the wild-type enzyme and three-point mutants of residues in the cofactor binding site of Go6PGDH. Results: The structural comparison suggests that the 6PG binding site initially evolved by exchanging C-terminal α-helices between subunits. An internal duplication event changed the quaternary structure of the enzyme from a tetrameric to a dimeric arrangement. The phylogenetic analysis suggests that 6PGDHs have spread from Bacteria to Archaea and Eukarya on multiple occasions by lateral gene transfer. Sequence motifs consistent with NAD+- and NADP+-specificity are found in the β2-α2 loop of dimeric and tetrameric 6PGDHs. Site-directed mutagenesis of Go6PGDH inspired by this analysis fully reverses dinucleotide preference. One of the mutants we engineered has the highest efficiency and specificity for NAD+ so far described for a 6PGDH. Conclusions: The family 6PGDH comprises dimeric and tetrameric members whose active sites are conformed by a C-terminal α-helix contributed from adjacent subunits. Dimeric 6PGDHs have evolved from the duplication-fusion of the tetrameric C-terminal domain before independent transitions of cofactor specificity. Changes in the conserved β2-α2 loop are crucial to modulate the cofactor specificity in Go6PGDH.
Collapse
|
7
|
Hussain A, Shahbaz M, Tariq M, Ibrahim M, Hong X, Naeem F, Khalid Z, Raza HMZ, Bo Z, Bin L. Genome re-seqeunce and analysis of Burkholderia glumae strain AU6208 and evidence of toxoflavin: A potential bacterial toxin. Comput Biol Chem 2020; 86:107245. [PMID: 32172200 DOI: 10.1016/j.compbiolchem.2020.107245] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Revised: 03/01/2020] [Accepted: 03/03/2020] [Indexed: 12/29/2022]
Abstract
Burkholderia glumae, the primary causative agent of bacterial panicle blight in rice, has been reported as an opportunistic pathogen in patients with chronic infections. This study aimed to re-sequence the clinical isolate B. glumae strain AU6208 and comparatively analyze its genome using B. glumae strain BGR1 from rice plant as the reference. Re-sequencing results revealed that the genome of strain AU6208 comprised 96 contigs corresponding to a 6.1 Mbp genome of the strain AU6208, with 5322 coding sequences and 68.2 % GC content; this is much larger compared to the genome previously sequenced by us and described by Seo et al (2015), which was reported to be 4.1 Mbp comprising >1200 contigs, 4361 coding sequences, and 67.31 % GC content. Moreover, this updated genome shares >80 % identity to the 7.2 Mbp genome of BGR1, which encodes 6491 coding sequences and has 68.3 % GC content. Further computational analysis revealed that the strain AU6208 encodes several bacteriocin biosynthesis genes, antibiotic, as well as virulent genes such as toxoflavin genes, which included 425 specialty genes and 12 toxoflavin genes. Upon further characterization, 12 toxoflavins (ToxA, B, C, D, E, F, G, H, I, J, TofI, and TofR) were found in AU6208 with 70-100 % sequence, family, and domain similarity with that of BGR1. Upon comparison with BGR1, the structural characterizations of selected toxoflavin genes (ToxB, ToxC, ToxG, H, and TofI) revealed variations in 2D and 3D structures such as differences in α-helix, β-sheets, loops, physiological properties of proteins, RMSD values, etc. These variations may play significant role in different mode of action in different hosts thereby indicating that in addition to their respective hosts, toxoflavins could also contribute to exploit other hosts across the kingdom. In addition to understanding the epidemiology of strain AU6208, this updated genomics data will also unfold the pathogenicity of bacteria in diversity of various hosts and anti-virulence.
Collapse
Affiliation(s)
- Annam Hussain
- State Key Laboratory of Rice Biology and Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou, China; Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Maham Shahbaz
- Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Maria Tariq
- Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Muhammad Ibrahim
- Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Xianxian Hong
- State Key Laboratory of Rice Biology and Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou, China
| | - Faryal Naeem
- Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Zunera Khalid
- Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Hafiz Muhammad Zeeshan Raza
- Genomics and Computational Biology Laboratory, Department of Biosciences, COMSATS University Islamabad, Sahiwal Campus, Pakistan
| | - Zhu Bo
- Key Laboratory of Urban Agriculture by Ministry of Agriculture of China, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai, China.
| | - Li Bin
- State Key Laboratory of Rice Biology and Key Lab of Molecular Biology of Crop Pathogens and Insects, Institute of Biotechnology, Zhejiang University, Hangzhou, China.
| |
Collapse
|
8
|
Joseph JA, Chakraborty D, Wales DJ. Energy Landscape for Fold-Switching in Regulatory Protein RfaH. J Chem Theory Comput 2018; 15:731-742. [DOI: 10.1021/acs.jctc.8b00912] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Jerelle A. Joseph
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Debayan Chakraborty
- Department of Chemistry, University of Texas at Austin, Austin, Texas 78712, United States
| | - David J. Wales
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
9
|
Lessons from making the Structural Classification of Proteins (SCOP) and their implications for protein structure modelling. Biochem Soc Trans 2017; 44:937-43. [PMID: 27284063 PMCID: PMC5011417 DOI: 10.1042/bst20160053] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2016] [Indexed: 12/04/2022]
Abstract
The Structural Classification of Proteins (SCOP) database has facilitated the development of many tools and algorithms and it has been successfully used in protein structure prediction and large-scale genome annotations. During the development of SCOP, numerous exceptions were found to topological rules, along with complex evolutionary scenarios and peculiarities in proteins including the ability to fold into alternative structures. This article reviews cases of structural variations observed for individual proteins and among groups of homologues, knowledge of which is essential for protein structure modelling.
Collapse
|
10
|
Dybas JM, Fiser A. Development of a motif-based topology-independent structure comparison method to identify evolutionarily related folds. Proteins 2016; 84:1859-1874. [PMID: 27671894 PMCID: PMC5118133 DOI: 10.1002/prot.25169] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Revised: 08/17/2016] [Accepted: 08/25/2016] [Indexed: 11/09/2022]
Abstract
Structure conservation, functional similarities, and homologous relationships that exist across diverse protein topologies suggest that some regions of the protein fold universe are continuous. However, the current structure classification systems are based on hierarchical organizations, which cannot accommodate structural relationships that span fold definitions. Here, we describe a novel, super-secondary-structure motif-based, topology-independent structure comparison method (SmotifCOMP) that is able to quantitatively identify structural relationships between disparate topologies. The basis of SmotifCOMP is a systematically defined super-secondary-structure motif library whose representative geometries are shown to be saturated in the Protein Data Bank and exhibit a unique distribution within the known folds. SmotifCOMP offers a robust and quantitative technique to compare domains that adopt different topologies since the method does not rely on a global superposition. SmotifCOMP is used to perform an exhaustive comparison of the known folds and the identified relationships are used to produce a nonhierarchical representation of the fold space that reflects the notion of a continuous and connected fold universe. The current work offers insight into previously hypothesized evolutionary relationships between disparate folds and provides a resource for exploring novel ones. Proteins 2016; 84:1859-1874. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Joseph M. Dybas
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue Bronx, NY 10461, USA
- Department of Biochemistry, Albert Einstein College of Medicine, 1300 Morris Park Avenue Bronx, NY 10461, USA
| | - Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, 1300 Morris Park Avenue Bronx, NY 10461, USA
- Department of Biochemistry, Albert Einstein College of Medicine, 1300 Morris Park Avenue Bronx, NY 10461, USA
| |
Collapse
|
11
|
An atypical segment swap in the DN and DC domains of the Acr_tran family resistance-nodulation-cell division pump. J Struct Biol 2016; 196:358-363. [PMID: 27542537 DOI: 10.1016/j.jsb.2016.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 08/12/2016] [Accepted: 08/16/2016] [Indexed: 11/23/2022]
Abstract
Domain/segment swapping is an exchange of equivalent secondary structure element(s) among two or more protein domains resulting in the reconstitution of the original fold while simultaneously causing oligomerization. Here we report an example of the outer membrane factor docking region of the Acr_tran family (PF00873) resistance-nodulation-cell division pump, in which a swapped, misfolded state, of the ferredoxin-like fold of the DN and DC domains, effectuates oligomerization. The atypical segment swap and the associated displacement of a region of the ferredoxin-like fold leads to a topology that is distinct from the original fold. To our knowledge, such segment swaps and associated fold change are rare. This exemplifies the role of functional constraints including oligomerization that determine the interplay between sequence and the three-dimensional structure of proteins.
Collapse
|
12
|
Schaeffer RD, Kinch LN, Liao Y, Grishin NV. Classification of proteins with shared motifs and internal repeats in the ECOD database. Protein Sci 2016; 25:1188-203. [PMID: 26833690 DOI: 10.1002/pro.2893] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Revised: 01/23/2016] [Accepted: 01/27/2016] [Indexed: 12/19/2022]
Abstract
Proteins and their domains evolve by a set of events commonly including the duplication and divergence of small motifs. The presence of short repetitive regions in domains has generally constituted a difficult case for structural domain classifications and their hierarchies. We developed the Evolutionary Classification Of protein Domains (ECOD) in part to implement a new schema for the classification of these types of proteins. Here we document the ways in which ECOD classifies proteins with small internal repeats, widespread functional motifs, and assemblies of small domain-like fragments in its evolutionary schema. We illustrate the ways in which the structural genomics project impacted the classification and characterization of new structural domains and sequence families over the decade.
Collapse
Affiliation(s)
- R Dustin Schaeffer
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| | - Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| | - Yuxing Liao
- Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050.,Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390-9050
| |
Collapse
|
13
|
Cahn JKB, Brinkmann-Chen S, Buller AR, Arnold FH. Artificial domain duplication replicates evolutionary history of ketol-acid reductoisomerases. Protein Sci 2015; 25:1241-8. [PMID: 26644020 DOI: 10.1002/pro.2852] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 12/01/2015] [Indexed: 11/11/2022]
Abstract
The duplication of protein structural domains has been proposed as a common mechanism for the generation of new protein folds. A particularly interesting case is the class II ketol-acid reductoisomerase (KARI), which putatively arose from an ancestral class I KARI by duplication of the C-terminal domain and corresponding loss of obligate dimerization. As a result, the class II enzymes acquired a deeply embedded figure-of-eight knot. To test this evolutionary hypothesis we constructed a novel class II KARI by duplicating the C-terminal domain of a hyperthermostable class I KARI. The new protein is monomeric, as confirmed by gel filtration and X-ray crystallography, and has the deeply knotted class II KARI fold. Surprisingly, its catalytic activity is nearly unchanged from the parent KARI. This provides strong evidence in support of domain duplication as the mechanism for the evolution of the class II KARI fold and demonstrates the ability of domain duplication to generate topological novelty in a function-neutral manner.
Collapse
Affiliation(s)
- Jackson K B Cahn
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, 91125
| | - Sabine Brinkmann-Chen
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, 91125
| | - Andrew R Buller
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, 91125
| | - Frances H Arnold
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California, 91125
| |
Collapse
|
14
|
Alva V, Söding J, Lupas AN. A vocabulary of ancient peptides at the origin of folded proteins. eLife 2015; 4:e09410. [PMID: 26653858 PMCID: PMC4739770 DOI: 10.7554/elife.09410] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Accepted: 12/13/2015] [Indexed: 01/01/2023] Open
Abstract
The seemingly limitless diversity of proteins in nature arose from only a few thousand domain prototypes, but the origin of these themselves has remained unclear. We are pursuing the hypothesis that they arose by fusion and accretion from an ancestral set of peptides active as co-factors in RNA-dependent replication and catalysis. Should this be true, contemporary domains may still contain vestiges of such peptides, which could be reconstructed by a comparative approach in the same way in which ancient vocabularies have been reconstructed by the comparative study of modern languages. To test this, we compared domains representative of known folds and identified 40 fragments whose similarity is indicative of common descent, yet which occur in domains currently not thought to be homologous. These fragments are widespread in the most ancient folds and enriched for iron-sulfur- and nucleic acid-binding. We propose that they represent the observable remnants of a primordial RNA-peptide world.
Collapse
Affiliation(s)
- Vikram Alva
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Johannes Söding
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Andrei N Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| |
Collapse
|
15
|
Edwards H, Deane CM. Structural Bridges through Fold Space. PLoS Comput Biol 2015; 11:e1004466. [PMID: 26372166 PMCID: PMC4570669 DOI: 10.1371/journal.pcbi.1004466] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 07/12/2015] [Indexed: 12/05/2022] Open
Abstract
Several protein structure classification schemes exist that partition the protein universe into structural units called folds. Yet these schemes do not discuss how these units sit relative to each other in a global structure space. In this paper we construct networks that describe such global relationships between folds in the form of structural bridges. We generate these networks using four different structural alignment methods across multiple score thresholds. The networks constructed using the different methods remain a similar distance apart regardless of the probability threshold defining a structural bridge. This suggests that at least some structural bridges are method specific and that any attempt to build a picture of structural space should not be reliant on a single structural superposition method. Despite these differences all representations agree on an organisation of fold space into five principal community structures: all-α, all-β sandwiches, all-β barrels, α/β and α + β. We project estimated fold ages onto the networks and find that not only are the pairings of unconnected folds associated with higher age differences than bridged folds, but this difference increases with the number of networks displaying an edge. We also examine different centrality measures for folds within the networks and how these relate to fold age. While these measures interpret the central core of fold space in varied ways they all identify the disposition of ancestral folds to fall within this core and that of the more recently evolved structures to provide the peripheral landscape. These findings suggest that evolutionary information is encoded along these structural bridges. Finally, we identify four highly central pivotal folds representing dominant topological features which act as key attractors within our landscapes. Folds are considered to be the structural units which make up the protein universe. Structural classification schemes focus on the assignment and organisation of protein domains into folds. However, they do not suggest how different folds might relate to one another in a global way. We introduce the concept of bridges through fold space: significant similarities between these units. We consider four alignment methods and a dynamic approach to placing these bridges. A greater consensus between these methods cannot be achieved by simply increasing the stringency with which edges are assigned. Instead, we emphasise the importance of considering consensus maps and only report results where there is agreement across all networks. It is possible that a study of the bridges may reveal evolutionary relationships. Based on a phylogenetic analysis of structures, we find that bridges consistently fall between folds which evolved at similar times. Moreover, the landscapes all consist of a core of older folds, with younger structures more often seen at the periphery. Finally we identify four pivotal folds in the landscapes. They contain topological motifs which unite disparate regions of fold space.
Collapse
Affiliation(s)
- Hannah Edwards
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Charlotte M. Deane
- Department of Statistics, University of Oxford, Oxford, United Kingdom
- * E-mail:
| |
Collapse
|
16
|
Eaton KV, Anderson WJ, Dubrava MS, Kumirov VK, Dykstra EM, Cordes MHJ. Studying protein fold evolution with hybrids of differently folded homologs. Protein Eng Des Sel 2015; 28:241-50. [PMID: 25991865 DOI: 10.1093/protein/gzv027] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2015] [Accepted: 04/20/2015] [Indexed: 11/13/2022] Open
Abstract
To study the sequence determinants governing protein fold evolution, we generated hybrid sequences from two homologous proteins with 40% identity but different folds: Pfl 6 Cro, which has a mixed α + β structure, and Xfaso 1 Cro, which has an all α-helical structure. First, we first examined eight chimeric hybrids in which the more structurally conserved N-terminal half of one protein was fused to the more structurally divergent C-terminal half of the other. None of these chimeras folded, as judged by circular dichroism spectra and thermal melts, suggesting that both halves have strong intrinsic preferences for the native global fold pattern, and/or that the interfaces between the halves are not readily interchangeable. Second, we examined 10 hybrids in which blocks of the structurally divergent C-terminal region were exchanged. These hybrids showed varying levels of thermal stability and suggested that the key residues in the Xfaso 1 C terminus specifying the all-α fold were concentrated near the end of helix 4 in Xfaso 1, which aligns to the end of strand 2 in Pfl 6. Finally, we generated hybrid substitutions for each individual residue in this critical region and measured thermal stabilities. The results suggested that R47 and V48 were the strongest factors that excluded formation of the α + β fold in the C-terminal region of Xfaso 1. In support of this idea, we found that the folding stability of one of the original eight chimeras could be rescued by back-substituting these two residues. Overall, the results show not only that the key factors for Cro fold specificity and evolution are global and multifarious, but also that some all-α Cro proteins have a C-terminal subdomain sequence within a few substitutions of switching to the α + β fold.
Collapse
Affiliation(s)
- Karen V Eaton
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721-0088, USA
| | - William J Anderson
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721-0088, USA
| | - Matthew S Dubrava
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721-0088, USA
| | - Vlad K Kumirov
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721-0088, USA
| | - Emily M Dykstra
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721-0088, USA
| | - Matthew H J Cordes
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, AZ 85721-0088, USA
| |
Collapse
|
17
|
Sikosek T, Chan HS. Biophysics of protein evolution and evolutionary protein biophysics. J R Soc Interface 2015; 11:20140419. [PMID: 25165599 DOI: 10.1098/rsif.2014.0419] [Citation(s) in RCA: 150] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The study of molecular evolution at the level of protein-coding genes often entails comparing large datasets of sequences to infer their evolutionary relationships. Despite the importance of a protein's structure and conformational dynamics to its function and thus its fitness, common phylogenetic methods embody minimal biophysical knowledge of proteins. To underscore the biophysical constraints on natural selection, we survey effects of protein mutations, highlighting the physical basis for marginal stability of natural globular proteins and how requirement for kinetic stability and avoidance of misfolding and misinteractions might have affected protein evolution. The biophysical underpinnings of these effects have been addressed by models with an explicit coarse-grained spatial representation of the polypeptide chain. Sequence-structure mappings based on such models are powerful conceptual tools that rationalize mutational robustness, evolvability, epistasis, promiscuous function performed by 'hidden' conformational states, resolution of adaptive conflicts and conformational switches in the evolution from one protein fold to another. Recently, protein biophysics has been applied to derive more accurate evolutionary accounts of sequence data. Methods have also been developed to exploit sequence-based evolutionary information to predict biophysical behaviours of proteins. The success of these approaches demonstrates a deep synergy between the fields of protein biophysics and protein evolution.
Collapse
Affiliation(s)
- Tobias Sikosek
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| | - Hue Sun Chan
- Department of Biochemistry, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Department of Physics, University of Toronto, Toronto, Ontario, Canada M5S 1A8
| |
Collapse
|
18
|
Andreeva A, Howorth D, Chothia C, Kulesha E, Murzin AG. Investigating Protein Structure and Evolution with SCOP2. ACTA ACUST UNITED AC 2015; 49:1.26.1-1.26.21. [PMID: 25754991 DOI: 10.1002/0471250953.bi0126s49] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
SCOP2 is a successor to the Structural Classification of Proteins (SCOP) database that organizes proteins of known structure according to their structural and evolutionary relationships. It was designed to provide a more advanced framework for the classification of proteins. The SCOP2 classification is described in terms of a directed acyclic graph in which each node defines a relationship of particular type that is represented by a region of protein structure and sequence. The SCOP2 data are accessible via SCOP2-Browser and SCOP2-Graph. This protocol unit describes different ways to explore and investigate the SCOP2 evolutionary and structural groupings.
Collapse
Affiliation(s)
- Antonina Andreeva
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, United Kingdom
| | - Dave Howorth
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, United Kingdom
| | - Cyrus Chothia
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, United Kingdom
| | - Eugene Kulesha
- European Bioinformatics Institute, Hinxton, Cambridge, United Kingdom
| | - Alexey G Murzin
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, United Kingdom
| |
Collapse
|
19
|
Lipinski CA, Litterman NK, Southan C, Williams AJ, Clark AM, Ekins S. Parallel worlds of public and commercial bioactive chemistry data. J Med Chem 2014; 58:2068-76. [PMID: 25415348 PMCID: PMC4360371 DOI: 10.1021/jm5011308] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
![]()
The
availability of structures and linked bioactivity data in databases
is powerfully enabling for drug discovery and chemical biology. However,
we now review some confounding issues with the divergent expansions
of public and commercial sources of chemical structures. These are
associated with not only expanding patent extraction but also increasingly
large vendor collections amassed via different selection criteria
between SciFinder from Chemical Abstracts Service (CAS) and major
public sources such as PubChem, ChemSpider, UniChem, and others. These
increasingly massive collections may include both real and virtual
compounds, as well as so-called prophetic compounds from patents.
We address a range of issues raised by the challenges faced resolving
the NIH probe compounds. In addition we highlight the confounding
of prior-art searching by virtual compounds that could impact the
composition of matter patentability of a new medicinal chemistry lead.
Finally, we propose some potential solutions.
Collapse
Affiliation(s)
- Christopher A Lipinski
- Christopher A. Lipinski, Ph.D., LLC , 10 Connshire Drive, Waterford, Connecticut 06385-4122, United States
| | | | | | | | | | | |
Collapse
|
20
|
Porto WF, Fensterseifer GM, Franco OL. In silico identification, structural characterization, and phylogenetic analysis of MdesDEF-2: a novel defensin from the Hessian fly, Mayetiola destructor. J Mol Model 2014; 20:2339. [DOI: 10.1007/s00894-014-2339-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2013] [Accepted: 06/08/2014] [Indexed: 10/25/2022]
|
21
|
Neuman BW, Chamberlain P, Bowden F, Joseph J. Atlas of coronavirus replicase structure. Virus Res 2013; 194:49-66. [PMID: 24355834 PMCID: PMC7114488 DOI: 10.1016/j.virusres.2013.12.004] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2013] [Revised: 12/03/2013] [Accepted: 12/05/2013] [Indexed: 12/13/2022]
Abstract
Complete and up to date coverage of replicase protein structures for SARS-CoV. Discusses SARS-CoV structure in the context of other coronavirus structures. Summarizes data from a variety of structural methods to illuminate protein function. Uses models and predictions to fill gaps in the SARS-CoV structure. Discusses the high percentage of novel protein folds among SARS-CoV proteins.
The international response to SARS-CoV has produced an outstanding number of protein structures in a very short time. This review summarizes the findings of functional and structural studies including those derived from cryoelectron microscopy, small angle X-ray scattering, NMR spectroscopy, and X-ray crystallography, and incorporates bioinformatics predictions where no structural data is available. Structures that shed light on the function and biological roles of the proteins in viral replication and pathogenesis are highlighted. The high percentage of novel protein folds identified among SARS-CoV proteins is discussed.
Collapse
Affiliation(s)
| | | | - Fern Bowden
- School of Biological Sciences, University of Reading, Reading, UK
| | | |
Collapse
|
22
|
Andreeva A, Howorth D, Chothia C, Kulesha E, Murzin AG. SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 2013; 42:D310-4. [PMID: 24293656 PMCID: PMC3964979 DOI: 10.1093/nar/gkt1242] [Citation(s) in RCA: 198] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
We present a prototype of a new structural classification of proteins, SCOP2 (http://scop2.mrc-lmb.cam.ac.uk/), that we have developed recently. SCOP2 is a successor to the Structural Classification of Proteins (SCOP, http://scop.mrc-lmb.cam.ac.uk/scop/) database. Similarly to SCOP, the main focus of SCOP2 is to organize structurally characterized proteins according to their structural and evolutionary relationships. SCOP2 was designed to provide a more advanced framework for protein structure annotation and classification. It defines a new approach to the classification of proteins that is essentially different from SCOP, but retains its best features. The SCOP2 classification is described in terms of a directed acyclic graph in which nodes form a complex network of many-to-many relationships and are represented by a region of protein structure and sequence. The new classification project is expected to ensure new advances in the field and open new areas of research.
Collapse
Affiliation(s)
- Antonina Andreeva
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK and European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | | | | | | | | |
Collapse
|
23
|
Kopec KO, Lupas AN. β-Propeller blades as ancestral peptides in protein evolution. PLoS One 2013; 8:e77074. [PMID: 24143202 PMCID: PMC3797127 DOI: 10.1371/journal.pone.0077074] [Citation(s) in RCA: 67] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2013] [Accepted: 09/05/2013] [Indexed: 12/04/2022] Open
Abstract
Proteins of the β-propeller fold are ubiquitous in nature and widely used as structural scaffolds for ligand binding and enzymatic activity. This fold comprises between four and twelve four-stranded β-meanders, the so called blades that are arranged circularly around a central funnel-shaped pore. Despite the large size range of β-propellers, their blades frequently show sequence similarity indicative of a common ancestry and it has been proposed that the majority of β-propellers arose divergently by amplification and diversification of an ancestral blade. Given the structural versatility of β-propellers and the hypothesis that the first folded proteins evolved from a simpler set of peptides, we investigated whether this blade may have given rise to other folds as well. Using sequence comparisons, we identified proteins of four other folds as potential homologs of β-propellers: the luminal domain of inositol-requiring enzyme 1 (IRE1-LD), type II β-prisms, β-pinwheels, and WW domains. Because, with increasing evolutionary distance and decreasing sequence length, the statistical significance of sequence comparisons becomes progressively harder to distinguish from the background of convergent similarities, we complemented our analyses with a new method that evaluates possible homology based on the correlation between sequence and structure similarity. Our results indicate a homologous relationship of IRE1-LD and type II β-prisms with β-propellers, and an analogous one for β-pinwheels and WW domains. Whereas IRE1-LD most likely originated by fold-changing mutations from a fully formed PQQ motif β-propeller, type II β-prisms originated by amplification and differentiation of a single blade, possibly also of the PQQ type. We conclude that both β-propellers and type II β-prisms arose by independent amplification of a blade-sized fragment, which represents a remnant of an ancient peptide world.
Collapse
Affiliation(s)
- Klaus O. Kopec
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen, Baden-Württemberg, Germany
| | - Andrei N. Lupas
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen, Baden-Württemberg, Germany
- * E-mail:
| |
Collapse
|
24
|
Protein-protein interactions as a strategy towards protein-specific drug design: the example of ataxin-1. PLoS One 2013; 8:e76456. [PMID: 24155902 PMCID: PMC3796545 DOI: 10.1371/journal.pone.0076456] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2013] [Accepted: 08/26/2013] [Indexed: 11/20/2022] Open
Abstract
A main challenge for structural biologists is to understand the mechanisms that discriminate between molecular interactions and determine function. Here, we show how partner recognition of the AXH domain of the transcriptional co-regulator ataxin-1 is fine-tuned by a subtle balance between self- and hetero-associations. Ataxin-1 is the protein responsible for the hereditary spinocerebellar ataxia type 1, a disease linked to protein aggregation and transcriptional dysregulation. Expansion of a polyglutamine tract is essential for ataxin-1 aggregation, but the sequence-wise distant AXH domain plays an important aggravating role in the process. The AXH domain is also a key element for non-aberrant function as it intervenes in interactions with multiple protein partners. Previous data have shown that AXH is dimeric in solution and forms a dimer of dimers when crystallized. By solving the structure of a complex of AXH with a peptide from the interacting transcriptional repressor CIC, we show that the dimer interface of AXH is displaced by the new interaction and that, when blocked by the CIC peptide AXH aggregation and misfolding are impaired. This is a unique example in which palindromic self- and hetero-interactions within a sequence with chameleon properties discriminate the partner. We propose a drug design strategy for the treatment of SCA1 that is based on the information gained from the AXH/CIC complex.
Collapse
|
25
|
Mahajan S, Agarwal G, Iftekhar M, Offmann B, de Brevern AG, Srinivasan N. DoSA: Database of Structural Alignments. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat048. [PMID: 23846594 PMCID: PMC3708618 DOI: 10.1093/database/bat048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Protein structure alignment is a crucial step in protein structure–function analysis. Despite the advances in protein structure alignment algorithms, some of the local conformationally similar regions are mislabeled as structurally variable regions (SVRs). These regions are not well superimposed because of differences in their spatial orientations. The Database of Structural Alignments (DoSA) addresses this gap in identification of local structural similarities obscured in global protein structural alignments by realigning SVRs using an algorithm based on protein blocks. A set of protein blocks is a structural alphabet that abstracts protein structures into 16 unique local structural motifs. DoSA provides unique information about 159 780 conformationally similar and 56 140 conformationally dissimilar SVRs in 74 705 pairwise structural alignments of homologous proteins. The information provided on conformationally similar and dissimilar SVRs can be helpful to model loop regions. It is also conceivable that conformationally similar SVRs with conserved residues could potentially contribute toward functional integrity of homologues, and hence identifying such SVRs could be helpful in understanding the structural basis of protein function. Database URL:http://bo-protscience.fr/dosa/
Collapse
Affiliation(s)
- Swapnil Mahajan
- Dynamique des Structures et Interactions des Macromolécules Biologiques, UMR-S INSERM S665, Faculté des Sciences et Technologies, Université de La Réunion, F-97715 Saint Denis Messag Cedex 09, La Réunion, France
| | | | | | | | | | | |
Collapse
|
26
|
|
27
|
Bukhari SA, Caetano-Anollés G. Origin and evolution of protein fold designs inferred from phylogenomic analysis of CATH domain structures in proteomes. PLoS Comput Biol 2013; 9:e1003009. [PMID: 23555236 PMCID: PMC3610613 DOI: 10.1371/journal.pcbi.1003009] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2012] [Accepted: 02/13/2013] [Indexed: 12/22/2022] Open
Abstract
The spatial arrangements of secondary structures in proteins, irrespective of their connectivity, depict the overall shape and organization of protein domains. These features have been used in the CATH and SCOP classifications to hierarchically partition fold space and define the architectural make up of proteins. Here we use phylogenomic methods and a census of CATH structures in hundreds of genomes to study the origin and diversification of protein architectures (A) and their associated topologies (T) and superfamilies (H). Phylogenies that describe the evolution of domain structures and proteomes were reconstructed from the structural census and used to generate timelines of domain discovery. Phylogenies of CATH domains at T and H levels of structural abstraction and associated chronologies revealed patterns of reductive evolution, the early rise of Archaea, three epochs in the evolution of the protein world, and patterns of structural sharing between superkingdoms. Phylogenies of proteomes confirmed the early appearance of Archaea. While these findings are in agreement with previous phylogenomic studies based on the SCOP classification, phylogenies unveiled sharing patterns between Archaea and Eukarya that are recent and can explain the canonical bacterial rooting typically recovered from sequence analysis. Phylogenies of CATH domains at A level uncovered general patterns of architectural origin and diversification. The tree of A structures showed that ancient structural designs such as the 3-layer (αβα) sandwich (3.40) or the orthogonal bundle (1.10) are comparatively simpler in their makeup and are involved in basic cellular functions. In contrast, modern structural designs such as prisms, propellers, 2-solenoid, super-roll, clam, trefoil and box are not widely distributed and were probably adopted to perform specialized functions. Our timelines therefore uncover a universal tendency towards protein structural complexity that is remarkable.
Collapse
Affiliation(s)
- Syed Abbas Bukhari
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, United States of America
| |
Collapse
|
28
|
de Chiara C, Rees M, Menon RP, Pauwels K, Lawrence C, Konarev PV, Svergun DI, Martin SR, Chen YW, Pastore A. Self-assembly and conformational heterogeneity of the AXH domain of ataxin-1: an unusual example of a chameleon fold. Biophys J 2013; 104:1304-13. [PMID: 23528090 DOI: 10.1016/j.bpj.2013.01.048] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Revised: 01/20/2013] [Accepted: 01/28/2013] [Indexed: 10/27/2022] Open
Abstract
Ataxin-1 is a human protein responsible for spinocerebellar ataxia type 1, a hereditary disease associated with protein aggregation and misfolding. Essential for ataxin-1 aggregation is the anomalous expansion of a polyglutamine tract near the protein N-terminus, but the sequence-wise distant AXH domain modulates and contributes to the process. The AXH domain is also involved in the nonpathologic functions of the protein, including a variety of intermolecular interactions with other cellular partners. The domain forms a globular dimer in solution and displays a dimer of dimers arrangement in the crystal asymmetric unit. Here, we have characterized the domain further by studying its behavior in the crystal and in solution. We solved two new structures of the domain crystallized under different conditions that confirm an inherent plasticity of the AXH fold. In solution, the domain is present as a complex equilibrium mixture of monomeric, dimeric, and higher molecular weight species. This behavior, together with the tendency of the AXH fold to be trapped in local conformations, and the multiplicity of protomer interfaces, makes the AXH domain an unusual example of a chameleon protein whose properties bear potential relevance for the aggregation properties of ataxin-1 and thus for disease.
Collapse
Affiliation(s)
- Cesira de Chiara
- Medical Research Council National Institute for Medical Research London, United Kingdom
| | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Kuhns MS, Badgandi HB. Piecing together the family portrait of TCR-CD3 complexes. Immunol Rev 2013; 250:120-43. [PMID: 23046126 DOI: 10.1111/imr.12000] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The pre-T-cell receptor (TCR)-, αβTCR-, and γδTCR-CD3 complexes are members of a family of modular biosensors that are responsible for driving T-cell development, activation, and effector functions. They inform essential checkpoint decisions by relaying key information from their ligand-binding modules (TCRs) to their signaling modules (CD3γε + CD3δε and CD3ζζ) and on to the intracellular signaling apparatus. Their actions shape the T-cell repertoire, as well as T-cell-mediated immunity; yet, the mechanisms that underlie their activity remain an enigma. As with any molecular machine, understanding how they function depends upon understanding how their parts fit and work together. In the 30 years since the initial biochemical and genetic characterizations of the αβTCR, the structure and function of the individual components of these family members have been extensively characterized. Cumulatively, this information has allowed us to piece together a portrait of the αβTCR-CD3 complex and outline the form of the remaining family members. Here we review the known structural and functional characteristics of the components of these TCR-CD3 complex family members. We then discuss how these data have informed our understanding of the architecture of the αβTCR-CD3 complex as well as their implications for the other family members. The intent is to provide a framework for considering: (i) how these thematically similar complexes diverge to execute their specific functions and (ii) how our knowledge of the form and function of these distinct family members can cross-inform our understanding of the other family members.
Collapse
Affiliation(s)
- Michael S Kuhns
- Department of Immunobiology, The University of Arizona College of Medicine, Tucson, USA.
| | | |
Collapse
|
30
|
Abstract
In the early 1930s, phenylketonuria was among the first metabolic diseases to be defined. In the following years, multiple attempts to correlate genotype and phenotype in several inherited metabolic diseases, including phenylketonuria, were encountered with difficulties. It is becoming evident that the phenotype of metabolic disorders is often more multifaceted than expected from the disruption of a specific enzyme function caused by a single-gene disorder. Undoubtedly, revealing the factors contributing to the discrepancy between the loss of a single enzymatic function and the wide spectrum of clinical consequences would allow clinicians to optimize treatment for their patients. This article discusses several possible contributors to the unique, complex phenotypes observed in inherited metabolic disorders, using argininosuccinic aciduria as a disease model.Genet Med 2013:15(4):251-257.
Collapse
Affiliation(s)
- Ayelet Erez
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
31
|
Bhattacharyya M, Upadhyay R, Vishveshwara S. Interaction signatures stabilizing the NAD(P)-binding Rossmann fold: a structure network approach. PLoS One 2012; 7:e51676. [PMID: 23284738 PMCID: PMC3524241 DOI: 10.1371/journal.pone.0051676] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Accepted: 11/05/2012] [Indexed: 11/19/2022] Open
Abstract
The fidelity of the folding pathways being encoded in the amino acid sequence is met with challenge in instances where proteins with no sequence homology, performing different functions and no apparent evolutionary linkage, adopt a similar fold. The problem stated otherwise is that a limited fold space is available to a repertoire of diverse sequences. The key question is what factors lead to the formation of a fold from diverse sequences. Here, with the NAD(P)-binding Rossmann fold domains as a case study and using the concepts of network theory, we have unveiled the consensus structural features that drive the formation of this fold. We have proposed a graph theoretic formalism to capture the structural details in terms of the conserved atomic interactions in global milieu, and hence extract the essential topological features from diverse sequences. A unified mathematical representation of the different structures together with a judicious concoction of several network parameters enabled us to probe into the structural features driving the adoption of the NAD(P)-binding Rossmann fold. The atomic interactions at key positions seem to be better conserved in proteins, as compared to the residues participating in these interactions. We propose a "spatial motif" and several "fold specific hot spots" that form the signature structural blueprints of the NAD(P)-binding Rossmann fold domain. Excellent agreement of our data with previous experimental and theoretical studies validates the robustness and validity of the approach. Additionally, comparison of our results with statistical coupling analysis (SCA) provides further support. The methodology proposed here is general and can be applied to similar problems of interest.
Collapse
Affiliation(s)
| | - Roopali Upadhyay
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | | |
Collapse
|
32
|
Burmann BM, Knauer SH, Sevostyanova A, Schweimer K, Mooney RA, Landick R, Artsimovitch I, Rösch P. An α helix to β barrel domain switch transforms the transcription factor RfaH into a translation factor. Cell 2012; 150:291-303. [PMID: 22817892 DOI: 10.1016/j.cell.2012.05.042] [Citation(s) in RCA: 162] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Revised: 03/28/2012] [Accepted: 05/07/2012] [Indexed: 12/24/2022]
Abstract
NusG homologs regulate transcription and coupled processes in all living organisms. The Escherichia coli (E. coli) two-domain paralogs NusG and RfaH have conformationally identical N-terminal domains (NTDs) but dramatically different carboxy-terminal domains (CTDs), a β barrel in NusG and an α hairpin in RfaH. Both NTDs interact with elongating RNA polymerase (RNAP) to reduce pausing. In NusG, NTD and CTD are completely independent, and NusG-CTD interacts with termination factor Rho or ribosomal protein S10. In contrast, RfaH-CTD makes extensive contacts with RfaH-NTD to mask an RNAP-binding site therein. Upon RfaH interaction with its DNA target, the operon polarity suppressor (ops) DNA, RfaH-CTD is released, allowing RfaH-NTD to bind to RNAP. Here, we show that the released RfaH-CTD completely refolds from an all-α to an all-β conformation identical to that of NusG-CTD. As a consequence, RfaH-CTD binding to S10 is enabled and translation of RfaH-controlled operons is strongly potentiated. PAPERFLICK:
Collapse
Affiliation(s)
- Björn M Burmann
- Lehrstuhl Biopolymere und Forschungszentrum für Bio-Makromoleküle, Universität Bayreuth, Universitätsstraße 30, 95447 Bayreuth, Germany
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Abstract
The wealth of available protein structural data provides unprecedented opportunity to study and better understand the underlying principles of protein folding and protein structure evolution. A key to achieving this lies in the ability to analyse these data and to organize them in a coherent classification scheme. Over the past years several protein classifications have been developed that aim to group proteins based on their structural relationships. Some of these classification schemes explore the concept of structural neighbourhood (structural continuum), whereas other utilize the notion of protein evolution and thus provide a discrete rather than continuum view of protein structure space. This chapter presents a strategy for classification of proteins with known three-dimensional structure. Steps in the classification process along with basic definitions are introduced. Examples illustrating some fundamental concepts of protein folding and evolution with a special focus on the exceptions to them are presented.
Collapse
|
34
|
Suhrer SJ, Gruber M, Wiederstein M, Sippl MJ. Effective techniques for protein structure mining. Methods Mol Biol 2012; 857:33-54. [PMID: 22323216 DOI: 10.1007/978-1-61779-588-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Retrieval and characterization of protein structure relationships are instrumental in a wide range of tasks in structural biology. The classification of protein structures (COPS) is a web service that provides efficient access to structure and sequence similarities for all currently available protein structures. Here, we focus on the application of COPS to the problem of template selection in homology modeling.
Collapse
Affiliation(s)
- Stefan J Suhrer
- Center of Applied Molecular Engineering, Division of Bioinformatics, University of Salzburg, Salzburg, Austria.
| | | | | | | |
Collapse
|
35
|
Szilágyi A, Zhang Y, Závodszky P. Intra-chain 3D segment swapping spawns the evolution of new multidomain protein architectures. J Mol Biol 2011; 415:221-35. [PMID: 22079367 DOI: 10.1016/j.jmb.2011.10.045] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2011] [Revised: 10/07/2011] [Accepted: 10/27/2011] [Indexed: 10/15/2022]
Abstract
Multidomain proteins form in evolution through the concatenation of domains, but structural domains may comprise multiple segments of the chain. In this work, we demonstrate that new multidomain architectures can evolve by an apparent three-dimensional swap of segments between structurally similar domains within a single-chain monomer. By a comprehensive structural search of the current Protein Data Bank (PDB), we identified 32 well-defined segment-swapped proteins (SSPs) belonging to 18 structural families. Nearly 13% of all multidomain proteins in the PDB may have a segment-swapped evolutionary precursor as estimated by more permissive searching criteria. The formation of SSPs can be explained by two principal evolutionary mechanisms: (i) domain swapping and fusion (DSF) and (ii) circular permutation (CP). By large-scale comparative analyses using structural alignment and hidden Markov model methods, it was found that the majority of SSPs have evolved via the DSF mechanism, and a much smaller fraction, via CP. Functional analyses further revealed that segment swapping, which results in two linkers connecting the domains, may impart directed flexibility to multidomain proteins and contributes to the development of new functions. Thus, inter-domain segment swapping represents a novel general mechanism by which new protein folds and multidomain architectures arise in evolution, and SSPs have structural and functional properties that make them worth defining as a separate group.
Collapse
Affiliation(s)
- András Szilágyi
- Institute of Enzymology, Hungarian Academy of Sciences, Karolina út 29, H-1113 Budapest, Hungary
| | | | | |
Collapse
|
36
|
Jones DD. Recombining low homology, functionally rich regions of bacterial subtilisins by combinatorial fragment exchange. PLoS One 2011; 6:e24319. [PMID: 21915310 PMCID: PMC3168465 DOI: 10.1371/journal.pone.0024319] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 08/08/2011] [Indexed: 11/19/2022] Open
Abstract
Combinatorial fragment exchange was utilised to recombine key structural and functional low homology regions of bacilli subtilisins to generate new active hybrid proteases with altered substrate profiles. Up to six different regions comprising mostly of loop residues from the commercially important subtilisin Savinase were exchanged with the structurally equivalent regions of six other subtilisins. The six additional subtilisins derive from diverse origins and included thermophilic and intracellular subtilisins as well as other academically and commercially relevant subtilisins. Savinase was largely tolerant to fragment exchange; rational replacement of all six regions with 5 of 6 donating subtilisin sequences preserved activity, albeit reduced compared to Savinase. A combinatorial approach was used to generate hybrid Savinase variants in which the sequences derived from all seven subtilisins at each region were recombined to generate new region combinations. Variants with different substrate profiles and with greater apparent activity compared to Savinase and the rational fragment exchange variants were generated with the substrate profile exhibited by variants dependent on the sequence combination at each region.
Collapse
Affiliation(s)
- D Dafydd Jones
- School of Biosciences, Cardiff University, Cardiff, United Kingdom.
| |
Collapse
|
37
|
Meng EC, Babbitt PC. Topological variation in the evolution of new reactions in functionally diverse enzyme superfamilies. Curr Opin Struct Biol 2011; 21:391-7. [PMID: 21458983 PMCID: PMC3551608 DOI: 10.1016/j.sbi.2011.03.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2011] [Revised: 03/05/2011] [Accepted: 03/09/2011] [Indexed: 10/18/2022]
Abstract
In functionally diverse enzyme superfamilies (SFs), conserved structural and active site features reflect catalytic capabilities 'hard-wired' in each SF architecture. Overlaid on this foundation, evolutionary changes in active site machinery, structural topology and other aspects of structural organization and interactions support the emergence of new reactions, mechanisms, and substrate specificity. This review connects topological with functional variation in each of the haloalkanoic acid dehalogenase (HAD) and vicinal oxygen chelate fold (VOC) SFs and a set of redox-active thioredoxin (Trx)-fold SFs to illustrate a few of the varied themes nature has used to evolve new functions from a limited set of structural scaffolds.
Collapse
Affiliation(s)
- Elaine C. Meng
- Department of Pharmaceutical Chemistry, University of California, M/S 2240, 600 16th Street, San Francisco, CA 94158-2517, USA,
| | - Patricia C. Babbitt
- Department of Pharmaceutical Chemistry, University of California, M/S 2240, 600 16th Street, San Francisco, CA 94158-2517, USA,
- Department of Bioengineering and Therapeutic Sciences, University of California, M/S 2250, 1700 4 Street, San Francisco, CA 94158-2330, USA
- California Institute for Quantitative Biosciences, University of California, San Francisco
| |
Collapse
|
38
|
Hollup SM, Sadowski MI, Jonassen I, Taylor WR. Exploring the limits of fold discrimination by structural alignment: a large scale benchmark using decoys of known fold. Comput Biol Chem 2011; 35:174-88. [PMID: 21704264 PMCID: PMC3145973 DOI: 10.1016/j.compbiolchem.2011.04.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Accepted: 04/23/2011] [Indexed: 11/10/2022]
Abstract
Protein structure comparison by pairwise alignment is commonly used to identify highly similar substructures in pairs of proteins and provide a measure of structural similarity based on the size and geometric similarity of the match. These scores are routinely applied in analyses of protein fold space under the assumption that high statistical significance is equivalent to a meaningful relationship, however the truth of this assumption has previously been difficult to test since there is a lack of automated methods which do not rely on the same underlying principles. As a resolution to this we present a method based on the use of topological descriptions of global protein structure, providing an independent means to assess the ability of structural alignment to maintain meaningful structural correspondances on a large scale. Using a large set of decoys of specified global fold we benchmark three widely used methods for structure comparison, SAP, TM-align and DALI, and test the degree to which this assumption is justified for these methods. Application of a topological edit distance measure to provide a scale of the degree of fold change shows that while there is a broad correlation between high structural alignment scores and low edit distances there remain many pairs of highly significant score which differ by core strand swaps and therefore are structurally different on a global level. Possible causes of this problem and its meaning for present assessments of protein fold space are discussed.
Collapse
|
39
|
Conformational conversion and prion disease. Nat Rev Mol Cell Biol 2011; 12:273; author reply 273. [PMID: 21427768 DOI: 10.1038/nrm3007-c1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
40
|
Agarwal G, Mahajan S, Srinivasan N, de Brevern AG. Identification of local conformational similarity in structurally variable regions of homologous proteins using protein blocks. PLoS One 2011; 6:e17826. [PMID: 21445259 PMCID: PMC3060819 DOI: 10.1371/journal.pone.0017826] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2010] [Accepted: 02/15/2011] [Indexed: 11/18/2022] Open
Abstract
Structure comparison tools can be used to align related protein structures to identify structurally conserved and variable regions and to infer functional and evolutionary relationships. While the conserved regions often superimpose well, the variable regions appear non superimposable. Differences in homologous protein structures are thought to be due to evolutionary plasticity to accommodate diverged sequences during evolution. One of the kinds of differences between 3-D structures of homologous proteins is rigid body displacement. A glaring example is not well superimposed equivalent regions of homologous proteins corresponding to α-helical conformation with different spatial orientations. In a rigid body superimposition, these regions would appear variable although they may contain local similarity. Also, due to high spatial deviation in the variable region, one-to-one correspondence at the residue level cannot be determined accurately. Another kind of difference is conformational variability and the most common example is topologically equivalent loops of two homologues but with different conformations. In the current study, we present a refined view of the “structurally variable” regions which may contain local similarity obscured in global alignment of homologous protein structures. As structural alphabet is able to describe local structures of proteins precisely through Protein Blocks approach, conformational similarity has been identified in a substantial number of ‘variable’ regions in a large data set of protein structural alignments; optimal residue-residue equivalences could be achieved on the basis of Protein Blocks which led to improved local alignments. Also, through an example, we have demonstrated how the additional information on local backbone structures through protein blocks can aid in comparative modeling of a loop region. In addition, understanding on sequence-structure relationships can be enhanced through our approach. This has been illustrated through examples where the equivalent regions in homologous protein structures share sequence similarity to varied extent but do not preserve local structure.
Collapse
Affiliation(s)
- Garima Agarwal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Swapnil Mahajan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, UAS-GKVK Campus, Bangalore, India
| | | | - Alexandre G. de Brevern
- Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), INSERM, U665, Paris, France
- Université Paris Diderot - Paris 7, UMR-S665, Paris, France
- Institut National de la Transfusion Sanguine (INTS), Paris, France
| |
Collapse
|
41
|
Tai CH, Sam V, Gibrat JF, Garnier J, Munson PJ, Lee B. Protein domain assignment from the recurrence of locally similar structures. Proteins 2010; 79:853-66. [PMID: 21287617 DOI: 10.1002/prot.22923] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2010] [Revised: 10/14/2010] [Accepted: 10/18/2010] [Indexed: 11/10/2022]
Abstract
Domains are basic units of protein structure and essential for exploring protein fold space and structure evolution. With the structural genomics initiative, the number of protein structures in the Protein Databank (PDB) is increasing dramatically and domain assignments need to be done automatically. Most existing structural domain assignment programs define domains using the compactness of the domains and/or the number and strength of intra-domain versus inter-domain contacts. Here we present a different approach based on the recurrence of locally similar structural pieces (LSSPs) found by one-against-all structure comparisons with a dataset of 6373 protein chains from the PDB. Residues of the query protein are clustered using LSSPs via three different procedures to define domains. This approach gives results that are comparable to several existing programs that use geometrical and other structural information explicitly. Remarkably, most of the proteins that contribute the LSSPs defining a domain do not themselves contain the domain of interest. This study shows that domains can be defined by a collection of relatively small locally similar structural pieces containing, on average, four secondary structure elements. In addition, it indicates that domains are indeed made of recurrent small structural pieces that are used to build protein structures of many different folds as suggested by recent studies.
Collapse
Affiliation(s)
- Chin-Hsien Tai
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | |
Collapse
|
42
|
Arbing MA, Handelman SK, Kuzin AP, Verdon G, Wang C, Su M, Rothenbacher FP, Abashidze M, Liu M, Hurley JM, Xiao R, Acton T, Inouye M, Montelione GT, Woychik NA, Hunt JF. Crystal structures of Phd-Doc, HigA, and YeeU establish multiple evolutionary links between microbial growth-regulating toxin-antitoxin systems. Structure 2010; 18:996-1010. [PMID: 20696400 DOI: 10.1016/j.str.2010.04.018] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2009] [Revised: 03/22/2010] [Accepted: 04/21/2010] [Indexed: 10/19/2022]
Abstract
Bacterial toxin-antitoxin (TA) systems serve a variety of physiological functions including regulation of cell growth and maintenance of foreign genetic elements. Sequence analyses suggest that TA families are linked by complex evolutionary relationships reflecting likely swapping of functional domains between different TA families. Our crystal structures of Phd-Doc from bacteriophage P1, the HigA antitoxin from Escherichia coli CFT073, and YeeU of the YeeUWV systems from E. coli K12 and Shigella flexneri confirm this inference and reveal additional, unanticipated structural relationships. The growth-regulating Doc toxin exhibits structural similarity to secreted virulence factors that are toxic for eukaryotic target cells. The Phd antitoxin possesses the same fold as both the YefM and NE2111 antitoxins that inhibit structurally unrelated toxins. YeeU, which has an antitoxin-like activity that represses toxin expression, is structurally similar to the ribosome-interacting toxins YoeB and RelE. These observations suggest extensive functional exchanges have occurred between TA systems during bacterial evolution.
Collapse
Affiliation(s)
- Mark A Arbing
- Department of Biological Sciences, Columbia University, 702 Fairchild Center, MC2434, New York, NY 10027, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Andreeva A, Murzin AG. Structural classification of proteins and structural genomics: new insights into protein folding and evolution. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010; 66:1190-7. [PMID: 20944210 PMCID: PMC2954204 DOI: 10.1107/s1744309110007177] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Accepted: 02/24/2010] [Indexed: 11/10/2022]
Abstract
During the past decade, the Protein Structure Initiative (PSI) centres have become major contributors of new families, superfamilies and folds to the Structural Classification of Proteins (SCOP) database. The PSI results have increased the diversity of protein structural space and accelerated our understanding of it. This review article surveys a selection of protein structures determined by the Joint Center for Structural Genomics (JCSG). It presents previously undescribed β-sheet architectures such as the double barrel and spiral β-roll and discusses new examples of unusual topologies and peculiar structural features observed in proteins characterized by the JCSG and other Structural Genomics centres.
Collapse
Affiliation(s)
- Antonina Andreeva
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, England
| | - Alexey G. Murzin
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, England
| |
Collapse
|
44
|
Phages have adapted the same protein fold to fulfill multiple functions in virion assembly. Proc Natl Acad Sci U S A 2010; 107:14384-9. [PMID: 20660769 DOI: 10.1073/pnas.1005822107] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Evolutionary relationships may exist among very diverse groups of proteins even though they perform different functions and display little sequence similarity. The tailed bacteriophages present a uniquely amenable system for identifying such groups because of their huge diversity yet conserved genome structures. In this work, we used structural, functional, and genomic context comparisons to conclude that the head-tail connector protein and tail tube protein of bacteriophage lambda diverged from a common ancestral protein. Further comparisons of tertiary and quaternary structures indicate that the baseplate hub and tail terminator proteins of bacteriophage may also be part of this same family. We propose that all of these proteins evolved from a single ancestral tail tube protein fold, and that gene duplication followed by differentiation led to the specialized roles of these proteins seen in bacteriophages today. Although this type of evolutionary mechanism has been proposed for other systems, our work provides an evolutionary mechanism for a group of proteins with different functions that bear no sequence similarity. Our data also indicate that the addition of a structural element at the N terminus of the lambda head-tail connector protein endows it with a distinctive protein interaction capability compared with many of its putative homologues.
Collapse
|
45
|
Bryant DH, Moll M, Chen BY, Fofanov VY, Kavraki LE. Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction. BMC Bioinformatics 2010; 11:242. [PMID: 20459833 PMCID: PMC2885373 DOI: 10.1186/1471-2105-11-242] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2009] [Accepted: 05/11/2010] [Indexed: 12/02/2022] Open
Abstract
Background Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels. Results This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST) method uses all-against-all substructure comparison to determine Substructural Clusters (SCs). SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH) framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs. Conclusions FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated, statistically rigorous procedure for incorporating structural variation data into protein function prediction pipelines. Our work provides an unbiased, automated assessment of the structural variability of identified binding site substructures among protein structure families and a technique for exploring the relation of substructural variation to protein function. As available proteomic data continues to expand, the techniques proposed will be indispensable for the large-scale analysis and interpretation of structural data.
Collapse
Affiliation(s)
- Drew H Bryant
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | | | | | | |
Collapse
|
46
|
Metamorphic proteins mediate evolutionary transitions of structure. Proc Natl Acad Sci U S A 2010; 107:7287-92. [PMID: 20368465 DOI: 10.1073/pnas.0912616107] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The primary sequence of proteins usually dictates a single tertiary and quaternary structure. However, certain proteins undergo reversible backbone rearrangements. Such metamorphic proteins provide a means of facilitating the evolution of new folds and architectures. However, because natural folds emerged at the early stages of evolution, the potential role of metamorphic intermediates in mediating evolutionary transitions of structure remains largely unexplored. We evolved a set of new proteins based on approximately 100 amino acid fragments derived from tachylectin-2--a monomeric, 236 amino acids, five-bladed beta-propeller. Their structures reveal a unique pentameric assembly and novel beta-propeller structures. Although identical in sequence, the oligomeric subunits adopt two, or even three, different structures that together enable the pentameric assembly of two propellers connected via a small linker. Most of the subunits adopt a wild-type-like structure within individual five-bladed propellers. However, the bridging subunits exhibit domain swaps and asymmetric strand exchanges that allow them to complete the two propellers and connect them. Thus, the modular and metamorphic nature of these subunits enabled dramatic changes in tertiary and quaternary structure, while maintaining the lectin function. These oligomers therefore comprise putative intermediates via which beta-propellers can evolve from smaller elements. Our data also suggest that the ability of one sequence to equilibrate between different structures can be evolutionary optimized, thus facilitating the emergence of new structures.
Collapse
|
47
|
Abstract
Many protein classification systems capture homologous relationships by grouping domains into families and superfamilies on the basis of sequence similarity. Superfamilies with similar 3D structures are further grouped into folds. In the absence of discernable sequence similarity, these structural similarities were long thought to have originated independently, by convergent evolution. However, the growth of databases and advances in sequence comparison methods have led to the discovery of many distant evolutionary relationships that transcend the boundaries of superfamilies and folds. To investigate the contributions of convergent versus divergent evolution in the origin of protein folds, we clustered representative domains of known structure by their sequence similarity, treating them as point masses in a virtual 2D space which attract or repel each other depending on their pairwise sequence similarities. As expected, families in the same superfamily form tight clusters. But often, superfamilies of the same fold are linked with each other, suggesting that the entire fold evolved from an ancient prototype. Strikingly, some links connect superfamilies with different folds. They arise from modular peptide fragments of between 20 and 40 residues that co-occur in the connected folds in disparate structural contexts. These may be descendants of an ancestral pool of peptide modules that evolved as cofactors in the RNA world and from which the first folded proteins arose by amplification and recombination. Our galaxy of folds summarizes, in a single image, most known and many yet undescribed homologous relationships between protein superfamilies, providing new insights into the evolution of protein domains.
Collapse
Affiliation(s)
- Vikram Alva
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Tübingen 72076, Germany
| | | | | | | | | |
Collapse
|
48
|
Kuhns MS, Girvin AT, Klein LO, Chen R, Jensen KD, Newell EW, Huppa JB, Lillemeier BF, Huse M, Chien YH, Garcia KC, Davis MM. Evidence for a functional sidedness to the alphabetaTCR. Proc Natl Acad Sci U S A 2010; 107:5094-9. [PMID: 20202921 PMCID: PMC2841884 DOI: 10.1073/pnas.1000925107] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The T cell receptor (TCR) and associated CD3gammaepsilon, deltaepsilon, and zetazeta signaling dimers allow T cells to discriminate between different antigens and respond accordingly, but our knowledge of how these parts fit and work together is incomplete. In this study, we provide additional evidence that the CD3 heterodimers congregate on one side of the TCR in both the alphabeta and gammadeltaTCR-CD3 complexes. We also report that the other side of the alphabetaTCR mediates homotypic alphabetaTCR interactions and signaling. Specifically, an erythropoietin receptor-based dimerization assay was used to show that, upon complex assembly, the CD3epsilon chains of two CD3 heterodimers are arranged side-by-side in both the alphabeta and gammadeltaTCR-CD3 complexes. This system was also used to show that alphabetaTCRs can dimerize in the cell membrane and that mutating the unusual outer strands of the Calpha domain impairs this dimerization. Finally, we present data showing that, for CD4 T cells, the mutations that impair alphabetaTCR dimerization also alter ligand-induced calcium mobilization, TCR accumulation at the site of pMHC contact, and polarization toward the site of antigen contact. These data reveal a "functional-sidedness" to the alphabetaTCR constant region, with dimerization occurring on the side of the TCR opposite from where the CD3 heterodimers are located.
Collapse
MESH Headings
- Animals
- Antigen-Presenting Cells/cytology
- CD3 Complex/metabolism
- Calcium Signaling
- Cell Line
- Cell Membrane/metabolism
- Cell Polarity
- Humans
- Intracellular Space/metabolism
- Mice
- Models, Molecular
- Mutation/genetics
- Protein Multimerization
- Protein Structure, Secondary
- Protein Subunits/metabolism
- Receptors, Antigen, T-Cell, alpha-beta/chemistry
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/metabolism
- Receptors, Antigen, T-Cell, gamma-delta/metabolism
- T-Lymphocytes/cytology
Collapse
Affiliation(s)
- Michael S. Kuhns
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Andrew T. Girvin
- Department of Microbiology and Immunology
- Graduate Program in Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Lawrence O. Klein
- Graduate Program in Biophysics
- Stanford University School of Medicine, Stanford, CA 94305
| | - Rebecca Chen
- CCIS/ITI Summer High School Research Program
- Stanford University School of Medicine, Stanford, CA 94305
| | - Kirk D.C. Jensen
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Evan W. Newell
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Johannes B. Huppa
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Björn F. Lillemeier
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Morgan Huse
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - Yueh-hsiu Chien
- Department of Microbiology and Immunology
- Stanford University School of Medicine, Stanford, CA 94305
| | - K. Christopher Garcia
- Department of Molecular and Cellular Physiology
- Department of Structural Biology, and
- The Howard Hughes Medical Institute, Chevy Chase, MD 20815; and
- Stanford University School of Medicine, Stanford, CA 94305
| | - Mark M. Davis
- Department of Microbiology and Immunology
- The Howard Hughes Medical Institute, Chevy Chase, MD 20815; and
- Stanford University School of Medicine, Stanford, CA 94305
| |
Collapse
|
49
|
Khayrutdinov BI, Bae WJ, Yun YM, Lee JH, Tsuyama T, Kim JJ, Hwang E, Ryu KS, Cheong HK, Cheong C, Ko JS, Enomoto T, Karplus PA, Güntert P, Tada S, Jeon YH, Cho Y. Structure of the Cdt1 C-terminal domain: conservation of the winged helix fold in replication licensing factors. Protein Sci 2010; 18:2252-64. [PMID: 19722278 DOI: 10.1002/pro.236] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
In eukaryotic replication licensing, Cdt1 plays a key role by recruiting the MCM2-7 complex onto the origin of chromosome. The C-terminal domain of mouse Cdt1 (mCdt1C), the most conserved region in Cdt1, is essential for licensing and directly interacts with the MCM2-7 complex. We have determined the structures of mCdt1CS (mCdt1C_small; residues 452 to 557) and mCdt1CL (mCdt1C_large; residues 420 to 557) using X-ray crystallography and solution NMR spectroscopy, respectively. While the N-terminal 31 residues of mCdt1CL form a flexible loop with a short helix near the middle, the rest of mCdt1C folds into a winged helix structure. Together with the middle domain of mouse Cdt1 (mCdt1M, residues 172-368), this study reveals that Cdt1 is formed with a tandem repeat of the winged helix domain. The winged helix fold is also conserved in other licensing factors including archaeal ORC and Cdc6, which supports an idea that these replication initiators may have evolved from a common ancestor. Based on the structure of mCdt1C, in conjunction with the biochemical analysis, we propose a binding site for the MCM complex within the mCdt1C.
Collapse
Affiliation(s)
- Bulat I Khayrutdinov
- The Magnetic Resonance Team, Korea Basic Science Institute, 804-1 Yangchung-Ri, Ochang, Chungbuk 363-883, South Korea
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Xiong B, Wu J, Burk DL, Xue M, Jiang H, Shen J. BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server. BMC Bioinformatics 2010; 11:47. [PMID: 20100327 PMCID: PMC3098077 DOI: 10.1186/1471-2105-11-47] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 01/25/2010] [Indexed: 11/17/2022] Open
Abstract
Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.
Collapse
Affiliation(s)
- Bing Xiong
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Zhangjiang Hi-Tech Park, Pudong, Shanghai, 201203, PR China.
| | | | | | | | | | | |
Collapse
|