151
|
Zubieta C, Krishna SS, Kapoor M, Kozbial P, McMullan D, Axelrod HL, Miller MD, Abdubek P, Ambing E, Astakhova T, Carlton D, Chiu HJ, Clayton T, Deller MC, Duan L, Elsliger MA, Feuerhelm J, Grzechnik SK, Hale J, Hampton E, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kumar A, Marciano D, Morse AT, Nigoghossian E, Okach L, Oommachen S, Reyes R, Rife CL, Schimmel P, van den Bedem H, Weekes D, White A, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structures of two novel dye-decolorizing peroxidases reveal a beta-barrel fold with a conserved heme-binding motif. Proteins 2009; 69:223-33. [PMID: 17654545 DOI: 10.1002/prot.21550] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
BtDyP from Bacteroides thetaiotaomicron (strain VPI-5482) and TyrA from Shewanella oneidensis are dye-decolorizing peroxidases (DyPs), members of a new family of heme-dependent peroxidases recently identified in fungi and bacteria. Here, we report the crystal structures of BtDyP and TyrA at 1.6 and 2.7 A, respectively. BtDyP assembles into a hexamer, while TyrA assembles into a dimer; the dimerization interface is conserved between the two proteins. Each monomer exhibits a two-domain, alpha+beta ferredoxin-like fold. A site for heme binding was identified computationally, and modeling of a heme into the proposed active site allowed for identification of residues likely to be functionally important. Structural and sequence comparisons with other DyPs demonstrate a conservation of putative heme-binding residues, including an absolutely conserved histidine. Isothermal titration calorimetry experiments confirm heme binding, but with a stoichiometry of 0.3:1 (heme:protein).
Collapse
|
152
|
Xu Q, Traag BA, Willemse J, McMullan D, Miller MD, Elsliger MA, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Carlton D, Chen C, Chiu HJ, Chruszcz M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Grzechnik SK, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, Minor W, Mommaas AM, Morse AT, Nigoghossian E, Nopakun A, Okach L, Oommachen S, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Wang S, Weekes D, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA, van Wezel GP. Structural and functional characterizations of SsgB, a conserved activator of developmental cell division in morphologically complex actinomycetes. J Biol Chem 2009; 284:25268-79. [PMID: 19567872 DOI: 10.1074/jbc.m109.018564] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
SsgA-like proteins (SALPs) are a family of homologous cell division-related proteins that occur exclusively in morphologically complex actinomycetes. We show that SsgB, a subfamily of SALPs, is the archetypal SALP that is functionally conserved in all sporulating actinomycetes. Sporulation-specific cell division of Streptomyces coelicolor ssgB mutants is restored by introduction of distant ssgB orthologues from other actinomycetes. Interestingly, the number of septa (and spores) of the complemented null mutants is dictated by the specific ssgB orthologue that is expressed. The crystal structure of the SsgB from Thermobifida fusca was determined at 2.6 A resolution and represents the first structure for this family. The structure revealed similarities to a class of eukaryotic "whirly" single-stranded DNA/RNA-binding proteins. However, the electro-negative surface of the SALPs suggests that neither SsgB nor any of the other SALPs are likely to interact with nucleotide substrates. Instead, we show that a conserved hydrophobic surface is likely to be important for SALP function and suggest that proteins are the likely binding partners.
Collapse
|
153
|
Das D, Kozbial P, Axelrod HL, Miller MD, McMullan D, Krishna SS, Abdubek P, Acosta C, Astakhova T, Burra P, Carlton D, Chen C, Chiu HJ, Clayton T, Deller MC, Duan L, Elias Y, Elsliger MA, Ernst D, Farr C, Feuerhelm J, Grzechnik A, Grzechnik SK, Hale J, Han GW, Jaroszewski L, Jin KK, Johnson HA, Klock HE, Knuth MW, Kumar A, Marciano D, Morse AT, Murphy KD, Nigoghossian E, Nopakun A, Okach L, Oommachen S, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Sudek S, Tien H, Trame C, Trout CV, van den Bedem H, Weekes D, White A, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of a novel Sm-like protein of putative cyanophage origin at 2.60 A resolution. Proteins 2009; 75:296-307. [PMID: 19173316 DOI: 10.1002/prot.22360] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
ECX21941 represents a very large family (over 600 members) of novel, ocean metagenome-specific proteins identified by clustering of the dataset from the Global Ocean Sampling expedition. The crystal structure of ECX21941 reveals unexpected similarity to Sm/LSm proteins, which are important RNA-binding proteins, despite no detectable sequence similarity. The ECX21941 protein assembles as a homopentamer in solution and in the crystal structure when expressed in Escherichia coli and represents the first pentameric structure for this Sm/LSm family of proteins, although the actual oligomeric form in vivo is currently not known. The genomic neighborhood analysis of ECX21941 and its homologs combined with sequence similarity searches suggest a cyanophage origin for this protein. The specific functions of members of this family are unknown, but our structure analysis of ECX21941 indicates nucleic acid-binding capabilities and suggests a role in RNA and/or DNA processing.
Collapse
|
154
|
Xu Q, Carlton D, Miller MD, Elsliger MA, Krishna SS, Abdubek P, Astakhova T, Burra P, Chiu HJ, Clayton T, Deller MC, Duan L, Elias Y, Feuerhelm J, Grant JC, Grzechnik A, Grzechnik SK, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Kumar A, Marciano D, McMullan D, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, Rife CL, Sefcovic N, Trame C, Trout CV, van den Bedem H, Weekes D, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of histidine phosphotransfer protein ShpA, an essential regulator of stalk biogenesis in Caulobacter crescentus. J Mol Biol 2009; 390:686-98. [PMID: 19450606 DOI: 10.1016/j.jmb.2009.05.023] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Revised: 05/08/2009] [Accepted: 05/13/2009] [Indexed: 11/27/2022]
Abstract
Cell-cycle-regulated stalk biogenesis in Caulobacter crescentus is controlled by a multistep phosphorelay system consisting of the hybrid histidine kinase ShkA, the histidine phosphotransfer (HPt) protein ShpA, and the response regulator TacA. ShpA shuttles phosphoryl groups between ShkA and TacA. When phosphorylated, TacA triggers a downstream transcription cascade for stalk synthesis in an RpoN-dependent manner. The crystal structure of ShpA was determined to 1.52 A resolution. ShpA belongs to a family of monomeric HPt proteins that feature a highly conserved four-helix bundle. The phosphorylatable histidine His56 is located on the surface of the helix bundle and is fully solvent exposed. One end of the four-helix bundle in ShpA is shorter compared with other characterized HPt proteins, whereas the face that potentially interacts with the response regulators is structurally conserved. Similarities of the interaction surface around the phosphorylation site suggest that ShpA is likely to share a common mechanism for molecular recognition and phosphotransfer with yeast phosphotransfer protein YPD1 despite their low overall sequence similarity.
Collapse
|
155
|
Das D, Krishna SS, McMullan D, Miller MD, Xu Q, Abdubek P, Acosta C, Astakhova T, Axelrod HL, Burra P, Carlton D, Chiu HJ, Clayton T, Deller MC, Duan L, Elias Y, Elsliger MA, Ernst D, Feuerhelm J, Grzechnik A, Grzechnik SK, Hale J, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Kumar A, Marciano D, Morse AT, Murphy KD, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, Rife CL, Sefcovic N, Tien H, Trame CB, Trout CV, van den Bedem H, Weekes D, White A, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of the Fic (Filamentation induced by cAMP) family protein SO4266 (gi|24375750) from Shewanella oneidensis MR-1 at 1.6 A resolution. Proteins 2009; 75:264-71. [PMID: 19127588 DOI: 10.1002/prot.22338] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
156
|
An Y, Chen CY, Moyer B, Rotkiewicz P, Elsliger MA, Godzik A, Wilson IA, Balch WE. Structural and functional analysis of the globular head domain of p115 provides insight into membrane tethering. J Mol Biol 2009; 391:26-41. [PMID: 19414022 DOI: 10.1016/j.jmb.2009.04.062] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2009] [Revised: 04/10/2009] [Accepted: 04/15/2009] [Indexed: 01/02/2023]
Abstract
Molecular tethers have a central role in the organization of the complex membrane architecture of eukaryotic cells. p115 is a ubiquitous, essential tether involved in vesicle transport and the structural organization of the exocytic pathway. We describe two crystal structures of the N-terminal domain of p115 at 2.0 A resolution. The p115 structures show a novel alpha-solenoid architecture constructed of 12 armadillo-like, tether-repeat, alpha-helical tripod motifs. We find that the H1 TR binds the Rab1 GTPase involved in endoplasmic reticulum to Golgi transport. Mutation of the H1 motif results in the dominant negative inhibition of endoplasmic reticulum to Golgi trafficking. We propose that the H1 helical tripod contributes to the assembly of Rab-dependent complexes responsible for the tether and SNARE-dependent fusion of membranes.
Collapse
|
157
|
|
158
|
Cho DH, Nakamura T, Fang J, Cieplak P, Godzik A, Gu Z, Lipton SA. S-nitrosylation of Drp1 mediates beta-amyloid-related mitochondrial fission and neuronal injury. Science 2009; 324:102-5. [PMID: 19342591 PMCID: PMC2823371 DOI: 10.1126/science.1171091] [Citation(s) in RCA: 825] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Mitochondria continuously undergo two opposing processes, fission and fusion. The disruption of this dynamic equilibrium may herald cell injury or death and may contribute to developmental and neurodegenerative disorders. Nitric oxide functions as a signaling molecule, but in excess it mediates neuronal injury, in part via mitochondrial fission or fragmentation. However, the underlying mechanism for nitric oxide-induced pathological fission remains unclear. We found that nitric oxide produced in response to beta-amyloid protein, thought to be a key mediator of Alzheimer's disease, triggered mitochondrial fission, synaptic loss, and neuronal damage, in part via S-nitrosylation of dynamin-related protein 1 (forming SNO-Drp1). Preventing nitrosylation of Drp1 by cysteine mutation abrogated these neurotoxic events. SNO-Drp1 is increased in brains of human Alzheimer's disease patients and may thus contribute to the pathogenesis of neurodegeneration.
Collapse
|
159
|
Xu Q, Rife CL, Carlton D, Miller MD, Krishna SS, Elsliger MA, Abdubek P, Astakhova T, Chiu HJ, Clayton T, Duan L, Feuerhelm J, Grzechnik SK, Hale J, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kumar A, McMullan D, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, van den Bedem H, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of a novel archaeal AAA+ ATPase SSO1545 from Sulfolobus solfataricus. Proteins 2009; 74:1041-9. [PMID: 19089981 DOI: 10.1002/prot.22325] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
160
|
Schwede T, Sali A, Honig B, Levitt M, Berman HM, Jones D, Brenner SE, Burley SK, Das R, Dokholyan NV, Dunbrack RL, Fidelis K, Fiser A, Godzik A, Huang YJ, Humblet C, Jacobson MP, Joachimiak A, Krystek SR, Kortemme T, Kryshtafovych A, Montelione GT, Moult J, Murray D, Sanchez R, Sosnick TR, Standley DM, Stouch T, Vajda S, Vasquez M, Westbrook JD, Wilson IA. Outcome of a workshop on applications of protein models in biomedical research. Structure 2009; 17:151-9. [PMID: 19217386 PMCID: PMC2739730 DOI: 10.1016/j.str.2008.12.014] [Citation(s) in RCA: 109] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2008] [Revised: 11/14/2008] [Accepted: 12/16/2008] [Indexed: 02/05/2023]
Abstract
We describe the proceedings and conclusions from the "Workshop on Applications of Protein Models in Biomedical Research" (the Workshop) that was held at the University of California, San Francisco on 11 and 12 July, 2008. At the Workshop, international scientists involved with structure modeling explored (i) how models are currently used in biomedical research, (ii) the requirements and challenges for different applications, and (iii) how the interaction between the computational and experimental research communities could be strengthened to advance the field.
Collapse
|
161
|
Nair R, Liu J, Soong TT, Acton TB, Everett JK, Kouranov A, Fiser A, Godzik A, Jaroszewski L, Orengo C, Montelione GT, Rost B. Structural genomics is the largest contributor of novel structural leverage. ACTA ACUST UNITED AC 2009; 10:181-91. [PMID: 19194785 PMCID: PMC2705706 DOI: 10.1007/s10969-008-9055-6] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2008] [Accepted: 12/08/2008] [Indexed: 11/28/2022]
Abstract
The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database.
Collapse
|
162
|
Xu Q, McMullan D, Abdubek P, Astakhova T, Carlton D, Chen C, Chiu HJ, Clayton T, Das D, Deller MC, Duan L, Elsliger MA, Feuerhelm J, Hale J, Han GW, Jaroszewski L, Jin KK, Johnson HA, Klock HE, Knuth MW, Kozbial P, Sri Krishna S, Kumar A, Marciano D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Oommachen S, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Trame C, van den Bedem H, Weekes D, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. A structural basis for the regulatory inactivation of DnaA. J Mol Biol 2008; 385:368-80. [PMID: 19000695 DOI: 10.1016/j.jmb.2008.10.059] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2008] [Revised: 10/18/2008] [Accepted: 10/22/2008] [Indexed: 11/25/2022]
Abstract
Regulatory inactivation of DnaA is dependent on Hda (homologous to DnaA), a protein homologous to the AAA+ (ATPases associated with diverse cellular activities) ATPase region of the replication initiator DnaA. When bound to the sliding clamp loaded onto duplex DNA, Hda can stimulate the transformation of active DnaA-ATP into inactive DnaA-ADP. The crystal structure of Hda from Shewanella amazonensis SB2B at 1.75 A resolution reveals that Hda resembles typical AAA+ ATPases. The arrangement of the two subdomains in Hda (residues 1-174 and 175-241) differs dramatically from that of DnaA. A CDP molecule anchors the Hda domains in a conformation that promotes dimer formation. The Hda dimer adopts a novel oligomeric assembly for AAA+ proteins in which the arginine finger, crucial for ATP hydrolysis, is fully exposed and available to hydrolyze DnaA-ATP through a typical AAA+ type of mechanism. The sliding clamp binding motifs at the N-terminus of each Hda monomer are partially buried and combine to form an antiparallel beta-sheet at the dimer interface. The inaccessibility of the clamp binding motifs in the CDP-bound structure of Hda suggests that conformational changes are required for Hda to form a functional complex with the clamp. Thus, the CDP-bound Hda dimer likely represents an inactive form of Hda.
Collapse
|
163
|
Verberkmoes NC, Russell AL, Shah M, Godzik A, Rosenquist M, Halfvarson J, Lefsrud MG, Apajalahti J, Tysk C, Hettich RL, Jansson JK. Shotgun metaproteomics of the human distal gut microbiota. ISME JOURNAL 2008; 3:179-89. [DOI: 10.1038/ismej.2008.108] [Citation(s) in RCA: 428] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
164
|
Li W, Wooley JC, Godzik A. Probing metagenomics by rapid cluster analysis of very large datasets. PLoS One 2008; 3:e3375. [PMID: 18846219 PMCID: PMC2557142 DOI: 10.1371/journal.pone.0003375] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Accepted: 09/04/2008] [Indexed: 11/21/2022] Open
Abstract
Background The scale and diversity of metagenomic sequencing projects challenge both our technical and conceptual approaches in gene and genome annotations. The recent Sorcerer II Global Ocean Sampling (GOS) expedition yielded millions of predicted protein sequences, which significantly altered the landscape of known protein space by more than doubling its size and adding thousands of new families (Yooseph et al., 2007 PLoS Biol 5, e16). Such datasets, not only by their sheer size, but also by many other features, defy conventional analysis and annotation methods. Methodology/Principal Findings In this study, we describe an approach for rapid analysis of the sequence diversity and the internal structure of such very large datasets by advanced clustering strategies using the newly modified CD-HIT algorithm. We performed a hierarchical clustering analysis on the 17.4 million Open Reading Frames (ORFs) identified from the GOS study and found over 33 thousand large predicted protein clusters comprising nearly 6 million sequences. Twenty percent of these clusters did not match known protein families by sequence similarity search and might represent novel protein families. Distributions of the large clusters were illustrated on organism composition, functional class, and sample locations. Conclusion/Significance Our clustering took about two orders of magnitude less computational effort than the similar protein family analysis of original GOS study. This approach will help to analyze other large metagenomic datasets in the future. A Web server with our clustering results and annotations of predicted protein clusters is available online at http://tools.camera.calit2.net/gos under the CAMERA project.
Collapse
|
165
|
Igarashi Y, Heureux E, Doctor KS, Talwar P, Gramatikova S, Gramatikoff K, Zhang Y, Blinov M, Ibragimova SS, Boyd S, Ratnikov B, Cieplak P, Godzik A, Smith JW, Osterman AL, Eroshkin AM. PMAP: databases for analyzing proteolytic events and pathways. Nucleic Acids Res 2008; 37:D611-8. [PMID: 18842634 PMCID: PMC2686432 DOI: 10.1093/nar/gkn683] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The Proteolysis MAP (PMAP, http://www.proteolysis.org) is a user-friendly website intended to aid the scientific community in reasoning about proteolytic networks and pathways. PMAP is comprised of five databases, linked together in one environment. The foundation databases, ProteaseDB and SubstrateDB, are driven by an automated annotation pipeline that generates dynamic ‘Molecule Pages’, rich in molecular information. PMAP also contains two community annotated databases focused on function; CutDB has information on more than 5000 proteolytic events, and ProfileDB is dedicated to information of the substrate recognition specificity of proteases. Together, the content within these four databases will ultimately feed PathwayDB, which will be comprised of known pathways whose function can be dynamically modeled in a rule-based manner, and hypothetical pathways suggested by semi-automated culling of the literature. A Protease Toolkit is also available for the analysis of proteases and proteolysis. Here, we describe how the databases of PMAP can be used to foster understanding of proteolytic pathways, and equally as significant, to reason about proteolysis.
Collapse
|
166
|
Veeramalai M, Ye Y, Godzik A. TOPS++FATCAT: fast flexible structural alignment using constraints derived from TOPS+ Strings Model. BMC Bioinformatics 2008; 9:358. [PMID: 18759993 PMCID: PMC2553092 DOI: 10.1186/1471-2105-9-358] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2008] [Accepted: 08/31/2008] [Indexed: 11/28/2022] Open
Abstract
Background Protein structure analysis and comparison are major challenges in structural bioinformatics. Despite the existence of many tools and algorithms, very few of them have managed to capture the intuitive understanding of protein structures developed in structural biology, especially in the context of rapid database searches. Such intuitions could help speed up similarity searches and make it easier to understand the results of such analyses. Results We developed a TOPS++FATCAT algorithm that uses an intuitive description of the proteins' structures as captured in the popular TOPS diagrams to limit the search space of the aligned fragment pairs (AFPs) in the flexible alignment of protein structures performed by the FATCAT algorithm. The TOPS++FATCAT algorithm is faster than FATCAT by more than an order of magnitude with a minimal cost in classification and alignment accuracy. For beta-rich proteins its accuracy is better than FATCAT, because the TOPS+ strings models contains important information of the parallel and anti-parallel hydrogen-bond patterns between the beta-strand SSEs (Secondary Structural Elements). We show that the TOPS++FATCAT errors, rare as they are, can be clearly linked to oversimplifications of the TOPS diagrams and can be corrected by the development of more precise secondary structure element definitions. Software Availability The benchmark analysis results and the compressed archive of the TOPS++FATCAT program for Linux platform can be downloaded from the following web site: Conclusion TOPS++FATCAT provides FATCAT accuracy and insights into protein structural changes at a speed comparable to sequence alignments, opening up a possibility of interactive protein structure similarity searches.
Collapse
|
167
|
Stec B, Prasad B, Zhang Y, Godzik A. Defining a protein: mining the protein structure database. Acta Crystallogr A 2008. [DOI: 10.1107/s0108767308079804] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
168
|
Elsliger M, Deacon A, Godzik A, Lesley S, Wooley J, Wilson I. Joint Center for Structural Genomics: tools and resources for the community. Acta Crystallogr A 2008. [DOI: 10.1107/s0108767308088405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
|
169
|
Xu Q, Kozbial P, McMullan D, Krishna SS, Brittain SM, Ficarro SB, DiDonato M, Miller MD, Abdubek P, Axelrod HL, Chiu HJ, Clayton T, Duan L, Elsliger MA, Feuerhelm J, Grzechnik SK, Hale J, Han GW, Jaroszewski L, Klock HE, Morse AT, Nigoghossian E, Paulsen J, Reyes R, Rife CL, van den Bedem H, White A, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of an ADP-ribosylated protein with a cytidine deaminase-like fold, but unknown function (TM1506), from Thermotoga maritima at 2.70 A resolution. Proteins 2008; 71:1546-52. [PMID: 18275082 DOI: 10.1002/prot.21992] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
170
|
Le Negrate G, Krieg A, Faustin B, Loeffler M, Godzik A, Krajewski S, Reed JC. ChlaDub1 of Chlamydia trachomatis suppresses NF-kappaB activation and inhibits IkappaBalpha ubiquitination and degradation. Cell Microbiol 2008; 10:1879-92. [PMID: 18503636 DOI: 10.1111/j.1462-5822.2008.01178.x] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Chlamydia trachomatis is an obligate intracellular bacterial pathogen that causes various human diseases, including blindness caused by ocular infection and sexually transmitted diseases resulting from urogenital infection. After infecting host cells, Chlamydiae avoid alarming the host's immune system. Among the immune evasion mechanisms, Chlamydiae can inhibit NF-kappaB activation, a crucial pathway for host inflammatory responses. In this study, we show that ChlaDub1, a deubiquitinating and deNeddylating protease from C. trachomatis, is expressed in infected cells. In transfection experiments, ChlaDub1 suppresses NF-kappaB activation induced by several pro-inflammatory stimuli and binds the NF-kappaB inhibitory subunit IkappaBalpha, impairing its ubiquitination and degradation. Thus, we provide further insight into the mechanism by which C. trachomatis may evade the host inflammatory response by demonstrating that ChlaDub1, a protease produced by this microorganism, is capable of inhibiting IkappaBalpha degradation and blocking NF-kappaB activation.
Collapse
|
171
|
Holland LZ, Albalat R, Azumi K, Benito-Gutiérrez E, Blow MJ, Bronner-Fraser M, Brunet F, Butts T, Candiani S, Dishaw LJ, Ferrier DEK, Garcia-Fernàndez J, Gibson-Brown JJ, Gissi C, Godzik A, Hallböök F, Hirose D, Hosomichi K, Ikuta T, Inoko H, Kasahara M, Kasamatsu J, Kawashima T, Kimura A, Kobayashi M, Kozmik Z, Kubokawa K, Laudet V, Litman GW, McHardy AC, Meulemans D, Nonaka M, Olinski RP, Pancer Z, Pennacchio LA, Pestarino M, Rast JP, Rigoutsos I, Robinson-Rechavi M, Roch G, Saiga H, Sasakura Y, Satake M, Satou Y, Schubert M, Sherwood N, Shiina T, Takatori N, Tello J, Vopalensky P, Wada S, Xu A, Ye Y, Yoshida K, Yoshizaki F, Yu JK, Zhang Q, Zmasek CM, de Jong PJ, Osoegawa K, Putnam NH, Rokhsar DS, Satoh N, Holland PWH. The amphioxus genome illuminates vertebrate origins and cephalochordate biology. Genome Res 2008; 18:1100-11. [PMID: 18562680 DOI: 10.1101/gr.073676.107] [Citation(s) in RCA: 368] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Cephalochordates, urochordates, and vertebrates evolved from a common ancestor over 520 million years ago. To improve our understanding of chordate evolution and the origin of vertebrates, we intensively searched for particular genes, gene families, and conserved noncoding elements in the sequenced genome of the cephalochordate Branchiostoma floridae, commonly called amphioxus or lancelets. Special attention was given to homeobox genes, opsin genes, genes involved in neural crest development, nuclear receptor genes, genes encoding components of the endocrine and immune systems, and conserved cis-regulatory enhancers. The amphioxus genome contains a basic set of chordate genes involved in development and cell signaling, including a fifteenth Hox gene. This set includes many genes that were co-opted in vertebrates for new roles in neural crest development and adaptive immunity. However, where amphioxus has a single gene, vertebrates often have two, three, or four paralogs derived from two whole-genome duplication events. In addition, several transcriptional enhancers are conserved between amphioxus and vertebrates--a very wide phylogenetic distance. In contrast, urochordate genomes have lost many genes, including a diversity of homeobox families and genes involved in steroid hormone function. The amphioxus genome also exhibits derived features, including duplications of opsins and genes proposed to function in innate immunity and endocrine systems. Our results indicate that the amphioxus genome is elemental to an understanding of the biology and evolution of nonchordate deuterostomes, invertebrate chordates, and vertebrates.
Collapse
|
172
|
Kozbial P, Xu Q, Chiu HJ, McMullan D, Krishna SS, Miller MD, Abdubek P, Acosta C, Astakhova T, Axelrod HL, Carlton D, Clayton T, Deller M, Duan L, Elias Y, Elsliger MA, Feuerhelm J, Grzechnik SK, Hale J, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Kumar A, Marciano D, Morse AT, Murphy KD, Nigoghossian E, Okach L, Oommachen S, Reyes R, Rife CL, Spraggon G, Trout CV, van den Bedem H, Weekes D, White A, Wolf G, Zubieta C, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structures of MW1337R and lin2004: representatives of a novel protein family that adopt a four-helical bundle fold. Proteins 2008; 71:1589-96. [PMID: 18324683 DOI: 10.1002/prot.22020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
173
|
Ting JPY, Lovering RC, Alnemri ES, Bertin J, Boss JM, Davis BK, Flavell RA, Girardin SE, Godzik A, Harton JA, Hoffman HM, Hugot JP, Inohara N, Mackenzie A, Maltais LJ, Nunez G, Ogura Y, Otten LA, Philpott D, Reed JC, Reith W, Schreiber S, Steimle V, Ward PA. The NLR gene family: a standard nomenclature. Immunity 2008; 28:285-7. [PMID: 18341998 DOI: 10.1016/j.immuni.2008.02.005] [Citation(s) in RCA: 629] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
174
|
Le Negrate G, Faustin B, Welsh K, Loeffler M, Krajewska M, Hasegawa P, Mukherjee S, Orth K, Krajewski S, Godzik A, Guiney DG, Reed JC. Salmonella Secreted Factor L Deubiquitinase of Salmonella typhimurium Inhibits NF-κB, Suppresses IκBα Ubiquitination and Modulates Innate Immune Responses. THE JOURNAL OF IMMUNOLOGY 2008; 180:5045-56. [DOI: 10.4049/jimmunol.180.7.5045] [Citation(s) in RCA: 116] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
175
|
Mathews II, McMullan D, Miller MD, Canaves JM, Elsliger MA, Floyd R, Grzechnik SK, Jaroszewski L, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, McPhillips TM, Morse AT, Quijano K, Rife CL, Schwarzenbacher R, Spraggon G, Stevens RC, van den Bedem H, Weekes D, Wolf G, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of 2-keto-3-deoxygluconate kinase (TM0067) from Thermotoga maritima at 2.05 A resolution. Proteins 2008; 70:603-8. [PMID: 18004772 DOI: 10.1002/prot.21842] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
176
|
Slabinski L, Jaroszewski L, Rodrigues APC, Rychlewski L, Wilson IA, Lesley SA, Godzik A. The challenge of protein structure determination--lessons from structural genomics. Protein Sci 2008; 16:2472-82. [PMID: 17962404 DOI: 10.1110/ps.073037907] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
The process of experimental determination of protein structure is marred with a high ratio of failures at many stages. With availability of large quantities of data from high-throughput structure determination in structural genomics centers, we can now learn to recognize protein features correlated with failures; thus, we can recognize proteins more likely to succeed and eventually learn how to modify those that are less likely to succeed. Here, we identify several protein features that correlate strongly with successful protein production and crystallization and combine them into a single score that assesses "crystallization feasibility." The formula derived here was tested with a jackknife procedure and validated on independent benchmark sets. The "crystallization feasibility" score described here is being applied to target selection in the Joint Center for Structural Genomics, and is now contributing to increasing the success rate, lowering the costs, and shortening the time for protein structure determination. Analyses of PDB depositions suggest that very similar features also play a role in non-high-throughput structure determination, suggesting that this crystallization feasibility score would also be of significant interest to structural biology, as well as to molecular and biochemistry laboratories.
Collapse
|
177
|
Godzik A, Jambon M, Friedberg I. Computational protein function prediction: are we making progress? Cell Mol Life Sci 2008; 64:2505-11. [PMID: 17611711 DOI: 10.1007/s00018-007-7211-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The computational prediction of gene and protein function is rapidly gaining ground as a central undertaking in computational biology. Making sense of the flood of genomic data requires fast and reliable annotation. Many ingenious algorithms have been devised to infer a protein's function from its amino acid sequence, 3D structure and chromosomal location of the encoding genes. However, there are significant challenges in assessing how well these programs perform. In this article we explore those challenges and review our own attempt at assessing the performance of those programs. We conclude that the task is far from complete and that a critical assessment of the performance of function prediction programs is necessary to make true progress in computational function prediction.
Collapse
|
178
|
Axelrod HL, McMullan D, Krishna SS, Miller MD, Elsliger MA, Abdubek P, Ambing E, Astakhova T, Carlton D, Chiu HJ, Clayton T, Duan L, Feuerhelm J, Grzechnik SK, Hale J, Han GW, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Quijano K, Reyes R, Rife CL, van den Bedem H, Weekes D, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of AICAR transformylase IMP cyclohydrolase (TM1249) fromThermotoga maritima at 1.88 Å resolution. Proteins 2008; 71:1042-9. [DOI: 10.1002/prot.21967] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
179
|
Schwarzenbacher R, Godzik A, Jaroszewski L. The JCSG MR pipeline: optimized alignments, multiple models and parallel searches. ACTA CRYSTALLOGRAPHICA. SECTION D, BIOLOGICAL CRYSTALLOGRAPHY 2008; 64:133-40. [PMID: 18094477 PMCID: PMC2394805 DOI: 10.1107/s0907444907050111] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2007] [Accepted: 10/12/2007] [Indexed: 12/05/2022]
Abstract
The success rate of molecular replacement (MR) falls considerably when search models share less than 35% sequence identity with their templates, but can be improved significantly by using fold-recognition methods combined with exhaustive MR searches. Models based on alignments calculated with fold-recognition algorithms are more accurate than models based on conventional alignment methods such as FASTA or BLAST, which are still widely used for MR. In addition, by designing MR pipelines that integrate phasing and automated refinement and allow parallel processing of such calculations, one can effectively increase the success rate of MR. Here, updated results from the JCSG MR pipeline are presented, which to date has solved 33 MR structures with less than 35% sequence identity to the closest homologue of known structure. By using difficult MR problems as examples, it is demonstrated that successful MR phasing is possible even in cases where the similarity between the model and the template can only be detected with fold-recognition algorithms. In the first step, several search models are built based on all homologues found in the PDB by fold-recognition algorithms. The models resulting from this process are used in parallel MR searches with different combinations of input parameters of the MR phasing algorithm. The putative solutions are subjected to rigid-body and restrained crystallographic refinement and ranked based on the final values of free R factor, figure of merit and deviations from ideal geometry. Finally, crystal packing and electron-density maps are checked to identify the correct solution. If this procedure does not yield a solution with interpretable electron-density maps, then even more alternative models are prepared. The structurally variable regions of a protein family are identified based on alignments of sequences and known structures from that family and appropriate trimmings of the models are proposed. All combinations of these trimmings are applied to the search models and the resulting set of models is used in the MR pipeline. It is estimated that with the improvements in model building and exhaustive parallel searches with existing phasing algorithms, MR can be successful for more than 50% of recognizable homologues of known structures below the threshold of 35% sequence identity. This implies that about one-third of the proteins in a typical bacterial proteome are potential MR targets.
Collapse
|
180
|
Premkumar L, Rife CL, Sri Krishna S, McMullan D, Miller MD, Abdubek P, Ambing E, Astakhova T, Axelrod HL, Canaves JM, Carlton D, Chiu HJ, Clayton T, DiDonato M, Duan L, Elsliger MA, Feuerhelm J, Floyd R, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Kovarik JS, Kreusch A, Levin I, McPhillips TM, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Quijano K, Reyes R, Rezezadeh F, Rodionov D, Schwarzenbacher R, Spraggon G, van den Bedem H, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of TM1030 from Thermotoga maritima at 2.3 A resolution reveals molecular details of its transcription repressor function. Proteins 2007; 68:418-24. [PMID: 17444523 DOI: 10.1002/prot.21436] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
181
|
Zhang Y, Stec B, Godzik A. Between order and disorder in protein structures: analysis of "dual personality" fragments in proteins. Structure 2007; 15:1141-7. [PMID: 17850753 PMCID: PMC2084070 DOI: 10.1016/j.str.2007.07.012] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2007] [Revised: 06/21/2007] [Accepted: 07/04/2007] [Indexed: 11/21/2022]
Abstract
In their natural environment, three-dimensional structures of proteins undergo significant fluctuations and are often partially or completely disordered. This phenomenon recently became the focus of much attention, as many proteins, especially from higher organisms, were shown to contain large intrinsically disordered regions. Such disordered regions may become ordered only under very specific circumstances, if at all, and can be recognized by specific amino acid composition and sequence signatures. Here, we suggest that the balance between order and disorder is much more subtle in that many regions are very close to the order/disorder boundary. Specifically, analysis of redundant sets of experimental models of protein structures, where emphasis is put on comparison of structures of identical proteins solved in different conditions and functional states, shows hundreds of fragments captured in two states: ordered and disordered. We show that such fragments, which we call here "dual personality" (DP) fragments, have distinctive features that differentiate them from both regularly folded and intrinsically disordered fragments. We hypothesize, and show on several examples, that such fragments are often targets of regulation, either by allostery or posttranslational modifications.
Collapse
|
182
|
Xu Q, Saikatendu KS, Krishna SS, McMullan D, Abdubek P, Agarwalla S, Ambing E, Astakhova T, Axelrod HL, Carlton D, Chiu HJ, Clayton T, DiDonato M, Duan L, Elsliger MA, Feuerhelm J, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Miller MD, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, Rife CL, Schwarzenbacher R, van den Bedem H, White A, Wolf G, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of MtnX phosphatase fromBacillus subtilisat 2.0 Å resolution provides a structural basis for bipartite phosphomonoester hydrolysis of 2-hydroxy-3-keto-5-methylthiopentenyl-1-phosphate. Proteins 2007; 69:433-9. [PMID: 17654724 DOI: 10.1002/prot.21602] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
183
|
Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, Remington K, Eisen JA, Heidelberg KB, Manning G, Li W, Jaroszewski L, Cieplak P, Miller CS, Li H, Mashiyama ST, Joachimiak MP, van Belle C, Chandonia JM, Soergel DA, Zhai Y, Natarajan K, Lee S, Raphael BJ, Bafna V, Friedman R, Brenner SE, Godzik A, Eisenberg D, Dixon JE, Taylor SS, Strausberg RL, Frazier M, Venter JC. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol 2007; 5:e16. [PMID: 17355171 PMCID: PMC1821046 DOI: 10.1371/journal.pbio.0050016] [Citation(s) in RCA: 667] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2006] [Accepted: 08/15/2006] [Indexed: 02/04/2023] Open
Abstract
Metagenomics projects based on shotgun sequencing of populations of micro-organisms yield insight into protein families. We used sequence similarity clustering to explore proteins with a comprehensive dataset consisting of sequences from available databases together with 6.12 million proteins predicted from an assembly of 7.7 million Global Ocean Sampling (GOS) sequences. The GOS dataset covers nearly all known prokaryotic protein families. A total of 3,995 medium- and large-sized clusters consisting of only GOS sequences are identified, out of which 1,700 have no detectable homology to known families. The GOS-only clusters contain a higher than expected proportion of sequences of viral origin, thus reflecting a poor sampling of viral diversity until now. Protein domain distributions in the GOS dataset and current protein databases show distinct biases. Several protein domains that were previously categorized as kingdom specific are shown to have GOS examples in other kingdoms. About 6,000 sequences (ORFans) from the literature that heretofore lacked similarity to known proteins have matches in the GOS data. The GOS dataset is also used to improve remote homology detection. Overall, besides nearly doubling the number of current proteins, the predicted GOS proteins also add a great deal of diversity to known protein families and shed light on their evolution. These observations are illustrated using several protein families, including phosphatases, proteases, ultraviolet-irradiation DNA damage repair enzymes, glutamine synthetase, and RuBisCO. The diversity added by GOS data has implications for choosing targets for experimental structure characterization as part of structural genomics efforts. Our analysis indicates that new families are being discovered at a rate that is linear or almost linear with the addition of new sequences, implying that we are still far from discovering all protein families in nature. The rapidly emerging field of metagenomics seeks to examine the genomic content of communities of organisms to understand their roles and interactions in an ecosystem. Given the wide-ranging roles microbes play in many ecosystems, metagenomics studies of microbial communities will reveal insights into protein families and their evolution. Because most microbes will not grow in the laboratory using current cultivation techniques, scientists have turned to cultivation-independent techniques to study microbial diversity. One such technique—shotgun sequencing—allows random sampling of DNA sequences to examine the genomic material present in a microbial community. We used shotgun sequencing to examine microbial communities in water samples collected by the Sorcerer II Global Ocean Sampling (GOS) expedition. Our analysis predicted more than six million proteins in the GOS data—nearly twice the number of proteins present in current databases. These predictions add tremendous diversity to known protein families and cover nearly all known prokaryotic protein families. Some of the predicted proteins had no similarity to any currently known proteins and therefore represent new families. A higher than expected fraction of these novel families is predicted to be of viral origin. We also found that several protein domains that were previously thought to be kingdom specific have GOS examples in other kingdoms. Our analysis opens the door for a multitude of follow-up protein family analyses and indicates that we are a long way from sampling all the protein families that exist in nature. The GOS data identified 6.12 million predicted proteins covering nearly all known prokaryotic protein families, and several new families. This almost doubles the number of known proteins and shows that we are far from identifying all the proteins in nature.
Collapse
|
184
|
Slabinski L, Jaroszewski L, Rychlewski L, Wilson IA, Lesley SA, Godzik A. XtalPred: a web server for prediction of protein crystallizability. ACTA ACUST UNITED AC 2007; 23:3403-5. [PMID: 17921170 DOI: 10.1093/bioinformatics/btm477] [Citation(s) in RCA: 216] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
UNLABELLED XtalPred is a web server for prediction of protein crystallizability. The prediction is made by comparing several features of the protein with distributions of these features in TargetDB and combining the results into an overall probability of crystallization. XtalPred provides: (1) a detailed comparison of the protein's features to the corresponding distribution from TargetDB; (2) a summary of protein features and predictions that indicate problems that are likely to be encountered during protein crystallization; (3) prediction of ligands; and (4) (optional) lists of close homologs from complete microbial genomes that are more likely to crystallize. AVAILABILITY The XtalPred web server is freely available for academic users on http://ffas.burnham.org/XtalPred
Collapse
|
185
|
Abstract
Large-scale genome sequencing and structural genomics projects generate numerous sequences and structures for 'hypothetical' proteins without functional characterizations. Detection of homology to experimentally characterized proteins can provide functional clues, but the accuracy of homology-based predictions is limited by the paucity of tools for quantitative comparison of diverging residues responsible for the functional divergence. SURF'S UP! is a web server for analysis of functional relationships in protein families, as inferred from protein surface maps comparison according to the algorithm. It assigns a numerical score to the similarity between patterns of physicochemical features(charge, hydrophobicity) on compared protein surfaces. It allows recognizing clusters of proteins that have similar surfaces, hence presumably similar functions. The server takes as an input a set of protein coordinates and returns files with "spherical coordinates" of proteins in a PDB format and their graphical presentation, a matrix with values of mutual similarities between the surfaces, and the unrooted tree that represents the clustering of similar surfaces, calculated by the neighbor-joining method. SURF'S UP! facilitates the comparative analysis of physicochemical features of the surface, which are the key determinants of the protein function. By concentrating on coarse surface features, SURF'S UP! can work with models obtained from comparative modelling. Although it is designed to analyse the conservation among homologs, it can also be used to compare surfaces of non-homologous proteins with different three-dimensional folds, as long as a functionally meaningful structural superposition is supplied by the user. Another valuable characteristic of our method is the lack of initial assumptions about the functional features to be compared. SURF'S UP! is freely available for academic researchers at http://asia.genesilico.pl/surfs_up/.
Collapse
|
186
|
Friedberg I, Godzik A. Connecting the protein structure universe by using sparse recurring fragments. Structure 2007; 13:1213-24. [PMID: 16084393 DOI: 10.1016/j.str.2005.05.009] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2005] [Revised: 04/22/2005] [Accepted: 05/11/2005] [Indexed: 10/25/2022]
Abstract
The quest to order and classify protein structures has lead to various classification schemes, focusing mostly on hierarchical relationships between structural domains. At the coarsest classification level, such schemes typically identify hundreds of types of fundamental units called folds. As a result, we picture protein structure space as a collection of isolated fold islands. It is obvious, however, that many protein folds share structural and functional commonalities. Locating those commonalities is important for our understanding of protein structure, function, and evolution. Here, we present an alternative view of the protein fold space, based on an interfold similarity measure that is related to the frequency of fragments shared between folds. In this view, protein structures form a complicated, crossconnected network with very interesting topology. We show that interfold similarity based on sequence/structure fragments correlates well with similarities of functions between protein populations in different folds.
Collapse
|
187
|
Ye Y, Godzik A. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 2007; 19 Suppl 2:ii246-55. [PMID: 14534198 DOI: 10.1093/bioinformatics/btg1086] [Citation(s) in RCA: 344] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Protein structures are flexible and undergo structural rearrangements as part of their function, and yet most existing protein structure comparison methods treat them as rigid bodies, which may lead to incorrect alignment. RESULTS We have developed the Flexible structure AlignmenT by Chaining AFPs (Aligned Fragment Pairs) with Twists (FATCAT), a new method for structural alignment of proteins. The FATCAT approach simultaneously addresses the two major goals of flexible structure alignment; optimizing the alignment and minimizing the number of rigid-body movements (twists) around pivot points (hinges) introduced in the reference protein. In contrast, currently existing flexible structure alignment programs treat the hinge detection as a post-process of a standard rigid body alignment. We illustrate the advantages of the FATCAT approach by several examples of comparison between proteins known to adopt different conformations, where the FATCAT algorithm achieves more accurate structure alignments than current methods, while at the same time introducing fewer hinges.
Collapse
|
188
|
Zubieta C, Krishna SS, McMullan D, Miller MD, Abdubek P, Agarwalla S, Ambing E, Astakhova T, Axelrod HL, Carlton D, Chiu HJ, Clayton T, Deller M, DiDonato M, Duan L, Elsliger MA, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Kumar A, Marciano D, Morse AT, Nigoghossian E, Oommachen S, Reyes R, Rife CL, van den Bedem H, Weekes D, White A, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of homoserine O-succinyltransferase from Bacillus cereus at 2.4 Å resolution. Proteins 2007; 68:999-1005. [PMID: 17546672 DOI: 10.1002/prot.21208] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
189
|
Rodrigues APC, Grant BJ, Godzik A, Friedberg I. The 2006 automated function prediction meeting. BMC Bioinformatics 2007; 8 Suppl 4:S1-4. [PMID: 17570143 PMCID: PMC1892079 DOI: 10.1186/1471-2105-8-s4-s1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
190
|
Weekes D, Miller MD, Krishna SS, McMullan D, McPhillips TM, Acosta C, Canaves JM, Elsliger MA, Floyd R, Grzechnik SK, Jaroszewski L, Klock HE, Koesema E, Kovarik JS, Kreusch A, Morse AT, Quijano K, Spraggon G, van den Bedem H, Wolf G, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of a transcription regulator (TM1602) from Thermotoga maritima at 2.3 A resolution. Proteins 2007; 67:247-52. [PMID: 17256761 DOI: 10.1002/prot.21221] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
191
|
Friedberg I, Nika K, Tautz L, Saito K, Cerignoli F, Friedberg I, Godzik A, Mustelin T. Identification and characterization of DUSP27, a novel dual-specific protein phosphatase. FEBS Lett 2007; 581:2527-33. [PMID: 17498703 DOI: 10.1016/j.febslet.2007.04.059] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2007] [Revised: 04/08/2007] [Accepted: 04/18/2007] [Indexed: 10/23/2022]
Abstract
A novel human dual-specific protein phosphatase (DSP), designated DUSP27, is here described. The DUSP27 gene contains three exons, rather than the predicted 4-14 exons, and encodes a 220 amino acid protein. DUSP27 is structurally similar to other small DSPs, like VHR and DUSP13. The location of DUSP27 on chromosome 10q22, 50kb upstream of DUSP13, suggests that these two genes arose by gene duplication. DUSP27 is an active enzyme, and its kinetic parameters and were determined. DUSP27 is a cytosolic enzyme, expressed in skeletal muscle, liver and adipose tissue, suggesting its possible role in energy metabolism.
Collapse
|
192
|
Grynberg M, Li Z, Szczurek E, Godzik A. Putative type IV secretion genes in Bacillus anthracis. Trends Microbiol 2007; 15:191-5. [PMID: 17387016 DOI: 10.1016/j.tim.2007.03.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2006] [Revised: 02/15/2007] [Accepted: 03/13/2007] [Indexed: 11/22/2022]
Abstract
Although the physiology of Bacillus anthracis, the causative agent of anthrax, has been studied extensively, we still do not know how toxins are dispatched from the bacterial cell. Here, by means of distant homology and genome context analyses, we identify genes encoding putative type IV secretion system-related elements on the B. anthracis plasmids pXO1 and pXO2 and in the chromosome. We argue that this type IV secretion system-like system could be responsible for anthrax toxin secretion, although we also discuss the possibilities of its involvement in the processes of sporulation, germination or conjugation.
Collapse
|
193
|
Friedberg I, Harder T, Kolodny R, Sitbon E, Li Z, Godzik A. Using an alignment of fragment strings for comparing protein structures. Bioinformatics 2007; 23:e219-24. [PMID: 17237095 DOI: 10.1093/bioinformatics/btl310] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Most methods that are used to compare protein structures use three-dimensional (3D) structural information. At the same time, it has been shown that a 1D string representation of local protein structure retains a degree of structural information. This type of representation can be a powerful tool for protein structure comparison and classification, given the arsenal of sequence comparison tools developed by computational biology. However, in order to do so, there is a need to first understand how much information is contained in various possible 1D representations of protein structure. RESULTS Here we describe the use of a particular structure fragment library, denoted here as KL-strings, for the 1D representation of protein structure. Using KL-strings, we develop an infrastructure for comparing protein structures with a 1D representation. This study focuses on the added value gained from such a description. We show the new local structure language adds resolution to the traditional three-state (helix, strand and coil) secondary structure description, and provides a high degree of accuracy in recognizing structural similarities when used with a pairwise alignment benchmark. The results of this study have immediate applications towards fast structure recognition, and for fold prediction and classification.
Collapse
|
194
|
Xu Q, Krishna SS, McMullan D, Schwarzenbacher R, Miller MD, Abdubek P, Agarwalla S, Ambing E, Astakhova T, Axelrod HL, Canaves JM, Carlton D, Chiu HJ, Clayton T, DiDonato M, Duan L, Elsliger MA, Feuerhelm J, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Quijano K, Reyes R, Rife CL, Spraggon G, Stevens RC, van den Bedem H, White A, Wolf G, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of an ORFan protein (TM1622) from Thermotoga maritima at 1.75 A resolution reveals a fold similar to the Ran-binding protein Mog1p. Proteins 2007; 65:777-82. [PMID: 16948158 DOI: 10.1002/prot.21015] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
195
|
Igarashi Y, Eroshkin A, Gramatikova S, Gramatikoff K, Zhang Y, Smith JW, Osterman AL, Godzik A. CutDB: a proteolytic event database. Nucleic Acids Res 2006; 35:D546-9. [PMID: 17142225 PMCID: PMC1669773 DOI: 10.1093/nar/gkl813] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Beyond the well-known role of proteolytic machinery in protein degradation and turnover, many specialized proteases play a key role in various regulatory processes. Thousands of highly specific proteolytic events are associated with normal and pathological conditions, including bacterial and viral infections. However, the information about individual proteolytic events is dispersed over multiple publications and is not easily available for large-scale analysis. CutDB is one of the first systematic efforts to build an easily accessible collection of documented proteolytic events for natural proteins in vivo or in vitro. A CutDB entry is defined by a unique combination of these three attributes: protease, protein substrate and cleavage site. Currently, CutDB integrates 3070 proteolytic events for 470 different proteases captured from public archives (such as MEROPS and HPRD) and publications. CutDB supports various types of data searches and displays, including clickable network diagrams. Most importantly, CutDB is a community annotation resource based on a Wikipedia approach, providing a convenient user interface to input new data online. A recent contribution of 568 proteolytic events by several experts in the field of matrix metallopeptidases suggests that this approach will significantly accelerate the development of CutDB content. CutDB is publicly available at .
Collapse
|
196
|
Kosloff M, Han GW, Krishna SS, Schwarzenbacher R, Fasnacht M, Elsliger MA, Abdubek P, Agarwalla S, Ambing E, Astakhova T, Axelrod HL, Canaves JM, Carlton D, Chiu HJ, Clayton T, DiDonato M, Duan L, Feuerhelm J, Grittini C, Grzechnik SK, Hale J, Hampton E, Haugen J, Jaroszewski L, Jin KK, Johnson H, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Levin I, McMullan D, Miller MD, Morse AT, Moy K, Nigoghossian E, Okach L, Oommachen S, Page R, Paulsen J, Quijano K, Reyes R, Rife CL, Sims E, Spraggon G, Sridhar V, Stevens RC, van den Bedem H, Velasquez J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Comparative structural analysis of a novel glutathioneS-transferase (ATU5508) fromAgrobacterium tumefaciensat 2.0 Å resolution. Proteins 2006; 65:527-37. [PMID: 16988933 DOI: 10.1002/prot.21130] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Glutathione S-transferases (GSTs) comprise a diverse superfamily of enzymes found in organisms from all kingdoms of life. GSTs are involved in diverse processes, notably small-molecule biosynthesis or detoxification, and are frequently also used in protein engineering studies or as biotechnology tools. Here, we report the high-resolution X-ray structure of Atu5508 from the pathogenic soil bacterium Agrobacterium tumefaciens (atGST1). Through use of comparative sequence and structural analysis of the GST superfamily, we identified local sequence and structural signatures, which allowed us to distinguish between different GST classes. This approach enables GST classification based on structure, without requiring additional biochemical or immunological data. Consequently, analysis of the atGST1 crystal structure suggests a new GST class, distinct from previously characterized GSTs, which would make it an attractive target for further biochemical studies.
Collapse
|
197
|
DiDonato M, Krishna SS, Schwarzenbacher R, McMullan D, Agarwalla S, Brittain SM, Miller MD, Abdubek P, Ambing E, Axelrod HL, Canaves JM, Chiu HJ, Deacon AM, Duan L, Elsliger MA, Godzik A, Grzechnik SK, Hale J, Hampton E, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Lesley SA, Levin I, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Quijano K, Reyes R, Rife CL, Spraggon G, Stevens RC, van den Bedem H, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of 2-phosphosulfolactate phosphatase (ComB) fromClostridium acetobutylicumat 2.6 Å resolution reveals a new fold with a novel active site. Proteins 2006; 65:771-6. [PMID: 16927339 DOI: 10.1002/prot.20978] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
198
|
Berman HM, Burley SK, Chiu W, Sali A, Adzhubei A, Bourne PE, Bryant SH, Dunbrack RL, Fidelis K, Frank J, Godzik A, Henrick K, Joachimiak A, Heymann B, Jones D, Markley JL, Moult J, Montelione GT, Orengo C, Rossmann MG, Rost B, Saibil H, Schwede T, Standley DM, Westbrook JD. Outcome of a workshop on archiving structural models of biological macromolecules. Structure 2006; 14:1211-7. [PMID: 16955948 DOI: 10.1016/j.str.2006.06.005] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
199
|
Zhi D, Krishna SS, Cao H, Pevzner P, Godzik A. Representing and comparing protein structures as paths in three-dimensional space. BMC Bioinformatics 2006; 7:460. [PMID: 17052359 PMCID: PMC1626488 DOI: 10.1186/1471-2105-7-460] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2006] [Accepted: 10/20/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Most existing formulations of protein structure comparison are based on detailed atomic level descriptions of protein structures and bypass potential insights that arise from a higher-level abstraction. RESULTS We propose a structure comparison approach based on a simplified representation of proteins that describes its three-dimensional path by local curvature along the generalized backbone of the polypeptide. We have implemented a dynamic programming procedure that aligns curvatures of proteins by optimizing a defined sum turning angle deviation measure. CONCLUSION Although our procedure does not directly optimize global structural similarity as measured by RMSD, our benchmarking results indicate that it can surprisingly well recover the structural similarity defined by structure classification databases and traditional structure alignment programs. In addition, our program can recognize similarities between structures with extensive conformation changes that are beyond the ability of traditional structure alignment programs. We demonstrate the applications of procedure to several contexts of structure comparison. An implementation of our procedure, CURVE, is available as a public webserver.
Collapse
|
200
|
Awobuluyi M, Yang J, Ye Y, Chatterton JE, Godzik A, Lipton SA, Zhang D. Subunit-Specific Roles of Glycine-Binding Domains in Activation of NR1/NR3 N-Methyl-d-aspartate Receptors. Mol Pharmacol 2006; 71:112-22. [PMID: 17047094 DOI: 10.1124/mol.106.030700] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
N-Methyl-D-aspartate receptors (NMDARs) composed of NR1 and NR3 subunits differ from other NMDAR subtypes in that they require glycine alone for activation. However, little else is known about the activation mechanism of these receptors. Using NMDAR glycine-site agonists/antagonists in conjunction with functional mutagenesis of the NR1 and NR3 ligand-binding cores, we demonstrate quite surprisingly that agonist binding to NR3 alone is sufficient to activate a significant component of NR1/NR3 receptor currents. Thus, the apo conformation of NR1 in NR1/NR3 receptors is permissive for receptor activation. Agonist-bound NR1 may also contribute to peak NR1/NR3 receptor currents but specifically enables significant NR1/NR3 receptor current decay under the conditions studied here, pre-sumably via a slow component of desensitization. Ligand studies of NR1/NR3 receptors also suggest differential agonist selectivity between NR3 and NR1, as some high-affinity NR1 agonists only minimally activate NR1/NR3 receptors, whereas other NR1 agonists are as potent as glycine. Furthermore, liganded NR3 subunits seem necessary for effective engagement of NR1 in NR1/NR3 receptor activation, suggesting significant interactivity between the two subunits. NR3 subunits thus induce plasticity in NR1 with respect to subunit assembly and ligand binding/channel coupling that is unique among ligand-gated ion channel subunits.
Collapse
|