201
|
Ginalski K, Godzik A, Rychlewski L. Novel SARS unique AdoMet-dependent methyltransferase. Cell Cycle 2006; 5:2414-6. [PMID: 17102613 DOI: 10.4161/cc.5.20.3361] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
202
|
Xu Q, Schwarzenbacher R, Krishna SS, McMullan D, Agarwalla S, Quijano K, Abdubek P, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Elsliger MA, Grittini C, Grzechnik SK, DiDonato M, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Miller MD, Moy K, Nigoghossian E, Paulsen J, Reyes R, Rife C, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, White A, Wolf G, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of acireductone dioxygenase (ARD) from Mus musculus at 2.06 angstrom resolution. Proteins 2006; 64:808-13. [PMID: 16783794 DOI: 10.1002/prot.20947] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
203
|
White AK, Hoch JA, Grynberg M, Godzik A, Perego M. Sensor domains encoded in Bacillus anthracis virulence plasmids prevent sporulation by hijacking a sporulation sensor histidine kinase. J Bacteriol 2006; 188:6354-60. [PMID: 16923903 PMCID: PMC1595385 DOI: 10.1128/jb.00656-06] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Anthrax toxin and capsule, determinants for successful infection by Bacillus anthracis, are encoded on the virulence plasmids pXO1 and pXO2, respectively. Each of these plasmids also encodes proteins that are highly homologous to the signal sensor domain of a chromosomally encoded major sporulation sensor histidine kinase (BA2291) in this organism. B. anthracis Sterne overexpressing the plasmid pXO2-61-encoded signal sensor domain exhibited a significant decrease in sporulation that was suppressed by the deletion of the BA2291 gene. Expression of the sensor domains from the pXO1-118 and pXO2-61 genes in Bacillus subtilis strains carrying the B. anthracis sporulation sensor kinase BA2291 gene resulted in BA2291-dependent inhibition of sporulation. These results indicate that sporulation sensor kinase BA2291 is converted from an activator to an inhibitor of sporulation in its native host by the virulence plasmid-encoded signal sensor domains. We speculate that activation of these signal sensor domains contributes to the initiation of B. anthracis sporulation in the bloodstream of its infected host, a salient characteristic in the virulence of this organism, and provides an additional role for the virulence plasmids in anthrax pathogenesis.
Collapse
|
204
|
Mathews II, Krishna SS, Schwarzenbacher R, McMullan D, Jaroszewski L, Miller MD, Abdubek P, Agarwalla S, Ambing E, Axelrod HL, Canaves JM, Carlton D, Chiu HJ, Clayton T, DiDonato M, Duan L, Elsliger MA, Grzechnik SK, Hale J, Hampton E, Haugen J, Jin KK, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, Levin I, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Quijano K, Reyes R, Rife CL, Spraggon G, Stevens RC, van den Bedem H, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of phosphoribosylformyl-glycinamidine synthase II, PurS subunit (TM1244) from Thermotoga maritima at 1.90 A resolution. Proteins 2006; 65:249-54. [PMID: 16865708 DOI: 10.1002/prot.21024] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
205
|
Han GW, Sri Krishna S, Schwarzenbacher R, McMullan D, Ginalski K, Elsliger MA, Brittain SM, Abdubek P, Agarwalla S, Ambing E, Astakhova T, Axelrod H, Canaves JM, Chiu HJ, DiDonato M, Grzechnik SK, Hale J, Hampton E, Haugen J, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Miller MD, Morse AT, Moy K, Nigoghossian E, Oommachen S, Ouyang J, Paulsen J, Quijano K, Reyes R, Rife C, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Wang X, West B, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of the ApbE protein (TM1553) from Thermotoga maritima at 1.58 A resolution. Proteins 2006; 64:1083-90. [PMID: 16779835 DOI: 10.1002/prot.20950] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
206
|
Abstract
With the high number of sequences and structures streaming in from genomic projects, there is a need for more powerful and sophisticated annotation tools. Most problematic of the annotation efforts is predicting gene and protein function. Over the past few years there has been considerable progress in automated protein function prediction, using a diverse set of methods. Nevertheless, no single method reports all the information possible, and molecular biologists resort to 'shopping around' using different methods: a cumbersome and time-consuming practice. Here we present the Joined Assembly of Function Annotations, or JAFA server. JAFA queries several function prediction servers with a protein sequence and assembles the returned predictions in a legible, non-redundant format. In this manner, JAFA combines the predictions of several servers to provide a comprehensive view of what are the predicted functions of the proteins. JAFA also offers its own output, and the individual programs' predictions for further processing. JAFA is available for use from http://jafa.burnham.org.
Collapse
|
207
|
|
208
|
Schwarzenbacher R, McMullan D, Krishna SS, Xu Q, Miller MD, Canaves JM, Elsliger MA, Floyd R, Grzechnik SK, Jaroszewski L, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, McPhillips TM, Morse AT, Quijano K, Spraggon G, Stevens RC, van den Bedem H, Wolf G, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA. Crystal structure of a glycerate kinase (TM1585) from Thermotoga maritima at 2.70 Å resolution reveals a new fold. Proteins 2006; 65:243-8. [PMID: 16865707 DOI: 10.1002/prot.21058] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
209
|
Mathews II, Krishna SS, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Levin I, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of phosphoribosylformylglycinamidine synthase II (smPurL) from Thermotoga maritima at 2.15 A resolution. Proteins 2006; 63:1106-11. [PMID: 16544324 DOI: 10.1002/prot.20650] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
210
|
Jin KK, Krishna SS, Schwarzenbacher R, McMullan D, Abdubek P, Agarwalla S, Ambing E, Axelrod H, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Feuerhelm J, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Lesley SA, Miller MD, Moy K, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Quijano K, Reyes R, Rife C, Stevens RC, Spraggon G, van den Bedem H, Velasquez J, White A, Wolf G, Han GW, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of TM1367 from Thermotoga maritima at 1.90 A resolution reveals an atypical member of the cyclophilin (peptidylprolyl isomerase) fold. Proteins 2006; 63:1112-8. [PMID: 16544291 DOI: 10.1002/prot.20894] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
211
|
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006. [PMID: 16731699 DOI: 10.1007/978-1-4899-7478-5_221] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023] Open
Abstract
In 2001 and 2002, we published two papers (Bioinformatics, 17, 282-283, Bioinformatics, 18, 77-82) describing an ultrafast protein sequence clustering program called cd-hit. This program can efficiently cluster a huge protein database with millions of sequences. However, the applications of the underlying algorithm are not limited to only protein sequences clustering, here we present several new programs using the same algorithm including cd-hit-2d, cd-hit-est and cd-hit-est-2d. Cd-hit-2d compares two protein datasets and reports similar matches between them; cd-hit-est clusters a DNA/RNA sequence database and cd-hit-est-2d compares two nucleotide datasets. All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST.
Collapse
|
212
|
Doukhanina EV, Chen S, van der Zalm E, Godzik A, Reed J, Dickman MB. Identification and Functional Characterization of the BAG Protein Family in Arabidopsis thaliana. J Biol Chem 2006; 281:18793-801. [PMID: 16636050 DOI: 10.1074/jbc.m511794200] [Citation(s) in RCA: 142] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The genes that control mammalian programmed cell death are conserved across wide evolutionary distances. Although plant cells can undergo apoptosis-like cell death, plant homologs of mammalian regulators of apoptosis have, in general, not been found. This is in part due to the lack of primary sequence conservation between animal and putative plant regulators of apoptosis. Thus, alternative approaches beyond sequence similarities are required to find functional plant homologs of apoptosis regulators. Here, we present the results of using advanced bioinformatic tools to uncover the Arabidopsis family of BAG proteins. The mammalian BAG (Bcl-2-associated athanogene) proteins are a family of chaperone regulators that modulate a number of diverse processes ranging from proliferation to growth arrest and cell death. Such proteins are distinguished by a conserved BAG domain that directly interacts with Hsp70 and Hsc70 proteins to regulate their activity. Our searches of the Arabidopsis thaliana genome sequence revealed seven homologs of the BAG protein family. We further show that plant BAG family members are also multifunctional and remarkably similar to their animal counterparts, as they regulate apoptosis-like processes ranging from pathogen attack to abiotic stress and development.
Collapse
|
213
|
Ye Y, Osterman A, Overbeek R, Godzik A. Automatic detection of subsystem/pathway variants in genome analysis. Bioinformatics 2006; 21 Suppl 1:i478-86. [PMID: 15961494 DOI: 10.1093/bioinformatics/bti1052] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Proteins work together in pathways and networks, collectively comprising the cellular machinery. A subsystem (a generalization of pathway concept) is a group of related functional roles (such as enzymes) jointly involved in a specific aspect of the cellular machinery. Subsystems provide a natural framework for comparative genome analysis and functional annotation. A subsystem may be implemented in a number of different functional variants in individual species. In order to reliably project functional assignments across multiple genomes, we have to be able to identify the variants implemented in each genome. The analysis of such variants across diverse species is an interesting problem by itself and may provide new evolutionary insights. However, no computational techniques are presently available for an automated detection and analysis of subsystem variants. RESULTS Here we formulate the subsystem variant detection problem as finding the minimum number of subgraphs of a subsystem, which is represented as a graph, and solve the optimization problem by integer programming approach. The performance of our method was tested on subsystems encoded in the SEED, a genomic integration platform developed by the Fellowship for Interpretation of Genomes as a component of a large-scale effort on comparative analysis and annotation of multiple diverse genomes. Here we illustrate the results obtained for two expert-encoded subsystems of the biosynthesis of Coenzyme A and FMN/FAD cofactors. Applications of variant detection, to support genomic annotations and to assess divergence of species, are briefly discussed in the context of these universally conserved and essential metabolic subsystems. SUPPLEMENTARY INFORMATION The details of the variant detection results are available at http://ffas.burnham.org/svar/supp.html.
Collapse
|
214
|
Han GW, Schwarzenbacher R, Page R, Jaroszewski L, Abdubek P, Ambing E, Biorac T, Canaves JM, Chiu HJ, Dai X, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Haugen J, Hornsby M, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Levin I, McMullan D, McPhillips TM, Miller MD, Morse A, Moy K, Nigoghossian E, Ouyang J, Paulsen J, Quijano K, Reyes R, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, von Delft F, Wang X, West B, White A, Wolf G, Xu Q, Zagnitko O, Hodgson KO, Wooley J, Wilson IA. Crystal structure of an alanine-glyoxylate aminotransferase from Anabaena sp. at 1.70 A resolution reveals a noncovalently linked PLP cofactor. Proteins 2006; 58:971-5. [PMID: 15657930 DOI: 10.1002/prot.20360] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
215
|
DiDonato M, Krishna SS, Schwarzenbacher R, McMullan D, Jaroszewski L, Miller MD, Abdubek P, Agarwalla S, Ambing E, Axelrod H, Biorac T, Chiu HJ, Deacon AM, Elsliger MA, Feuerhelm J, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Haugen J, Hornsby M, Klock HE, Knuth MW, Koesema E, Kreusch A, Kuhn P, Lesley SA, Moy K, Nigoghossian E, Okach L, Paulsen J, Quijano K, Reyes R, Rife C, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a single-stranded DNA-binding protein (TM0604) from Thermotoga maritima at 2.60 A resolution. Proteins 2006; 63:256-60. [PMID: 16435371 DOI: 10.1002/prot.20841] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
216
|
Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006; 22:1658-9. [PMID: 16731699 DOI: 10.1093/bioinformatics/btl158] [Citation(s) in RCA: 6572] [Impact Index Per Article: 365.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION In 2001 and 2002, we published two papers (Bioinformatics, 17, 282-283, Bioinformatics, 18, 77-82) describing an ultrafast protein sequence clustering program called cd-hit. This program can efficiently cluster a huge protein database with millions of sequences. However, the applications of the underlying algorithm are not limited to only protein sequences clustering, here we present several new programs using the same algorithm including cd-hit-2d, cd-hit-est and cd-hit-est-2d. Cd-hit-2d compares two protein datasets and reports similar matches between them; cd-hit-est clusters a DNA/RNA sequence database and cd-hit-est-2d compares two nucleotide datasets. All these programs can handle huge datasets with millions of sequences and can be hundreds of times faster than methods based on the popular sequence comparison and database search tools, such as BLAST.
Collapse
|
217
|
Rife C, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of the global regulatory protein CsrA from Pseudomonas putida at 2.05 A resolution reveals a new fold. Proteins 2006; 61:449-53. [PMID: 16104018 DOI: 10.1002/prot.20502] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
218
|
Xu Q, Schwarzenbacher R, McMullan D, Abdubek P, Agarwalla S, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Rife C, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, White A, Wolf G, Hodgson KO, Wooley J, Wilson IA. Crystal structure of virulence factor CJ0248 from Campylobacter jejuni at 2.25 A resolution reveals a new fold. Proteins 2006; 62:292-6. [PMID: 16287129 DOI: 10.1002/prot.20611] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
219
|
Rife C, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a putative modulator of DNA gyrase (pmbA) from Thermotoga maritima at 1.95 A resolution reveals a new fold. Proteins 2006; 61:444-8. [PMID: 16104019 DOI: 10.1002/prot.20468] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
220
|
Klock HE, Schwarzenbacher R, Xu Q, McMullan D, Abdubek P, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Koesema E, Kreusch A, Kuhn P, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Rife C, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, White A, Wolf G, Hodgson KO, Wooley J, Lesley SA, Wilson IA. Crystal structure of a conserved hypothetical protein (gi: 13879369) from Mouse at 1.90 A resolution reveals a new fold. Proteins 2006; 61:1132-6. [PMID: 16224779 DOI: 10.1002/prot.20610] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
221
|
Shiryaev S, Ratnikov B, Chekanov A, Sikora S, Rozanov D, Godzik A, Wang J, Smith J, Huang Z, Lindberg I, Samuel M, Diamond M, Strongin A. Cleavage targets and the D-arginine-based inhibitors of the West Nile virus NS3 processing proteinase. Biochem J 2006; 393:503-11. [PMID: 16229682 PMCID: PMC1360700 DOI: 10.1042/bj20051374] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Mosquito-borne WNV (West Nile virus) is an emerging global threat. The NS3 proteinase, which is essential for the proteolytic processing of the viral polyprotein precursor, is a promising drug target. We have isolated and biochemically characterized the recombinant, highly active NS3 proteinase. We have determined that the NS3 proteinase functions in a manner that is distantly similar to furin in cleaving the peptide and protein substrates. We determined that aprotinin and D-arginine-based 9-12-mer peptides are potent inhibitors of WNV NS3 with K(i) values of 26 nM and 1 nM respectively. Consistent with the essential role of NS3 activity in the life cycle of WNV and with the sensitivity of NS3 activity to the D-arginine-based peptides, we showed that nona-D-Arg-NH2 reduced WNV infection in primary neurons. We have also shown that myelin basic protein, a deficiency of which is linked to neurological abnormalities of the brain, is sensitive to NS3 proteolysis in vitro and therefore this protein represents a convenient test substrate for the studies of NS3. A three-dimensional model of WNV NS3 that we created may provide a structural guidance and a rationale for the subsequent design of fine-tuned inhibitors. Overall, our findings represent a foundation for in-depth mechanistic and structural studies as well as for the design of novel and efficient inhibitors of WNV NS3.
Collapse
|
222
|
Levin I, Miller MD, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Biorac T, Cambell J, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Karlak C, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Morse A, Moy K, Nigoghossian E, Ouyang J, Page R, Quijano K, Reyes R, Robb A, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, Wang X, West B, Wolf G, Xu Q, Zagnitko O, Hodgson KO, Wooley J, Wilson IA. Crystal structure of an indigoidine synthase A (IndA)-like protein (TM1464) from Thermotoga maritima at 1.90 A resolution reveals a new fold. Proteins 2006; 59:864-8. [PMID: 15822122 DOI: 10.1002/prot.20420] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
223
|
Han GW, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Dai X, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, McPhillips TM, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of an Apo mRNA decapping enzyme (DcpS) from Mouse at 1.83 A resolution. Proteins 2006; 60:797-802. [PMID: 16001405 DOI: 10.1002/prot.20467] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
224
|
Li Z, Ye Y, Godzik A. Flexible Structural Neighborhood--a database of protein structural similarities and alignments. Nucleic Acids Res 2006; 34:D277-80. [PMID: 16381864 PMCID: PMC1347486 DOI: 10.1093/nar/gkj124] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Protein structures are flexible, changing their shapes not only upon substrate binding, but also during evolution as a collective effect of mutations, deletions and insertions. A new generation of protein structure comparison algorithms allows for such flexibility; they go beyond identifying the largest common part between two proteins and find hinge regions and patterns of flexibility in protein families. Here we present a Flexible Structural Neighborhood (FSN), a database of structural neighbors of proteins deposited in PDB as seen by a flexible protein structure alignment program FATCAT, developed previously in our group. The database, searchable by a protein PDB code, provides lists of proteins with statistically significant structural similarity and on lower menu levels provides detailed alignments, interactive superposition of structures and positions of hinges that were identified in the comparison. While superficially similar to other structural protein alignment resources, FSN provides a unique resource to study not only protein structural similarity, but also how protein structures change. FSN is available from a server and by direct links from the PDB database.
Collapse
|
225
|
Newman RM, Salunkhe P, Godzik A, Reed JC. Identification and characterization of a novel bacterial virulence factor that shares homology with mammalian Toll/interleukin-1 receptor family proteins. Infect Immun 2006; 74:594-601. [PMID: 16369016 PMCID: PMC1346628 DOI: 10.1128/iai.74.1.594-601.2006] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Many important bacterial virulence factors act as mimics of mammalian proteins to subvert normal host cell processes. To identify bacterial protein mimics of components of the innate immune signaling pathway, we searched the bacterial genome database for proteins with homology to the Toll/interleukin-1 receptor (TIR) domain of the mammalian Toll-like receptors (TLRs) and their adaptor proteins. A previously uncharacterized gene, which we have named tlpA (for TIR-like protein A), was identified in the Salmonella enterica serovar Enteritidis genome that is predicted to encode a protein resembling mammalian TIR domains, We show that overexpression of TlpA in mammalian cells suppresses the ability of mammalian TIR-containing proteins TLR4, IL-1 receptor, and MyD88 to induce the transactivation and DNA-binding activities of NF-kappaB, a downstream target of the TIR signaling pathway. In addition, TlpA mimics the previously characterized Salmonella virulence factor SipB in its ability to induce activation of caspase-1 in a mammalian cell transfection model. Disruption of the chromosomal tlpA gene rendered a virulent serovar Enteritidis strain defective in intracellular survival and IL-1beta secretion in a cell culture infection model using human THP1 macrophages. Bacteria with disrupted tlpA also displayed reduced lethality in mice, further confirming an important role for this factor in pathogenesis. Taken together, our findings demonstrate that the bacterial TIR-like protein TlpA is a novel prokaryotic modulator of NF-kappaB activity and IL-1beta secretion that contributes to serovar Enteritidis virulence.
Collapse
|
226
|
Robinson-Rechavi M, Alibés A, Godzik A. Contribution of Electrostatic Interactions, Compactness and Quaternary Structure to Protein Thermostability: Lessons from Structural Genomics of Thermotoga maritima. J Mol Biol 2006; 356:547-57. [PMID: 16375925 DOI: 10.1016/j.jmb.2005.11.065] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2005] [Revised: 11/21/2005] [Accepted: 11/21/2005] [Indexed: 10/25/2022]
Abstract
Studies of the structural basis of protein thermostability have produced a confusing picture. Small sets of proteins have been analyzed from a variety of thermophilic species, suggesting different structural features as responsible for protein thermostability. Taking advantage of the recent advances in structural genomics, we have compiled a relatively large protein structure dataset, which was constructed very carefully and selectively; that is, the dataset contains only experimentally determined structures of proteins from one specific organism, the hyperthermophilic bacterium Thermotoga maritima, and those of close homologs from mesophilic bacteria. In contrast to the conclusions of previous studies, our analyses show that oligomerization order, hydrogen bonds, and secondary structure play minor roles in adaptation to hyperthermophily in bacteria. On the other hand, the data exhibit very significant increases in the density of salt-bridges and in compactness for proteins from T.maritima. The latter effect can be measured by contact order or solvent accessibility, and network analysis shows a specific increase in highly connected residues in this thermophile. These features account for changes in 96% of the protein pairs studied. Our results provide a clear picture of protein thermostability in one species, and a framework for future studies of thermal adaptation.
Collapse
|
227
|
Li W, Godzik A. VISSA: a program to visualize structural features from structure sequence alignment. ACTA ACUST UNITED AC 2006; 22:887-8. [PMID: 16434438 DOI: 10.1093/bioinformatics/btl019] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Multiple sequence alignment is an important tool to understand and analyze functions of homologous proteins. However, the logic of residue conservation/variation is usually apparent only in three-dimensional (3D) space, not on a primary sequence level. Thus, in a traditional multiple alignment it is often difficult to directly visualize and analyze key residues because they are masked by other residues along the alignment. Here we present an integrated multiple alignment and 3D structure visualization program that can (1) map and highlight residues from a 1D alignment onto a 3D structure and vice versa and (2) display only the alignment of preselected, key residues. This program, called Visualize Structure Sequence Alignment, also has many other built-in tools that can help analyze multiple sequence alignments. AVAILABILITY http://bioinformatics.burnham.org/liwz/vissa CONTACT liwz@burnham.org.
Collapse
|
228
|
Ye Y, Li Z, Godzik A. Modeling and analyzing three-dimensional structures of human disease proteins. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2006:439-50. [PMID: 17094259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Three-dimensional structures of proteins, experimental or predicted, show us how these molecular machines actually work. With the help of information on disease-related mutations, they can also show us how they malfunction in diseases. Such understanding, currently lacking for most human diseases, is an important first step before designing drugs or therapies to cure specific diseases. Here we used homology modeling to model human disease-related proteins, and studied structural characteristics of disease related mutations and compared them with non synonymous SNPs. 1484 domains from 874 proteins were modeled, and together with experimentally determined structures of 369 domains they provided the structural coverage of 48% of total residues in 1237 human disease proteins. We found that disease-related mutations have statistically significantly preference to form clusters on protein surfaces. In contrast, the non-synonymous SNPs appear to be randomly distributed on the surface. We interpret these results as an indication that disease mutations affect protein-protein interaction interfaces. This interpretation is supported by the analysis of 8 experimentally determined complexes between disease proteins, where disease-related mutations are clearly located in the binding interface of proteins, while SNPs are not. The non-uniform distribution of disease mutations indicates that we can use this feature as guidance in modeling and evaluating human disease proteins and their complexes. We set up a resource for Disease Protein Models (DPM at http://ffas.burnham.org/DPM), which can be used for studying the relation between disease and mutation/polymorphism sites in the context of protein 3D structures and complexes.
Collapse
|
229
|
Plewczynski D, Tkacz A, Wyrwicz LS, Godzik A, Kloczkowski A, Rychlewski L. Support-vector-machine classification of linear functional motifs in proteins. J Mol Model 2005; 12:453-61. [PMID: 16341901 DOI: 10.1007/s00894-005-0070-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2005] [Accepted: 10/18/2005] [Indexed: 10/25/2022]
Abstract
Our algorithm predicts short linear functional motifs in proteins using only sequence information. Statistical models for short linear functional motifs in proteins are built using the database of short sequence fragments taken from proteins in the current release of the Swiss-Prot database. Those segments are confirmed by experiments to have single-residue post-translational modification. The sensitivities of the classification for various types of short linear motifs are in the range of 70%. The query protein sequence is dissected into short overlapping fragments. All segments are represented as vectors. Each vector is then classified by a machine learning algorithm (Support Vector Machine) as potentially modifiable or not. The resulting list of plausible post-translational sites in the query protein is returned to the user. We also present a study of the human protein kinase C family as a biological application of our method.
Collapse
|
230
|
Robinson-Rechavi M, Godzik A. Structural genomics of thermotoga maritima proteins shows that contact order is a major determinant of protein thermostability. Structure 2005; 13:857-60. [PMID: 15939017 DOI: 10.1016/j.str.2005.03.011] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2004] [Revised: 03/17/2005] [Accepted: 03/17/2005] [Indexed: 11/29/2022]
Abstract
Despite numerous studies, understanding the structural basis of protein stability in thermophilic organisms has remained elusive. One of the main reasons is the limited number of thermostable protein structures available for analysis, but also the difficulty in identifying relevant features to compare. Notably, an intuitive feeling of "compactness" of thermostable proteins has eluded quantification. With the unprecedented opportunity to assemble a data set for comparative analyses due to the recent advances in structural genomics, we can now revisit this issue and focus on experimentally determined structures of proteins from the hyperthermophilic bacterium Thermotoga maritima. We find that 73% of T. maritima proteins have higher contact order than their mesophilic homologs. Thus, contact order, a structural feature that was originally introduced to explain differences in folding rates of different protein families, is a significant parameter that can now be correlated with thermostability.
Collapse
|
231
|
Jaroszewski L, Schwarzenbacher R, McMullan D, Abdubek P, Agarwalla S, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Miller MD, Moy K, Nigoghossian E, Paulsen J, Quijano K, Reyes R, Rife C, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of Hsp33 chaperone (TM1394) from Thermotoga maritima at 2.20 Å resolution. Proteins 2005; 61:669-73. [PMID: 16167343 DOI: 10.1002/prot.20542] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
232
|
Sikora S, Strongin A, Godzik A. Convergent evolution as a mechanism for pathogenic adaptation. Trends Microbiol 2005; 13:522-7. [PMID: 16153847 DOI: 10.1016/j.tim.2005.08.010] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2005] [Revised: 08/11/2005] [Accepted: 08/30/2005] [Indexed: 11/26/2022]
Abstract
The survival of human pathogens depends on their ability to modulate defence pathways in human host cells. This was thought to be attained mainly by pathogen specific "virulence factors". However, pathogens are increasingly being discovered that use distant homologs of the human regulatory proteins as virulence factors. We analyzed several cases of this approach, with a particular focus on virulence proteases. The analysis reveals clear cases of bacterial proteases mimicking the specificity of their human counterparts, such as strong similarities in their active and/or binding sites. With more sensitive tools for distant homology recognition, we could expect to discover many more such cases.
Collapse
|
233
|
Elsliger MA, Deacon AM, Godzik A, Kuhn P, Lesley SA, Stevens RC, Hodgson KO, Wooley J, Wilson IA. The Joint Center for Structural Genomics: a multi-tiered approach to structural genomics. Acta Crystallogr A 2005. [DOI: 10.1107/s0108767305089014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
234
|
Joseph J, Brooun A, Neuman B, Abola E, Stevens J, Saikatendu K, Johnson M, Recht M, Kraus M, Nelson M, Burrer R, Coon S, Subramanian V, Li W, Godzik A, Wilson I. Functional and structural proteomics of SARS: defining a rational response to emerging diseases. Acta Crystallogr A 2005. [DOI: 10.1107/s0108767305098934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
235
|
Godzik A. A systematic study of flexibility in protein structures and its implications in protein structure prediction. Acta Crystallogr A 2005. [DOI: 10.1107/s0108767305098247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
236
|
Plewczynski D, Jaroszewski L, Godzik A, Kloczkowski A, Rychlewski L. Molecular modeling of phosphorylation sites in proteins using a database of local structure segments. J Mol Model 2005; 11:431-8. [PMID: 16094535 DOI: 10.1007/s00894-005-0235-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2004] [Accepted: 11/30/2004] [Indexed: 11/29/2022]
Abstract
A new bioinformatics tool for molecular modeling of the local structure around phosphorylation sites in proteins has been developed. Our method is based on a library of short sequence and structure motifs. The basic structural elements to be predicted are local structure segments (LSSs). This enables us to avoid the problem of non-exact local description of structures, caused by either diversity in the structural context, or uncertainties in prediction methods. We have developed a library of LSSs and a profile--profile-matching algorithm that predicts local structures of proteins from their sequence information. Our fragment library prediction method is publicly available on a server (FRAGlib), at http://ffas.ljcrf.edu/Servers/frag.html . The algorithm has been applied successfully to the characterization of local structure around phosphorylation sites in proteins. Our computational predictions of sequence and structure preferences around phosphorylated residues have been confirmed by phosphorylation experiments for PKA and PKC kinases. The quality of predictions has been evaluated with several independent statistical tests. We have observed a significant improvement in the accuracy of predictions by incorporating structural information into the description of the neighborhood of the phosphorylated site. Our results strongly suggest that sequence information ought to be supplemented with additional structural context information (predicted with our segment similarity method) for more successful predictions of phosphorylation sites in proteins.
Collapse
|
237
|
Jaroszewski L, Rychlewski L, Li Z, Li W, Godzik A. FFAS03: a server for profile--profile sequence alignments. Nucleic Acids Res 2005; 33:W284-8. [PMID: 15980471 PMCID: PMC1160179 DOI: 10.1093/nar/gki418] [Citation(s) in RCA: 456] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The FFAS03 server provides a web interface to the third generation of the profile–profile alignment and fold-recognition algorithm of fold and function assignment system (FFAS) [L. Rychlewski, L. Jaroszewski, W. Li and A. Godzik (2000), Protein Sci., 9, 232–241]. Profile–profile algorithms use information present in sequences of homologous proteins to amplify the patterns defining the family. As a result, they enable detection of remote homologies beyond the reach of other methods. FFAS, initially developed in 2000, is consistently one of the best ranked fold prediction methods in the CAFASP and LiveBench competitions. It is also used by several fold-recognition consensus methods and meta-servers. The FFAS03 server accepts a user supplied protein sequence and automatically generates a profile, which is then compared with several sets of sequence profiles of proteins from PDB, COG, PFAM and SCOP. The profile databases used by the server are automatically updated with the latest structural and sequence information. The server provides access to the alignment analysis, multiple alignment, and comparative modeling tools. Access to the server is open for both academic and commercial researchers. The FFAS03 server is available at .
Collapse
|
238
|
Abstract
The Fragnostic (http://ffas.burnham.org/Fragnostic) web tool implements a novel and useful view of protein structure space. We mined a non-redundant subset of the PDB for common fragments shared between proteins inhabiting different SCOP folds. Subsequently, we formulated an inter-fold similarity measure based on fragment sharing. Fold space is described as a graph whose nodes are folds between which the edges are drawn depending on the extent of fragment sharing. In this fashion, Fragnostic helps discover meaningful relationships between proteins belonging to different folds, based on sharing similar fragments in the proteins comprising those folds. Distant fold similarity information is supplemented by annotations taken from Gene Ontology, SCOP and CATH. Overall, Fragnostic is a tool which helps discover structural and functional relationships between proteins which are distantly related or seemingly unrelated.
Collapse
|
239
|
Mathews I, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Axelrod H, Biorac T, Canaves JM, Chiu HJ, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Levin I, Miller MD, Moy K, Nigoghossian E, Ouyang J, Paulsen J, Quijano K, Reyes R, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, White A, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of S-adenosylmethionine:tRNA ribosyltransferase-isomerase (QueA) from Thermotoga maritima at 2.0 Å resolution reveals a new fold. Proteins 2005; 59:869-74. [PMID: 15822125 DOI: 10.1002/prot.20419] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
240
|
Ginalski K, Grishin NV, Godzik A, Rychlewski L. Practical lessons from protein structure prediction. Nucleic Acids Res 2005; 33:1874-91. [PMID: 15805122 PMCID: PMC1074308 DOI: 10.1093/nar/gki327] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Despite recent efforts to develop automated protein structure determination protocols, structural genomics projects are slow in generating fold assignments for complete proteomes, and spatial structures remain unknown for many protein families. Alternative cheap and fast methods to assign folds using prediction algorithms continue to provide valuable structural information for many proteins. The development of high-quality prediction methods has been boosted in the last years by objective community-wide assessment experiments. This paper gives an overview of the currently available practical approaches to protein structure prediction capable of generating accurate fold assignment. Recent advances in assessment of the prediction quality are also discussed.
Collapse
|
241
|
Abstract
MOTIVATION Existing comparisons of protein structures are not able to describe structural divergence and flexibility in the structures being compared because they focus on identifying a common invariant core and ignore parts of the structures outside this core. Understanding the structural divergence and flexibility is critical for studying the evolution of functions and specificities of proteins. RESULTS A new method of multiple protein structure alignment, POSA (Partial Order Structure Alignment), was developed using a partial order graph representation of multiple alignments. POSA has two unique features: (1) identifies and classifies regions that are conserved only in a subset of input structures and (2) allows internal rearrangements in protein structures. POSA outperforms other programs in the cases where structural flexibilities exist and provides new insights by visualizing the mosaic nature of multiple structural alignments. POSA is an ideal tool for studying the variation of protein structures within diverse structural families. AVAILABILITY POSA is freely available for academic users on a Web server at http://fatcat.burnham.org/POSA
Collapse
|
242
|
Miller MD, Schwarzenbacher R, von Delft F, Abdubek P, Ambing E, Biorac T, Brinen LS, Canaves JM, Cambell J, Chiu HJ, Dai X, Deacon AM, DiDonato M, Elsliger MA, Eshagi S, Floyd R, Godzik A, Grittini C, Grzechnik SK, Hampton E, Jaroszewski L, Karlak C, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, Lesley SA, Levin I, McMullan D, McPhillips TM, Morse A, Moy K, Ouyang J, Page R, Quijano K, Robb A, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, Wang X, West B, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a tandem cystathionine-beta-synthase (CBS) domain protein (TM0935) from Thermotoga maritima at 1.87 A resolution. Proteins 2005; 57:213-7. [PMID: 15326606 DOI: 10.1002/prot.20024] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
243
|
Abstract
We have recently developed a flexible protein structure alignment program (FATCAT) that identifies structural similarity, at the same time accounting for flexibility of protein structures. One of the most important applications of a structure alignment method is to aid in functional annotations by identifying similar structures in large structural databases. However, none of the flexible structure alignment methods were applied in this task because of a lack of significance estimation of flexible alignments. In this paper, we developed an estimate of the statistical significance of FATCAT alignment score, allowing us to use it as a database-searching tool. The results reported here show that (1) the distribution of the similarity score of FATCAT alignment between two unrelated protein structures follows the extreme value distribution (EVD), adding one more example to the current collection of EVDs of sequence and structure similarities; (2) introducing flexibility into structure comparison only slightly influences the sensitivity and specificity of identifying similar structures; and (3) the overall performance of FATCAT as a database searching tool is comparable to that of the widely used rigid-body structure comparison programs DALI and CE. Two examples illustrating the advantages of using flexible structure alignments in database searching are also presented. The conformational flexibilities that were detected in the first example may be involved with substrate specificity, and the conformational flexibilities detected in the second example may reflect the evolution of structures by block building.
Collapse
|
244
|
Xu Q, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Biorac T, Canaves JM, Chiu HJ, Dai X, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hampton E, Hornsby M, Jaroszewski L, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, Levin I, Miller MD, Morse A, Moy K, Ouyang J, Page R, Quijano K, Reyes R, Robb A, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, von Delft F, Wang X, West B, White A, Wolf G, Zagnitko O, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a formiminotetrahydrofolate cyclodeaminase (TM1560) from Thermotoga maritima at 2.80 Å resolution reveals a new fold. Proteins 2005; 58:976-81. [PMID: 15651027 DOI: 10.1002/prot.20364] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
245
|
Plewczyński D, Tkacz A, Godzik A, Rychlewski L. A support vector machine approach to the identification of phosphorylation sites. Cell Mol Biol Lett 2005; 10:73-89. [PMID: 15809681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023] Open
Abstract
We describe a bioinformatics tool that can be used to predict the position of phosphorylation sites in proteins based only on sequence information. The method uses the support vector machine (SVM) statistical learning theory. The statistical models for phosphorylation by various types of kinases are built using a dataset of short (9-amino acid long) sequence fragments. The sequence segments are dissected around post-translationally modified sites of proteins that are on the current release of the Swiss-Prot database, and that were experimentally confirmed to be phosphorylated by any kinase. We represent them as vectors in a multidimensional abstract space of short sequence fragments. The prediction method is as follows. First, a given query protein sequence is dissected into overlapping short segments. All the fragments are then projected into the multidimensional space of sequence fragments via a collection of different representations. Those points are classified with pre-built statistical models (the SVM method with linear, polynomial and radial kernel functions) either as phosphorylated or inactive ones. The resulting list of plausible sites for phosphorylation by various types of kinases in the query protein is returned to the user. The efficiency of the method for each type of phosphorylation is estimated using leave-one-out tests and presented here. The sensitivities of the models can reach over 70%, depending on the type of kinase. The additional information from profile representations of short sequence fragments helps in gaining a higher degree of accuracy in some phosphorylation types. The further development of an automatic phosphorylation site annotation predictor based on our algorithm should yield a significant improvement when using statistical algorithms in order to quantify the results.
Collapse
|
246
|
Arndt JW, Schwarzenbacher R, Page R, Abdubek P, Ambing E, Biorac T, Canaves JM, Chiu HJ, Dai X, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hale J, Hampton E, Han GW, Haugen J, Hornsby M, Klock HE, Koesema E, Kreusch A, Kuhn P, Jaroszewski L, Lesley SA, Levin I, McMullan D, McPhillips TM, Miller MD, Morse A, Moy K, Nigoghossian E, Ouyang J, Peti WS, Quijano K, Reyes R, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, von Delft F, Wang X, West B, White A, Wolf G, Xu Q, Zagnitko O, Hodgson KO, Wooley J, Wilson IA. Crystal structure of an α/β serine hydrolase (YDR428C) from Saccharomyces cerevisiae at 1.85 Å resolution. Proteins 2004; 58:755-8. [PMID: 15624212 DOI: 10.1002/prot.20336] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
247
|
McMullan D, Schwarzenbacher R, Jaroszewski L, von Delft F, Klock HE, Vincent J, Quijano K, Abdubek P, Ambing E, Biorac T, Brinen LS, Canaves JM, Dai X, Deacon AM, DiDonato M, Elsliger MA, Eshaghi S, Floyd R, Godzik A, Grittini C, Grzechnik SK, Hampton E, Karlak C, Koesema E, Kreusch A, Kuhn P, Levin I, McPhillips TM, Miller MD, Morse A, Moy K, Ouyang J, Page R, Reyes R, Rezezadeh F, Robb A, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Wang X, West B, Wolf G, Xu Q, Hodgson KO, Wooley J, Lesley SA, Wilson IA. Crystal structure of a novel Thermotoga maritima enzyme (TM1112) from the cupin family at 1.83 A resolution. Proteins 2004; 56:615-8. [PMID: 15229894 DOI: 10.1002/prot.20139] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
248
|
Jaroszewski L, Schwarzenbacher R, von Delft F, McMullan D, Brinen LS, Canaves JM, Dai X, Deacon AM, DiDonato M, Elsliger MA, Eshagi S, Floyd R, Godzik A, Grittini C, Grzechnik SK, Hampton E, Levin I, Karlak C, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, Lesley SA, McPhillips TM, Miller MD, Morse A, Moy K, Ouyang J, Page R, Quijano K, Reyes R, Rezezadeh F, Robb A, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, Wang X, West B, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a novel manganese-containing cupin (TM1459) from Thermotoga maritima at 1.65 A resolution. Proteins 2004; 56:611-4. [PMID: 15229893 DOI: 10.1002/prot.20130] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
249
|
Levin I, Schwarzenbacher R, McMullan D, Abdubek P, Ambing E, Biorac T, Cambell J, Canaves JM, Chiu HJ, Dai X, Deacon AM, DiDonato M, Elsliger MA, Godzik A, Grittini C, Grzechnik SK, Hampton E, Jaroszewski L, Karlak C, Klock HE, Koesema E, Kreusch A, Kuhn P, Lesley SA, McPhillips TM, Miller MD, Morse A, Moy K, Ouyang J, Page R, Quijano K, Reyes R, Robb A, Sims E, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, von Delft F, Wang X, West B, Wolf G, Xu Q, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a putative NADPH-dependent oxidoreductase (GI: 18204011) from mouse at 2.10 A resolution. Proteins 2004; 56:629-33. [PMID: 15229897 DOI: 10.1002/prot.20163] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
250
|
Xu Q, Schwarzenbacher R, McMullan D, von Delft F, Brinen LS, Canaves JM, Dai X, Deacon AM, Elsliger MA, Eshagi S, Floyd R, Godzik A, Grittini C, Grzechnik SK, Jaroszewski L, Karlak C, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, Lesley SA, Levin I, McPhillips TM, Miller MD, Morse A, Moy K, Ouyang J, Page R, Quijano K, Robb A, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, Wang X, West B, Wolf G, Hodgson KO, Wooley J, Wilson IA. Crystal structure of a ribose-5-phosphate isomerase RpiB (TM1080) from Thermotoga maritima at 1.90 A resolution. Proteins 2004; 56:171-5. [PMID: 15162497 DOI: 10.1002/prot.20129] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|