301
|
Alonso A, Rahmouni S, Williams S, van Stipdonk M, Jaroszewski L, Godzik A, Abraham RT, Schoenberger SP, Mustelin T. Tyrosine phosphorylation of VHR phosphatase by ZAP-70. Nat Immunol 2003; 4:44-8. [PMID: 12447358 DOI: 10.1038/ni856] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2002] [Accepted: 10/01/2002] [Indexed: 11/09/2022]
Abstract
The ZAP-70 tyrosine kinase is a key component of the signaling machinery for the T cell antigen receptor (TCR). Whereas recruitment and activation of ZAP-70 are relatively well understood, the proteins phosphorylated by ZAP-70 are incompletely known. We report here that VHR, a Vaccinia virus VH1-related dual-specific protein phosphatase that inactivates the mitogen-activated kinases Erk2 and Jnk, is phosphorylated at Y138 by ZAP-70. Tyr138 phosphorylation was required for VHR to inhibit the Erk2-Elk-1 pathway and, conversely, the VHR(Y138F) mutant augmented TCR-induced Erk2 kinase and activation of the gene encoding interleukin 2. These results suggest that VHR is a target for ZAP-70 and tempers activation of the Erk2 pathway in a ZAP-70-controlled manner.
Collapse
|
302
|
Okazaki Y, Furuno M, Kasukawa T, Adachi J, Bono H, Kondo S, Nikaido I, Osato N, Saito R, Suzuki H, Yamanaka I, Kiyosawa H, Yagi K, Tomaru Y, Hasegawa Y, Nogami A, Schönbach C, Gojobori T, Baldarelli R, Hill DP, Bult C, Hume DA, Quackenbush J, Schriml LM, Kanapin A, Matsuda H, Batalov S, Beisel KW, Blake JA, Bradt D, Brusic V, Chothia C, Corbani LE, Cousins S, Dalla E, Dragani TA, Fletcher CF, Forrest A, Frazer KS, Gaasterland T, Gariboldi M, Gissi C, Godzik A, Gough J, Grimmond S, Gustincich S, Hirokawa N, Jackson IJ, Jarvis ED, Kanai A, Kawaji H, Kawasawa Y, Kedzierski RM, King BL, Konagaya A, Kurochkin IV, Lee Y, Lenhard B, Lyons PA, Maglott DR, Maltais L, Marchionni L, McKenzie L, Miki H, Nagashima T, Numata K, Okido T, Pavan WJ, Pertea G, Pesole G, Petrovsky N, Pillai R, Pontius JU, Qi D, Ramachandran S, Ravasi T, Reed JC, Reed DJ, Reid J, Ring BZ, Ringwald M, Sandelin A, Schneider C, Semple CAM, Setou M, Shimada K, Sultana R, Takenaka Y, Taylor MS, Teasdale RD, Tomita M, Verardo R, Wagner L, Wahlestedt C, Wang Y, Watanabe Y, Wells C, Wilming LG, Wynshaw-Boris A, Yanagisawa M, Yang I, Yang L, Yuan Z, Zavolan M, Zhu Y, Zimmer A, Carninci P, Hayatsu N, Hirozane-Kishikawa T, Konno H, Nakamura M, Sakazume N, Sato K, Shiraki T, Waki K, Kawai J, Aizawa K, Arakawa T, Fukuda S, Hara A, Hashizume W, Imotani K, Ishii Y, Itoh M, Kagawa I, Miyazaki A, Sakai K, Sasaki D, Shibata K, Shinagawa A, Yasunishi A, Yoshino M, Waterston R, Lander ES, Rogers J, Birney E, Hayashizaki Y. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 2002; 420:563-73. [PMID: 12466851 DOI: 10.1038/nature01266] [Citation(s) in RCA: 1226] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2002] [Accepted: 10/28/2002] [Indexed: 01/10/2023]
Abstract
Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Collapse
MESH Headings
- Alternative Splicing/genetics
- Amino Acid Motifs
- Animals
- Chromosomes, Mammalian/genetics
- Cloning, Molecular
- DNA, Complementary/genetics
- Databases, Genetic
- Expressed Sequence Tags
- Genes/genetics
- Genomics/methods
- Humans
- Membrane Proteins/genetics
- Mice/genetics
- Physical Chromosome Mapping
- Protein Structure, Tertiary
- Proteome/chemistry
- Proteome/genetics
- RNA, Antisense/genetics
- RNA, Messenger/analysis
- RNA, Messenger/genetics
- RNA, Untranslated/analysis
- RNA, Untranslated/genetics
- Transcription Initiation Site
- Transcription, Genetic/genetics
Collapse
|
303
|
Kuhn P, Lesley SA, Mathews II, Canaves JM, Brinen LS, Dai X, Deacon AM, Elsliger MA, Eshaghi S, Floyd R, Godzik A, Grittini C, Grzechnik SK, Guda C, Hodgson KO, Jaroszewski L, Karlak C, Klock HE, Koesema E, Kovarik JM, Kreusch AT, McMullan D, McPhillips TM, Miller MA, Miller M, Morse A, Moy K, Ouyang J, Robb A, Rodrigues K, Selby TL, Spraggon G, Stevens RC, Taylor SS, van den Bedem H, Velasquez J, Vincent J, Wang X, West B, Wolf G, Wooley J, Wilson IA. Crystal structure of thy1, a thymidylate synthase complementing protein from Thermotoga maritima at 2.25 A resolution. Proteins 2002; 49:142-5. [PMID: 12211025 DOI: 10.1002/prot.10202] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
304
|
Fiorentino L, Stehlik C, Oliveira V, Ariza ME, Godzik A, Reed JC. A novel PAAD-containing protein that modulates NF-kappa B induction by cytokines tumor necrosis factor-alpha and interleukin-1beta. J Biol Chem 2002; 277:35333-40. [PMID: 12093792 DOI: 10.1074/jbc.m200446200] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
PAAD domains are found in diverse proteins of unknown function and are structurally related to a superfamily of protein interaction modules that includes death domains, death effector domains, and Caspase activation and recruitment domains. Using bioinformatics strategies, cDNAs were identified that encode a novel protein of 110 kDa containing a PAAD domain followed by a putative nucleotide-binding (NACHT) domain and several leucine-rich repeat domains. This protein thus resembles Cryopyrin, a protein implicated in hereditary hyperinflammation syndromes, and was termed PAN2 for PAAD and NACHT-containing protein 2. When expressed in HEK293 cells, PAN2 suppressed NF-kappaB induction by the cytokines tumor necrosis factor-alpha (TNFalpha) and interleukin-1beta (IL-1beta), suggesting that this protein operates at a point of convergence in these two cytokine signaling pathways. This PAN2-mediated suppression of NF-kappaB was evident both in reporter gene assays that measured NF-kappaB transcriptional activity and electromobility shift assays that measured NF-kappaB DNA binding activity. PAN2 also suppressed NF-kappaB induction resulting from overexpression of several adapter proteins and protein kinases involved in the TNF or IL-1 receptor signal transduction, including TRAF2, TRAF6, RIP, IRAK2, and NF-kappaB-inducing kinase as well as the IkappaB kinases IKKalpha and IKKbeta. PAN2 also inhibited the cytokine-mediated activation of IKKalpha and IKKbeta as measured by in vitro kinase assays. Furthermore, PAN2 association with IKKalpha was demonstrated by co-immunoprecipitation assays, suggesting a direct effect on the IKK complex. These observations suggest a role for PAN2 in modulating NF-kappaB activity in cells, thus providing the insights into the potential functions of PAAD family proteins and their roles in controlling inflammatory responses.
Collapse
|
305
|
Lesley SA, Kuhn P, Godzik A, Deacon AM, Mathews I, Kreusch A, Spraggon G, Klock HE, McMullan D, Shin T, Vincent J, Robb A, Brinen LS, Miller MD, McPhillips TM, Miller MA, Scheibe D, Canaves JM, Guda C, Jaroszewski L, Selby TL, Elsliger MA, Wooley J, Taylor SS, Hodgson KO, Wilson IA, Schultz PG, Stevens RC. Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline. Proc Natl Acad Sci U S A 2002; 99:11664-9. [PMID: 12193646 PMCID: PMC129326 DOI: 10.1073/pnas.142413399] [Citation(s) in RCA: 357] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2002] [Indexed: 11/18/2022] Open
Abstract
Structural genomics is emerging as a principal approach to define protein structure-function relationships. To apply this approach on a genomic scale, novel methods and technologies must be developed to determine large numbers of structures. We describe the design and implementation of a high-throughput structural genomics pipeline and its application to the proteome of the thermophilic bacterium Thermotoga maritima. By using this pipeline, we successfully cloned and attempted expression of 1,376 of the predicted 1,877 genes (73%) and have identified crystallization conditions for 432 proteins, comprising 23% of the T. maritima proteome. Representative structures from TM0423 glycerol dehydrogenase and TM0449 thymidylate synthase-complementing protein are presented as examples of final outputs from the pipeline.
Collapse
|
306
|
Lipton SA, Choi YB, Takahashi H, Zhang D, Li W, Godzik A, Bankston LA. Cysteine regulation of protein function--as exemplified by NMDA-receptor modulation. Trends Neurosci 2002; 25:474-80. [PMID: 12183209 DOI: 10.1016/s0166-2236(02)02245-2] [Citation(s) in RCA: 298] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Until recently cysteine residues, especially those located extracellularly, were thought to be important for metal coordination, catalysis and protein structure by forming disulfide bonds - but they were not thought to regulate protein function. However, this is not the case. Crucial cysteine residues can be involved in modulation of protein activity and signaling events via other reactions of their thiol (sulfhydryl; -SH) groups. These reactions can take several forms, such as redox events (chemical reduction or oxidation), chelation of transition metals (chiefly Zn(2+), Mn(2+) and Cu(2+)) or S-nitrosylation [the catalyzed transfer of a nitric oxide (NO) group to a thiol group]. In several cases, these disparate reactions can compete with one another for the same thiol group on a single cysteine residue, forming a molecular switch composed of a latticework of possible redox, NO or Zn(2+) modifications to control protein function. Thiol-mediated regulation of protein function can also involve reactions of cysteine residues that affect ligand binding allosterically. This article reviews the basis for these molecular cysteine switches, drawing on the NMDA receptor as an exemplary protein, and proposes a molecular model for the action of S-nitrosylation based on recently derived crystal structures.
Collapse
|
307
|
Abstract
Most genome annotation protocols combine ab initio predictions with transcription and homology analyses to produce reliable gene predictions but they often fail to detect many actual genes. Alternative approaches involving more sensitive homology recognition methods are playing an increasingly important role in the next stage of gene discovery. The hunt for new genes is far from over.
Collapse
|
308
|
Li W, Jaroszewski L, Godzik A. Sequence clustering strategies improve remote homology recognitions while reducing search times. Protein Eng Des Sel 2002; 15:643-9. [PMID: 12364578 DOI: 10.1093/protein/15.8.643] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Sequence databases are rapidly growing, thereby increasing the coverage of protein sequence space, but this coverage is uneven because most sequencing efforts have concentrated on a small number of organisms. The resulting granularity of sequence space creates many problems for profile-based sequence comparison programs. In this paper, we suggest several strategies that address these problems, and at the same time speed up the searches for homologous proteins and improve the ability of profile methods to recognize distant homologies. One of our strategies combines database clustering, which removes highly redundant sequence, and a two-step PSI-BLAST (PDB-BLAST), which separates sequence spaces of profile composition and space of homology searching. The combination of these strategies improves distant homology recognitions by more than 100%, while using only 10% of the CPU time of the standard PSI-BLAST search. Another method, intermediate profile searches, allows for the exploration of additional search directions that are normally dominated by large protein sub-families within very diverse families. All methods are evaluated with a large fold-recognition benchmark.
Collapse
|
309
|
Abstract
A major bottleneck in comparative modeling is the alignment quality; this is especially true for proteins whose distant relationships could be reliably recognized only by recent advances in fold recognition. The best algorithms excel in recognizing distant homologs but often produce incorrect alignments for over 50% of protein pairs in large fold-prediction benchmarks. The alignments obtained by sequence-sequence or sequence-structure matching algorithms differ significantly from the structural alignments. To study this problem, we developed a simplified method to explicitly enumerate all possible alignments for a pair of proteins. This allowed us to estimate the number of significantly different alignments for a given scoring method that score better than the structural alignment. Using several examples of distantly related proteins, we show that for standard sequence-sequence alignment methods, the number of significantly different alignments is usually large, often about 10(10) alternatives. This distance decreases when the alignment method is improved, but the number is still too large for the brute force enumeration approach. More effective strategies were needed, so we evaluated and compared two well-known approaches for searching the space of suboptimal alignments. We combined their best features and produced a hybrid method, which yielded alignments that surpassed the original alignments for about 50% of protein pairs with minimal computational effort.
Collapse
|
310
|
Kridel SJ, Sawai H, Ratnikov BI, Chen EI, Li W, Godzik A, Strongin AY, Smith JW. A unique substrate binding mode discriminates membrane type-1 matrix metalloproteinase from other matrix metalloproteinases. J Biol Chem 2002; 277:23788-93. [PMID: 11959855 DOI: 10.1074/jbc.m111574200] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
In our study, we characterized the substrate recognition properties of membrane type-1 matrix metalloproteinase (MT1-MMP; also known as MMP-14), a key enzyme in tumor cell invasion and metastasis. A panel of optimal peptide substrates for MT1-MMP was identified using substrate phage display. The substrates can be segregated into four groups based on their degree of selectivity for MT1-MMP. Substrates with poor selectivity for MT1-MMP are comprised predominately of the Pro-X-X- downward arrow-X(Hy) motif that is recognized by a number of MMPs. Highly selective substrates lack the characteristic Pro at the P(3) position; instead they contain an Arg at the P(4) position. This P(4) Arg is essential for efficient hydrolysis and for selectivity for MT1-MMP. Molecular modeling indicates that the selective substrates adopt a linear conformation that extends along the entire catalytic pocket of MT1-MMP, whereas non-selective substrates are kinked at the conserved P(3) Pro residue. Importantly, the selective substrates can be made non-selective by insertion of a proline kink at P(3), without significantly reducing overall k(cat)/K(m) values. Altogether the study provides a structural basis for selective and non-selective substrate recognition by MT1-MMP. The findings in this report are likely to explain several aspects of MT1-MMP biology.
Collapse
|
311
|
Wu X, Li W, Sharma V, Godzik A, Freeze HH. Cloning and characterization of glucose transporter 11, a novel sugar transporter that is alternatively spliced in various tissues. Mol Genet Metab 2002; 76:37-45. [PMID: 12175779 DOI: 10.1016/s1096-7192(02)00018-5] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
We have cloned and characterized a novel glucose transporter (GLUT11) that is alternatively spliced. The GLUT11 gene maps to chromosome 22q11.2 and consists of 13 exons. The long form (GLUT11-L) cDNA uses 13 exons to produce a protein containing 503 amino acids. The short form of GLUT11 (GLUT-11) cDNA is missing exon 2 and produces a protein of 496 amino acids with a 14 amino acid N-terminal difference compared to the long form. GLUT11 has significant similarity to known GLUTs and contains 12 putative membrane-spanning helices along with sugar transporter signature motifs that have previously been shown to be essential for transport activity. The putative glycosylation site of GLUT11 is present in loop 1. Northern blot analysis showed that GLUT11 mRNA is expressed in a number of tissues and most abundantly in the skeletal muscle and heart. RT-PCR assay showed that GLUT11 is alternatively spliced and the two isoforms are distributed differently in various tissues. Immunofluorescence microscopy demonstrated that GLUT11-L resides on the plasma membrane when overexpressed in HEK293T cells. Western blot analysis revealed that GLUT11-L runs as a broad band of approximately 42 kDa that was converted to a 38 kDa polypeptide by PNGase F digestion. Furthermore, a liposome reconstitution functional assay showed that GLUT11-L has glucose transport activity.
Collapse
|
312
|
Stenner-Liewen F, Liewen H, Zapata JM, Pawlowski K, Godzik A, Reed JC. CADD, a Chlamydia protein that interacts with death receptors. J Biol Chem 2002; 277:9633-6. [PMID: 11805081 DOI: 10.1074/jbc.c100693200] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We report here the identification of a bacterial protein capable of interacting with mammalian death receptors in vitro and in vivo. The protein is encoded in the genome of Chlamydia trachomatis and has homologues in other Chlamydia species. This protein, which we refer to as "Chlamydia protein associating with death domains" (CADD), induces apoptosis in a variety of mammalian cell lines when expressed by transient gene transfection. Apoptosis induction can be blocked by Caspase inhibitors, indicating that CADD triggers cell death by engaging the host apoptotic machinery. CADD interacts with death domains of tumor necrosis factor (TNF) family receptors TNFR1, Fas, DR4, and DR5 but not with the respective downstream adaptors. In infected epithelial cells, CADD is expressed late in the infectious cycle of C. trachomatis and co-localizes with Fas in the proximity of the inclusion body. The results suggest a role for CADD modulating the apoptosis pathways of cells infected, revealing a new mechanism of host-pathogen interaction.
Collapse
MESH Headings
- Amino Acid Sequence
- Animals
- Antigens, CD/metabolism
- Apoptosis
- Bacterial Proteins
- COS Cells
- Cell Line
- Chlamydia/genetics
- Chlamydia muridarum/genetics
- Chlamydia trachomatis/metabolism
- Cloning, Molecular
- DNA, Complementary/metabolism
- Fungal Proteins/chemistry
- Fungal Proteins/genetics
- Fungal Proteins/metabolism
- Glutathione Transferase/metabolism
- HeLa Cells
- Humans
- Immunoblotting
- Mice
- Mice, Inbred BALB C
- Microscopy, Fluorescence
- Molecular Sequence Data
- Plasmids/metabolism
- Precipitin Tests
- Protein Binding
- Protein Structure, Tertiary
- Receptors, TNF-Related Apoptosis-Inducing Ligand
- Receptors, Tumor Necrosis Factor/metabolism
- Receptors, Tumor Necrosis Factor, Type I
- Recombinant Fusion Proteins/metabolism
- Reverse Transcriptase Polymerase Chain Reaction
- Sequence Homology, Amino Acid
- Species Specificity
- Transfection
- Tumor Cells, Cultured
- fas Receptor/metabolism
Collapse
|
313
|
Roth W, Stenner-Liewen F, Pawlowski K, Godzik A, Reed JC. Identification and characterization of DEDD2, a death effector domain-containing protein. J Biol Chem 2002; 277:7501-8. [PMID: 11741985 DOI: 10.1074/jbc.m110749200] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
A novel Death Effector Domain-containing protein was identified, DEDD2, which is closest in amino acid sequence homology to death effector domain-containing DNA-binding protein, DEDD. DEDD2 mRNA is expressed widely in adult human tissues with highest levels in liver, kidney, and peripheral blood leukocytes. DEDD2 interacts with FLIP, but not with Fas-associated death domain (FADD) or caspase-8. Overexpression of DEDD2 induces moderate apoptosis and results in substantial sensitization to apoptosis induced by Fas (CD95/APO-1), tumor necrosis factor-related apoptosis-inducing ligand (TRAIL, Apo2L), or FADD. In contrast, Bax- or staurosporine-mediated cell death is not affected by expression of DEDD2. Fluorescence microscopy showed that overexpressed DEDD2 translocates to the nucleus, which is dependent on the presence of a bipartite nuclear localization signal in the DEDD2 protein. Mutagenesis studies revealed that the translocation of the DED of DEDD2 to the nucleus is essential for its pro-apoptotic activity. These findings suggest that DEDD2 is involved in the regulation of nuclear events mediated by the extrinsic apoptosis pathway.
Collapse
|
314
|
Alonso A, Merlo JJ, Na S, Kholod N, Jaroszewski L, Kharitonenkov A, Williams S, Godzik A, Posada JD, Mustelin T. Inhibition of T cell antigen receptor signaling by VHR-related MKPX (VHX), a new dual specificity phosphatase related to VH1 related (VHR). J Biol Chem 2002; 277:5524-8. [PMID: 11733513 DOI: 10.1074/jbc.m107653200] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
A cDNA encoding a novel, human, dual-specific protein phosphatase was identified in the Incyte data base. The open reading frame predicted a protein of 184 amino acids related to the Vaccinia virus VH1 and human VH1-related (VHR) phosphatases. Expression VHR-related MKPX (VHX) was highest in thymus, but also detectable in monocytes and lymphocytes. A VHX-specific antiserum detected a protein with an apparent molecular mass of 19 kDa in many cells, including T lymphocytes and monocytes. VHX expression was not induced by T cell activation, but decreased somewhat at later time points. In vitro, VHX dephosphorylated the Erk2 mitogen-activated protein kinase with faster kinetics than did VHR, which is thought to be specific for Erk1 and 2. When expressed in Jurkat T cells, VHX had the capacity to suppress T cell antigen receptor-induced activation of Erk2 and of an NFAT/AP-1 luciferase reporter, but not an NF-kappaB reporter. Thus, VHX is a new member of the VH1/VHR group of small dual-specific phosphatases that act in mitogen-activated protein kinase signaling pathways.
Collapse
|
315
|
Chen EI, Kridel SJ, Howard EW, Li W, Godzik A, Smith JW. A unique substrate recognition profile for matrix metalloproteinase-2. J Biol Chem 2002; 277:4485-91. [PMID: 11694539 DOI: 10.1074/jbc.m109469200] [Citation(s) in RCA: 94] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The catalytic domains of the matrix metalloproteinases (MMPs) are structurally homologous, raising questions as to the degree of distinction, or overlap, in substrate recognition. The primary objective of the present study was to define the substrate recognition profile of MMP-2, a protease that was historically referred to as gelatinase A. By cleaving a phage peptide library with recombinant MMP-2, four distinct sets of substrates were identified. The first set is structurally related to substrates previously reported for other MMPs. These substrates contain the PXX/X(Hy) consensus motif (where X(Hy) is a hydrophobic residue) and are not generally selective for MMP-2 over the other MMPs tested. Two other groups of substrates were selected from the phage library with similar frequency. Substrates in group II contain the L/IXX/X(Hy) consensus motif. Substrates in group III contain a consensus motif with a sequence of X(Hy)SX/L, and the fourth set of substrates contain the HXX/X(Hy) sequence. Substrates in Group II, III, and IV were found to be 8- to almost 200-fold more selective for MMP-2 over MMP-9. To gain an understanding of the structural basis for substrate selectivity, individual residues within substrates were mutated, revealing that the P(2) residue is a key element in conferring selectivity. These findings indicate that MMP-2 and MMP-9 exhibit different substrate recognition profiles and point to the P(2) subsite as a primary determinant in substrate distinction.
Collapse
|
316
|
Li W, Jaroszewski L, Godzik A. Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics 2002; 18:77-82. [PMID: 11836214 DOI: 10.1093/bioinformatics/18.1.77] [Citation(s) in RCA: 318] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Sequence clustering replaces groups of similar sequences in a database with single representatives. Clustering large protein databases like the NCBI Non-Redundant database (NR) using even the best currently available clustering algorithms is very time-consuming and only practical at relatively high sequence identity thresholds. Our previous program, CD-HI, clustered NR at 90% identity in approximately 1 h and at 75% identity in approximately 1 day on a 1 GHz Linux PC (Li et al., Bioinformatics, 17, 282, 2001); however even faster clustering speed is needed because the size of protein databases are rapidly growing and many applications desire a lower attainable thresholds. RESULTS For our previous algorithm (CD-HI), we have employed short-word filters to speed up the clustering. In this paper, we show that tolerating some redundancy makes for more efficient use of these short-word filters and increases the program's speed 100 times. Our new program implements this technique and clusters NR at 70% identity within 2 h, and at 50% identity in approximately 5 days. Although some redundancy is present after clustering, our new program's results only differ from our previous program's by less than 0.4%.
Collapse
|
317
|
Miyazaki T, Kim HR, Godzik A, Krajewski S, Reed JC. Response to 'Interaction of DAP3 and FADD only after cellular disruption'. Nat Immunol 2002. [DOI: 10.1038/ni0102-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
318
|
Pathan N, Marusawa H, Krajewska M, Matsuzawa S, Kim H, Okada K, Torii S, Kitada S, Krajewski S, Welsh K, Pio F, Godzik A, Reed JC. TUCAN, an antiapoptotic caspase-associated recruitment domain family protein overexpressed in cancer. J Biol Chem 2001; 276:32220-9. [PMID: 11408476 DOI: 10.1074/jbc.m100433200] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Caspase-associated recruitment domains (CARDs) are protein interaction domains that participate in activation or suppression of CARD-carrying members of the caspase family of apoptosis-inducing proteases. A novel CARD-containing protein was identified that is overexpressed in some types of cancer and that binds and suppresses activation of procaspase-9, which we term TUCAN (tumor-up-regulated CARD-containing antagonist of caspase nine). The CARD domain of TUCAN selectively binds itself and procaspase-9. TUCAN interferes with binding of Apaf1 to procaspase-9 and suppresses caspase activation induced by the Apaf1 activator, cytochrome c. Overexpression of TUCAN in cells by stable or transient transfection inhibits apoptosis and caspase activation induced by Apaf1/caspase-9-dependent stimuli, including Bax, VP16, and staurosporine, but not by Apaf1/caspase-9-independent stimuli, Fas and granzyme B. High levels of endogenous TUCAN protein were detected in several tumor cell lines and in colon cancer specimens, correlating with shorter patient survival. Thus, TUCAN represents a new member of the CARD family that selectively suppresses apoptosis induced via the mitochondrial pathway for caspase activation.
Collapse
|
319
|
Abstract
Proteins governing cell death form the basis of many normal processes and contribute to the pathogenesis of many diseases when dysregulated. Here we report the cloning of a novel human CED-4-like gene, CLAN, and several of its alternatively spliced isoforms. These caspase-associated recruitment domain (CARD)-containing proteins are expressed at varying degrees in normal human tissues and may contribute to a number of intracellular processes including apoptosis, cytokine processing, and NF-kappa B activation. The CARD of the CLAN proteins binds a number of other CARD-containing proteins including caspase-1, BCL10, NOD2, and NAC. Once their physiologic functions are uncovered, CLAN proteins may prove to be valuable therapeutic targets.
Collapse
|
320
|
Zapata JM, Pawlowski K, Haas E, Ware CF, Godzik A, Reed JC. A diverse family of proteins containing tumor necrosis factor receptor-associated factor domains. J Biol Chem 2001; 276:24242-52. [PMID: 11279055 DOI: 10.1074/jbc.m100354200] [Citation(s) in RCA: 156] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We have identified three new tumor necrosis factor-receptor associated factor (TRAF) domain-containing proteins in humans using bioinformatics approaches, including: MUL, the product of the causative gene in Mulibrey Nanism syndrome; USP7 (HAUSP), an ubiquitin protease; and SPOP, a POZ domain-containing protein. Unlike classical TRAF family proteins involved in TNF family receptor (TNFR) signaling, the TRAF domains (TDs) of MUL, USP7, and SPOP are located near the NH(2) termini or central region of these proteins, rather than carboxyl end. MUL and USP7 are capable of binding in vitro via their TDs to all of the previously identified TRAF family proteins (TRAF1, TRAF2, TRAF3, TRAF4, TRAF5, and TRAF6), whereas the TD of SPOP interacts weakly with TRAF1 and TRAF6 only. The TD of MUL also interacted with itself, whereas the TDs of USP7 and SPOP did not self-associate. Analysis of various MUL and USP7 mutants by transient transfection assays indicated that the TDs of these proteins are necessary and sufficient for suppressing NF-kappaB induction by TRAF2 and TRAF6 as well as certain TRAF-binding TNF family receptors. In contrast, the TD of SPOP did not inhibit NF-kappaB induction. Immunofluorescence confocal microscopy indicated that MUL localizes to cytosolic bodies, with targeting to these structures mediated by a RBCC tripartite domain within the MUL protein. USP7 localized predominantly to the nucleus, in a TD-dependent manner. Data base searches revealed multiple proteins containing TDs homologous to those found in MUL, USP7, and SPOP throughout eukaryotes, including yeast, protists, plants, invertebrates, and mammals, suggesting that this branch of the TD family arose from an ancient gene. We propose the moniker TEFs (TD-encompassing factors) for this large family of proteins.
Collapse
|
321
|
Marchenko GN, Ratnikov BI, Rozanov DV, Godzik A, Deryugina EI, Strongin AY. Characterization of matrix metalloproteinase-26, a novel metalloproteinase widely expressed in cancer cells of epithelial origin. Biochem J 2001; 356:705-18. [PMID: 11389678 PMCID: PMC1221897 DOI: 10.1042/0264-6021:3560705] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Identification of expanding roles for matrix metalloproteinases (MMPs) in complex regulatory processes of tissue remodelling has stimulated the search for genes encoding proteinases with unique functions, regulation and expression patterns. By using a novel cloning strategy, we identified three previously unknown human MMPs, i.e. MMP-21, MMP-26 and MMP-28, in comprehensive gene libraries. The present study is focused on the gene and the protein of a novel MMP, MMP-26. Our findings show that MMP-26 is specifically expressed in cancer cells of epithelial origin, including carcinomas of lung, prostate and breast. Several unique structural and regulatory features, including an unusual 'cysteine-switch' motif, discriminate broad-spectrum MMP-26 from most other MMPs. MMP-26 efficiently cleaves fibrinogen and extracellular matrix proteins, including fibronectin, vitronectin and denatured collagen. Protein sequence, minimal modular domain structure, exon-intron mapping and computer modelling demonstrate similarity between MMP-26 and MMP-7 (matrilysin). However, substrate specificity and transcriptional regulation, as well as the functional role of MMP-26 and MMP-7 in cancer, are likely to be distinct. Despite these differences, matrilysin-2 may be a suitable trivial name for MMP-26. Our observations suggest an important specific function for MMP-26 in tumour progression and angiogenesis, and confirm and extend the recent findings of other authors [Park, Ni, Gerkema, Liu, Belozerov and Sang (2000) J. Biol. Chem. 275, 20540--20544; Uría and López-Otín (2000) Cancer Res. 60, 4745--4751; de Coignac, Elson, Delneste, Magistrelli, Jeannin, Aubry, Berthier, Schmitt, Bonnefoy and Gauchat (2000) Eur. J. Biochem. 267, 3323--3329].
Collapse
|
322
|
Pawlowski K, Godzik A. Surface map comparison: studying function diversity of homologous proteins. J Mol Biol 2001; 309:793-806. [PMID: 11397097 DOI: 10.1006/jmbi.2001.4630] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A simplified protein surface cartography approach has been developed to assist in the analysis of surface features in homologous families, and thus to predict conservation or divergence of protein functions and protein-protein interaction patterns. A spherical approximation of protein surface was used, with a focus on charged and hydrophobic residues. The resulting surface map allows for qualitative analysis and comparison of surfaces of proteins, but can also be used to define a simple numerical measure of map similarity between two or more proteins. The latter was shown to be useful for function based classifications within large protein families. Surface map analysis was tested on several test cases: haemoglobins, death domains and TRAF domains. It was shown that surface map comparison allows a better function prediction than general sequence analysis methods and can reproduce known examples of functional variation within a divergent group of proteins. In another example, we predict novel, unexpected sets of common functional properties for seemingly distant members of a large group of divergent proteins. The method was also shown to be robust enough to allow using protein models from comparative modelling instead of experimental structures.
Collapse
|
323
|
Abstract
Fold assignments for newly sequenced genomes belong to the most important and interesting applications of the booming field of protein structure prediction. We present a brief survey and a discussion of such assignments completed to date, using as an example several fold assignment projects for proteins from the Escherichia coli genome. This review focuses on steps that are necessary to go beyond the simple assignment projects and into the development of tools extending our understanding of functions of proteins in newly sequenced genomes. This paper also discusses several problems seldom addressed in the literature, such as the problem of domain prediction and complementary predictions (e.g., transmembrane regions and flexible regions) and cross-correlation of predictions from different servers. The influence of sequence and structure database growth on prediction success is also addressed. Finally, we discuss the perspectives of the field in the context of massive sequence and structure determination projects, as well as the development of novel prediction methods.
Collapse
|
324
|
Ke N, Godzik A, Reed JC. Bcl-B, a novel Bcl-2 family member that differentially binds and regulates Bax and Bak. J Biol Chem 2001; 276:12481-4. [PMID: 11278245 DOI: 10.1074/jbc.c000871200] [Citation(s) in RCA: 95] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
A novel human member of the Bcl-2 family was identified, Bcl-B, which is closest in amino acid sequence homology to the Boo (Diva) protein. The Bcl-B protein contains four Bcl-2 homology (BH) domains (BH1, BH2, BH3, BH4) and a predicted carboxyl-terminal transmembrane (TM) domain. The BCL-B mRNA is widely expressed in adult human tissues. The Bcl-B protein binds Bcl-2, Bcl-X(L), and Bax but not Bak. In transient transfection assays, Bcl-B suppresses apoptosis induced by Bax but not Bak. Deletion of the TM domain of Bcl-B impairs its association with intracellular organelles and diminishes its anti-apoptotic function. Bcl-B thus displays a unique pattern of selectivity for binding and regulating the function of other members of the Bcl-2 family.
Collapse
|
325
|
Chu ZL, Pio F, Xie Z, Welsh K, Krajewska M, Krajewski S, Godzik A, Reed JC. A novel enhancer of the Apaf1 apoptosome involved in cytochrome c-dependent caspase activation and apoptosis. J Biol Chem 2001; 276:9239-45. [PMID: 11113115 DOI: 10.1074/jbc.m006309200] [Citation(s) in RCA: 137] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Apaf1/CED4 family members play central roles in apoptosis regulation as activators of caspase family cell death proteases. These proteins contain a nucleotide-binding (NB) self-oligomerization domain and a caspase recruitment domain (CARD). A novel human protein was identified, NAC, that contains an NB domain and CARD. The CARD of NAC interacts selectively with the CARD domain of Apaf1, a caspase-activating protein that couples mitochondria-released cytochrome c (cyt-c) to activation of cytosolic caspases. Cyt-c-mediated activation of caspases in cytosolic extracts and in cells is enhanced by overexpressing NAC and inhibited by reducing NAC using antisense/DNAzymes. Furthermore, association of NAC with Apaf1 is cyt c-inducible, resulting in a mega-complex (>1 MDa) containing both NAC and Apaf1 and correlating with enhanced recruitment and proteolytic processing of pro-caspase-9. NAC also collaborates with Apaf1 in inducing caspase activation and apoptosis in intact cells, whereas fragments of NAC representing only the CARD or NB domain suppress Apaf1-dependent apoptosis induction. NAC expression in vivo is associated with terminal differentiation of short lived cells in epithelia and some other tissues. The ability of NAC to enhance Apaf1-apoptosome function reveals a novel paradigm for apoptosis regulation.
Collapse
|
326
|
Li W, Jaroszewski L, Godzik A. Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 2001; 17:282-3. [PMID: 11294794 DOI: 10.1093/bioinformatics/17.3.282] [Citation(s) in RCA: 626] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
We present a fast and flexible program for clustering large protein databases at different sequence identity levels. It takes less than 2 h for the all-against-all sequence comparison and clustering of the non-redundant protein database of over 560,000 sequences on a high-end PC. The output database, including only the representative sequences, can be used for more efficient and sensitive database searches.
Collapse
|
327
|
Jaroszewski L, Godzik A. Search for a new description of protein topology and local structure. PROCEEDINGS. INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS FOR MOLECULAR BIOLOGY 2001; 8:211-7. [PMID: 10977082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
Abstract
A novel description of protein structure in terms of the generalized secondary structure elements (GSSE) is proposed. GSSE's are defined as fragments of the protein structure where the chain doesn't radically change its direction. In this new language, global protein topology becomes a particular arrangement of the relatively small number of large, rod like GSSE's. Protein topology can be described by an adjacency matrix giving information, which GSSE's are close in space to each other and defining a graph, where GSSE's are equivalent to vertices and interactions between them to edges. The information about the local structure is translated into the local density of pseudo-Calpha atoms along the chain and the curvature of the chain. This new description has a number of interesting and useful features. For instance, enumeration theorems of graph theory can be used to estimate a number of possible topologies for a protein built from a given number of elements. Different topologies, including novel ones, can be generated from the known by various permutations of elements. Many new regularities in protein structures become suddenly visible in a new description. A new local structure description is more amenable to predictions and easier to use in fold predictions.
Collapse
|
328
|
Pawłowski K, Pio F, Chu Z, Reed JC, Godzik A. PAAD - a new protein domain associated with apoptosis, cancer and autoimmune diseases. Trends Biochem Sci 2001; 26:85-7. [PMID: 11166558 DOI: 10.1016/s0968-0004(00)01729-1] [Citation(s) in RCA: 114] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
A new protein domain was found in several proteins involved in apoptosis, inflammation, cancer and immune responses. Its location within these proteins and predicted fold suggests that it functions as a protein-protein interaction domain, possibly uniting different signaling pathways.
Collapse
|
329
|
Abstract
A new member of the Bcl-2 family was identified, Bcl-G. The human BCL-G gene consists of 6 exons, resides on chromosome 12p12, and encodes two proteins through alternative mRNA splicing, Bcl-G(L) (long) and Bcl-G(S) (short) consisting of 327 and 252 amino acids in length, respectively. Bcl-G(L) and Bcl-G(S) have identical sequences for the first 226 amino acids but diverge thereafter. Among the Bcl-2 homology (BH) domains previously recognized in Bcl-2 family proteins, the BH3 domain is found in both Bcl-G(L) and Bcl-G(S), but only the longer Bcl-G(L) protein possesses a BH2 domain. Bcl-G(L) mRNA is expressed widely in adult human tissues, whereas Bcl-G(S) mRNA was found only in testis. Overexpression of Bcl-G(L) or Bcl-G(S) in cells induced apoptosis although Bcl-G(S) was far more potent than Bcl-G(L). Apoptosis induction by Bcl-G(S) depended on the BH3 domain and was suppressed by coexpression of anti-apoptotic Bcl-X(L) protein. Bcl-X(L) also coimmunoprecipitated with Bcl-G(S) but not with mutants of Bcl-G(S) in which the BH3 domain was deleted or mutated or with Bcl-G(L). Bcl-G(S) was predominantly localized to cytosolic organelles, whereas Bcl-G(L) was diffusely distributed throughout the cytosol. A mutant of Bcl-G(L) in which the BH2 domain was deleted displayed increased apoptotic activity and coimmunoprecipitated with Bcl-X(L), suggesting that the BH2 domain autorepresses Bcl-G(L).
Collapse
|
330
|
Li W, Pio F, Pawłowski K, Godzik A. Saturated BLAST: an automated multiple intermediate sequence search used to detect distant homology. Bioinformatics 2000; 16:1105-10. [PMID: 11159329 DOI: 10.1093/bioinformatics/16.12.1105] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Two proteins can have a similar 3-dimensional structure and biological function, but have sequences sufficiently different that traditional protein sequence comparison algorithms do not identify their relationship. The desire to identify such relations has led to the development of more sensitive sequence alignment strategies. One such strategy is the Intermediate Sequence Search (ISS), which connects two proteins through one or more intermediate sequences. In its brute-force implementation, ISS is a strategy that repetitively uses the results of the previous query as new search seeds, making it time-consuming and difficult to analyze. RESULTS Saturated BLAST is a package that performs ISS in an efficient and automated manner. It was developed using Perl and Perl/Tk and implemented on the LINUX operating system. Starting with a protein sequence, Saturated BLAST runs a BLAST search and identifies representative sequences for the next generation of searches. The procedure is run until convergence or until some predefined criteria are met. Saturated BLAST has a friendly graphic user interface, a built-in BLAST result parser, several multiple alignment tools, clustering algorithms and various filters for the elimination of false positives, thereby providing an easy way to edit, visualize, analyze, monitor and control the search. Besides detecting remote homologies, Saturated BLAST can be used to maintain protein family databases and to search for new genes in genomic databases.
Collapse
|
331
|
Grynberg M, Topczewski J, Godzik A, Paszewski A. The Aspergillus nidulans cysA gene encodes a novel type of serine O-acetyltransferase which is homologous to homoserine O-acetyltransferases. MICROBIOLOGY (READING, ENGLAND) 2000; 146 ( Pt 10):2695-2703. [PMID: 11021945 DOI: 10.1099/00221287-146-10-2695] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The Aspergillus nidulans cysA gene was cloned by functional complementation of the cysA1 mutation that impairs the synthesis of O:-acetylserine. The molecular nature of cysA1 and cysA103 alleles was characterized; a nucleotide substitution and a frame shift were found in the former and a deletion mutation in the latter. The CYSA protein is 525 amino acids long and is encoded by an uninterrupted open reading frame. Expression of the cysA gene appears not to be regulated by sulfur, carbon and nitrogen sources. Protein sequence analysis reveals extensive similarity to homoserine O:-acetyltransferases, particularly the bacterial ones, and no homology with known serine O:-acetyltransferases. The authors propose that the CYSA protein is analogous to serine O:-acetyltransferases, i.e. it catalyses the same reaction but has an independent evolutionary origin.
Collapse
|
332
|
Pawłowski K, Jaroszewski L, Rychlewski L, Godzik A. Sensitive sequence comparison as protein function predictor. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2000:42-53. [PMID: 10902155 DOI: 10.1142/9789814447331_0005] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Protein function assignments based on postulated homology as recognized by high sequence similarity are used routinely in genome analysis. Improvements in sensitivity of sequence comparison algorithms got to the point, that proteins with previously undetectable sequence similarity, such as for instance 10-15% of identical residues, sometimes can be classified as similar. What is the relation between such proteins? Is it possible that they are homologous? What is the practical significance of detecting such similarities? A simplified analysis of the relation between sequence similarity and function similarity is presented here for the well-characterized proteins from the E. coli genome. Using a simple measure of functional similarity based on E.C. classification of enzymes, it is shown that it correlates well with sequence similarity measured by statistical significance of the alignment score. Proteins, similar by this standard, even in cases of low sequence identity, have a much larger chance of having similar function than the randomly chosen protein pairs. Interesting exceptions to these rules are discussed.
Collapse
|
333
|
Zhang H, Huang Q, Ke N, Matsuyama S, Hammock B, Godzik A, Reed JC. Drosophila Pro-apoptotic Bcl-2/Bax Homologue Reveals Evolutionary Conservation of Cell Death Mechanisms. J Biol Chem 2000. [DOI: 10.1016/s0021-9258(19)61510-3] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
334
|
Zhang H, Huang Q, Ke N, Matsuyama S, Hammock B, Godzik A, Reed JC. Drosophila pro-apoptotic Bcl-2/Bax homologue reveals evolutionary conservation of cell death mechanisms. J Biol Chem 2000; 275:27303-6. [PMID: 10811653 DOI: 10.1074/jbc.m002846200] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genetic analysis of programmed cell death in Drosophila reveals many similarities with mammals. Heretofore, a missing link in the fly has been the absence of any Bcl-2/Bax family members, proteins that function in mammals as regulators of mitochondrial cytochrome c release. A Drosophila homologue of the human killer protein Bok (DBok) was identified. The predicted structure of DBok is similar to pore-forming Bcl-2/Bax family members. DBok induces apoptosis in insect and human cells, which is suppressible by anti-apoptotic human Bcl-2 family proteins. A caspase inhibitor suppressed DBok-induced apoptosis but did not prevent DBok-induced cell death. Moreover, DBok targets mitochondria and triggers cytochrome c release through a caspase-independent mechanism. These characteristics of DBok reveal evolutionary conservation of cell death mechanisms in flies and humans.
Collapse
|
335
|
Godzik A. Simple understanding. Trends Biotechnol 2000. [DOI: 10.1016/s0167-7799(00)01468-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
336
|
Poole LB, Godzik A, Nayeem A, Schmitt JD. AhpF can be dissected into two functional units: tandem repeats of two thioredoxin-like folds in the N-terminus mediate electron transfer from the thioredoxin reductase-like C-terminus to AhpC. Biochemistry 2000; 39:6602-15. [PMID: 10828978 DOI: 10.1021/bi000405w] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
AhpF, the flavin-containing component of the Salmonella typhimurium alkyl hydroperoxide reductase system, catalyzes the NADH-dependent reduction of an active-site disulfide bond in the other component, AhpC, which in turn reduces hydroperoxide substrates. The amino acid sequence of the C-terminus of AhpF is 35% identical to that of thioredoxin reductase (TrR) from Escherichia coli. AhpF contains an additional 200-residue N-terminal domain possessing a second redox-active disulfide center also required for AhpC reduction. Our studies indicate that this N-terminus contains a tandem repeat of two thioredoxin (Tr)-like folds, the second of which contains the disulfide redox center. Structural and catalytic properties of independently expressed fragments of AhpF corresponding to the TrR-like C-terminus (F[208-521]) and the 2Tr-like N-terminal domain (F[1-202]) have been addressed. Enzymatic assays, reductive titrations, and circular dichroism studies of the fragments indicate that each folds properly and retains many functional properties. Electron transfer between F[208-521] and F[1-202] is, however, relatively slow (4 x 10(4) M(-)(1) s(-)(1) at 25 degrees C) and nonsaturable up to 100 microM F[1-202]. TrR is nearly as efficient at F[1-202] reduction as is F[208-521], although neither the latter fragment, nor intact AhpF, can reduce Tr. An engineered mutant AhpC substrate with a fluorophore attached via a disulfide bond has been used to demonstrate that only F[1-202], and not F[208-521], is capable of electron transfer to AhpC, thereby establishing the direct role this N-terminal domain plays in mediating electron transfer between the TrR-like part of AhpF and AhpC.
Collapse
|
337
|
Jaroszewski L, Rychlewski L, Reed JC, Godzik A. ATP-activated oligomerization as a mechanism for apoptosis regulation: fold and mechanism prediction for CED-4. Proteins 2000; 39:197-203. [PMID: 10737940 DOI: 10.1002/(sici)1097-0134(20000515)39:3<197::aid-prot10>3.0.co;2-v] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Fold recognition algorithm FFAS (Rychlewski et al., Protein Sci, 2000;9:232-241) was used to match the nucleotide-binding adaptor shared by APAF-1, certain R gene products and CED-4 (NB-ARC domain) to the structure of the D2 domain of N-ethylemaleimide-Sensitive Fusion Protein and the delta; subunit of clamp loader of DNA polymerase III. The predicted structure consists of the p-loop ATP-binding domain, followed by two alpha-helical domains that regulate the oligomerization process. This prediction suggests a detailed molecular mechanism for the "induced proximity" hypothesis (Salvesen and Dixit, Proc Natl Acad Sci USA 1999;96:10964-10967) for CED3/caspase-9 activation by CED4/APAF-1 complex. According to this model, the ATP binding acts as a trigger in CED-4 oligomerization and the helical domain immediately following the ATP-binding domain provides additional mechanisms for regulation of the oligomerization process. This model explains most of known experimental data about CED-4-mediated caspase activation and, at the same time, suggest experiments that could test this hypothesis.
Collapse
|
338
|
Pawłowski K, Rychlewski L, Reed JC, Godzik A. From fold to function predictions: an apoptosis regulator protein BID. COMPUTERS & CHEMISTRY 2000; 24:511-7. [PMID: 10816020 DOI: 10.1016/s0097-8485(99)00081-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
With the rapidly increasing pace of genome sequencing projects and the resulting flood of predicted amino acid sequences of uncharacterized proteins, protein sequence analysis, and in particular, protein structure prediction is quickly gaining in importance. Prediction algorithms can be used for preliminary annotation of newly sequenced proteins and, at least in some cases, provide insights into their function and specific mode of action. Such annotations for several microbial genomes were performed by several groups and placed in public domain for evaluation. An example presented in this work comes from a related project of structural and functional predictions for proteins involved in the process of controlled cell death (apoptosis). The BID protein belongs to an important class of regulators of apoptosis identified by short sequence motifs. Here, several fold prediction methods are used to build a series of three-dimensional models. Structure analysis of the models with reference to the biological data available allows selection of the most appropriate model. It is found that the most likely structural model of BID is built on the structure of Bcl-X(L). The model is discussed in terms of experimental data on specific proteolytic cleavage of BID and its effect on BID interactions with other proteins and membranes.
Collapse
|
339
|
Zapata JM, Matsuzawa S, Godzik A, Leo E, Wasserman SA, Reed JC. The Drosophila tumor necrosis factor receptor-associated factor-1 (DTRAF1) interacts with Pelle and regulates NFkappaB activity. J Biol Chem 2000; 275:12102-7. [PMID: 10766844 DOI: 10.1074/jbc.275.16.12102] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
A member of the tumor necrosis factor (TNF) receptor-associated factor (TRAF) family was identified in Drosophila. DTRAF1 contains 7 zinc finger domains followed by a TRAF domain, similar to mammalian TRAFs and other members of the family identified in data bases from Caenorhabditis elegans, Arabidopsis, and Dictyostelium. Analysis of DTRAF1 binding to different members of the human TNF receptor family showed that this protein can interact through its TRAF domain with the p75 neurotrophin receptor and weakly with the lymphotoxin-beta receptor. DTRAF1 can also self-associate and binds to human TRAF1, TRAF2, and TRAF4. Interestingly, DTRAF1 interacts with human cIAP-1 and cIAP-2 but not with Drosophila DIAP-1 and -2. By itself, DTRAF1 did not induce significant NFkappaB activation when overexpressed in mammalian cells, although it specifically increased NFkappaB induction by TRAF6. In contrast, TRAF2-mediated NFkappaB induction was partially inhibited by DTRAF1. Mutants of DTRAF1 lacking the N-terminal region inhibited NFkappaB induction by either TRAF2 or TRAF6. DTRAF1 specifically associated with the regulatory N-terminal domain of Pelle, a Drosophila homolog of the human kinase interleukin-1 receptor-associated kinase (IRAK). Interestingly, though Pelle and DTRAF1 individually were unable to induce NFkappaB in a human cell line, co-expression of Pelle and DTRAF1 resulted in significant NFkappaB activity. Interactions of DTRAF1 with human TRAF-, TNF receptor-, and IAP-family proteins imply strong evolutionary conservation of TRAF protein structure and function throughout Metazoan evolution.
Collapse
|
340
|
Zhang H, Xu Q, Krajewski S, Krajewska M, Xie Z, Fuess S, Kitada S, Pawlowski K, Godzik A, Reed JC. BAR: An apoptosis regulator at the intersection of caspases and Bcl-2 family proteins. Proc Natl Acad Sci U S A 2000; 97:2597-602. [PMID: 10716992 PMCID: PMC15974 DOI: 10.1073/pnas.97.6.2597] [Citation(s) in RCA: 138] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Two major pathways for induction of apoptosis have been identified-intrinsic and extrinsic. The extrinsic pathway is represented by tumor necrosis factor family receptors, which utilize protein interaction modules known as death domains and death effector domains (DEDs) to assemble receptor signaling complexes that recruit and activate certain caspase-family cell death proteases, namely procaspases-8 and -10. The intrinsic pathway for apoptosis involves the participation of mitochondria, which release caspase-activating proteins. Bcl-2 family proteins govern this mitochondria-dependent apoptosis pathway, with proteins such as Bax functioning as inducers and proteins such as Bcl-2 and Bcl-X(L) serving as suppressors of cell death. An apoptosis regulator, BAR, was identified by using a yeast-based screen for inhibitors of Bax-induced cell death. The BAR protein contains a SAM domain, which is required for its interactions with Bcl-2 and Bcl-X(L) and for suppression of Bax-induced cell death in both mammalian cells and yeast. In addition, BAR contains a DED-like domain responsible for its interaction with DED-containing procaspases and suppression of Fas-induced apoptosis. Furthermore, BAR can bridge procaspase-8 and Bcl-2 into a protein complex. The BAR protein is anchored in intracellular membranes where Bcl-2 resides. BAR therefore may represent a scaffold protein capable of bridging two major apoptosis pathways.
Collapse
|
341
|
Rychlewski L, Jaroszewski L, Li W, Godzik A. Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 2000; 9:232-41. [PMID: 10716175 PMCID: PMC2144550 DOI: 10.1110/ps.9.2.232] [Citation(s) in RCA: 385] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Distant homologies between proteins are often discovered only after three-dimensional structures of both proteins are solved. The sequence divergence for such proteins can be so large that simple comparison of their sequences fails to identify any similarity. New generation of sensitive alignment tools use averaged sequences of entire homologous families (profiles) to detect such homologies. Several algorithms, including the newest generation of BLAST algorithms and BASIC, an algorithm used in our group to assign fold predictions for proteins from several genomes, are compared to each other on the large set of structurally similar proteins with little sequence similarity. Proteins in the benchmark are classified according to the level of their similarity, which allows us to demonstrate that most of the improvement of the new algorithms is achieved for proteins with strong functional similarities, with almost no progress in recognizing distant fold similarities. It is also shown that details of profile calculation strongly influence its sensitivity in recognizing distant homologies. The most important choice is how to include information from diverging members of the family, avoiding generating false predictions, while accounting for entire sequence divergence within a family. PSI-BLAST takes a conservative approach, deriving a profile from core members of the family, providing a solid improvement without almost any false predictions. BASIC strives for better sensitivity by increasing the weight of divergent family members and paying the price in lower reliability. A new FFAS algorithm introduced here uses a new procedure for profile generation that takes into account all the relations within the family and matches BASIC sensitivity with PSI-BLAST like reliability.
Collapse
|
342
|
Fischer D, Barret C, Bryson K, Elofsson A, Godzik A, Jones D, Karplus KJ, Kelley LA, MacCallum RM, Pawowski K, Rost B, Rychlewski L, Sternberg M. CAFASP-1: critical assessment of fully automated structure prediction methods. Proteins 1999; Suppl 3:209-17. [PMID: 10526371 DOI: 10.1002/(sici)1097-0134(1999)37:3+<209::aid-prot27>3.3.co;2-p] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The results of the first Critical Assessment of Fully Automated Structure Prediction (CAFASP-1) are presented. The objective was to evaluate the success rates of fully automatic web servers for fold recognition which are available to the community. This study was based on the targets used in the third meeting on the Critical Assessment of Techniques for Protein Structure Prediction (CASP-3). However, unlike CASP-3, the study was not a blind trial, as it was held after the structures of the targets were known. The aim was to assess the performance of methods without the user intervention that several groups used in their CASP-3 submissions. Although it is clear that "human plus machine" predictions are superior to automated ones, this CAFASP-1 experiment is extremely valuable for users of our methods; it provides an indication of the performance of the methods alone, and not of the "human plus machine" performance assessed in CASP. This information may aid users in choosing which programs they wish to use and in evaluating the reliability of the programs when applied to their specific prediction targets. In addition, evaluation of fully automated methods is particularly important to assess their applicability at genomic scales. For each target, groups submitted the top-ranking folds generated from their servers. In CAFASP-1 we concentrated on fold-recognition web servers only and evaluated only recognition of the correct fold, and not, as in CASP-3, alignment accuracy. Although some performance differences appeared within each of the four target categories used here, overall, no single server has proved markedly superior to the others. The results showed that current fully automated fold recognition servers can often identify remote similarities when pairwise sequence search methods fail. Nevertheless, in only a few cases outside the family-level targets has the score of the top-ranking fold been significant enough to allow for a confident fully automated prediction. Because the goals, rules, and procedures of CAFASP-1 were different from those used at CASP-3, the results reported here are not comparable with those reported in CASP-3. Nevertheless, it is clear that current automated fold recognition methods can not yet compete with "human-expert plus machine" predictions. Finally, CAFASP-1 has been useful in identifying the requirements for a future blind trial of automated served-based protein structure prediction.
Collapse
|
343
|
Schendel SL, Azimov R, Pawlowski K, Godzik A, Kagan BL, Reed JC. Ion channel activity of the BH3 only Bcl-2 family member, BID. J Biol Chem 1999; 274:21932-6. [PMID: 10419515 DOI: 10.1074/jbc.274.31.21932] [Citation(s) in RCA: 149] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
BID is a member of the BH3-only subgroup of Bcl-2 family proteins that displays pro-apoptotic activity. The NH(2)-terminal region of BID contains a caspase-8 (Casp-8) cleavage site and the cleaved form of BID translocates to mitochondrial membranes where it is a potent inducer of cytochrome c release. Secondary structure and fold predictions suggest that BID has a high degree of alpha-helical content and structural similarity to Bcl-X(L), which itself is highly similar to bacterial pore-forming toxins. Moreover, circular dichroism analysis confirmed a high alpha-helical content of BID. Amino-terminal truncated BIDDelta1-55, mimicking the Casp-8-cleaved molecule, formed channels in planar bilayers at neutral pH and in liposomes at acidic pH. In contrast, full-length BID displayed channel activity only at nonphysiological pH 4.0 (but not at neutral pH) in planar bilayers and failed to form channels in liposomes even under acidic conditions. On a single channel level, BIDDelta1-55 channels were voltage-gated and exhibited multiconductance behavior at neutral pH. When full-length BID was cleaved by Casp-8, it too demonstrated channel activity similar to that seen with BIDDelta1-55. Thus, BID appears to share structural and functional similarity with other Bcl-2 family proteins known to have channel-forming activity, but its activity exhibits a novel form of activation: proteolytic cleavage.
Collapse
|
344
|
Pawłowski K, Zhang B, Rychlewski L, Godzik A. The Helicobacter pylori genome: from sequence analysis to structural and functional predictions. Proteins 1999; 36:20-30. [PMID: 10373003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Fold assignments for proteins from the Helicobacter pylori genome are carried out using BASIC, a profile-profile alignment algorithm recently tested on the Mycoplasma genitalium and Escherichia coli genomes. The fold assignments are followed by automated function evaluation, based on the multilevel description of functional sites in proteins. Over 40% of the proteins encoded in the H. pylori genome can be recognized as belonging to a protein family with known structure. Previous estimates suggested that only 10-15% of genome proteins could be characterized this way. This dramatic increase in the number of recognized homologies between H. pylori proteins and structurally characterized protein families is partly due to the rapid increase of the database of known protein structures, but mostly it is due to the significant improvement in prediction algorithms. Knowledge of a protein fold adds a new dimension to our understanding of its function and, similarly, structure prediction can also add to understanding, verification, and/or prediction of function for uncharacterized proteins. Several examples analyzed in more detail in this article illustrate insights that can be achieved from structure and detailed function prediction.
Collapse
|
345
|
Paw?owski K, Zhang B, Rychlewski L, Godzik A. TheHelicobacter pylori genome: From sequence analysis to structural and functional predictions. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(19990701)36:1<20::aid-prot2>3.0.co;2-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
346
|
Zhang B, Rychlewski L, Pawłowski K, Fetrow JS, Skolnick J, Godzik A. From fold predictions to function predictions: automation of functional site conservation analysis for functional genome predictions. Protein Sci 1999; 8:1104-15. [PMID: 10338021 PMCID: PMC2144342 DOI: 10.1110/ps.8.5.1104] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
A database of functional sites for proteins with known structures, SITE, is constructed and used in conjunction with a simple pattern matching program SiteMatch to evaluate possible function conservation in a recently constructed database of fold predictions for Escherichia coli proteins (Rychlewski L et al., 1999, Protein Sci 8:614-624). In this and other prediction databases, fold predictions are based on algorithms that can recognize weak sequence similarities and putatively assign new proteins into already characterized protein families. It is not clear whether such sequence similarities arise from distant homologies or general similarity of physicochemical features along the sequence. Leaving aside the important question of nature of relations within fold superfamilies, it is possible to assess possible function conservation by looking at the pattern of conservation of crucial functional residues. SITE consists of a multilevel function description based on structure annotations and structure analyses. In particular, active site residues, ligand binding residues, and patterns of hydrophobic residues on the protein surface are used to describe different functional features. SiteMatch, a simple pattern matching program, is designed to check the conservation of residues involved in protein activity in alignments generated by any alignment method. Here, this procedure is used to study conservation of functional features in alignments between protein sequences from the E. coli genome and their optimal structural templates. The optimal templates were identified and alignments taken from the database of genomic structural predictions was described in a previous publication (Rychlewski L et al., 1999, Protein Sci 8:614-624). An automated assessment of function conservation is used to analyze the relation between fold and function similarity for a large number of fold predictions. For instance, it is shown that identifying low significance predictions with a high level of functional residue conservations can be used to extend the prediction sensitivity for fold prediction methods. Over 100 new fold/function predictions in this class were obtained in the E. coli genome. At the same time, about 30% of our previous fold predictions are not confirmed as function predictions, further highlighting the problem of function divergence in fold superfamilies.
Collapse
|
347
|
Rychlewski L, Zhang B, Godzik A. Functional insights from structural predictions: analysis of the Escherichia coli genome. Protein Sci 1999; 8:614-24. [PMID: 10091664 PMCID: PMC2144289 DOI: 10.1110/ps.8.3.614] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Fold assignments for proteins from the Escherichia coli genome are carried out using BASIC, a profile-profile alignment algorithm, recently tested on fold recognition benchmarks and on the Mycoplasma genitalium genome and PSI BLAST, the newest generation of the de facto standard in homology search algorithms. The fold assignments are followed by automated modeling and the resulting three-dimensional models are analyzed for possible function prediction. Close to 30% of the proteins encoded in the E. coli genome can be recognized as homologous to a protein family with known structure. Most of these homologies (23% of the entire genome) can be recognized both by PSI BLAST and BASIC algorithms, but the latter recognizes an additional 260 homologies. Previous estimates suggested that only 10-15% of E. coli proteins can be characterized this way. This dramatic increase in the number of recognized homologies between E. coli proteins and structurally characterized protein families is partly due to the rapid increase of the database of known protein structures, but mostly it is due to the significant improvement in prediction algorithms. Knowing protein structure adds a new dimension to our understanding of its function and the predictions presented here can be used to predict function for uncharacterized proteins. Several examples, analyzed in more detail in this paper, include the DPS protein protecting DNA from oxidative damage (predicted to be homologous to ferritin with iron ion acting as a reducing agent) and the ahpC/tsa family of proteins, which provides resistance to various oxidating agents (predicted to be homologous to glutathione peroxidase).
Collapse
|
348
|
Zhang L, Godzik A, Skolnick J, Fetrow JS. Functional analysis of the Escherichia coli genome for members of the alpha/beta hydrolase family. FOLDING & DESIGN 1999; 3:535-48. [PMID: 9889164 DOI: 10.1016/s1359-0278(98)00069-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
BACKGROUND Database-searching methods based on sequence similarity have become the most commonly used tools for characterizing newly sequenced proteins. Due to the often underestimated functional diversity in protein families and superfamilies, however, it is difficult to make the characterization specific and accurate. In this work, we have extended a method for active-site identification from predicted protein structures. RESULTS The structural conservation and variation of the active sites of the alpha/beta hydrolases with known structures were studied. The similarities were incorporated into a three-dimensional motif that specifies essential requirements for the enzymatic functions. A threading algorithm was used to align 651 Escherichia coli open reading frames (ORFs) to one of the members of the alpha/beta hydrolase fold family. These ORFs were then screened according to our three-dimensional motif and with an extra requirement that demands conservation of the key active-site residues among the proteins that bear significant sequence similarity to the ORFs. 17 ORFs from E. coli were predicted to have hydrolase activity and their putative active-site residues were identified. Most were in agreement with the experiments and results of other database-searching methods. The study further suggests that YHET_ECOLI, a hypothetical protein classified as a member of the UPF0017 family (an uncharacterized protein family), bears all the hallmarks of the alpha/beta hydrolase family. CONCLUSIONS The novel feature of our method is that it uses three-dimensional structural information for function prediction. The results demonstrate the importance and necessity of such a method to fill the gap between sequence alignment and function prediction; furthermore, the method provides a way to verify the structure predictions, which enables an expansion of the applicable scope of the threading algorithms.
Collapse
|
349
|
Fischer D, Barret C, Bryson K, Elofsson A, Godzik A, Jones D, Karplus KJ, Kelley LA, MacCallum RM, Pawowski K, Rost B, Rychlewski L, Sternberg M. CAFASP-1: Critical assessment of fully automated structure prediction methods. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(1999)37:3+<209::aid-prot27>3.0.co;2-y] [Citation(s) in RCA: 107] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
350
|
Jaroszewski L, Pawlowski K, Godzik A. Multiple Model Approach: Exploring the Limits of Comparative Modeling. J Mol Model 1998. [DOI: 10.1007/s008940050087] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|