1
|
Wang S, Sun E, Liu Y, Yin B, Zhang X, Li M, Huang Q, Tan C, Qian P, Rao VB, Tao P. Landscape of New Nuclease-Containing Antiphage Systems in Escherichia coli and the Counterdefense Roles of Bacteriophage T4 Genome Modifications. J Virol 2023; 97:e0059923. [PMID: 37306585 PMCID: PMC10308915 DOI: 10.1128/jvi.00599-23] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2023] [Accepted: 05/19/2023] [Indexed: 06/13/2023] Open
Abstract
Many phages, such as T4, protect their genomes against the nucleases of bacterial restriction-modification (R-M) and CRISPR-Cas systems through covalent modification of their genomes. Recent studies have revealed many novel nuclease-containing antiphage systems, raising the question of the role of phage genome modifications in countering these systems. Here, by focusing on phage T4 and its host Escherichia coli, we depicted the landscape of the new nuclease-containing systems in E. coli and demonstrated the roles of T4 genome modifications in countering these systems. Our analysis identified at least 17 nuclease-containing defense systems in E. coli, with type III Druantia being the most abundant system, followed by Zorya, Septu, Gabija, AVAST type 4, and qatABCD. Of these, 8 nuclease-containing systems were found to be active against phage T4 infection. During T4 replication in E. coli, 5-hydroxymethyl dCTP is incorporated into the newly synthesized DNA instead of dCTP. The 5-hydroxymethylcytosines (hmCs) are further modified by glycosylation to form glucosyl-5-hydroxymethylcytosine (ghmC). Our data showed that the ghmC modification of the T4 genome abolished the defense activities of Gabija, Shedu, Restriction-like, type III Druantia, and qatABCD systems. The anti-phage T4 activities of the last two systems can also be counteracted by hmC modification. Interestingly, the Restriction-like system specifically restricts phage T4 containing an hmC-modified genome. The ghmC modification cannot abolish the anti-phage T4 activities of Septu, SspBCDE, and mzaABCDE, although it reduces their efficiency. Our study reveals the multidimensional defense strategies of E. coli nuclease-containing systems and the complex roles of T4 genomic modification in countering these defense systems. IMPORTANCE Cleavage of foreign DNA is a well-known mechanism used by bacteria to protect themselves from phage infections. Two well-known bacterial defense systems, R-M and CRISPR-Cas, both contain nucleases that cleave the phage genomes through specific mechanisms. However, phages have evolved different strategies to modify their genomes to prevent cleavage. Recent studies have revealed many novel nuclease-containing antiphage systems from various bacteria and archaea. However, no studies have systematically investigated the nuclease-containing antiphage systems of a specific bacterial species. In addition, the role of phage genome modifications in countering these systems remains unknown. Here, by focusing on phage T4 and its host Escherichia coli, we depicted the landscape of the new nuclease-containing systems in E. coli using all 2,289 genomes available in NCBI. Our studies reveal the multidimensional defense strategies of E. coli nuclease-containing systems and the complex roles of genomic modification of phage T4 in countering these defense systems.
Collapse
Affiliation(s)
- Shuangshuang Wang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Erchao Sun
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Yuepeng Liu
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Baoqi Yin
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Xueqi Zhang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Mengling Li
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Qi Huang
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Chen Tan
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| | - Ping Qian
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Venigalla B. Rao
- Bacteriophage Medical Research Center, Department of Biology, The Catholic University of America, Washington, DC, USA
| | - Pan Tao
- State Key Laboratory of Agricultural Microbiology, Huazhong Agricultural University, Wuhan, Hubei, China
- Cooperative Innovation Center for Sustainable Pig Production, Huazhong Agricultural University, Wuhan, Hubei, China
- Hubei Hongshan Lab, Wuhan, Hubei, China
| |
Collapse
|
2
|
Chen H, Zhang M, Hochstrasser M. The Biochemistry of Cytoplasmic Incompatibility Caused by Endosymbiotic Bacteria. Genes (Basel) 2020; 11:genes11080852. [PMID: 32722516 PMCID: PMC7465683 DOI: 10.3390/genes11080852] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Revised: 07/19/2020] [Accepted: 07/20/2020] [Indexed: 12/29/2022] Open
Abstract
Many species of arthropods carry maternally inherited bacterial endosymbionts that can influence host sexual reproduction to benefit the bacterium. The most well-known of such reproductive parasites is Wolbachia pipientis. Wolbachia are obligate intracellular α-proteobacteria found in nearly half of all arthropod species. This success has been attributed in part to their ability to manipulate host reproduction to favor infected females. Cytoplasmic incompatibility (CI), a phenomenon wherein Wolbachia infection renders males sterile when they mate with uninfected females, but not infected females (the rescue mating), appears to be the most common. CI provides a reproductive advantage to infected females in the presence of a threshold level of infected males. The molecular mechanisms of CI and other reproductive manipulations, such as male killing, parthenogenesis, and feminization, have remained mysterious for many decades. It had been proposed by Werren more than two decades ago that CI is caused by a Wolbachia-mediated sperm modification and that rescue is achieved by a Wolbachia-encoded rescue factor in the infected egg. In the past few years, new research has highlighted a set of syntenic Wolbachia gene pairs encoding CI-inducing factors (Cifs) as the key players for the induction of CI and its rescue. Within each Cif pair, the protein encoded by the upstream gene is denoted A and the downstream gene B. To date, two types of Cifs have been characterized based on the enzymatic activity identified in the B protein of each protein pair; one type encodes a deubiquitylase (thus named CI-inducing deubiquitylase or cid), and a second type encodes a nuclease (named CI-inducing nuclease or cin). The CidA and CinA proteins bind tightly and specifically to their respective CidB and CinB partners. In transgenic Drosophila melanogaster, the expression of either the Cid or Cin protein pair in the male germline induces CI and the expression of the cognate A protein in females is sufficient for rescue. With the identity of the Wolbachia CI induction and rescue factors now known, research in the field has turned to directed studies on the molecular mechanisms of CI, which we review here.
Collapse
Affiliation(s)
- Hongli Chen
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06511, USA; (H.C.); (M.Z.)
| | - Mengwen Zhang
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06511, USA; (H.C.); (M.Z.)
- Department of Chemistry, Yale University, New Haven, CT 06511, USA
| | - Mark Hochstrasser
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06511, USA; (H.C.); (M.Z.)
- Department of Molecular, Cellular, & Developmental Biology, Yale University, New Haven, CT 06511, USA
- Correspondence:
| |
Collapse
|
3
|
Jana B, Fridman CM, Bosis E, Salomon D. A modular effector with a DNase domain and a marker for T6SS substrates. Nat Commun 2019; 10:3595. [PMID: 31399579 PMCID: PMC6688995 DOI: 10.1038/s41467-019-11546-6] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 07/16/2019] [Indexed: 12/30/2022] Open
Abstract
Bacteria deliver toxic effectors via type VI secretion systems (T6SSs) to dominate competitors, but the identity and function of many effectors remain unknown. Here we identify a Vibrio antibacterial T6SS effector that contains a previously undescribed, widespread DNase toxin domain that we call PoNe (Polymorphic Nuclease effector). PoNe belongs to a diverse superfamily of PD-(D/E)xK phosphodiesterases, and is associated with several toxin delivery systems including type V, type VI, and type VII. PoNe toxicity is antagonized by cognate immunity proteins (PoNi) containing DUF1911 and DUF1910 domains. In addition to PoNe, the effector contains a domain of unknown function (FIX domain) that is also found N-terminal to known toxin domains and is genetically and functionally linked to T6SS. FIX sequences can be used to identify T6SS effector candidates with potentially novel toxin domains. Our findings underline the modular nature of bacterial effectors harboring delivery or marker domains, specific to a secretion system, fused to interchangeable toxins. Bacteria deliver toxic effectors via type VI secretion systems (T6SSs) to dominate competitors. Here, the authors identify a Vibrio antibacterial effector that contains a new DNase toxin domain and a domain of unknown function that can be used as a marker to identify new T6SS effectors.
Collapse
Affiliation(s)
- Biswanath Jana
- Department of Clinical Microbiology and Immunology, Sackler Faculty of Medicine, Tel Aviv University, 6997801, Tel Aviv, Israel
| | - Chaya M Fridman
- Department of Clinical Microbiology and Immunology, Sackler Faculty of Medicine, Tel Aviv University, 6997801, Tel Aviv, Israel
| | - Eran Bosis
- Department of Biotechnology Engineering, ORT Braude College of Engineering, 2161002, Karmiel, Israel.
| | - Dor Salomon
- Department of Clinical Microbiology and Immunology, Sackler Faculty of Medicine, Tel Aviv University, 6997801, Tel Aviv, Israel.
| |
Collapse
|
4
|
Characterization of a DUF820 family protein Alr3200 of the cyanobacterium Anabaena sp. strain PCC7120. J Biosci 2016; 41:589-600. [DOI: 10.1007/s12038-016-9646-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
5
|
Afanas’ev MV, Balakhonov SV, Tokmakova EG, Polovinkina VS, Sidorova EA, Sinkov VV. Analysis of complete sequence of cryptic plasmid pTP33 from Yersinia pestis isolated in Tuva natural focus of plague. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416090027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
6
|
Johnson PM, Gucinski GC, Garza-Sánchez F, Wong T, Hung LW, Hayes CS, Goulding CW. Functional Diversity of Cytotoxic tRNase/Immunity Protein Complexes from Burkholderia pseudomallei. J Biol Chem 2016; 291:19387-400. [PMID: 27445337 DOI: 10.1074/jbc.m116.736074] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Indexed: 12/23/2022] Open
Abstract
Contact-dependent growth inhibition (CDI) is a widespread mechanism of inter-bacterial competition. CDI(+) bacteria deploy large CdiA effector proteins, which carry variable C-terminal toxin domains (CdiA-CT). CDI(+) cells also produce CdiI immunity proteins that specifically neutralize cognate CdiA-CT toxins to prevent auto-inhibition. Here, we present the crystal structure of the CdiA-CT/CdiI(E479) toxin/immunity protein complex from Burkholderia pseudomallei isolate E479. The CdiA-CT(E479) tRNase domain contains a core α/β-fold that is characteristic of PD(D/E)XK superfamily nucleases. Unexpectedly, the closest structural homolog of CdiA-CT(E479) is another CDI toxin domain from B. pseudomallei 1026b. Although unrelated in sequence, the two B. pseudomallei nuclease domains share similar folds and active-site architectures. By contrast, the CdiI(E479) and CdiI(1026b) immunity proteins share no significant sequence or structural homology. CdiA-CT(E479) and CdiA-CT(1026b) are both tRNases; however, each nuclease cleaves tRNA at a distinct position. We used a molecular docking approach to model each toxin bound to tRNA substrate. The resulting models fit into electron density envelopes generated by small-angle x-ray scattering analysis of catalytically inactive toxin domains bound stably to tRNA. CdiA-CT(E479) is the third CDI toxin found to have structural homology to the PD(D/E)XK superfamily. We propose that CDI systems exploit the inherent sequence variability and active-site plasticity of PD(D/E)XK nucleases to generate toxin diversity. These findings raise the possibility that many other uncharacterized CDI toxins may belong to the PD(D/E)XK superfamily.
Collapse
Affiliation(s)
| | | | - Fernando Garza-Sánchez
- Department of Molecular, Cellular and Developmental Biology, University of California at Santa Barbara, Santa Barbara, California 93106-9625, and
| | - Timothy Wong
- From the Departments of Molecular Biology and Biochemistry and
| | - Li-Wei Hung
- the Physics Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545
| | - Christopher S Hayes
- the Biomolecular Science and Engineering Program and Department of Molecular, Cellular and Developmental Biology, University of California at Santa Barbara, Santa Barbara, California 93106-9625, and
| | - Celia W Goulding
- From the Departments of Molecular Biology and Biochemistry and Pharmaceutical Sciences, University of California at Irvine, Irvine, California 92697,
| |
Collapse
|
7
|
Lopes-Kulishev CO, Alves IR, Valencia EY, Pidhirnyj MI, Fernández-Silva FS, Rodrigues TR, Guzzo CR, Galhardo RS. Functional characterization of two SOS-regulated genes involved in mitomycin C resistance in Caulobacter crescentus. DNA Repair (Amst) 2015; 33:78-89. [DOI: 10.1016/j.dnarep.2015.06.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Revised: 06/24/2015] [Accepted: 06/26/2015] [Indexed: 10/23/2022]
|
8
|
Hooton SPT, Timms AR, Cummings NJ, Moreton J, Wilson R, Connerton IF. The complete plasmid sequences of Salmonella enterica serovar Typhimurium U288. Plasmid 2014; 76:32-9. [PMID: 25175817 DOI: 10.1016/j.plasmid.2014.08.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2013] [Revised: 08/11/2014] [Accepted: 08/21/2014] [Indexed: 12/20/2022]
Abstract
Salmonella enterica Serovar Typhimurium U288 is an emerging pathogen of pigs. The strain contains three plasmids of diverse origin that encode traits that are of concern for food security and safety, these include antibiotic resistant determinants, an array of functions that can modify cell physiology and permit genetic mobility. At 148,711 bp, pSTU288-1 appears to be a hybrid plasmid containing a conglomerate of genes found in pSLT of S. Typhimurium LT2, coupled with a mosaic of horizontally-acquired elements. Class I integron containing gene cassettes conferring resistance against clinically important antibiotics and compounds are present in pSTU288-1. A curious feature of the plasmid involves the deletion of two genes encoded in the Salmonella plasmid virulence operon (spvR and spvA) following the insertion of a tnpA IS26-like element coupled to a blaTEM gene. The spv operon is considered to be a major plasmid-encoded Salmonella virulence factor that is essential for the intracellular lifecycle. The loss of the positive regulator SpvR may impact on the pathogenesis of S. Typhimurium U288. A second 11,067 bp plasmid designated pSTU288-2 contains further antibiotic resistance determinants, as well as replication and mobilization genes. Finally, a small 4675 bp plasmid pSTU288-3 was identified containing mobilization genes and a pleD-like G-G-D/E-E-F conserved domain protein that modulate intracellular levels of cyclic di-GMP, and are associated with motile to sessile transitions in growth.
Collapse
Affiliation(s)
- Steven P T Hooton
- Division of Food Sciences, School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough LE12 5RD, UK
| | - Andrew R Timms
- Division of Food Sciences, School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough LE12 5RD, UK
| | - Nicola J Cummings
- Division of Food Sciences, School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough LE12 5RD, UK
| | - Joanna Moreton
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Loughborough LE12 5RD, UK
| | - Ray Wilson
- DeepSeq, Queens Medical Centre, University of Nottingham, Nottingham NG7 2UH, UK
| | - Ian F Connerton
- Division of Food Sciences, School of Biosciences, University of Nottingham, Sutton Bonington Campus, Loughborough LE12 5RD, UK.
| |
Collapse
|
9
|
Mukha DV, Pasyukova EG, Kapelinskaya TV, Kagramanova AS. Endonuclease domain of the Drosophila melanogaster R2 non-LTR retrotransposon and related retroelements: a new model for transposition. Front Genet 2013; 4:63. [PMID: 23637706 PMCID: PMC3636483 DOI: 10.3389/fgene.2013.00063] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2012] [Accepted: 04/05/2013] [Indexed: 01/25/2023] Open
Abstract
The molecular mechanisms of the transposition of non-long terminal repeat (non-LTR) retrotransposons are not well understood; the key questions of how the 3′-ends of cDNA copies integrate and how site-specific integration occurs remain unresolved. Integration depends on properties of the endonuclease (EN) domain of retrotransposons. Using the EN domain of the Drosophila R2 retrotransposon as a model for other, closely related non-LTR retrotransposons, we investigated the EN domain and found that it resembles archaeal Holliday-junction resolvases. We suggest that these non-LTR retrotransposons are co-transcribed with the host transcript. Combined with the proposed resolvase activity of the EN domain, this model yields a novel mechanism for site-specific retrotransposition within this class of retrotransposons, with resolution proceeding via a Holliday junction intermediate.
Collapse
Affiliation(s)
- Dmitry V Mukha
- Vavilov Institute of General Genetics, Russian Academy of Sciences Moscow, Russia
| | | | | | | |
Collapse
|
10
|
Steczkiewicz K, Muszewska A, Knizewski L, Rychlewski L, Ginalski K. Sequence, structure and functional diversity of PD-(D/E)XK phosphodiesterase superfamily. Nucleic Acids Res 2012; 40:7016-45. [PMID: 22638584 PMCID: PMC3424549 DOI: 10.1093/nar/gks382] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Proteins belonging to PD-(D/E)XK phosphodiesterases constitute a functionally diverse superfamily with representatives involved in replication, restriction, DNA repair and tRNA-intron splicing. Their malfunction in humans triggers severe diseases, such as Fanconi anemia and Xeroderma pigmentosum. To date there have been several attempts to identify and classify new PD-(D/E)KK phosphodiesterases using remote homology detection methods. Such efforts are complicated, because the superfamily exhibits extreme sequence and structural divergence. Using advanced homology detection methods supported with superfamily-wide domain architecture and horizontal gene transfer analyses, we provide a comprehensive reclassification of proteins containing a PD-(D/E)XK domain. The PD-(D/E)XK phosphodiesterases span over 21,900 proteins, which can be classified into 121 groups of various families. Eleven of them, including DUF4420, DUF3883, DUF4263, COG5482, COG1395, Tsp45I, HaeII, Eco47II, ScaI, HpaII and Replic_Relax, are newly assigned to the PD-(D/E)XK superfamily. Some groups of PD-(D/E)XK proteins are present in all domains of life, whereas others occur within small numbers of organisms. We observed multiple horizontal gene transfers even between human pathogenic bacteria or from Prokaryota to Eukaryota. Uncommon domain arrangements greatly elaborate the PD-(D/E)XK world. These include domain architectures suggesting regulatory roles in Eukaryotes, like stress sensing and cell-cycle regulation. Our results may inspire further experimental studies aimed at identification of exact biological functions, specific substrates and molecular mechanisms of reactions performed by these highly diverse proteins.
Collapse
Affiliation(s)
- Kamil Steczkiewicz
- Laboratory of Bioinformatics and Systems Biology, CENT, University of Warsaw, Zwirki i Wigury 93, 02-089 Warsaw, Poland
| | | | | | | | | |
Collapse
|
11
|
Zylicz-Stachula A, Zolnierkiewicz O, Lubys A, Ramanauskaite D, Mitkaite G, Bujnicki JM, Skowron PM. Related bifunctional restriction endonuclease-methyltransferase triplets: TspDTI, Tth111II/TthHB27I and TsoI with distinct specificities. BMC Mol Biol 2012; 13:13. [PMID: 22489904 PMCID: PMC3384240 DOI: 10.1186/1471-2199-13-13] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2012] [Accepted: 04/10/2012] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND We previously defined a family of restriction endonucleases (REases) from Thermus sp., which share common biochemical and biophysical features, such as the fusion of both the nuclease and methyltransferase (MTase) activities in a single polypeptide, cleavage at a distance from the recognition site, large molecular size, modulation of activity by S-adenosylmethionine (SAM), and incomplete cleavage of the substrate DNA. Members include related thermophilic REases with five distinct specificities: TspGWI, TaqII, Tth111II/TthHB27I, TspDTI and TsoI. RESULTS TspDTI, TsoI and isoschizomers Tth111II/TthHB27I recognize different, but related sequences: 5'-ATGAA-3', 5'-TARCCA-3' and 5'-CAARCA-3' respectively. Their amino acid sequences are similar, which is unusual among REases of different specificity. To gain insight into this group of REases, TspDTI, the prototype member of the Thermus sp. enzyme family, was cloned and characterized using a recently developed method for partially cleaving REases. CONCLUSIONS TspDTI, TsoI and isoschizomers Tth111II/TthHB27I are closely related bifunctional enzymes. They comprise a tandem arrangement of Type I-like domains, like other Type IIC enzymes (those with a fusion of a REase and MTase domains), e.g. TspGWI, TaqII and MmeI, but their sequences are only remotely similar to these previously characterized enzymes. The characterization of TspDTI, a prototype member of this group, extends our understanding of sequence-function relationships among multifunctional restriction-modification enzymes.
Collapse
|
12
|
Laganeckas M, Margelevicius M, Venclovas C. Identification of new homologs of PD-(D/E)XK nucleases by support vector machines trained on data derived from profile-profile alignments. Nucleic Acids Res 2010; 39:1187-96. [PMID: 20961958 PMCID: PMC3045609 DOI: 10.1093/nar/gkq958] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
PD-(D/E)XK nucleases, initially represented by only Type II restriction enzymes, now comprise a large and extremely diverse superfamily of proteins. They participate in many different nucleic acids transactions including DNA degradation, recombination, repair and RNA processing. Different PD-(D/E)XK families, although sharing a structurally conserved core, typically display little or no detectable sequence similarity except for the active site motifs. This makes the identification of new superfamily members using standard homology search techniques challenging. To tackle this problem, we developed a method for the detection of PD-(D/E)XK families based on the binary classification of profile–profile alignments using support vector machines (SVMs). Using a number of both superfamily-specific and general features, SVMs were trained to identify true positive alignments of PD-(D/E)XK representatives. With this method we identified several PFAM families of uncharacterized proteins as putative new members of the PD-(D/E)XK superfamily. In addition, we assigned several unclassified restriction enzymes to the PD-(D/E)XK type. Results show that the new method is able to make confident assignments even for alignments that have statistically insignificant scores. We also implemented the method as a freely accessible web server at http://www.ibt.lt/bioinformatics/software/pdexk/.
Collapse
|
13
|
The crystal structure of D212 from sulfolobus spindle-shaped virus ragged hills reveals a new member of the PD-(D/E)XK nuclease superfamily. J Virol 2010; 84:5890-7. [PMID: 20375162 PMCID: PMC2876643 DOI: 10.1128/jvi.01663-09] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Structural studies have made significant contributions to our understanding of Sulfolobus spindle-shaped viruses (Fuselloviridae), an important model system for archaeal viruses. Continuing these efforts, we report the structure of D212 from Sulfolobus spindle-shaped virus Ragged Hills. The overall fold and conservation of active site residues place D212 in the PD-(D/E)XK nuclease superfamily. The greatest structural similarity is found to the archaeal Holliday junction cleavage enzymes, strongly suggesting a role in DNA replication, repair, or recombination. Other roles associated with nuclease activity are also considered.
Collapse
|
14
|
Bertrand L, Leiva-Torres GA, Hyjazie H, Pearson A. Conserved residues in the UL24 protein of herpes simplex virus 1 are important for dispersal of the nucleolar protein nucleolin. J Virol 2010; 84:109-18. [PMID: 19864385 PMCID: PMC2798432 DOI: 10.1128/jvi.01428-09] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2009] [Accepted: 10/20/2009] [Indexed: 12/13/2022] Open
Abstract
The UL24 family of proteins is widely conserved among herpesviruses. We demonstrated previously that UL24 of herpes simplex virus 1 (HSV-1) is important for the dispersal of nucleolin from nucleolar foci throughout the nuclei of infected cells. Furthermore, the N-terminal portion of UL24 localizes to nuclei and can disperse nucleolin in the absence of any other viral proteins. In this study, we tested the hypothesis that highly conserved residues in UL24 are important for the ability of the protein to modify the nuclear distribution of nucleolin. We constructed a panel of substitution mutations in UL24 and tested their effects on nucleolin staining patterns. We found that modified UL24 proteins exhibited a range of subcellular distributions. Mutations associated with a wild-type localization pattern for UL24 correlated with high levels of nucleolin dispersal. Interestingly, mutations targeting two regions, namely, within the first homology domain and overlapping or near the previously identified PD-(D/E)XK endonuclease motif, caused the most altered UL24 localization pattern and the most drastic reduction in its ability to disperse nucleolin. Viral mutants corresponding to the substitutions G121A and E99A/K101A both exhibited a syncytial plaque phenotype at 39 degrees C. vUL24-E99A/K101A replicated to lower titers than did vUL24-G121A or KOS. Furthermore, the E99A/K101A mutation caused the greatest impairment of HSV-1-induced dispersal of nucleolin. Our results identified residues in UL24 that are critical for the ability of UL24 to alter nucleoli and further support the notion that the endonuclease motif is important for the function of UL24 during infection.
Collapse
Affiliation(s)
- Luc Bertrand
- INRS-Institut Armand-Frappier, Université du Québec, Laval, Québec, Canada
| | | | - Huda Hyjazie
- INRS-Institut Armand-Frappier, Université du Québec, Laval, Québec, Canada
| | - Angela Pearson
- INRS-Institut Armand-Frappier, Université du Québec, Laval, Québec, Canada
| |
Collapse
|
15
|
Makarova KS, Wolf YI, Koonin EV. Comprehensive comparative-genomic analysis of type 2 toxin-antitoxin systems and related mobile stress response systems in prokaryotes. Biol Direct 2009; 4:19. [PMID: 19493340 PMCID: PMC2701414 DOI: 10.1186/1745-6150-4-19] [Citation(s) in RCA: 318] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2009] [Accepted: 06/03/2009] [Indexed: 11/13/2022] Open
Abstract
Background The prokaryotic toxin-antitoxin systems (TAS, also referred to as TA loci) are widespread, mobile two-gene modules that can be viewed as selfish genetic elements because they evolved mechanisms to become addictive for replicons and cells in which they reside, but also possess "normal" cellular functions in various forms of stress response and management of prokaryotic population. Several distinct TAS of type 1, where the toxin is a protein and the antitoxin is an antisense RNA, and numerous, unrelated TAS of type 2, in which both the toxin and the antitoxin are proteins, have been experimentally characterized, and it is suspected that many more remain to be identified. Results We report a comprehensive comparative-genomic analysis of Type 2 toxin-antitoxin systems in prokaryotes. Using sensitive methods for distant sequence similarity search, genome context analysis and a new approach for the identification of mobile two-component systems, we identified numerous, previously unnoticed protein families that are homologous to toxins and antitoxins of known type 2 TAS. In addition, we predict 12 new families of toxins and 13 families of antitoxins, and also, predict a TAS or TAS-like activity for several gene modules that were not previously suspected to function in that capacity. In particular, we present indications that the two-gene module that encodes a minimal nucleotidyl transferase and the accompanying HEPN protein, and is extremely abundant in many archaea and bacteria, especially, thermophiles might comprise a novel TAS. We present a survey of previously known and newly predicted TAS in 750 complete genomes of archaea and bacteria, quantitatively demonstrate the exceptional mobility of the TAS, and explore the network of toxin-antitoxin pairings that combines plasticity with selectivity. Conclusion The defining properties of the TAS, namely, the typically small size of the toxin and antitoxin genes, fast evolution, and extensive horizontal mobility, make the task of comprehensive identification of these systems particularly challenging. However, these same properties can be exploited to develop context-based computational approaches which, combined with exhaustive analysis of subtle sequence similarities were employed in this work to substantially expand the current collection of TAS by predicting both previously unnoticed, derived versions of known toxins and antitoxins, and putative novel TAS-like systems. In a broader context, the TAS belong to the resistome domain of the prokaryotic mobilome which includes partially selfish, addictive gene cassettes involved in various aspects of stress response and organized under the same general principles as the TAS. The "selfish altruism", or "responsible selfishness", of TAS-like systems appears to be a defining feature of the resistome and an important characteristic of the entire prokaryotic pan-genome given that in the prokaryotic world the mobilome and the "stable" chromosomes form a dynamic continuum. Reviewers This paper was reviewed by Kenn Gerdes (nominated by Arcady Mushegian), Daniel Haft, Arcady Mushegian, and Andrei Osterman. For full reviews, go to the Reviewers' Reports section.
Collapse
Affiliation(s)
- Kira S Makarova
- National Center for Biotechnology Information, NLM, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | | | |
Collapse
|
16
|
Zylicz-Stachula A, Bujnicki JM, Skowron PM. Cloning and analysis of a bifunctional methyltransferase/restriction endonuclease TspGWI, the prototype of a Thermus sp. enzyme family. BMC Mol Biol 2009; 10:52. [PMID: 19480701 PMCID: PMC2700111 DOI: 10.1186/1471-2199-10-52] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2008] [Accepted: 05/29/2009] [Indexed: 01/09/2023] Open
Abstract
Background Restriction-modification systems are a diverse class of enzymes. They are classified into four major types: I, II, III and IV. We have previously proposed the existence of a Thermus sp. enzyme family, which belongs to type II restriction endonucleases (REases), however, it features also some characteristics of types I and III. Members include related thermophilic endonucleases: TspGWI, TaqII, TspDTI, and Tth111II. Results Here we describe cloning, mutagenesis and analysis of the prototype TspGWI enzyme that recognises the 5'-ACGGA-3' site and cleaves 11/9 nt downstream. We cloned, expressed, and mutagenised the tspgwi gene and investigated the properties of its product, the bifunctional TspGWI restriction/modification enzyme. Since TspGWI does not cleave DNA completely, a cloning method was devised, based on amino acid sequencing of internal proteolytic fragments. The deduced amino acid sequence of the enzyme shares significant sequence similarity with another representative of the Thermus sp. family – TaqII. Interestingly, these enzymes recognise similar, yet different sequences in the DNA. Both enzymes cleave DNA at the same distance, but differ in their ability to cleave single sites and in the requirement of S-adenosylmethionine as an allosteric activator for cleavage. Both the restriction endonuclease (REase) and methyltransferase (MTase) activities of wild type (wt) TspGWI (either recombinant or isolated from Thermus sp.) are dependent on the presence of divalent cations. Conclusion TspGWI is a bifunctional protein comprising a tandem arrangement of Type I-like domains; particularly noticeable is the central HsdM-like module comprising a helical domain and a highly conserved S-adenosylmethionine-binding/catalytic MTase domain, containing DPAVGTG and NPPY motifs. TspGWI also possesses an N-terminal PD-(D/E)XK nuclease domain related to the corresponding domains in HsdR subunits, but lacks the ATP-dependent translocase module of the HsdR subunit and the additional domains that are involved in subunit-subunit interactions in Type I systems. The MTase and REase activities of TspGWI are autonomous and can be uncoupled. Structurally and functionally, the TspGWI protomer appears to be a streamlined 'half' of a Type I enzyme.
Collapse
Affiliation(s)
- Agnieszka Zylicz-Stachula
- Division of Environmental Molecular Biotechnology, Department of Chemistry, University of Gdansk, Sobieskiego 18, Gdansk 80-952, Poland.
| | | | | |
Collapse
|
17
|
Type II restriction endonuclease R.Hpy188I belongs to the GIY-YIG nuclease superfamily, but exhibits an unusual active site. BMC STRUCTURAL BIOLOGY 2008; 8:48. [PMID: 19014591 PMCID: PMC2630997 DOI: 10.1186/1472-6807-8-48] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2008] [Accepted: 11/14/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND Catalytic domains of Type II restriction endonucleases (REases) belong to a few unrelated three-dimensional folds. While the PD-(D/E)XK fold is most common among these enzymes, crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI). Bioinformatics analyses supported by mutagenesis experiments suggested that some REases belong to the HNH fold (e.g. R.KpnI), and that a small group represented by R.Eco29kI belongs to the GIY-YIG fold. However, for a large fraction of REases with known sequences, the three-dimensional fold and the architecture of the active site remain unknown, mostly due to extreme sequence divergence that hampers detection of homology to enzymes with known folds. RESULTS R.Hpy188I is a Type II REase with unknown structure. PSI-BLAST searches of the non-redundant protein sequence database reveal only 1 homolog (R.HpyF17I, with nearly identical amino acid sequence and the same DNA sequence specificity). Standard application of state-of-the-art protein fold-recognition methods failed to predict the relationship of R.Hpy188I to proteins with known structure or to other protein families. In order to increase the amount of evolutionary information in the multiple sequence alignment, we have expanded our sequence database searches to include sequences from metagenomics projects. This search resulted in identification of 23 further members of R.Hpy188I family, both from metagenomics and the non-redundant database. Moreover, fold-recognition analysis of the extended R.Hpy188I family revealed its relationship to the GIY-YIG domain and allowed for computational modeling of the R.Hpy188I structure. Analysis of the R.Hpy188I model in the light of sequence conservation among its homologs revealed an unusual variant of the active site, in which the typical Tyr residue of the YIG half-motif had been substituted by a Lys residue. Moreover, some of its homologs have the otherwise invariant Arg residue in a non-homologous position in sequence that nonetheless allows for spatial conservation of the guanidino group potentially involved in phosphate binding. CONCLUSION The present study eliminates a significant "white spot" on the structural map of REases. It also provides important insight into sequence-structure-function relationships in the GIY-YIG nuclease superfamily. Our results reveal that in the case of proteins with no or few detectable homologs in the standard "non-redundant" database, it is useful to expand this database by adding the metagenomic sequences, which may provide evolutionary linkage to detect more remote homologs.
Collapse
|
18
|
Orlowski J, Bujnicki JM. Structural and evolutionary classification of Type II restriction enzymes based on theoretical and experimental analyses. Nucleic Acids Res 2008; 36:3552-69. [PMID: 18456708 PMCID: PMC2441816 DOI: 10.1093/nar/gkn175] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
For a very long time, Type II restriction enzymes (REases) have been a paradigm of ORFans: proteins with no detectable similarity to each other and to any other protein in the database, despite common cellular and biochemical function. Crystallographic analyses published until January 2008 provided high-resolution structures for only 28 of 1637 Type II REase sequences available in the Restriction Enzyme database (REBASE). Among these structures, all but two possess catalytic domains with the common PD-(D/E)XK nuclease fold. Two structures are unrelated to the others: R.BfiI exhibits the phospholipase D (PLD) fold, while R.PabI has a new fold termed 'half-pipe'. Thus far, bioinformatic studies supported by site-directed mutagenesis have extended the number of tentatively assigned REase folds to five (now including also GIY-YIG and HNH folds identified earlier in homing endonucleases) and provided structural predictions for dozens of REase sequences without experimentally solved structures. Here, we present a comprehensive study of all Type II REase sequences available in REBASE together with their homologs detectable in the nonredundant and environmental samples databases at the NCBI. We present the summary and critical evaluation of structural assignments and predictions reported earlier, new classification of all REase sequences into families, domain architecture analysis and new predictions of three-dimensional folds. Among 289 experimentally characterized (not putative) Type II REases, whose apparently full-length sequences are available in REBASE, we assign 199 (69%) to contain the PD-(D/E)XK domain. The HNH domain is the second most common, with 24 (8%) members. When putative REases are taken into account, the fraction of PD-(D/E)XK and HNH folds changes to 48% and 30%, respectively. Fifty-six characterized (and 521 predicted) REases remain unassigned to any of the five REase folds identified so far, and may exhibit new architectures. These enzymes are proposed as the most interesting targets for structure determination by high-resolution experimental methods. Our analysis provides the first comprehensive map of sequence-structure relationships among Type II REases and will help to focus the efforts of structural and functional genomics of this large and biotechnologically important class of enzymes.
Collapse
Affiliation(s)
- Jerzy Orlowski
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | | |
Collapse
|
19
|
Obarska-Kosinska A, Taylor JEN, Callow P, Orlowski J, Bujnicki JM, Kneale GG. HsdR subunit of the type I restriction-modification enzyme EcoR124I: biophysical characterisation and structural modelling. J Mol Biol 2008; 376:438-452. [PMID: 18164032 PMCID: PMC2878639 DOI: 10.1016/j.jmb.2007.11.024] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Revised: 11/08/2007] [Accepted: 11/09/2007] [Indexed: 01/19/2023]
Abstract
Type I restriction-modification (RM) systems are large, multifunctional enzymes composed of three different subunits. HsdS and HsdM form a complex in which HsdS recognizes the target DNA sequence, and HsdM carries out methylation of adenosine residues. The HsdR subunit, when associated with the HsdS-HsdM complex, translocates DNA in an ATP-dependent process and cleaves unmethylated DNA at a distance of several thousand base-pairs from the recognition site. The molecular mechanism by which these enzymes translocate the DNA is not fully understood, in part because of the absence of crystal structures. To date, crystal structures have been determined for the individual HsdS and HsdM subunits and models have been built for the HsdM-HsdS complex with the DNA. However, no structure is available for the HsdR subunit. In this work, the gene coding for the HsdR subunit of EcoR124I was re-sequenced, which showed that there was an error in the published sequence. This changed the position of the stop codon and altered the last 17 amino acid residues of the protein sequence. An improved purification procedure was developed to enable HsdR to be purified efficiently for biophysical and structural analysis. Analytical ultracentrifugation shows that HsdR is monomeric in solution, and the frictional ratio of 1.21 indicates that the subunit is globular and fairly compact. Small angle neutron-scattering of the HsdR subunit indicates a radius of gyration of 3.4 nm and a maximum dimension of 10 nm. We constructed a model of the HsdR using protein fold-recognition and homology modelling to model individual domains, and small-angle neutron scattering data as restraints to combine them into a single molecule. The model reveals an ellipsoidal shape of the enzymatic core comprising the N-terminal and central domains, and suggests conformational heterogeneity of the C-terminal region implicated in binding of HsdR to the HsdS-HsdM complex.
Collapse
Affiliation(s)
- Agnieszka Obarska-Kosinska
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland
| | - James E N Taylor
- Biophysics Laboratories, Institute of Biomedical and Biomolecular Sciences, University of Portsmouth, PO1 2DT, UK
| | - Philip Callow
- EPSAM and ISTM Research Institutes, Keele University, Staffordshire ST5 5BG, UK; ILL-EMBL Deuteration Laboratory, Partnership for Structural Biology, Institut Laue Langevin, 38042 Grenoble Cedex 9, Grenoble, France
| | - Jerzy Orlowski
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland.
| | - G Geoff Kneale
- Biophysics Laboratories, Institute of Biomedical and Biomolecular Sciences, University of Portsmouth, PO1 2DT, UK.
| |
Collapse
|
20
|
Functional differentiation of proteins: implications for structural genomics. Structure 2007; 15:405-15. [PMID: 17437713 DOI: 10.1016/j.str.2007.02.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2006] [Revised: 02/15/2007] [Accepted: 02/16/2007] [Indexed: 01/06/2023]
Abstract
Structural genomics is a broad initiative of various centers aiming to provide complete coverage of protein structure space. Because it is not feasible to experimentally determine the structures of all proteins, it is generally agreed that the only viable strategy to achieve such coverage is to carefully select specific proteins (targets), determine their structure experimentally, and then use comparative modeling techniques to model the rest. Here we suggest that structural genomics centers refine the structure-driven approach in target selection by adopting function-based criteria. We suggest targeting functionally divergent superfamilies within a given structural fold so that each function receives a structural characterization. We have developed a method to do so, and an itemized survey of several functionally rich folds shows that they are only partially functionally characterized. We call upon structural genomics centers to consider this approach and upon computational biologists to further develop function-based targeting methods.
Collapse
|
21
|
Guzzo CR, Nagem RAP, Barbosa JARG, Farah CS. Structure of Xanthomonas axonopodis pv. citri YaeQ reveals a new compact protein fold built around a variation of the PD-(D/E)XK nuclease motif. Proteins 2007; 69:644-51. [PMID: 17623842 DOI: 10.1002/prot.21556] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The YaeQ family of proteins are found in many Gram-negative and a few Gram-positive bacteria. We have determined the first structure of a member of the YaeQ family by X-ray crystallography. Comparisons with other structures indicate that YaeQ represents a new compact protein fold built around a variation of the PD-(D/E)XK nuclease motif found in type II endonucleases and enzymes involved in DNA replication, repair, and recombination. We show that catalytically important residues in the PD-(D/E)XK nuclease superfamily are spatially conserved in YaeQ and other highly conserved YaeQ residues may be poised to interact with nucleic acid structures.
Collapse
Affiliation(s)
- Cristiane R Guzzo
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, Avenida Prof. Lineu Prestes 748, São Paulo, SP, CEP 05508-000, Brazil
| | | | | | | |
Collapse
|
22
|
Cymerman IA, Obarska A, Skowronek KJ, Lubys A, Bujnicki JM. Identification of a new subfamily of HNH nucleases and experimental characterization of a representative member, HphI restriction endonuclease. Proteins 2007; 65:867-76. [PMID: 17029241 DOI: 10.1002/prot.21156] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The restriction endonuclease (REase) R. HphI is a Type IIS enzyme that recognizes the asymmetric target DNA sequence 5'-GGTGA-3' and in the presence of Mg(2+) hydrolyzes phosphodiester bonds in both strands of the DNA at a distance of 8 nucleotides towards the 3' side of the target, producing a 1 nucleotide 3'-staggered cut in an unspecified sequence at this position. REases are typically ORFans that exhibit little similarity to each other and to any proteins in the database. However, bioinformatics analyses revealed that R.HphI is a member of a relatively big sequence family with a conserved C-terminal domain and a variable N-terminal domain. We predict that the C-terminal domains of proteins from this family correspond to the nuclease domain of the HNH superfamily rather than to the most common PD-(D/E)XK superfamily of nucleases. We constructed a three-dimensional model of the R.HphI catalytic domain and validated our predictions by site-directed mutagenesis and studies of DNA-binding and catalytic activities of the mutant proteins. We also analyzed the genomic neighborhood of R.HphI homologs and found that putative nucleases accompanied by a DNA methyltransferase (i.e. predicted REases) do not form a single group on a phylogenetic tree, but are dispersed among free-standing putative nucleases. This suggests that nucleases from the HNH superfamily were independently recruited to become REases in the context of RM systems multiple times in the evolution and that members of the HNH superfamily may be much more frequent among the so far unassigned REase sequences than previously thought.
Collapse
Affiliation(s)
- Iwona A Cymerman
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland
| | | | | | | | | |
Collapse
|
23
|
Chovancová E, Kosinski J, Bujnicki JM, Damborský J. Phylogenetic analysis of haloalkane dehalogenases. Proteins 2007; 67:305-16. [PMID: 17295320 DOI: 10.1002/prot.21313] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Haloalkane dehalogenases (HLDs) are enzymes that catalyze the cleavage of carbon-halogen bonds by a hydrolytic mechanism. Although comparative biochemical analyses have been published, no classification system has been proposed for HLDs, to date, that reconciles their phylogenetic and functional relationships. In the study presented here, we have analyzed all sequences and structures of genuine HLDs and their homologs detectable by database searches. Phylogenetic analyses revealed that the HLD family can be divided into three subfamilies denoted HLD-I, HLD-II, and HLD-III, of which HLD-I and HLD-III are predicted to be sister-groups. A mismatch between the HLD protein tree and the tree of species, as well as the presence of more than one HLD gene in a few genomes, suggest that horizontal gene transfers, and perhaps also multiple gene duplications and losses have been involved in the evolution of this family. Most of the biochemically characterized HLDs are found in the HLD-II subfamily. The dehalogenating activity of two members of the newly identified HLD-III subfamily has only recently been confirmed, in a study motivated by this phylogenetic analysis. A novel type of the catalytic pentad (Asp-His-Asp+Asn-Trp) was predicted for members of the HLD-III subfamily. Calculation of the evolutionary rates and lineage-specific innovations revealed a common conserved core as well as a set of residues that characterizes each HLD subfamily. The N-terminal part of the cap domain is one of the most variable regions within the whole family as well as within individual subfamilies, and serves as a preferential site for the location of relatively long insertions. The highest variability of discrete sites was observed among residues that are structural components of the access channels. Mutations at these sites modify the anatomy of the channels, which are important for the exchange of ligands between the buried active site and the bulk solvent, thus creating a structural basis for the molecular evolution of new substrate specificities. Our analysis sheds light on the evolutionary history of HLDs and provides a structural framework for designing enzymes with new specificities.
Collapse
Affiliation(s)
- Eva Chovancová
- Loschmidt Laboratories, Faculty of Science, Masaryk University, Brno, Czech Republic
| | | | | | | |
Collapse
|
24
|
Tamulaitiene G, Jakubauskas A, Urbanke C, Huber R, Grazulis S, Siksnys V. The crystal structure of the rare-cutting restriction enzyme SdaI reveals unexpected domain architecture. Structure 2006; 14:1389-400. [PMID: 16962970 DOI: 10.1016/j.str.2006.07.002] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2006] [Revised: 07/04/2006] [Accepted: 07/05/2006] [Indexed: 01/31/2023]
Abstract
Rare-cutting restriction enzymes are important tools in genome analysis. We report here the crystal structure of SdaI restriction endonuclease, which is specific for the 8 bp sequence CCTGCA/GG ("/" designates the cleavage site). Unlike orthodox Type IIP enzymes, which are single domain proteins, the SdaI monomer is composed of two structural domains. The N domain contains a classical winged helix-turn-helix (wHTH) DNA binding motif, while the C domain shows a typical restriction endonuclease fold. The active site of SdaI is located within the C domain and represents a variant of the canonical PD-(D/E)XK motif. SdaI determinants of sequence specificity are clustered on the recognition helix of the wHTH motif at the N domain. The modular architecture of SdaI, wherein one domain mediates DNA binding while the other domain is predicted to catalyze hydrolysis, distinguishes SdaI from previously characterized restriction enzymes interacting with symmetric recognition sequences.
Collapse
|
25
|
Koliński A, Bujnicki JM. Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models. Proteins 2006; 61 Suppl 7:84-90. [PMID: 16187348 DOI: 10.1002/prot.20723] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
To predict the tertiary structure of full-length sequences of all targets in CASP6, regardless of their potential category (from easy comparative modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our laboratories, which ranked quite well in different categories in CASP5. First, the GeneSilico metaserver was used to identify domains, predict secondary structure, and generate fold recognition (FR) alignments, which were converted to full-atom models using the "FRankenstein's Monster" approach for comparative modeling (CM) by recombination of protein fragments. Additional models generated "de novo" by fully automated servers were obtained from the CASP website. All these models were evaluated by VERIFY3D, and residues with scores better than 0.2 were used as a source of spatial restraints. Second, a new implementation of the lattice-based protein modeling tool CABS was used to carry out folding guided by the above-mentioned restraints with the Replica Exchange Monte Carlo sampling technique. Decoys generated in the course of simulation were subject to the average linkage hierarchical clustering. For a representative decoy from each cluster, a full-atom model was rebuilt. Finally, five models were selected for submission based on combination of various criteria, including the size, density, and average energy of the corresponding cluster, and the visual evaluation of the full-atom structures and their relationship to the original templates. The combination of FRankenstein and CABS was one of the best-performing algorithms over all categories in CASP6 (it is important to note that our human intervention was very limited, and all steps in our method can be easily automated). We were able to generate a number of very good models, especially in the Comparative Modeling and New Folds categories. Frequently, the best models were closer to the native structure than any of the templates used. The main problem we encountered was in the ranking of the final models (the only step of significant human intervention), due to the insufficient computational power, which precluded the possibility of full-atom refinement and energy-based evaluation.
Collapse
|
26
|
Dunin-Horkawicz S, Feder M, Bujnicki JM. Phylogenomic analysis of the GIY-YIG nuclease superfamily. BMC Genomics 2006; 7:98. [PMID: 16646971 PMCID: PMC1564403 DOI: 10.1186/1471-2164-7-98] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2006] [Accepted: 04/28/2006] [Indexed: 11/28/2022] Open
Abstract
Background The GIY-YIG domain was initially identified in homing endonucleases and later in other selfish mobile genetic elements (including restriction enzymes and non-LTR retrotransposons) and in enzymes involved in DNA repair and recombination. However, to date no systematic search for novel members of the GIY-YIG superfamily or comparative analysis of these enzymes has been reported. Results We carried out database searches to identify all members of known GIY-YIG nuclease families. Multiple sequence alignments together with predicted secondary structures of identified families were represented as Hidden Markov Models (HMM) and compared by the HHsearch method to the uncharacterized protein families gathered in the COG, KOG, and PFAM databases. This analysis allowed for extending the GIY-YIG superfamily to include members of COG3680 and a number of proteins not classified in COGs and to predict that these proteins may function as nucleases, potentially involved in DNA recombination and/or repair. Finally, all old and new members of the GIY-YIG superfamily were compared and analyzed to infer the phylogenetic tree. Conclusion An evolutionary classification of the GIY-YIG superfamily is presented for the very first time, along with the structural annotation of all (sub)families. It provides a comprehensive picture of sequence-structure-function relationships in this superfamily of nucleases, which will help to design experiments to study the mechanism of action of known members (especially the uncharacterized ones) and will facilitate the prediction of function for the newly discovered ones.
Collapse
Affiliation(s)
- Stanislaw Dunin-Horkawicz
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland
| | - Marcin Feder
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, 02-109 Warsaw, Poland
| |
Collapse
|
27
|
Tress ML, Cozzetto D, Tramontano A, Valencia A. An analysis of the Sargasso Sea resource and the consequences for database composition. BMC Bioinformatics 2006; 7:213. [PMID: 16623953 PMCID: PMC1513258 DOI: 10.1186/1471-2105-7-213] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2005] [Accepted: 04/19/2006] [Indexed: 01/20/2023] Open
Abstract
Background The environmental sequencing of the Sargasso Sea has introduced a huge new resource of genomic information. Unlike the protein sequences held in the current searchable databases, the Sargasso Sea sequences originate from a single marine environment and have been sequenced from species that are not easily obtainable by laboratory cultivation. The resource also contains very many fragments of whole protein sequences, a side effect of the shotgun sequencing method. These sequences form a significant addendum to the current searchable databases but also present us with some intrinsic difficulties. While it is important to know whether it is possible to assign function to these sequences with the current methods and whether they will increase our capacity to explore sequence space, it is also interesting to know how current bioinformatics techniques will deal with the new sequences in the resource. Results The Sargasso Sea sequences seem to introduce a bias that decreases the potential of current methods to propose structure and function for new proteins. In particular the high proportion of sequence fragments in the resource seems to result in poor quality multiple alignments. Conclusion These observations suggest that the new sequences should be used with care, especially if the information is to be used in large scale analyses. On a positive note, the results may just spark improvements in computational and experimental methods to take into account the fragments generated by environmental sequencing techniques.
Collapse
Affiliation(s)
- Michael L Tress
- Protein Design Group, CNB-CSIC, Calle Darwin, Cantoblanco 28049 Madrid, Spain
| | - Domenico Cozzetto
- Department of Biochemical Sciences, University "La Sapienza" Rome, Italy
| | - Anna Tramontano
- Department of Biochemical Sciences, University "La Sapienza" Rome, Italy
| | - Alfonso Valencia
- Protein Design Group, CNB-CSIC, Calle Darwin, Cantoblanco 28049 Madrid, Spain
| |
Collapse
|
28
|
Skowronek KJ, Kosinski J, Bujnicki JM. Theoretical model of restriction endonuclease HpaI in complex with DNA, predicted by fold recognition and validated by site-directed mutagenesis. Proteins 2006; 63:1059-68. [PMID: 16498623 DOI: 10.1002/prot.20920] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Type II restriction enzymes are commercially important deoxyribonucleases and very attractive targets for protein engineering of new specificities. At the same time they are a very challenging test bed for protein structure prediction methods. Typically, enzymes that recognize different sequences show little or no amino acid sequence similarity to each other and to other proteins. Based on crystallographic analyses that revealed the same PD-(D/E)XK fold for more than a dozen case studies, they were nevertheless considered to be related until the combination of bioinformatics and mutational analyses has demonstrated that some of these proteins belong to other, unrelated folds PLD, HNH, and GIY-YIG. As a part of a large-scale project aiming at identification of a three-dimensional fold for all type II REases with known sequences (currently approximately 1000 proteins), we carried out preliminary structure prediction and selected candidates for experimental validation. Here, we present the analysis of HpaI REase, an ORFan with no detectable homologs, for which we detected a structural template by protein fold recognition, constructed a model using the FRankenstein monster approach and identified a number of residues important for the DNA binding and catalysis. These predictions were confirmed by site-directed mutagenesis and in vitro analysis of the mutant proteins. The experimentally validated model of HpaI will serve as a low-resolution structural platform for evolutionary considerations in the subgroup of blunt-cutting REases with different specificities. The research protocol developed in the course of this work represents a streamlined version of the previously used techniques and can be used in a high-throughput fashion to build and validate models for other enzymes, especially ORFans that exhibit no sequence similarity to any other protein in the database.
Collapse
|
29
|
Zhao F, Zhang X, Liang C, Wu J, Bao Q, Qin S. Genome-wide analysis of restriction-modification system in unicellular and filamentous cyanobacteria. Physiol Genomics 2005; 24:181-90. [PMID: 16368872 DOI: 10.1152/physiolgenomics.00255.2005] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Cyanobacteria are an ancient group of gram-negative bacteria with strong genome size variation ranging from 1.6 to 9.1 Mb. Here, we first retrieved all the putative restriction-modification (RM) genes in the draft genome of Spirulina and then performed a range of comparative and bioinformatic analyses on RM genes from unicellular and filamentous cyanobacterial genomes. We have identified 6 gene clusters containing putative Type I RMs and 11 putative Type II RMs or the solitary methyltransferases (MTases). RT-PCR analysis reveals that 6 of 18 MTases are not expressed in Spirulina, whereas one hsdM gene, with a mutated cognate hsdS, was detected to be expressed. Our results indicate that the number of RM genes in filamentous cyanobacteria is significantly higher than in unicellular species, and this expansion of RM systems in filamentous cyanobacteria may be related to their wide range of ecological tolerance. Furthermore, a coevolutionary pattern is found between hsdM and hsdR, with a large number of site pairs positively or negatively correlated, indicating the functional importance of these pairing interactions between their tertiary structures. No evidence for positive selection is found for the majority of RMs, e.g., hsdM, hsdS, hsdR, and Type II restriction endonuclease gene families, while a group of MTases exhibit a remarkable signature of adaptive evolution. Sites and genes identified here to have been under positive selection would provide targets for further research on their structural and functional evaluations.
Collapse
Affiliation(s)
- Fangqing Zhao
- Institute of Oceanology, Chinese Academy of Sciences, Qingdao, China
| | | | | | | | | | | |
Collapse
|
30
|
Armalyte E, Bujnicki JM, Giedriene J, Gasiunas G, Kosiński J, Lubys A. Mva1269I: a monomeric type IIS restriction endonuclease from Micrococcus varians with two EcoRI- and FokI-like catalytic domains. J Biol Chem 2005; 280:41584-94. [PMID: 16223716 DOI: 10.1074/jbc.m506775200] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Type II restriction endonuclease Mva1269I recognizes an asymmetric DNA sequence 5'-GAATGCN / -3'/5'-NG / CATTC-3' and cuts top and bottom DNA strands at positions, indicated by the "/" symbol. Most restriction endonucleases require dimerization to cleave both strands of DNA. We found that Mva1269I is a monomer both in solution and upon binding of cognate DNA. Protein fold-recognition analysis revealed that Mva1269I comprises two "PD-(D/E)XK" domains. The N-terminal domain is related to the 5'-GAATTC-3'-specific restriction endonuclease EcoRI, whereas the C-terminal one resembles the nonspecific nuclease domain of restriction endonuclease FokI. Inactivation of the C-terminal catalytic site transformed Mva1269I into a very active bottom strand-nicking enzyme, whereas mutants in the N-terminal domain nicked the top strand, but only at elevated enzyme concentrations. We found that the cleavage of the bottom strand is a prerequisite for the cleavage of the top strand. We suggest that Mva1269I evolved the ability to recognize and to cleave its asymmetrical target by a fusion of an EcoRI-like domain, which incises the bottom strand within the target, and a FokI-like domain that completes the cleavage within the nonspecific region outside the target sequence. Our results have implications for the molecular evolution of restriction endonucleases, as well as for perspectives of engineering new restriction and nicking enzymes with asymmetric target sites.
Collapse
Affiliation(s)
- Elena Armalyte
- Institute of Biotechnology, Graiciuno 8, Vilnius LT-02241, Lithuania
| | | | | | | | | | | |
Collapse
|
31
|
Kosinski J, Feder M, Bujnicki JM. The PD-(D/E)XK superfamily revisited: identification of new members among proteins involved in DNA metabolism and functional predictions for domains of (hitherto) unknown function. BMC Bioinformatics 2005; 6:172. [PMID: 16011798 PMCID: PMC1189080 DOI: 10.1186/1471-2105-6-172] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2005] [Accepted: 07/12/2005] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND The PD-(D/E)XK nuclease superfamily, initially identified in type II restriction endonucleases and later in many enzymes involved in DNA recombination and repair, is one of the most challenging targets for protein sequence analysis and structure prediction. Typically, the sequence similarity between these proteins is so low, that most of the relationships between known members of the PD-(D/E)XK superfamily were identified only after the corresponding structures were determined experimentally. Thus, it is tempting to speculate that among the uncharacterized protein families, there are potential nucleases that remain to be discovered, but their identification requires more sensitive tools than traditional PSI-BLAST searches. RESULTS The low degree of amino acid conservation hampers the possibility of identification of new members of the PD-(D/E)XK superfamily based solely on sequence comparisons to known members. Therefore, we used a recently developed method HHsearch for sensitive detection of remote similarities between protein families represented as profile Hidden Markov Models enhanced by secondary structure. We carried out a comparison of known families of PD-(D/E)XK nucleases to the database comprising the COG and PFAM profiles corresponding to both functionally characterized as well as uncharacterized protein families to detect significant similarities. The initial candidates for new nucleases were subsequently verified by sequence-structure threading, comparative modeling, and identification of potential active site residues. CONCLUSION In this article, we report identification of the PD-(D/E)XK nuclease domain in numerous proteins implicated in interactions with DNA but with unknown structure and mechanism of action (such as putative recombinase RmuC, DNA competence factor CoiA, a DNA-binding protein SfsA, a large human protein predicted to be a DNA repair enzyme, predicted archaeal transcription regulators, and the head completion protein of phage T4) and in proteins for which no function was assigned to date (such as YhcG, various phage proteins, novel candidates for restriction enzymes). Our results contributes to the reduction of "white spaces" on the sequence-structure-function map of the protein universe and will help to jump-start the experimental characterization of new nucleases, of which many may be of importance for the complete understanding of mechanisms that govern the evolution and stability of the genome.
Collapse
Affiliation(s)
- Jan Kosinski
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, PL-02-109 Warsaw, Poland
| | - Marcin Feder
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Trojdena 4, PL-02-109 Warsaw, Poland
| |
Collapse
|
32
|
Rigden DJ. An inactivated nuclease-like domain in RecC with novel function: implications for evolution. BMC STRUCTURAL BIOLOGY 2005; 5:9. [PMID: 15985153 PMCID: PMC1185551 DOI: 10.1186/1472-6807-5-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/15/2005] [Accepted: 06/28/2005] [Indexed: 02/03/2023]
Abstract
BACKGROUND The PD-(D/E)xK superfamily, containing a wide variety of other exo- and endonucleases, is a notable example of general function conservation in the face of extreme sequence and structural variation. Almost all members employ a small number of shared conserved residues to bind catalytically essential metal ions and thereby effect DNA cleavage. The crystal structure of the RecBCD prokaryotic DNA repair machinery shows that RecB contains such a nuclease domain at its C-terminus. The RecC C-terminal region was reported as having a novel fold. RESULTS The RecC C-terminal region can be divided into an alpha/beta domain and a smaller alpha-helical bundle domain. Here we show that the alpha/beta domain is homologous to the RecB nuclease domain but lacks the features necessary for catalysis. Instead, the domain has a novel function within the nuclease superfamily--providing a hoop through which single-stranded DNA passes. Comparison with other structures of nuclease domains bound to DNA reveals strikingly different modes of ligand binding. The alpha-helical bundle domain contributes the pin which splits the DNA duplex. CONCLUSION The demonstrated homology of RecB and RecC shows how evolution acted to produce the present RecBCD complex through aggregation of new domains as well as functional divergence and structural redeployment of existing domains. Distantly homologous nuclease(-like) domains bind DNA in highly diverse manners.
Collapse
Affiliation(s)
- Daniel John Rigden
- School of Biological Sciences, University of Liverpool, Crown St., Liverpool L69 7ZB, UK.
| |
Collapse
|
33
|
Chmiel AA, Radlinska M, Pawlak SD, Krowarsch D, Bujnicki JM, Skowronek KJ. A theoretical model of restriction endonuclease NlaIV in complex with DNA, predicted by fold recognition and validated by site-directed mutagenesis and circular dichroism spectroscopy. Protein Eng Des Sel 2005; 18:181-9. [PMID: 15849215 DOI: 10.1093/protein/gzi019] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Restriction enzymes (REases) are commercial reagents commonly used in DNA manipulations and mapping. They are regarded as very attractive models for studying protein-DNA interactions and valuable targets for protein engineering. Their amino acid sequences usually show no similarities to other proteins, with rare exceptions of other REases that recognize identical or very similar sequences. Hence, they are extremely hard targets for structure prediction and modeling. NlaIV is a Type II REase, which recognizes the interrupted palindromic sequence GGNNCC (where N indicates any base) and cleaves it in the middle, leaving blunt ends. NlaIV shows no sequence similarity to other proteins and virtually nothing is known about its sequence-structure-function relationships. Using protein fold recognition, we identified a remote relationship between NlaIV and EcoRV, an extensively studied REase, which recognizes the GATATC sequence and whose crystal structure has been determined. Using the 'FRankenstein's monster' approach we constructed a comparative model of NlaIV based on the EcoRV template and used it to predict the catalytic and DNA-binding residues. The model was validated by site-directed mutagenesis and analysis of the activity of the mutants in vivo and in vitro as well as structural characterization of the wild-type enzyme and two mutants by circular dichroism spectroscopy. The structural model of the NlaIV-DNA complex suggests regions of the protein sequence that may interact with the 'non-specific' bases of the target and thus it provides insight into the evolution of sequence specificity in restriction enzymes and may help engineer REases with novel specificities. Before this analysis was carried out, neither the three-dimensional fold of NlaIV, its evolutionary relationships or its catalytic or DNA-binding residues were known. Hence our analysis may be regarded as a paradigm for studies aiming at reducing 'white spaces' on the evolutionary landscape of sequence-function relationships by combining bioinformatics with simple experimental assays.
Collapse
Affiliation(s)
- Agnieszka A Chmiel
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, ul. ks. Trojdena 4, 02-109 Warsaw, Poland
| | | | | | | | | | | |
Collapse
|