1
|
Ham DT, Browne TS, Banglorewala PN, Wilson TL, Michael RK, Gloor GB, Edgell DR. A generalizable Cas9/sgRNA prediction model using machine transfer learning with small high-quality datasets. Nat Commun 2023; 14:5514. [PMID: 37679324 PMCID: PMC10485023 DOI: 10.1038/s41467-023-41143-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023] Open
Abstract
The CRISPR/Cas9 nuclease from Streptococcus pyogenes (SpCas9) can be used with single guide RNAs (sgRNAs) as a sequence-specific antimicrobial agent and as a genome-engineering tool. However, current bacterial sgRNA activity models struggle with accurate predictions and do not generalize well, possibly because the underlying datasets used to train the models do not accurately measure SpCas9/sgRNA activity and cannot distinguish on-target cleavage from toxicity. Here, we solve this problem by using a two-plasmid positive selection system to generate high-quality data that more accurately reports on SpCas9/sgRNA cleavage and that separates activity from toxicity. We develop a machine learning architecture (crisprHAL) that can be trained on existing datasets, that shows marked improvements in sgRNA activity prediction accuracy when transfer learning is used with small amounts of high-quality data, and that can generalize predictions to different bacteria. The crisprHAL model recapitulates known SpCas9/sgRNA-target DNA interactions and provides a pathway to a generalizable sgRNA bacterial activity prediction tool that will enable accurate antimicrobial and genome engineering applications.
Collapse
Affiliation(s)
- Dalton T Ham
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Tyler S Browne
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Pooja N Banglorewala
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | | | | | - Gregory B Gloor
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| | - David R Edgell
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| |
Collapse
|
2
|
Christie KA, Guo JA, Silverstein RA, Doll RM, Mabuchi M, Stutzman HE, Lin J, Ma L, Walton RT, Pinello L, Robb GB, Kleinstiver BP. Precise DNA cleavage using CRISPR-SpRYgests. Nat Biotechnol 2023; 41:409-416. [PMID: 36203014 PMCID: PMC10023266 DOI: 10.1038/s41587-022-01492-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 08/31/2022] [Indexed: 11/09/2022]
Abstract
Methods for in vitro DNA cleavage and molecular cloning remain unable to precisely cleave DNA directly adjacent to bases of interest. Restriction enzymes (REs) must bind specific motifs, whereas wild-type CRISPR-Cas9 or CRISPR-Cas12 nucleases require protospacer adjacent motifs (PAMs). Here we explore the utility of our previously reported near-PAMless SpCas9 variant, named SpRY, to serve as a universal DNA cleavage tool for various cloning applications. By performing SpRY DNA digests (SpRYgests) using more than 130 guide RNAs (gRNAs) sampling a wide diversity of PAMs, we discovered that SpRY is PAMless in vitro and can cleave DNA at practically any sequence, including sites refractory to cleavage with wild-type SpCas9. We illustrate the versatility and effectiveness of SpRYgests to improve the precision of several cloning workflows, including those not possible with REs or canonical CRISPR nucleases. We also optimize a rapid and simple one-pot gRNA synthesis protocol to streamline SpRYgest implementation. Together, SpRYgests can improve various DNA engineering applications that benefit from precise DNA breaks.
Collapse
Affiliation(s)
- Kathleen A Christie
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Jimmy A Guo
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Biological and Biomedical Sciences Program, Harvard University, Boston, MA, USA
| | - Rachel A Silverstein
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Biological and Biomedical Sciences Program, Harvard University, Boston, MA, USA
| | - Roman M Doll
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Molecular Biosciences/Cancer Biology Program, Heidelberg University and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | - Hannah E Stutzman
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | - Jiecong Lin
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital Charlestown, Boston, MA, USA
| | - Linyuan Ma
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Russell T Walton
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Luca Pinello
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital Charlestown, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Benjamin P Kleinstiver
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.
- Department of Pathology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
3
|
Lacoursiere RE, Shaw GS. Acetylated Ubiquitin Modulates the Catalytic Activity of the E1 Enzyme Uba1. Biochemistry 2021; 60:1276-1285. [PMID: 33848125 DOI: 10.1021/acs.biochem.1c00145] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Ubiquitin (Ub) signaling requires the covalent passage of Ub among E1, E2, and E3 enzymes. The choice of E2 and E3 enzymes combined with multiple rounds of the cascade leads to the formation of polyubiquitin chains linked through any one of the seven lysines on Ub. The linkage type and length act as a signal to trigger important cellular processes such as protein degradation or the DNA damage response. Recently, proteomics studies have identified that Ub can be acetylated at six of its seven lysine residues under various cell stress conditions. To understand the potential differences in Ub signaling caused by acetylation, we synthesized all possible acetylated ubiquitin (acUb) variants and examined the E1-mediated formation of the corresponding E2∼acUb conjugates in vitro using kinetic methods. A Förster resonance energy transfer assay was optimized in which the Ub constructs were labeled with a CyPet fluorophore and the E2 UBE2D1 was labeled with a YPet fluorophore to monitor the formation of E2∼Ub conjugates. Our methods enable the detection of small differences that may otherwise be concealed in steady-state ubiquitination experiments. We determined that Ub, acetylated at K11, K27, K33, K48, or K63, has altered turnover numbers for E2∼Ub conjugate formation by the E1 enzyme Uba1. This work provides evidence that acetylation of Ub can alter the catalysis of ubiquitination early on in the pathway.
Collapse
Affiliation(s)
| | - Gary S Shaw
- Department of Biochemistry, Western University, London, Ontario N6A 5C1, Canada
| |
Collapse
|
4
|
Roy AC, Wilson GG, Edgell DR. Perpetuating the homing endonuclease life cycle: identification of mutations that modulate and change I-TevI cleavage preference. Nucleic Acids Res 2016; 44:7350-9. [PMID: 27387281 PMCID: PMC5009752 DOI: 10.1093/nar/gkw614] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Accepted: 06/27/2016] [Indexed: 11/14/2022] Open
Abstract
Homing endonucleases are sequence-tolerant DNA endonucleases that act as mobile genetic elements. The ability of homing endonucleases to cleave substrates with multiple nucleotide substitutions suggests a high degree of adaptability in that changing or modulating cleavage preference would require relatively few amino acid substitutions. Here, using directed evolution experiments with the GIY-YIG homing endonuclease I-TevI that targets the thymidylate synthase gene of phage T4, we readily isolated variants that dramatically broadened I-TevI cleavage preference, as well as variants that fine-tuned cleavage preference. By combining substitutions, we observed an ∼10 000-fold improvement in cleavage on some substrates not cleaved by the wild-type enzyme, correlating with a decrease in readout of information content at the cleavage site. Strikingly, we were able to change the cleavage preference of I-TevI to that of the isoschizomer I-BmoI which targets a different cleavage site in the thymidylate synthase gene, recapitulating the evolution of cleavage preference in this family of homing endonucleases. Our results define a strategy to isolate GIY-YIG nuclease domains with distinct cleavage preferences, and provide insight into how homing endonucleases may escape a dead-end life cycle in a population of saturated target sites by promoting transposition to different target sites.
Collapse
Affiliation(s)
- Alexander C Roy
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada
| | | | - David R Edgell
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada
| |
Collapse
|
5
|
Kleinstiver BP, Prew MS, Tsai SQ, Nguyen NT, Topkar VV, Zheng Z, Joung JK. Broadening the targeting range of Staphylococcus aureus CRISPR-Cas9 by modifying PAM recognition. Nat Biotechnol 2015; 33:1293-1298. [PMID: 26524662 DOI: 10.1038/nbt.3404] [Citation(s) in RCA: 429] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 10/14/2015] [Indexed: 12/26/2022]
Abstract
CRISPR-Cas9 nucleases target specific DNA sequences using a guide RNA but also require recognition of a protospacer adjacent motif (PAM) by the Cas9 protein. Although longer PAMs can potentially improve the specificity of genome editing, they limit the range of sequences that Cas9 orthologs can target. One potential strategy to relieve this restriction is to relax the PAM recognition specificity of Cas9. Here we used molecular evolution to modify the NNGRRT PAM of Staphylococcus aureus Cas9 (SaCas9). One variant we identified, referred to as KKH SaCas9, showed robust genome editing activities at endogenous human target sites with NNNRRT PAMs, thereby increasing SaCas9 targeting range by two- to fourfold. Using GUIDE-seq, we show that wild-type and KKH SaCas9 induce comparable numbers of off-target effects in human cells. Our strategy for evolving PAM specificity does not require structural information and therefore should be applicable to a wide range of Cas9 orthologs.
Collapse
Affiliation(s)
- Benjamin P Kleinstiver
- Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Department of Pathology, Harvard Medical School, Boston, Massachusetts, USA
| | - Michelle S Prew
- Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts, USA
| | - Shengdar Q Tsai
- Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Department of Pathology, Harvard Medical School, Boston, Massachusetts, USA
| | - Nhu T Nguyen
- Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts, USA
| | - Ved V Topkar
- Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts, USA
| | - Zongli Zheng
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong, China
| | - J Keith Joung
- Molecular Pathology Unit, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts, USA.,Department of Pathology, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
6
|
Kleinstiver BP, Prew MS, Tsai SQ, Topkar VV, Nguyen NT, Zheng Z, Gonzales APW, Li Z, Peterson RT, Yeh JRJ, Aryee MJ, Joung JK. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 2015; 523:481-5. [PMID: 26098369 PMCID: PMC4540238 DOI: 10.1038/nature14592] [Citation(s) in RCA: 1152] [Impact Index Per Article: 128.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2015] [Accepted: 06/01/2015] [Indexed: 12/25/2022]
Abstract
Although CRISPR-Cas9 nucleases are widely used for genome editing1, 2, the range of sequences that Cas9 can recognize is constrained by the need for a specific protospacer adjacent motif (PAM)3–6. As a result, it can often be difficult to target double-stranded breaks (DSBs) with the precision that is necessary for various genome editing applications. The ability to engineer Cas9 derivatives with purposefully altered PAM specificities would address this limitation. Here we show that the commonly used Streptococcus pyogenes Cas9 (SpCas9) can be modified to recognize alternative PAM sequences using structural information, bacterial selection-based directed evolution, and combinatorial design. These altered PAM specificity variants enable robust editing of endogenous gene sites in zebrafish and human cells not currently targetable by wild-type SpCas9, and their genome-wide specificities are comparable to wild-type SpCas9 as judged by GUIDE-Seq analysis7. In addition, we identified and characterized another SpCas9 variant that exhibits improved specificity in human cells, possessing better discrimination against off-target sites with non-canonical NAG and NGA PAMs and/or mismatched spacers. We also found that two smaller-size Cas9 orthologues, Streptococcus thermophilus Cas9 (St1Cas9) and Staphylococcus aureus Cas9 (SaCas9), function efficiently in the bacterial selection systems and in human cells, suggesting that our engineering strategies could be extended to Cas9s from other species. Our findings provide broadly useful SpCas9 variants and, more importantly, establish the feasibility of engineering a wide range of Cas9s with altered and improved PAM specificities.
Collapse
Affiliation(s)
- Benjamin P Kleinstiver
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [3] Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Michelle S Prew
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA
| | - Shengdar Q Tsai
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [3] Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Ved V Topkar
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA
| | - Nhu T Nguyen
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA
| | - Zongli Zheng
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA [3] Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm SE-171 77, Sweden
| | - Andrew P W Gonzales
- 1] Cardiovascular Research Center, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115, USA [3] Broad Institute, Cambridge, Massachusetts 02142, USA
| | - Zhuyun Li
- Cardiovascular Research Center, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA
| | - Randall T Peterson
- 1] Cardiovascular Research Center, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Department of Systems Biology, Harvard Medical School, Boston, Massachusetts 02115, USA [3] Broad Institute, Cambridge, Massachusetts 02142, USA
| | - Jing-Ruey Joanna Yeh
- 1] Cardiovascular Research Center, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Department of Medicine, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Martin J Aryee
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA [3] Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02115, USA
| | - J Keith Joung
- 1] Molecular Pathology Unit &Center for Cancer Research, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [2] Center for Computational and Integrative Biology, Massachusetts General Hospital, Charlestown, Massachusetts 02129, USA [3] Department of Pathology, Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
7
|
Wolfs JM, DaSilva M, Meister SE, Wang X, Schild-Poulter C, Edgell DR. MegaTevs: single-chain dual nucleases for efficient gene disruption. Nucleic Acids Res 2014; 42:8816-29. [PMID: 25013171 PMCID: PMC4117789 DOI: 10.1093/nar/gku573] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Targeting gene disruptions in complex genomes relies on imprecise repair by the non-homologous end-joining DNA pathway, creating mutagenic insertions or deletions (indels) at the break point. DNA end-processing enzymes are often co-expressed with genome-editing nucleases to enhance the frequency of indels, as the compatible cohesive ends generated by the nucleases can be precisely repaired, leading to a cycle of cleavage and non-mutagenic repair. Here, we present an alternative strategy to bias repair toward gene disruption by fusing two different nuclease active sites from I-TevI (a GIY-YIG enzyme) and I-OnuI E2 (an engineered meganuclease) into a single polypeptide chain. In vitro, the MegaTev enzyme generates two double-strand breaks to excise an intervening 30-bp fragment. In HEK 293 cells, we observe a high frequency of gene disruption without co-expression of DNA end-processing enzymes. Deep sequencing of disrupted target sites revealed minimal processing, consistent with the MegaTev sequestering the double-strand breaks from the DNA repair machinery. Off-target profiling revealed no detectable cleavage at sites where the I-TevI CNNNG cleavage motif is not appropriately spaced from the I-OnuI binding site. The MegaTev enzyme represents a small, programmable nuclease platform for extremely specific genome-engineering applications.
Collapse
Affiliation(s)
- Jason M Wolfs
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada
| | - Matthew DaSilva
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada
| | - Sarah E Meister
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada
| | - Xu Wang
- Robarts Research Institute, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5B7, Canada
| | - Caroline Schild-Poulter
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada Robarts Research Institute, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5B7, Canada
| | - David R Edgell
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, ON, N6A 5C1, Canada
| |
Collapse
|
8
|
Abstract
Positions in a protein are thought to coevolve to maintain important structural and functional interactions over evolutionary time. The detection of putative coevolving positions can provide important new insights into a protein family in the same way that knowledge is gained by recognizing evolutionarily conserved characters and characteristics. Putatively coevolving positions can be detected with statistical methods that identify covarying positions. However, positions in protein alignments can covary for many other reasons than coevolution; thus, it is crucial to create high-quality multiple sequence alignments for coevolution inference. Furthermore, it is important to understand common signs and sources of error. When confounding factors are accounted for, coevolution is a rich resource for protein engineering information.
Collapse
|
9
|
Kleinstiver BP, Wolfs JM, Edgell DR. The monomeric GIY-YIG homing endonuclease I-BmoI uses a molecular anchor and a flexible tether to sequentially nick DNA. Nucleic Acids Res 2013; 41:5413-27. [PMID: 23558745 PMCID: PMC3664794 DOI: 10.1093/nar/gkt186] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The GIY-YIG nuclease domain is found within protein scaffolds that participate in diverse cellular pathways and contains a single active site that hydrolyzes DNA by a one-metal ion mechanism. GIY-YIG homing endonucleases (GIY-HEs) are two-domain proteins with N-terminal GIY-YIG nuclease domains connected to C-terminal DNA-binding and they are thought to function as monomers. Using I-BmoI as a model GIY-HE, we test mechanisms by which the single active site is used to generate a double-strand break. We show that I-BmoI is partially disordered in the absence of substrate, and that the GIY-YIG domain alone has weak affinity for DNA. Significantly, we show that I-BmoI functions as a monomer at all steps of the reaction pathway and does not transiently dimerize or use sequential transesterification reactions to cleave substrate. Our results are consistent with the I-BmoI DNA-binding domain acting as a molecular anchor to tether the GIY-YIG domain to substrate, permitting rotation of the GIY-YIG domain to sequentially nick each DNA strand. These data highlight the mechanistic differences between monomeric GIY-HEs and dimeric or tetrameric GIY-YIG restriction enzymes, and they have implications for the use of the GIY-YIG domain in genome-editing applications.
Collapse
Affiliation(s)
- Benjamin P Kleinstiver
- Department of Biochemistry, Schulich School of Medicine and Dentistry, Western University, London, Ontario N6A 5C1, Canada
| | | | | |
Collapse
|
10
|
Dickson RJ, Gloor GB. Protein sequence alignment analysis by local covariation: coevolution statistics detect benchmark alignment errors. PLoS One 2012; 7:e37645. [PMID: 22715369 PMCID: PMC3371027 DOI: 10.1371/journal.pone.0037645] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2011] [Accepted: 04/26/2012] [Indexed: 11/19/2022] Open
Abstract
The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files/
Collapse
Affiliation(s)
| | - Gregory B. Gloor
- Department of Biochemistry, The University of Western Ontario, London, Canada
- * E-mail:
| |
Collapse
|
11
|
Abstract
Targeted manipulation of complex genomes often requires the introduction of a double-strand break at defined locations by site-specific DNA endonucleases. Here, we describe a monomeric nuclease domain derived from GIY-YIG homing endonucleases for genome-editing applications. Fusion of the GIY-YIG nuclease domain to three-member zinc-finger DNA binding domains generated chimeric GIY-zinc finger endonucleases (GIY-ZFEs). Significantly, the I-TevI-derived fusions (Tev-ZFEs) function in vitro as monomers to introduce a double-strand break, and discriminate in vitro and in bacterial and yeast assays against substrates lacking a preferred 5'-CNNNG-3' cleavage motif. The Tev-ZFEs function to induce recombination in a yeast-based assay with activity on par with a homodimeric Zif268 zinc-finger nuclease. We also fused the I-TevI nuclease domain to a catalytically inactive LADGLIDADG homing endonuclease (LHE) scaffold. The monomeric Tev-LHEs are active in vivo and similarly discriminate against substrates lacking the 5'-CNNNG-3' motif. The monomeric Tev-ZFEs and Tev-LHEs are distinct from the FokI-derived zinc-finger nuclease and TAL effector nuclease platforms as the GIY-YIG domain alleviates the requirement to design two nuclease fusions to target a given sequence, highlighting the diversity of nuclease domains with distinctive biochemical properties suitable for genome-editing applications.
Collapse
|
12
|
Kleinstiver BP, Bérubé-Janzen W, Fernandes AD, Edgell DR. Divalent metal ion differentially regulates the sequential nicking reactions of the GIY-YIG homing endonuclease I-BmoI. PLoS One 2011; 6:e23804. [PMID: 21887323 PMCID: PMC3161791 DOI: 10.1371/journal.pone.0023804] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2011] [Accepted: 07/26/2011] [Indexed: 01/30/2023] Open
Abstract
Homing endonucleases are site-specific DNA endonucleases that function as mobile genetic elements by introducing double-strand breaks or nicks at defined locations. Of the major families of homing endonucleases, the modular GIY-YIG endonucleases are least understood in terms of mechanism. The GIY-YIG homing endonuclease I-BmoI generates a double-strand break by sequential nicking reactions during which the single active site of the GIY-YIG nuclease domain must undergo a substantial reorganization. Here, we show that divalent metal ion plays a significant role in regulating the two independent nicking reactions by I-BmoI. Rate constant determination for each nicking reaction revealed that limiting divalent metal ion has a greater impact on the second strand than the first strand nicking reaction. We also show that substrate mutations within the I-BmoI cleavage site can modulate the first strand nicking reaction over a 314-fold range. Additionally, in-gel DNA footprinting with mutant substrates and modeling of an I-BmoI-substrate complex suggest that amino acid contacts to a critical GC-2 base pair are required to induce a bottom-strand distortion that likely directs conformational changes for reaction progress. Collectively, our data implies mechanistic roles for divalent metal ion and substrate bases, suggesting that divalent metal ion facilitates the re-positioning of the GIY-YIG nuclease domain between sequential nicking reactions.
Collapse
Affiliation(s)
- Benjamin P. Kleinstiver
- Department of Biochemistry, Schulich School of Medicine & Dentistry, The University of Western Ontario, London, Ontario, Canada
| | - Wesley Bérubé-Janzen
- Department of Biochemistry, Schulich School of Medicine & Dentistry, The University of Western Ontario, London, Ontario, Canada
| | - Andrew D. Fernandes
- Department of Biochemistry, Schulich School of Medicine & Dentistry, The University of Western Ontario, London, Ontario, Canada
- Department of Applied Mathematics, The University of Western Ontario, London, Ontario, Canada
| | - David R. Edgell
- Department of Biochemistry, Schulich School of Medicine & Dentistry, The University of Western Ontario, London, Ontario, Canada
- * E-mail:
| |
Collapse
|
13
|
Fernandes AD, Kleinstiver BP, Edgell DR, Wahl LM, Gloor GB. Estimating the evidence of selection and the reliability of inference in unigenic evolution. Algorithms Mol Biol 2010; 5:35. [PMID: 21059250 PMCID: PMC2994857 DOI: 10.1186/1748-7188-5-35] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2010] [Accepted: 11/08/2010] [Indexed: 11/10/2022] Open
Abstract
Background Unigenic evolution is a large-scale mutagenesis experiment used to identify residues that are potentially important for protein function. Both currently-used methods for the analysis of unigenic evolution data analyze 'windows' of contiguous sites, a strategy that increases statistical power but incorrectly assumes that functionally-critical sites are contiguous. In addition, both methods require the questionable assumption of asymptotically-large sample size due to the presumption of approximate normality. Results We develop a novel approach, termed the Evidence of Selection (EoS), removing the assumption that functionally important sites are adjacent in sequence and and explicitly modelling the effects of limited sample-size. Precise statistical derivations show that the EoS score can be easily interpreted as an expected log-odds-ratio between two competing hypotheses, namely, the hypothetical presence or absence of functional selection for a given site. Using the EoS score, we then develop selection criteria by which functionally-important yet non-adjacent sites can be identified. An approximate power analysis is also developed to estimate the reliability of inference given the data. We validate and demonstrate the the practical utility of our method by analysis of the homing endonuclease I-Bmol, comparing our predictions with the results of existing methods. Conclusions Our method is able to assess both the evidence of selection at individual amino acid sites and estimate the reliability of those inferences. Experimental validation with I-Bmol proves its utility to identify functionally-important residues of poorly characterized proteins, demonstrating increased sensitivity over previous methods without loss of specificity. With the ability to guide the selection of precise experimental mutagenesis conditions, our method helps make unigenic analysis a more broadly applicable technique with which to probe protein function. Availability Software to compute, plot, and summarize EoS data is available as an open-source package called 'unigenic' for the 'R' programming language at http://www.fernandes.org/txp/article/13/an-analytical-framework-for-unigenic-evolution.
Collapse
|
14
|
Chan SH, Stoddard BL, Xu SY. Natural and engineered nicking endonucleases--from cleavage mechanism to engineering of strand-specificity. Nucleic Acids Res 2010; 39:1-18. [PMID: 20805246 PMCID: PMC3017599 DOI: 10.1093/nar/gkq742] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Restriction endonucleases (REases) are highly specific DNA scissors that have facilitated the development of modern molecular biology. Intensive studies of double strand (ds) cleavage activity of Type IIP REases, which recognize 4–8 bp palindromic sequences, have revealed a variety of mechanisms of molecular recognition and catalysis. Less well-studied are REases which cleave only one of the strands of dsDNA, creating a nick instead of a ds break. Naturally occurring nicking endonucleases (NEases) range from frequent cutters such as Nt.CviPII (^CCD; ^ denotes the cleavage site) to rare-cutting homing endonucleases (HEases) such as I-HmuI. In addition to these bona fida NEases, individual subunits of some heterodimeric Type IIS REases have recently been shown to be natural NEases. The discovery and characterization of more REases that recognize asymmetric sequences, particularly Types IIS and IIA REases, has revealed recognition and cleavage mechanisms drastically different from the canonical Type IIP mechanisms, and has allowed researchers to engineer highly strand-specific NEases. Monomeric LAGLIDADG HEases use two separate catalytic sites for cleavage. Exploitation of this characteristic has also resulted in useful nicking HEases. This review aims at providing an overview of the cleavage mechanisms of Types IIS and IIA REases and LAGLIDADG HEases, the engineering of their nicking variants, and the applications of NEases and nicking HEases.
Collapse
|
15
|
Dickson RJ, Wahl LM, Fernandes AD, Gloor GB. Identifying and seeing beyond multiple sequence alignment errors using intra-molecular protein covariation. PLoS One 2010; 5:e11082. [PMID: 20596526 PMCID: PMC2893159 DOI: 10.1371/journal.pone.0011082] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2010] [Accepted: 05/17/2010] [Indexed: 11/23/2022] Open
Abstract
Background There is currently no way to verify the quality of a multiple sequence alignment that is independent of the assumptions used to build it. Sequence alignments are typically evaluated by a number of established criteria: sequence conservation, the number of aligned residues, the frequency of gaps, and the probable correct gap placement. Covariation analysis is used to find putatively important residue pairs in a sequence alignment. Different alignments of the same protein family give different results demonstrating that covariation depends on the quality of the sequence alignment. We thus hypothesized that current criteria are insufficient to build alignments for use with covariation analyses. Methodology/Principal Findings We show that current criteria are insufficient to build alignments for use with covariation analyses as systematic sequence alignment errors are present even in hand-curated structure-based alignment datasets like those from the Conserved Domain Database. We show that current non-parametric covariation statistics are sensitive to sequence misalignments and that this sensitivity can be used to identify systematic alignment errors. We demonstrate that removing alignment errors due to 1) improper structure alignment, 2) the presence of paralogous sequences, and 3) partial or otherwise erroneous sequences, improves contact prediction by covariation analysis. Finally we describe two non-parametric covariation statistics that are less sensitive to sequence alignment errors than those described previously in the literature. Conclusions/Significance Protein alignments with errors lead to false positive and false negative conclusions (incorrect assignment of covariation and conservation, respectively). Covariation analysis can provide a verification step, independent of traditional criteria, to identify systematic misalignments in protein alignments. Two non-parametric statistics are shown to be somewhat insensitive to misalignment errors, providing increased confidence in contact prediction when analyzing alignments with erroneous regions because of an emphasis on they emphasize pairwise covariation over group covariation.
Collapse
Affiliation(s)
- Russell J. Dickson
- Department of Biochemistry, The University of Western Ontario, London, Canada
| | - Lindi M. Wahl
- Department of Applied Mathematics, The University of Western Ontario, London, Canada
| | - Andrew D. Fernandes
- Department of Biochemistry, The University of Western Ontario, London, Canada
- Department of Applied Mathematics, The University of Western Ontario, London, Canada
| | - Gregory B. Gloor
- Department of Biochemistry, The University of Western Ontario, London, Canada
- * E-mail:
| |
Collapse
|