1
|
Ham DT, Browne TS, Banglorewala PN, Wilson TL, Michael RK, Gloor GB, Edgell DR. A generalizable Cas9/sgRNA prediction model using machine transfer learning with small high-quality datasets. Nat Commun 2023; 14:5514. [PMID: 37679324 PMCID: PMC10485023 DOI: 10.1038/s41467-023-41143-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023] Open
Abstract
The CRISPR/Cas9 nuclease from Streptococcus pyogenes (SpCas9) can be used with single guide RNAs (sgRNAs) as a sequence-specific antimicrobial agent and as a genome-engineering tool. However, current bacterial sgRNA activity models struggle with accurate predictions and do not generalize well, possibly because the underlying datasets used to train the models do not accurately measure SpCas9/sgRNA activity and cannot distinguish on-target cleavage from toxicity. Here, we solve this problem by using a two-plasmid positive selection system to generate high-quality data that more accurately reports on SpCas9/sgRNA cleavage and that separates activity from toxicity. We develop a machine learning architecture (crisprHAL) that can be trained on existing datasets, that shows marked improvements in sgRNA activity prediction accuracy when transfer learning is used with small amounts of high-quality data, and that can generalize predictions to different bacteria. The crisprHAL model recapitulates known SpCas9/sgRNA-target DNA interactions and provides a pathway to a generalizable sgRNA bacterial activity prediction tool that will enable accurate antimicrobial and genome engineering applications.
Collapse
Affiliation(s)
- Dalton T Ham
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Tyler S Browne
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Pooja N Banglorewala
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | | | | | - Gregory B Gloor
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| | - David R Edgell
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| |
Collapse
|
2
|
Huszár K, Welker Z, Györgypál Z, Tóth E, Ligeti Z, Kulcsár P, Dancsó J, Tálas A, Krausz S, Varga É, Welker E. Position-dependent sequence motif preferences of SpCas9 are largely determined by scaffold-complementary spacer motifs. Nucleic Acids Res 2023; 51:5847-5863. [PMID: 37140059 PMCID: PMC10287927 DOI: 10.1093/nar/gkad323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Revised: 04/04/2023] [Accepted: 05/02/2023] [Indexed: 05/05/2023] Open
Abstract
Streptococcus pyogenes Cas9 (SpCas9) nuclease exhibits considerable position-dependent sequence preferences. The reason behind these preferences is not well understood and is difficult to rationalise, since the protein establishes interactions with the target-spacer duplex in a sequence-independent manner. We revealed here that intramolecular interactions within the single guide RNA (sgRNA), between the spacer and the scaffold, cause most of these preferences. By using in cellulo and in vitro SpCas9 activity assays with systematically designed spacer and scaffold sequences and by analysing activity data from a large SpCas9 sequence library, we show that some long (>8 nucleotides) spacer motifs, that are complementary to the RAR unit of the scaffold, interfere with sgRNA loading, and that some motifs of more than 4 nucleotides, that are complementary to the SL1 unit, inhibit DNA binding and cleavage. Furthermore, we show that intramolecular interactions are present in the majority of the inactive sgRNA sequences of the library, suggesting that they are the most important intrinsic determinants of the activity of the SpCas9 ribonucleoprotein complex. We also found that in pegRNAs, sequences at the 3' extension of the sgRNA that are complementary to the SL2 unit are also inhibitory to prime editing, but not to the nuclease activity of SpCas9.
Collapse
Affiliation(s)
- Krisztina Huszár
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Department of Genetics, Doctoral School of Biology, Faculty of Science, Eötvös Loránd University, Budapest, H-1117, Hungary
- Gene Design Ltd, Szeged, Hungary
| | - Zsombor Welker
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Biospiral-2006 Ltd, Szeged, Hungary
| | - Zoltán Györgypál
- Biospiral-2006 Ltd, Szeged, Hungary
- Institute of Biophysics, Biological Research Centre, Szeged, Hungary
| | - Eszter Tóth
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Gene Design Ltd, Szeged, Hungary
| | - Zoltán Ligeti
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Institute of Biochemistry, Biological Research Centre, Szeged, Hungary
- Doctoral School of Multidisciplinary Medical Science, University of Szeged, Hungary
| | - Péter István Kulcsár
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - János Dancsó
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Biospiral-2006 Ltd, Szeged, Hungary
| | - András Tálas
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Sarah Laura Krausz
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- School of Ph.D. Studies, Semmelweis University, Budapest, Hungary
| | - Éva Varga
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Institute of Biochemistry, Biological Research Centre, Szeged, Hungary
- Doctoral School of Multidisciplinary Medical Science, University of Szeged, Hungary
| | - Ervin Welker
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
- Institute of Biochemistry, Biological Research Centre, Szeged, Hungary
| |
Collapse
|
3
|
Ramesh A, Trivedi V, Lee S, Tafrishi A, Schwartz C, Mohseni A, Li M, Lonardi S, Wheeldon I. acCRISPR: an activity-correction method for improving the accuracy of CRISPR screens. Commun Biol 2023; 6:617. [PMID: 37291233 PMCID: PMC10250353 DOI: 10.1038/s42003-023-04996-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 05/30/2023] [Indexed: 06/10/2023] Open
Abstract
High throughput CRISPR screens are revolutionizing the way scientists unravel the genetic underpinnings of engineered and evolved phenotypes. One of the critical challenges in accurately assessing screening outcomes is accounting for the variability in sgRNA cutting efficiency. Poorly active guides targeting genes essential to screening conditions obscure the growth defects that are expected from disrupting them. Here, we develop acCRISPR, an end-to-end pipeline that identifies essential genes in pooled CRISPR screens using sgRNA read counts obtained from next-generation sequencing. acCRISPR uses experimentally determined cutting efficiencies for each guide in the library to provide an activity correction to the screening outcomes via calculation of an optimization metric, thus determining the fitness effect of disrupted genes. CRISPR-Cas9 and -Cas12a screens were carried out in the non-conventional oleaginous yeast Yarrowia lipolytica and acCRISPR was used to determine a high-confidence set of essential genes for growth under glucose, a common carbon source used for the industrial production of oleochemicals. acCRISPR was also used in screens quantifying relative cellular fitness under high salt conditions to identify genes that were related to salt tolerance. Collectively, this work presents an experimental-computational framework for CRISPR-based functional genomics studies that may be expanded to other non-conventional organisms of interest.
Collapse
Affiliation(s)
- Adithya Ramesh
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA
| | - Varun Trivedi
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA
| | - Sangcheon Lee
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA
| | - Aida Tafrishi
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA
| | - Cory Schwartz
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA
- iBio Inc., San Diego, CA, USA
| | - Amirsadra Mohseni
- Department of Computer Science and Engineering, University of California, Riverside, CA, 92521, USA
| | - Mengwan Li
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA
| | - Stefano Lonardi
- Department of Computer Science and Engineering, University of California, Riverside, CA, 92521, USA
- Integrative Institute for Genome Biology, University of California, Riverside, CA, 92521, USA
| | - Ian Wheeldon
- Department of Chemical and Environmental Engineering, University of California, Riverside, CA, 92521, USA.
- Integrative Institute for Genome Biology, University of California, Riverside, CA, 92521, USA.
- Center for Industrial Biotechnology, University of California, Riverside, CA, 92521, USA.
| |
Collapse
|
4
|
Christie KA, Guo JA, Silverstein RA, Doll RM, Mabuchi M, Stutzman HE, Lin J, Ma L, Walton RT, Pinello L, Robb GB, Kleinstiver BP. Precise DNA cleavage using CRISPR-SpRYgests. Nat Biotechnol 2023; 41:409-416. [PMID: 36203014 PMCID: PMC10023266 DOI: 10.1038/s41587-022-01492-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 08/31/2022] [Indexed: 11/09/2022]
Abstract
Methods for in vitro DNA cleavage and molecular cloning remain unable to precisely cleave DNA directly adjacent to bases of interest. Restriction enzymes (REs) must bind specific motifs, whereas wild-type CRISPR-Cas9 or CRISPR-Cas12 nucleases require protospacer adjacent motifs (PAMs). Here we explore the utility of our previously reported near-PAMless SpCas9 variant, named SpRY, to serve as a universal DNA cleavage tool for various cloning applications. By performing SpRY DNA digests (SpRYgests) using more than 130 guide RNAs (gRNAs) sampling a wide diversity of PAMs, we discovered that SpRY is PAMless in vitro and can cleave DNA at practically any sequence, including sites refractory to cleavage with wild-type SpCas9. We illustrate the versatility and effectiveness of SpRYgests to improve the precision of several cloning workflows, including those not possible with REs or canonical CRISPR nucleases. We also optimize a rapid and simple one-pot gRNA synthesis protocol to streamline SpRYgest implementation. Together, SpRYgests can improve various DNA engineering applications that benefit from precise DNA breaks.
Collapse
Affiliation(s)
- Kathleen A Christie
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Jimmy A Guo
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Biological and Biomedical Sciences Program, Harvard University, Boston, MA, USA
| | - Rachel A Silverstein
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Biological and Biomedical Sciences Program, Harvard University, Boston, MA, USA
| | - Roman M Doll
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Molecular Biosciences/Cancer Biology Program, Heidelberg University and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | | | - Hannah E Stutzman
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
| | - Jiecong Lin
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital Charlestown, Boston, MA, USA
| | - Linyuan Ma
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
| | - Russell T Walton
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Luca Pinello
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Molecular Pathology Unit, Massachusetts General Hospital, Boston, MA, USA
- Center for Cancer Research, Massachusetts General Hospital Charlestown, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Benjamin P Kleinstiver
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA.
- Department of Pathology, Massachusetts General Hospital, Boston, MA, USA.
- Department of Pathology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
5
|
Li Z, Ma R, Liu D, Wang M, Zhu T, Deng Y. A straightforward plant prime editing system enabled highly efficient precise editing of rice Waxy gene. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2022; 323:111400. [PMID: 35905895 DOI: 10.1016/j.plantsci.2022.111400] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 07/19/2022] [Accepted: 07/24/2022] [Indexed: 06/15/2023]
Abstract
CRISPR Cas9-mediated genome editing is highly efficient at targeted site-specific gene knock-out through NHEJ (Non-Homology End Joining), but ineffective for specific DNA integration through HDR (Homology Directed Repair) for precise gene editing. Base editors can make limited base substitutions but only within restricted small windows of the protospacer. Prime editing has been applied in plants with various degrees of success. However, several questions such as low and inconsistent editing efficiencies across different target sites need to be addressed. We compared two prime editing approaches PE3 and PE2 at two neighboring target sites within rice Waxy gene to partially address those questions. A straightforward PE2 plant prime editing system retrofitted from a regular CRISPR-Cas9 editing system can deliver highly efficient up to 66.7% precise gene editing. Various forms of precise editing including base substitutions, small deletions and insertions can be accurately achieved. The secondary structure variations of different pegRNAs may be the primary reason for inconsistent editing across different target sites and should be the optimization focus to further improve plant prime editing.
Collapse
Affiliation(s)
- Zhongsen Li
- Beidahuang Kenfeng Seed, 380 Changjiang Road, Nangang District, Harbin, Heilongjiang, PR China.
| | - Rui Ma
- Beidahuang Kenfeng Seed, 380 Changjiang Road, Nangang District, Harbin, Heilongjiang, PR China
| | - Dan Liu
- Beidahuang Kenfeng Seed, 380 Changjiang Road, Nangang District, Harbin, Heilongjiang, PR China
| | - Mingyue Wang
- Beidahuang Kenfeng Seed, 380 Changjiang Road, Nangang District, Harbin, Heilongjiang, PR China
| | - Ting Zhu
- Beidahuang Kenfeng Seed, 380 Changjiang Road, Nangang District, Harbin, Heilongjiang, PR China
| | - Yanxue Deng
- Beidahuang Kenfeng Seed, 380 Changjiang Road, Nangang District, Harbin, Heilongjiang, PR China
| |
Collapse
|