1
|
Jiang G, Gao Y, Zhou N, Wang B. CRISPR-powered RNA sensing in vivo. Trends Biotechnol 2024:S0167-7799(24)00094-5. [PMID: 38734565 DOI: 10.1016/j.tibtech.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2024] [Revised: 04/02/2024] [Accepted: 04/02/2024] [Indexed: 05/13/2024]
Abstract
RNA sensing in vivo evaluates past or ongoing endogenous RNA disturbances, which is crucial for identifying cell types and states and diagnosing diseases. Recently, the CRISPR-driven genetic circuits have offered promising solutions to burgeoning challenges in RNA sensing. This review delves into the cutting-edge developments of CRISPR-powered RNA sensors in vivo, reclassifying these RNA sensors into four categories based on their working mechanisms, including programmable reassembly of split single-guide RNA (sgRNA), RNA-triggered RNA processing and protein cleavage, miRNA-triggered RNA interference (RNAi), and strand displacement reactions. Then, we discuss the advantages and challenges of existing methodologies in diverse application scenarios and anticipate and analyze obstacles and opportunities in forthcoming practical implementations.
Collapse
Affiliation(s)
- Guo Jiang
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, Zhejiang, China; ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 311200, Zhejiang, China
| | - Yuanli Gao
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, Zhejiang, China; ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 311200, Zhejiang, China; School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FF, UK
| | - Nan Zhou
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, Zhejiang, China; ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 311200, Zhejiang, China
| | - Baojun Wang
- College of Chemical and Biological Engineering, Zhejiang University, Hangzhou 310058, Zhejiang, China; ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 311200, Zhejiang, China.
| |
Collapse
|
2
|
Mu W, Luo T, Barrera A, Bounds LR, Klann TS, Ter Weele M, Bryois J, Crawford GE, Sullivan PF, Gersbach CA, Love MI, Li Y. Machine learning methods for predicting guide RNA effects in CRISPR epigenome editing experiments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.18.590188. [PMID: 38659894 PMCID: PMC11042384 DOI: 10.1101/2024.04.18.590188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
CRISPR epigenomic editing technologies enable functional interrogation of non-coding elements. However, current computational methods for guide RNA (gRNA) design do not effectively predict the power potential, molecular and cellular impact to optimize for efficient gRNAs, which are crucial for successful applications of these technologies. We present "launch-dCas9" (machine LeArning based UNified CompreHensive framework for CRISPR-dCas9) to predict gRNA impact from multiple perspectives, including cell fitness, wildtype abundance (gauging power potential), and gene expression in single cells. Our launchdCas9, built and evaluated using experiments involving >1 million gRNAs targeted across the human genome, demonstrates relatively high prediction accuracy (AUC up to 0.81) and generalizes across cell lines. Method-prioritized top gRNA(s) are 4.6-fold more likely to exert effects, compared to other gRNAs in the same cis-regulatory region. Furthermore, launchdCas9 identifies the most critical sequence-related features and functional annotations from >40 features considered. Our results establish launch-dCas9 as a promising approach to design gRNAs for CRISPR epigenomic experiments.
Collapse
Affiliation(s)
- Wancen Mu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Tianyou Luo
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Alejandro Barrera
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
| | - Lexi R Bounds
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Tyler S Klann
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Maria Ter Weele
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Julien Bryois
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Gregory E Crawford
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Pediatrics, Division of Medical Genetics, Duke University Medical Center, Durham, NC, USA
| | - Patrick F Sullivan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Charles A Gersbach
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
- Center for Advanced Genomic Technologies, Duke University, Durham, NC, USA
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
3
|
Motoche-Monar C, Ordoñez JE, Chang O, Gonzales-Zubiate FA. gRNA Design: How Its Evolution Impacted on CRISPR/Cas9 Systems Refinement. Biomolecules 2023; 13:1698. [PMID: 38136570 PMCID: PMC10741458 DOI: 10.3390/biom13121698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/05/2023] [Accepted: 06/12/2023] [Indexed: 12/24/2023] Open
Abstract
Over the past decade, genetic engineering has witnessed a revolution with the emergence of a relatively new genetic editing tool based on RNA-guided nucleases: the CRISPR/Cas9 system. Since the first report in 1987 and characterization in 2007 as a bacterial defense mechanism, this system has garnered immense interest and research attention. CRISPR systems provide immunity to bacteria against invading genetic material; however, with specific modifications in sequence and structure, it becomes a precise editing system capable of modifying the genomes of a wide range of organisms. The refinement of these modifications encompasses diverse approaches, including the development of more accurate nucleases, understanding of the cellular context and epigenetic conditions, and the re-designing guide RNAs (gRNAs). Considering the critical importance of the correct performance of CRISPR/Cas9 systems, our scope will emphasize the latter approach. Hence, we present an overview of the past and the most recent guide RNA web-based design tools, highlighting the evolution of their computational architecture and gRNA characteristics over the years. Our study explains computational approaches that use machine learning techniques, neural networks, and gRNA/target interactions data to enable predictions and classifications. This review could open the door to a dynamic community that uses up-to-date algorithms to optimize and create promising gRNAs, suitable for modern CRISPR/Cas9 engineering.
Collapse
Affiliation(s)
- Cristofer Motoche-Monar
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
| | - Julián E. Ordoñez
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
| | - Oscar Chang
- Departamento de Electrónica, Universidad Simon Bolivar, Caracas 1080, Venezuela
- MIND Research Group, Model Intelligent Networks Development, Urcuquí 100119, Ecuador
| | - Fernando A. Gonzales-Zubiate
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
- MIND Research Group, Model Intelligent Networks Development, Urcuquí 100119, Ecuador
| |
Collapse
|
4
|
Capponi S, Daniels KG. Harnessing the power of artificial intelligence to advance cell therapy. Immunol Rev 2023; 320:147-165. [PMID: 37415280 DOI: 10.1111/imr.13236] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 06/17/2023] [Indexed: 07/08/2023]
Abstract
Cell therapies are powerful technologies in which human cells are reprogrammed for therapeutic applications such as killing cancer cells or replacing defective cells. The technologies underlying cell therapies are increasing in effectiveness and complexity, making rational engineering of cell therapies more difficult. Creating the next generation of cell therapies will require improved experimental approaches and predictive models. Artificial intelligence (AI) and machine learning (ML) methods have revolutionized several fields in biology including genome annotation, protein structure prediction, and enzyme design. In this review, we discuss the potential of combining experimental library screens and AI to build predictive models for the development of modular cell therapy technologies. Advances in DNA synthesis and high-throughput screening techniques enable the construction and screening of libraries of modular cell therapy constructs. AI and ML models trained on this screening data can accelerate the development of cell therapies by generating predictive models, design rules, and improved designs.
Collapse
Affiliation(s)
- Sara Capponi
- Department of Functional Genomics and Cellular Engineering, IBM Almaden Research Center, San Jose, California, USA
- Center for Cellular Construction, San Francisco, California, USA
| | - Kyle G Daniels
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, California, USA
| |
Collapse
|
5
|
Zhang G, Luo Y, Dai X, Dai Z. Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities. Brief Bioinform 2023; 24:bbad333. [PMID: 37775147 DOI: 10.1093/bib/bbad333] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 08/31/2023] [Accepted: 09/04/2023] [Indexed: 10/01/2023] Open
Abstract
In silico design of single guide RNA (sgRNA) plays a critical role in clustered regularly interspaced, short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9) system. Continuous efforts are aimed at improving sgRNA design with efficient on-target activity and reduced off-target mutations. In the last 5 years, an increasing number of deep learning-based methods have achieved breakthrough performance in predicting sgRNA on- and off-target activities. Nevertheless, it is worthwhile to systematically evaluate these methods for their predictive abilities. In this review, we conducted a systematic survey on the progress in prediction of on- and off-target editing. We investigated the performances of 10 mainstream deep learning-based on-target predictors using nine public datasets with different sample sizes. We found that in most scenarios, these methods showed superior predictive power on large- and medium-scale datasets than on small-scale datasets. In addition, we performed unbiased experiments to provide in-depth comparison of eight representative approaches for off-target prediction on 12 publicly available datasets with various imbalanced ratios of positive/negative samples. Most methods showed excellent performance on balanced datasets but have much room for improvement on moderate- and severe-imbalanced datasets. This study provides comprehensive perspectives on CRISPR/Cas9 sgRNA on- and off-target activity prediction and improvement for method development.
Collapse
Affiliation(s)
- Guishan Zhang
- College of Engineering, Shantou University, Shantou 515063, China
| | - Ye Luo
- College of Engineering, Shantou University, Shantou 515063, China
| | - Xianhua Dai
- School of Cyber Science and Technology, Sun Yat-sen University, Shenzhen 518107, China
- Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai 519000, China
| | - Zhiming Dai
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- Guangdong Province Key Laboratory of Big Data Analysis and Processing, Sun Yat-sen University, Guangzhou 510006, China
| |
Collapse
|
6
|
Ansori ANM, Antonius Y, Susilo RJK, Hayaza S, Kharisma VD, Parikesit AA, Zainul R, Jakhmola V, Saklani T, Rebezov M, Ullah ME, Maksimiuk N, Derkho M, Burkov P. Application of CRISPR-Cas9 genome editing technology in various fields: A review. NARRA J 2023; 3:e184. [PMID: 38450259 PMCID: PMC10916045 DOI: 10.52225/narra.v3i2.184] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/23/2023] [Indexed: 03/08/2024]
Abstract
CRISPR-Cas9 has emerged as a revolutionary tool that enables precise and efficient modifications of the genetic material. This review provides a comprehensive overview of CRISPR-Cas9 technology and its applications in genome editing. We begin by describing the fundamental principles of CRISPR-Cas9 technology, explaining how the system utilizes a single guide RNA (sgRNA) to direct the Cas9 nuclease to specific DNA sequences in the genome, resulting in targeted double-stranded breaks. In this review, we provide in-depth explorations of CRISPR-Cas9 technology and its applications in agriculture, medicine, environmental sciences, fisheries, nanotechnology, bioinformatics, and biotechnology. We also highlight its potential, ongoing research, and the ethical considerations and controversies surrounding its use. This review might contribute to the understanding of CRISPR-Cas9 technology and its implications in various fields, paving the way for future developments and responsible applications of this transformative technology.
Collapse
Affiliation(s)
- Arif NM. Ansori
- Department of Biology, Faculty of Science and Technology, Universitas Airlangga, Surabaya, Indonesia
- Uttaranchal Institute of Pharmaceutical Sciences, Uttaranchal University, Dehradun, India
- European Virus Bioinformatics Center, Jena, Germany
| | - Yulanda Antonius
- Faculty of Biotechnology, Universitas Surabaya, Surabaya, Indonesia
| | - Raden JK. Susilo
- Nanotechology Engineering Study Program, Faculty of Advanced Technology and Multidiscipline, Universitas Airlangga, Surabaya, Indonesia
| | - Suhailah Hayaza
- Nanotechology Engineering Study Program, Faculty of Advanced Technology and Multidiscipline, Universitas Airlangga, Surabaya, Indonesia
| | - Viol D. Kharisma
- Doctoral Program of Mathematics and Natural Sciences, Faculty of Science and Technology, Universitas Airlangga, Surabaya, Indonesia
- Generasi Biologi Indonesia Foundation, Gresik, Indonesia
| | - Arli A. Parikesit
- Department of Bioinformatics, School of Life Sciences, Indonesia International Institute for Life Sciences (i3L), Jakarta,Indonesia
| | - Rahadian Zainul
- Department of Chemistry, Faculty of Mathematics and Natural Sciences, Universitas Negeri Padang, Padang, Indonesia
| | - Vikash Jakhmola
- Uttaranchal Institute of Pharmaceutical Sciences, Uttaranchal University, Dehradun, India
| | - Taru Saklani
- Uttaranchal Institute of Pharmaceutical Sciences, Uttaranchal University, Dehradun, India
| | - Maksim Rebezov
- Department of Scientific Research, V. M. Gorbatov Federal Research Center for Food Systems, Moscow, Russian Federation
- Faculty of Biotechnology and Food Engineering, Ural State Agrarian University, Yekaterinburg, Russian Federation
| | - Md. Emdad Ullah
- Department of Chemistry, Mississippi State University, Mississippi, United States
| | - Nikolai Maksimiuk
- Institute of Medical Education, Yaroslav-the-Wise Novgorod State University, Velikiy Novgorod, Russian Federation
| | - Marina Derkho
- Institute of Veterinary Medicine, South Ural State Agrarian University, Troitsk, Russian Federation
| | - Pavel Burkov
- Institute of Veterinary Medicine, South Ural State Agrarian University, Troitsk, Russian Federation
| |
Collapse
|
7
|
Alipanahi R, Safari L, Khanteymoori A. CRISPR genome editing using computational approaches: A survey. FRONTIERS IN BIOINFORMATICS 2023; 2:1001131. [PMID: 36710911 PMCID: PMC9875887 DOI: 10.3389/fbinf.2022.1001131] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/19/2022] [Indexed: 01/13/2023] Open
Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR)-based gene editing has been widely used in various cell types and organisms. To make genome editing with Clustered regularly interspaced short palindromic repeats far more precise and practical, we must concentrate on the design of optimal gRNA and the selection of appropriate Cas enzymes. Numerous computational tools have been created in recent years to help researchers design the best gRNA for Clustered regularly interspaced short palindromic repeats researches. There are two approaches for designing an appropriate gRNA sequence (which targets our desired sites with high precision): experimental and predicting-based approaches. It is essential to reduce off-target sites when designing an optimal gRNA. Here we review both traditional and machine learning-based approaches for designing an appropriate gRNA sequence and predicting off-target sites. In this review, we summarize the key characteristics of all available tools (as far as possible) and compare them together. Machine learning-based tools and web servers are believed to become the most effective and reliable methods for predicting on-target and off-target activities of Clustered regularly interspaced short palindromic repeats in the future. However, these predictions are not so precise now and the performance of these algorithms -especially deep learning one's-depends on the amount of data used during training phase. So, as more features are discovered and incorporated into these models, predictions become more in line with experimental observations. We must concentrate on the creation of ideal gRNA and the choice of suitable Cas enzymes in order to make genome editing with Clustered regularly interspaced short palindromic repeats far more accurate and feasible.
Collapse
Affiliation(s)
| | - Leila Safari
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran,*Correspondence: Leila Safari,
| | | |
Collapse
|
8
|
Zarate OA, Yang Y, Wang X, Wang JP. BoostMEC: predicting CRISPR-Cas9 cleavage efficiency through boosting models. BMC Bioinformatics 2022; 23:446. [PMID: 36289480 PMCID: PMC9597963 DOI: 10.1186/s12859-022-04998-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Accepted: 10/21/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the CRISPR-Cas9 system, the efficiency of genetic modifications has been found to vary depending on the single guide RNA (sgRNA) used. A variety of sgRNA properties have been found to be predictive of CRISPR cleavage efficiency, including the position-specific sequence composition of sgRNAs, global sgRNA sequence properties, and thermodynamic features. While prevalent existing deep learning-based approaches provide competitive prediction accuracy, a more interpretable model is desirable to help understand how different features may contribute to CRISPR-Cas9 cleavage efficiency. RESULTS We propose a gradient boosting approach, utilizing LightGBM to develop an integrated tool, BoostMEC (Boosting Model for Efficient CRISPR), for the prediction of wild-type CRISPR-Cas9 editing efficiency. We benchmark BoostMEC against 10 popular models on 13 external datasets and show its competitive performance. CONCLUSIONS BoostMEC can provide state-of-the-art predictions of CRISPR-Cas9 cleavage efficiency for sgRNA design and selection. Relying on direct and derived sequence features of sgRNA sequences and based on conventional machine learning, BoostMEC maintains an advantage over other state-of-the-art CRISPR efficiency prediction models that are based on deep learning through its ability to produce more interpretable feature insights and predictions.
Collapse
Affiliation(s)
- Oscar A. Zarate
- grid.16753.360000 0001 2299 3507Department of Statistics and Data Science, Northwestern University, Evanston, IL USA
| | - Yiben Yang
- grid.16753.360000 0001 2299 3507Department of Statistics and Data Science, Northwestern University, Evanston, IL USA
| | - Xiaozhong Wang
- grid.16753.360000 0001 2299 3507Department of Molecular BioSciences, Northwestern University, Evanston, IL USA
| | - Ji-Ping Wang
- grid.16753.360000 0001 2299 3507Department of Statistics and Data Science, Northwestern University, Evanston, IL USA
| |
Collapse
|
9
|
Predicting RNA solvent accessibility from multi-scale context feature via multi-shot neural network. Anal Biochem 2022; 654:114802. [PMID: 35809650 DOI: 10.1016/j.ab.2022.114802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/11/2022] [Accepted: 06/28/2022] [Indexed: 11/24/2022]
Abstract
Knowledge of RNA solvent accessibility has recently become attractive due to the increasing awareness of its importance for key biological process. Accurately predicting the solvent accessibility of RNA is crucial for understanding its 3D structure and biological function. In this study, we develop a novel computational method, termed M2pred, for accurately predicting the solvent accessibility of RNA from sequence-based multi-scale context feature. In M2pred, three single-view features, i.e., base-pairing probabilities, position-specific frequency matrix, and a binary one-hot encoding, are first generated as three feature sources, and immediately concatenated to engender a super feature. Secondly, for the super feature, the matrix-format features of each nucleotide are extracted using an initialized sliding window technique, and regularly stacked into a cube-format feature. Then, using multi-scale context feature extraction strategy, a pyramid feature constructed of contextual feature of four scales related to target nucleotides is extracted from the cube-format feature. Finally, a customized multi-shot neural network framework, which is equipped with four different scales of receptive fields mainly integrating several residual attention blocks, is designed to dig discrimination information from the contextual pyramid feature. Experimental results demonstrate that the proposed M2pred achieve a high prediction performance and outperforms existing state-of-the-art prediction methods of RNA solvent accessibility.
Collapse
|
10
|
Huang Y, Shang M, Liu T, Wang K. High-throughput methods for genome editing: the more the better. PLANT PHYSIOLOGY 2022; 188:1731-1745. [PMID: 35134245 PMCID: PMC8968257 DOI: 10.1093/plphys/kiac017] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 11/29/2021] [Indexed: 05/04/2023]
Abstract
During the last decade, targeted genome-editing technologies, especially clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein (Cas) technologies, have permitted efficient targeting of genomes, thereby modifying these genomes to offer tremendous opportunities for deciphering gene function and engineering beneficial traits in many biological systems. As a powerful genome-editing tool, the CRISPR/Cas systems, combined with the development of next-generation sequencing and many other high-throughput techniques, have thus been quickly developed into a high-throughput engineering strategy in animals and plants. Therefore, here, we review recent advances in using high-throughput genome-editing technologies in animals and plants, such as the high-throughput design of targeted guide RNA (gRNA), construction of large-scale pooled gRNA, and high-throughput genome-editing libraries, high-throughput detection of editing events, and high-throughput supervision of genome-editing products. Moreover, we outline perspectives for future applications, ranging from medication using gene therapy to crop improvement using high-throughput genome-editing technologies.
Collapse
Affiliation(s)
- Yong Huang
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310006, China
| | - Meiqi Shang
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310006, China
| | - Tingting Liu
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310006, China
| | - Kejian Wang
- State Key Laboratory of Rice Biology, China National Rice Research Institute, Chinese Academy of Agricultural Sciences, Hangzhou 310006, China
| |
Collapse
|
11
|
A systematic mapping study on machine learning techniques for the prediction of CRISPR/Cas9 sgRNA target cleavage. Comput Struct Biotechnol J 2022; 20:5813-5823. [PMID: 36382194 PMCID: PMC9630617 DOI: 10.1016/j.csbj.2022.10.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 09/21/2022] [Accepted: 10/08/2022] [Indexed: 11/30/2022] Open
Abstract
CRISPR/Cas9 technology has greatly accelerated genome engineering research. The CRISPR/Cas9 complex, a bacterial immune response system, is widely adopted for RNA-driven targeted genome editing. The systematic mapping study presented in this paper examines the literature on machine learning (ML) techniques employed in the prediction of CRISPR/Cas9 sgRNA on/off-target cleavage, focusing on improving support in sgRNA design activities and identifying areas currently being researched. This area of research has greatly expanded recently, and we found it appropriate to work on a Systematic Mapping Study (SMS), an investigation that has proven to be an effective secondary study method. Unlike a classic review, in an SMS, no comparison of methods or results is made, while this task can instead be the subject of a systematic literature review that chooses one theme among those highlighted in this SMS. The study is illustrated in this paper. To the best of the authors' knowledge, no other SMS studies have been published on this topic. Fifty-seven papers published in the period 2017–2022 (April, 30) were analyzed. This study reveals that the most widely used ML model is the convolutional neural network (CNN), followed by the feedforward neural network (FNN), while the use of other models is marginal. Other interesting information has emerged, such as the wide availability of both open code and platforms dedicated to supporting the activity of researchers or the fact that there is a clear prevalence of public funds that finance research on this topic.
Collapse
|
12
|
Xiao LM, Wan YQ, Jiang ZR. AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity. BMC Bioinformatics 2021; 22:589. [PMID: 34903170 PMCID: PMC8667445 DOI: 10.1186/s12859-021-04509-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 12/01/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. RESULTS To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules-one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. CONCLUSION With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity.
Collapse
Affiliation(s)
- Li-Ming Xiao
- School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China
| | - Yun-Qi Wan
- School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China
| | - Zhen-Ran Jiang
- School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
13
|
Xiang X, Corsi GI, Anthon C, Qu K, Pan X, Liang X, Han P, Dong Z, Liu L, Zhong J, Ma T, Wang J, Zhang X, Jiang H, Xu F, Liu X, Xu X, Wang J, Yang H, Bolund L, Church GM, Lin L, Gorodkin J, Luo Y. Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning. Nat Commun 2021; 12:3238. [PMID: 34050182 PMCID: PMC8163799 DOI: 10.1038/s41467-021-23576-0] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 05/06/2021] [Indexed: 02/07/2023] Open
Abstract
The design of CRISPR gRNAs requires accurate on-target efficiency predictions, which demand high-quality gRNA activity data and efficient modeling. To advance, we here report on the generation of on-target gRNA activity data for 10,592 SpCas9 gRNAs. Integrating these with complementary published data, we train a deep learning model, CRISPRon, on 23,902 gRNAs. Compared to existing tools, CRISPRon exhibits significantly higher prediction performances on four test datasets not overlapping with training data used for the development of these tools. Furthermore, we present an interactive gRNA design webserver based on the CRISPRon standalone software, both available via https://rth.dk/resources/crispr/ . CRISPRon advances CRISPR applications by providing more accurate gRNA efficiency predictions than the existing tools.
Collapse
Affiliation(s)
- Xi Xiang
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- BGI Education Center, University of Chinese Academy of Sciences, Shenzhen, China
- BGI-Shenzhen, Shenzhen, China
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
| | - Giulia I Corsi
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Christian Anthon
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - Kunli Qu
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Xiaoguang Pan
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
| | - Xue Liang
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Peng Han
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Zhanying Dong
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
| | - Lijun Liu
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
| | | | - Tao Ma
- MGI, BGI-Shenzhen, Shenzhen, China
| | | | | | | | - Fengping Xu
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- BGI-Shenzhen, Shenzhen, China
| | - Xin Liu
- BGI-Shenzhen, Shenzhen, China
| | - Xun Xu
- BGI-Shenzhen, Shenzhen, China
- Guangdong Provincial Key Laboratory of Genome Read and Write, BGI-Shenzhen, Shenzhen, China
| | | | - Huanming Yang
- BGI-Shenzhen, Shenzhen, China
- Guangdong Provincial Academician Workstation of BGI Synthetic Genomics, BGI-Shenzhen, Shenzhen, China
| | - Lars Bolund
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- BGI-Shenzhen, Shenzhen, China
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
| | - George M Church
- Department of Genetics, Blavatnik Institute, Harvard Medical School, Boston, MA, USA
| | - Lin Lin
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- Steno Diabetes Center Aarhus, Aarhus University, Aarhus, Denmark
| | - Jan Gorodkin
- Center for non-coding RNA in Technology and Health, Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Frederiksberg, Denmark.
| | - Yonglun Luo
- Lars Bolund Institute of Regenerative Medicine, Qingdao-Europe Advanced Institute for Life Sciences, BGI-Qingdao, Qingdao, China.
- BGI-Shenzhen, Shenzhen, China.
- Department of Biomedicine, Aarhus University, Aarhus, Denmark.
- Steno Diabetes Center Aarhus, Aarhus University, Aarhus, Denmark.
| |
Collapse
|
14
|
Vinodkumar PK, Ozcinar C, Anbarjafari G. Prediction of sgRNA Off-Target Activity in CRISPR/Cas9 Gene Editing Using Graph Convolution Network. ENTROPY (BASEL, SWITZERLAND) 2021; 23:608. [PMID: 34069050 PMCID: PMC8156774 DOI: 10.3390/e23050608] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 05/03/2021] [Accepted: 05/12/2021] [Indexed: 12/26/2022]
Abstract
CRISPR/Cas9 is a powerful genome-editing technology that has been widely applied in targeted gene repair and gene expression regulation. One of the main challenges for the CRISPR/Cas9 system is the occurrence of unexpected cleavage at some sites (off-targets) and predicting them is necessary due to its relevance in gene editing research. Very few deep learning models have been developed so far to predict the off-target propensity of single guide RNA (sgRNA) at specific DNA fragments by using artificial feature extract operations and machine learning techniques; however, this is a convoluted process that is difficult to understand and implement for researchers. In this research work, we introduce a novel graph-based approach to predict off-target efficacy of sgRNA in the CRISPR/Cas9 system that is easy to understand and replicate for researchers. This is achieved by creating a graph with sequences as nodes and by using a link prediction method to predict the presence of links between sgRNA and off-target inducing target DNA sequences. Features for the sequences are extracted from within the sequences. We used HEK293 and K562 t datasets in our experiments. GCN predicted the off-target gene knockouts (using link prediction) by predicting the links between sgRNA and off-target sequences with an auROC value of 0.987.
Collapse
Affiliation(s)
| | - Cagri Ozcinar
- iCV Lab, Institute of Technology, University of Tartu, 51009 Tartu, Estonia; (P.K.V.); (C.O.)
| | - Gholamreza Anbarjafari
- iCV Lab, Institute of Technology, University of Tartu, 51009 Tartu, Estonia; (P.K.V.); (C.O.)
- PwC Advisory Finland, 00180 Helsinki, Finland
| |
Collapse
|
15
|
Bhat MA, Bhat MA, Kumar V, Wani IA, Bashir H, Shah AA, Rahman S, Jan AT. The era of editing plant genomes using CRISPR/Cas: A critical appraisal. J Biotechnol 2020; 324:34-60. [DOI: 10.1016/j.jbiotec.2020.09.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 09/08/2020] [Accepted: 09/14/2020] [Indexed: 12/11/2022]
|