1
|
Krsek A, Baticic L, Sotosek V, Braut T. The Role of Biomarkers in HPV-Positive Head and Neck Squamous Cell Carcinoma: Towards Precision Medicine. Diagnostics (Basel) 2024; 14:1448. [PMID: 39001338 PMCID: PMC11241541 DOI: 10.3390/diagnostics14131448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 07/03/2024] [Accepted: 07/04/2024] [Indexed: 07/16/2024] Open
Abstract
Head and neck cancer (HNC) represents a significant global health challenge, with squamous cell carcinomas (SCCs) accounting for approximately 90% of all HNC cases. These malignancies, collectively referred to as head and neck squamous cell carcinoma (HNSCC), originate from the mucosal epithelium lining the larynx, pharynx, and oral cavity. The primary risk factors associated with HNSCC in economically disadvantaged nations have been chronic alcohol consumption and tobacco use. However, in more affluent countries, the landscape of HNSCC has shifted with the identification of human papillomavirus (HPV) infection, particularly HPV-16, as a major risk factor, especially among nonsmokers. Understanding the evolving risk factors and the distinct biological behaviors of HPV-positive and HPV-negative HNSCC is critical for developing targeted treatment strategies and improving patient outcomes in this complex and diverse group of cancers. Accurate diagnosis of HPV-positive HNSCC is essential for developing a comprehensive model that integrates the molecular characteristics, immune microenvironment, and clinical outcomes. The aim of this comprehensive review was to summarize the current knowledge and advances in the identification of DNA, RNA, and protein biomarkers in bodily fluids and tissues that have introduced new possibilities for minimally or non-invasive cancer diagnosis, monitoring, and assessment of therapeutic responses.
Collapse
Affiliation(s)
- Antea Krsek
- Faculty of Medicine, University of Rijeka, 51000 Rijeka, Croatia;
| | - Lara Baticic
- Department of Medical Chemistry, Biochemistry and Clinical Chemistry, Faculty of Medicine, University of Rijeka, 51000 Rijeka, Croatia
| | - Vlatka Sotosek
- Department of Clinical Medical Sciences I, Faculty of Health Studies, University of Rijeka, 51000 Rijeka, Croatia;
- Department of Anesthesiology, Reanimatology, Emergency and Intensive Care Medicine, Faculty of Medicine, University of Rijeka, 51000 Rijeka, Croatia
| | - Tamara Braut
- Department of Otorhinolaryngology and Head and Neck Surgery, Clinical Hospital Centre Rijeka, 51000 Rijeka, Croatia;
| |
Collapse
|
2
|
Bergman S, Tuller T. Strong association between genomic 3D structure and CRISPR cleavage efficiency. PLoS Comput Biol 2024; 20:e1012214. [PMID: 38848440 PMCID: PMC11189236 DOI: 10.1371/journal.pcbi.1012214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 06/20/2024] [Accepted: 05/30/2024] [Indexed: 06/09/2024] Open
Abstract
CRISPR is a gene editing technology which enables precise in-vivo genome editing; but its potential is hampered by its relatively low specificity and sensitivity. Improving CRISPR's on-target and off-target effects requires a better understanding of its mechanism and determinants. Here we demonstrate, for the first time, the chromosomal 3D spatial structure's association with CRISPR's cleavage efficiency, and its predictive capabilities. We used high-resolution Hi-C data to estimate the 3D distance between different regions in the human genome and utilized these spatial properties to generate 3D-based features, characterizing each region's density. We evaluated these features based on empirical, in-vivo CRISPR efficiency data and compared them to 425 features used in state-of-the-art models. The 3D features ranked in the top 13% of the features, and significantly improved the predictive power of LASSO and xgboost models trained with these features. The features indicated that sites with lower spatial density demonstrated higher efficiency. Understanding how CRISPR is affected by the 3D DNA structure provides insight into CRISPR's mechanism in general and improves our ability to correctly predict CRISPR's cleavage as well as design sgRNAs for therapeutic and scientific use.
Collapse
Affiliation(s)
- Shaked Bergman
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
| | - Tamir Tuller
- Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel
- The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel
| |
Collapse
|
3
|
Vora DS, Bhandari SM, Sundar D. DNA shape features improve prediction of CRISPR/Cas9 activity. Methods 2024; 226:120-126. [PMID: 38641083 DOI: 10.1016/j.ymeth.2024.04.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 03/27/2024] [Accepted: 04/10/2024] [Indexed: 04/21/2024] Open
Abstract
The CRISPR/Cas9 genome editing technology has transformed basic and translational research in biology and medicine. However, the advances are hindered by off-target effects and a paucity in the knowledge of the mechanism of the Cas9 protein. Machine learning models have been proposed for the prediction of Cas9 activity at unintended sites, yet feature engineering plays a major role in the outcome of the predictors. This study evaluates the improvement in the performance of similar predictors upon inclusion of epigenetic and DNA shape feature groups in the conventionally used sequence-based Cas9 target and off-target datasets. The approach involved the utilization of neural networks trained on a diverse range of parameters, allowing us to systematically assess the performance increase for the meticulously designed datasets- (i) sequence only, (ii) sequence and epigenetic features, and (iii) sequence, epigenetic and DNA shape feature datasets. The addition of DNA shape information significantly improved predictive performance, evaluated by Akaike and Bayesian information criteria. The evaluation of individual feature importance by permutation and LIME-based methods also indicates that not only sequence features like mismatches and nucleotide composition, but also base pairing parameters like opening and stretch, that are indicative of distortion in the DNA-RNA hybrid in the presence of mismatches, influence model outcomes.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, New Delhi 110016, India.
| | - Sakshi Manoj Bhandari
- Department of Mathematics, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India.
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, New Delhi 110016, India; School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India.
| |
Collapse
|
4
|
Zhu W, Xie H, Chen Y, Zhang G. CrnnCrispr: An Interpretable Deep Learning Method for CRISPR/Cas9 sgRNA On-Target Activity Prediction. Int J Mol Sci 2024; 25:4429. [PMID: 38674012 PMCID: PMC11050447 DOI: 10.3390/ijms25084429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 04/11/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024] Open
Abstract
CRISPR/Cas9 is a powerful genome-editing tool in biology, but its wide applications are challenged by a lack of knowledge governing single-guide RNA (sgRNA) activity. Several deep-learning-based methods have been developed for the prediction of on-target activity. However, there is still room for improvement. Here, we proposed a hybrid neural network named CrnnCrispr, which integrates a convolutional neural network and a recurrent neural network for on-target activity prediction. We performed unbiased experiments with four mainstream methods on nine public datasets with varying sample sizes. Additionally, we incorporated a transfer learning strategy to boost the prediction power on small-scale datasets. Our results showed that CrnnCrispr outperformed existing methods in terms of accuracy and generalizability. Finally, we applied a visualization approach to investigate the generalizable nucleotide-position-dependent patterns of sgRNAs for on-target activity, which shows potential in terms of model interpretability and further helps in understanding the principles of sgRNA design.
Collapse
Affiliation(s)
| | | | | | - Guishan Zhang
- College of Engineering, Shantou University, Shantou 515063, China; (W.Z.); (H.X.); (Y.C.)
| |
Collapse
|
5
|
Zhong Z, Li Z, Yang J, Wang Q. Unified Model to Predict gRNA Efficiency across Diverse Cell Lines and CRISPR-Cas9 Systems. J Chem Inf Model 2023; 63:7320-7329. [PMID: 37983481 DOI: 10.1021/acs.jcim.3c01339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Computationally predicting the efficiency of a guide RNA (gRNA) from its sequence is crucial to designing the CRISPR-Cas9 system. Currently, machine learning (ML)-based models are widely used for such predictions. However, these ML models often show performance imbalance when applied to multiple data sets from diverse sources, hindering the practical utilization of these tools. To address this issue, we propose a Michaelis-Menten theoretical framework that integrates information from multiple data sets. We demonstrate that the binding free energy can serve as a useful invariant that bridges the data from different experimental setups. Building upon this framework, we develop a new ML model called Uni-deepSG. This model exhibits broad applicability on 27 data sets with different cell types, Cas9 variants, and gRNA designs. Our work confirms the existence of a generalized model for predicting gRNA efficiency and lays the theoretical groundwork necessary to finalize such a model.
Collapse
Affiliation(s)
- Zhicheng Zhong
- Department of Physics, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Zeying Li
- Department of Physics, University of Science and Technology of China, Hefei 230026, Anhui, China
| | - Jie Yang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Qian Wang
- Department of Physics, University of Science and Technology of China, Hefei 230026, Anhui, China
| |
Collapse
|
6
|
Störtz F, Mak JK, Minary P. piCRISPR: Physically informed deep learning models for CRISPR/Cas9 off-target cleavage prediction. ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES 2023; 3:None. [PMID: 38047242 PMCID: PMC10316064 DOI: 10.1016/j.ailsci.2023.100075] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 04/02/2023] [Accepted: 04/30/2023] [Indexed: 12/05/2023]
Abstract
CRISPR/Cas programmable nuclease systems have become ubiquitous in the field of gene editing. With progressing development, applications in in vivo therapeutic gene editing are increasingly within reach, yet limited by possible adverse side effects from unwanted edits. Recent years have thus seen continuous development of off-target prediction algorithms trained on in vitro cleavage assay data gained from immortalised cell lines. It has been shown that in contrast to experimental epigenetic features, computed physically informed features are so far underutilised despite bearing considerably larger correlation with cleavage activity. Here, we implement state-of-the-art deep learning algorithms and feature encodings for off-target prediction with emphasis on physically informed features that capture the biological environment of the cleavage site, hence terming our approach piCRISPR. Features were gained from the large, diverse crisprSQL off-target cleavage dataset. We find that our best-performing models highlight the importance of sequence context and chromatin accessibility for cleavage prediction and compare favourably with literature standard prediction performance. We further show that our novel, environmentally sensitive features are crucial to accurate prediction on sequence-identical locus pairs, making them highly relevant for clinical guide design. The source code and trained models can be found ready to use at github.com/florianst/picrispr.
Collapse
Affiliation(s)
- Florian Störtz
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| | - Jeffrey K. Mak
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| | - Peter Minary
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| |
Collapse
|
7
|
Motoche-Monar C, Ordoñez JE, Chang O, Gonzales-Zubiate FA. gRNA Design: How Its Evolution Impacted on CRISPR/Cas9 Systems Refinement. Biomolecules 2023; 13:1698. [PMID: 38136570 PMCID: PMC10741458 DOI: 10.3390/biom13121698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 06/05/2023] [Accepted: 06/12/2023] [Indexed: 12/24/2023] Open
Abstract
Over the past decade, genetic engineering has witnessed a revolution with the emergence of a relatively new genetic editing tool based on RNA-guided nucleases: the CRISPR/Cas9 system. Since the first report in 1987 and characterization in 2007 as a bacterial defense mechanism, this system has garnered immense interest and research attention. CRISPR systems provide immunity to bacteria against invading genetic material; however, with specific modifications in sequence and structure, it becomes a precise editing system capable of modifying the genomes of a wide range of organisms. The refinement of these modifications encompasses diverse approaches, including the development of more accurate nucleases, understanding of the cellular context and epigenetic conditions, and the re-designing guide RNAs (gRNAs). Considering the critical importance of the correct performance of CRISPR/Cas9 systems, our scope will emphasize the latter approach. Hence, we present an overview of the past and the most recent guide RNA web-based design tools, highlighting the evolution of their computational architecture and gRNA characteristics over the years. Our study explains computational approaches that use machine learning techniques, neural networks, and gRNA/target interactions data to enable predictions and classifications. This review could open the door to a dynamic community that uses up-to-date algorithms to optimize and create promising gRNAs, suitable for modern CRISPR/Cas9 engineering.
Collapse
Affiliation(s)
- Cristofer Motoche-Monar
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
| | - Julián E. Ordoñez
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
| | - Oscar Chang
- Departamento de Electrónica, Universidad Simon Bolivar, Caracas 1080, Venezuela
- MIND Research Group, Model Intelligent Networks Development, Urcuquí 100119, Ecuador
| | - Fernando A. Gonzales-Zubiate
- School of Biological Sciences and Engineering, Yachay Tech University, Urcuquí 100119, Ecuador
- MIND Research Group, Model Intelligent Networks Development, Urcuquí 100119, Ecuador
| |
Collapse
|
8
|
Liu Y, Fan R, Yi J, Cui Q, Cui C. A fusion framework of deep learning and machine learning for predicting sgRNA cleavage efficiency. Comput Biol Med 2023; 165:107476. [PMID: 37696181 DOI: 10.1016/j.compbiomed.2023.107476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/22/2023] [Accepted: 09/04/2023] [Indexed: 09/13/2023]
Abstract
CRISPR/Cas9 system is a powerful tool for genome editing. Numerous studies have shown that sgRNAs can strongly affect the efficiency of editing. However, it is still not clear what rules should be followed for designing sgRNA with high cleavage efficiency. At present, several machine learning or deep learning methods have been developed to predict the cleavage efficiency of sgRNAs, however, the prediction accuracy of these tools is still not satisfactory. Here we propose a fusion framework of deep learning and machine learning, which first deals with the primary sequence and secondary structure features of the sgRNAs using both convolutional neural network (CNN) and recurrent neural network (RNN), and then uses the features extracted by the deep neural network to train a conventional machine learning model with LGBM. As a result, the new approach overwhelmed previous methods. The Spearman's correlation coefficient between predicted and measured sgRNA cleavage efficiency of our model (0.917) is improved by over 5% compared with the most advanced method (0.865), and the mean square error reduces from 7.89 × 10-3 to 4.75 × 10-3. Finally, we developed an online tool, CRISep (http://www.cuilab.cn/CRISep), to evaluate the availability of sgRNAs based on our models.
Collapse
Affiliation(s)
- Yu Liu
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Rui Fan
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Jingkun Yi
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Qinghua Cui
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China.
| | - Chunmei Cui
- Department of Biomedical Informatics, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China.
| |
Collapse
|
9
|
Zhang G, Luo Y, Dai X, Dai Z. Benchmarking deep learning methods for predicting CRISPR/Cas9 sgRNA on- and off-target activities. Brief Bioinform 2023; 24:bbad333. [PMID: 37775147 DOI: 10.1093/bib/bbad333] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 08/31/2023] [Accepted: 09/04/2023] [Indexed: 10/01/2023] Open
Abstract
In silico design of single guide RNA (sgRNA) plays a critical role in clustered regularly interspaced, short palindromic repeats/CRISPR-associated protein 9 (CRISPR/Cas9) system. Continuous efforts are aimed at improving sgRNA design with efficient on-target activity and reduced off-target mutations. In the last 5 years, an increasing number of deep learning-based methods have achieved breakthrough performance in predicting sgRNA on- and off-target activities. Nevertheless, it is worthwhile to systematically evaluate these methods for their predictive abilities. In this review, we conducted a systematic survey on the progress in prediction of on- and off-target editing. We investigated the performances of 10 mainstream deep learning-based on-target predictors using nine public datasets with different sample sizes. We found that in most scenarios, these methods showed superior predictive power on large- and medium-scale datasets than on small-scale datasets. In addition, we performed unbiased experiments to provide in-depth comparison of eight representative approaches for off-target prediction on 12 publicly available datasets with various imbalanced ratios of positive/negative samples. Most methods showed excellent performance on balanced datasets but have much room for improvement on moderate- and severe-imbalanced datasets. This study provides comprehensive perspectives on CRISPR/Cas9 sgRNA on- and off-target activity prediction and improvement for method development.
Collapse
Affiliation(s)
- Guishan Zhang
- College of Engineering, Shantou University, Shantou 515063, China
| | - Ye Luo
- College of Engineering, Shantou University, Shantou 515063, China
| | - Xianhua Dai
- School of Cyber Science and Technology, Sun Yat-sen University, Shenzhen 518107, China
- Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai 519000, China
| | - Zhiming Dai
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510006, China
- Guangdong Province Key Laboratory of Big Data Analysis and Processing, Sun Yat-sen University, Guangzhou 510006, China
| |
Collapse
|
10
|
Ham DT, Browne TS, Banglorewala PN, Wilson TL, Michael RK, Gloor GB, Edgell DR. A generalizable Cas9/sgRNA prediction model using machine transfer learning with small high-quality datasets. Nat Commun 2023; 14:5514. [PMID: 37679324 PMCID: PMC10485023 DOI: 10.1038/s41467-023-41143-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/24/2023] [Indexed: 09/09/2023] Open
Abstract
The CRISPR/Cas9 nuclease from Streptococcus pyogenes (SpCas9) can be used with single guide RNAs (sgRNAs) as a sequence-specific antimicrobial agent and as a genome-engineering tool. However, current bacterial sgRNA activity models struggle with accurate predictions and do not generalize well, possibly because the underlying datasets used to train the models do not accurately measure SpCas9/sgRNA activity and cannot distinguish on-target cleavage from toxicity. Here, we solve this problem by using a two-plasmid positive selection system to generate high-quality data that more accurately reports on SpCas9/sgRNA cleavage and that separates activity from toxicity. We develop a machine learning architecture (crisprHAL) that can be trained on existing datasets, that shows marked improvements in sgRNA activity prediction accuracy when transfer learning is used with small amounts of high-quality data, and that can generalize predictions to different bacteria. The crisprHAL model recapitulates known SpCas9/sgRNA-target DNA interactions and provides a pathway to a generalizable sgRNA bacterial activity prediction tool that will enable accurate antimicrobial and genome engineering applications.
Collapse
Affiliation(s)
- Dalton T Ham
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Tyler S Browne
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | - Pooja N Banglorewala
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada
| | | | | | - Gregory B Gloor
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| | - David R Edgell
- Department of Biochemistry, Schulich School of Medicine and Dentistry, London, ON, N6A5C1, Canada.
| |
Collapse
|
11
|
Lee M. Deep learning in CRISPR-Cas systems: a review of recent studies. Front Bioeng Biotechnol 2023; 11:1226182. [PMID: 37469443 PMCID: PMC10352112 DOI: 10.3389/fbioe.2023.1226182] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 06/22/2023] [Indexed: 07/21/2023] Open
Abstract
In genetic engineering, the revolutionary CRISPR-Cas system has proven to be a vital tool for precise genome editing. Simultaneously, the emergence and rapid evolution of deep learning methodologies has provided an impetus to the scientific exploration of genomic data. These concurrent advancements mandate regular investigation of the state-of-the-art, particularly given the pace of recent developments. This review focuses on the significant progress achieved during 2019-2023 in the utilization of deep learning for predicting guide RNA (gRNA) activity in the CRISPR-Cas system, a key element determining the effectiveness and specificity of genome editing procedures. In this paper, an analytical overview of contemporary research is provided, with emphasis placed on the amalgamation of artificial intelligence and genetic engineering. The importance of our review is underscored by the necessity to comprehend the rapidly evolving deep learning methodologies and their potential impact on the effectiveness of the CRISPR-Cas system. By analyzing recent literature, this review highlights the achievements and emerging trends in the integration of deep learning with the CRISPR-Cas systems, thus contributing to the future direction of this essential interdisciplinary research area.
Collapse
|
12
|
Sherkatghanad Z, Abdar M, Charlier J, Makarenkov V. Using traditional machine learning and deep learning methods for on- and off-target prediction in CRISPR/Cas9: a review. Brief Bioinform 2023; 24:7130974. [PMID: 37080758 DOI: 10.1093/bib/bbad131] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 03/07/2023] [Accepted: 03/13/2023] [Indexed: 04/22/2023] Open
Abstract
CRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated protein 9) is a popular and effective two-component technology used for targeted genetic manipulation. It is currently the most versatile and accurate method of gene and genome editing, which benefits from a large variety of practical applications. For example, in biomedicine, it has been used in research related to cancer, virus infections, pathogen detection, and genetic diseases. Current CRISPR/Cas9 research is based on data-driven models for on- and off-target prediction as a cleavage may occur at non-target sequence locations. Nowadays, conventional machine learning and deep learning methods are applied on a regular basis to accurately predict on-target knockout efficacy and off-target profile of given single-guide RNAs (sgRNAs). In this paper, we present an overview and a comparative analysis of traditional machine learning and deep learning models used in CRISPR/Cas9. We highlight the key research challenges and directions associated with target activity prediction. We discuss recent advances in the sgRNA-DNA sequence encoding used in state-of-the-art on- and off-target prediction models. Furthermore, we present the most popular deep learning neural network architectures used in CRISPR/Cas9 prediction models. Finally, we summarize the existing challenges and discuss possible future investigations in the field of on- and off-target prediction. Our paper provides valuable support for academic and industrial researchers interested in the application of machine learning methods in the field of CRISPR/Cas9 genome editing.
Collapse
Affiliation(s)
- Zeinab Sherkatghanad
- Departement d'Informatique, Universite du Quebec a Montreal, H2X 3Y7, Montreal, QC, Canada
| | - Moloud Abdar
- Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, 3216, Geelong, VIC, Australia
| | - Jeremy Charlier
- Departement d'Informatique, Universite du Quebec a Montreal, H2X 3Y7, Montreal, QC, Canada
| | - Vladimir Makarenkov
- Departement d'Informatique, Universite du Quebec a Montreal, H2X 3Y7, Montreal, QC, Canada
| |
Collapse
|
13
|
Vora DS, Yadav S, Sundar D. Hybrid Multitask Learning Reveals Sequence Features Driving Specificity in the CRISPR/Cas9 System. Biomolecules 2023; 13:biom13040641. [PMID: 37189388 DOI: 10.3390/biom13040641] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 03/27/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
CRISPR/Cas9 technology is capable of precisely editing genomes and is at the heart of various scientific and medical advances in recent times. The advances in biomedical research are hindered because of the inadvertent burden on the genome when genome editors are employed—the off-target effects. Although experimental screens to detect off-targets have allowed understanding the activity of Cas9, that knowledge remains incomplete as the rules do not extrapolate well to new target sequences. Off-target prediction tools developed recently have increasingly relied on machine learning and deep learning techniques to reliably understand the complete threat of likely off-targets because the rules that drive Cas9 activity are not fully understood. In this study, we present a count-based as well as deep-learning-based approach to derive sequence features that are important in deciding on Cas9 activity at a sequence. There are two major challenges in off-target determination—the identification of a likely site of Cas9 activity and the prediction of the extent of Cas9 activity at that site. The hybrid multitask CNN–biLSTM model developed, named CRISP–RCNN, simultaneously predicts off-targets and the extent of activity on off-targets. Employing methods of integrated gradients and weighting kernels for feature importance approximation, analysis of nucleotide and position preference, and mismatch tolerance have been performed.
Collapse
Affiliation(s)
- Dhvani Sandip Vora
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| | - Shashank Yadav
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| | - Durai Sundar
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
- Yardi School of Artificial Intelligence, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India
| |
Collapse
|
14
|
Dallo T, Krishnakumar R, Kolker SD, Ruffing AM. High-Density Guide RNA Tiling and Machine Learning for Designing CRISPR Interference in Synechococcus sp. PCC 7002. ACS Synth Biol 2023; 12:1175-1186. [PMID: 36893454 DOI: 10.1021/acssynbio.2c00653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/11/2023]
Abstract
While CRISPRi was previously established in Synechococcus sp. PCC 7002 (hereafter 7002), the design principles for guide RNA (gRNA) effectiveness remain largely unknown. Here, 76 strains of 7002 were constructed with gRNAs targeting three reporter systems to evaluate features that impact gRNA efficiency. Correlation analysis of the data revealed that important features of gRNA design include the position relative to the start codon, GC content, protospacer adjacent motif (PAM) site, minimum free energy, and targeted DNA strand. Unexpectedly, some gRNAs targeting upstream of the promoter region showed small but significant increases in reporter expression, and gRNAs targeting the terminator region showed greater repression than gRNAs targeting the 3' end of the coding sequence. Machine learning algorithms enabled prediction of gRNA effectiveness, with Random Forest having the best performance across all training sets. This study demonstrates that high-density gRNA data and machine learning can improve gRNA design for tuning gene expression in 7002.
Collapse
Affiliation(s)
- Tessa Dallo
- Molecular and Microbiology, Sandia National Laboratories, P.O. Box 5800, MS 1413, Albuquerque, New Mexico 87185, United States
| | - Raga Krishnakumar
- Systems Biology, Sandia National Laboratories, P.O. Box 969, MS 9292, Livermore, California 94551, United States
| | - Stephanie D Kolker
- Molecular and Microbiology, Sandia National Laboratories, P.O. Box 5800, MS 1413, Albuquerque, New Mexico 87185, United States
| | - Anne M Ruffing
- Molecular and Microbiology, Sandia National Laboratories, P.O. Box 5800, MS 1413, Albuquerque, New Mexico 87185, United States
| |
Collapse
|
15
|
Wan Y, Jiang Z. TransCrispr: Transformer Based Hybrid Model for Predicting CRISPR/Cas9 Single Guide RNA Cleavage Efficiency. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1518-1528. [PMID: 36006888 DOI: 10.1109/tcbb.2022.3201631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
CRISPR/Cas9 is a widely used genome editing tool for site-directed modification of deoxyribonucleic acid (DNA) nucleotide sequences. However, how to accurately predict and evaluate the on- and off-target effects of single guide RNA (sgRNA) is one of the key problems for CRISPR/Cas9 system. Using computational methods to obtain high cell-specific sensitivity and specificity is a prerequisite for the optimal design of sgRNAs. Inspired by the work of predecessors, we found that sgRNA on-target knockout efficacy was not only related to the original sequence but also affected by important biological features. Hence, we introduce a novel approach called TransCrispr, which integrates Transformer and convolutional neural network (CNN) architecture to predict sgRNA knockout efficacy. Firstly, we encode the sequence data and send the transformed sgRNA sequence, positional information, and biological features into the network as input. Then, the convolutional neural network will automatically learn an appropriate feature representation for the sgRNA sequence and combine it with the positional information for self-attention learning of the Transformer. Finally, a regression score is generated by predicting biological features. Experiments on seven public datasets illustrate that TransCrispr outperforms state-of-the-art methods in terms of prediction accuracy and generalization ability.
Collapse
|
16
|
Alipanahi R, Safari L, Khanteymoori A. CRISPR genome editing using computational approaches: A survey. FRONTIERS IN BIOINFORMATICS 2023; 2:1001131. [PMID: 36710911 PMCID: PMC9875887 DOI: 10.3389/fbinf.2022.1001131] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 12/19/2022] [Indexed: 01/13/2023] Open
Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR)-based gene editing has been widely used in various cell types and organisms. To make genome editing with Clustered regularly interspaced short palindromic repeats far more precise and practical, we must concentrate on the design of optimal gRNA and the selection of appropriate Cas enzymes. Numerous computational tools have been created in recent years to help researchers design the best gRNA for Clustered regularly interspaced short palindromic repeats researches. There are two approaches for designing an appropriate gRNA sequence (which targets our desired sites with high precision): experimental and predicting-based approaches. It is essential to reduce off-target sites when designing an optimal gRNA. Here we review both traditional and machine learning-based approaches for designing an appropriate gRNA sequence and predicting off-target sites. In this review, we summarize the key characteristics of all available tools (as far as possible) and compare them together. Machine learning-based tools and web servers are believed to become the most effective and reliable methods for predicting on-target and off-target activities of Clustered regularly interspaced short palindromic repeats in the future. However, these predictions are not so precise now and the performance of these algorithms -especially deep learning one's-depends on the amount of data used during training phase. So, as more features are discovered and incorporated into these models, predictions become more in line with experimental observations. We must concentrate on the creation of ideal gRNA and the choice of suitable Cas enzymes in order to make genome editing with Clustered regularly interspaced short palindromic repeats far more accurate and feasible.
Collapse
Affiliation(s)
| | - Leila Safari
- Department of Computer Engineering, University of Zanjan, Zanjan, Iran,*Correspondence: Leila Safari,
| | | |
Collapse
|
17
|
Comprehensive Review on the Use of Artificial Intelligence in Ophthalmology and Future Research Directions. Diagnostics (Basel) 2022; 13:diagnostics13010100. [PMID: 36611392 PMCID: PMC9818832 DOI: 10.3390/diagnostics13010100] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 12/12/2022] [Accepted: 12/26/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Having several applications in medicine, and in ophthalmology in particular, artificial intelligence (AI) tools have been used to detect visual function deficits, thus playing a key role in diagnosing eye diseases and in predicting the evolution of these common and disabling diseases. AI tools, i.e., artificial neural networks (ANNs), are progressively involved in detecting and customized control of ophthalmic diseases. The studies that refer to the efficiency of AI in medicine and especially in ophthalmology were analyzed in this review. MATERIALS AND METHODS We conducted a comprehensive review in order to collect all accounts published between 2015 and 2022 that refer to these applications of AI in medicine and especially in ophthalmology. Neural networks have a major role in establishing the demand to initiate preliminary anti-glaucoma therapy to stop the advance of the disease. RESULTS Different surveys in the literature review show the remarkable benefit of these AI tools in ophthalmology in evaluating the visual field, optic nerve, and retinal nerve fiber layer, thus ensuring a higher precision in detecting advances in glaucoma and retinal shifts in diabetes. We thus identified 1762 applications of artificial intelligence in ophthalmology: review articles and research articles (301 pub med, 144 scopus, 445 web of science, 872 science direct). Of these, we analyzed 70 articles and review papers (diabetic retinopathy (N = 24), glaucoma (N = 24), DMLV (N = 15), other pathologies (N = 7)) after applying the inclusion and exclusion criteria. CONCLUSION In medicine, AI tools are used in surgery, radiology, gynecology, oncology, etc., in making a diagnosis, predicting the evolution of a disease, and assessing the prognosis in patients with oncological pathologies. In ophthalmology, AI potentially increases the patient's access to screening/clinical diagnosis and decreases healthcare costs, mainly when there is a high risk of disease or communities face financial shortages. AI/DL (deep learning) algorithms using both OCT and FO images will change image analysis techniques and methodologies. Optimizing these (combined) technologies will accelerate progress in this area.
Collapse
|
18
|
Comprehensive computational analysis of epigenetic descriptors affecting CRISPR-Cas9 off-target activity. BMC Genomics 2022; 23:805. [PMID: 36474180 PMCID: PMC9724382 DOI: 10.1186/s12864-022-09012-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 10/17/2022] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND A common issue in CRISPR-Cas9 genome editing is off-target activity, which prevents the widespread use of CRISPR-Cas9 in medical applications. Among other factors, primary chromatin structure and epigenetics may influence off-target activity. METHODS In this work, we utilize crisprSQL, an off-target database, to analyze the effect of 19 epigenetic descriptors on CRISPR-Cas9 off-target activity. Termed as 19 epigenetic features/scores, they consist of 6 experimental epigenetic and 13 computed nucleosome organization-related features. In terms of novel features, 15 of the epigenetic scores are newly considered. The 15 newly considered scores consist of 13 freshly computed nucleosome occupancy/positioning scores and 2 experimental features (MNase and DRIP). The other 4 existing scores are experimental features (CTCF, DNase I, H3K4me3, RRBS) commonly used in deep learning models for off-target activity prediction. For data curation, MNase was aggregated from existing experimental nucleosome occupancy data. Based on the sequence context information available in crisprSQL, we also computed nucleosome occupancy/positioning scores for off-target sites. RESULTS To investigate the relationship between the 19 epigenetic features and off-target activity, we first conducted Spearman and Pearson correlation analysis. Such analysis shows that some computed scores derived from training-based models and training-free algorithms outperform all experimental epigenetic features. Next, we evaluated the contribution of all epigenetic features in two successful machine/deep learning models which predict off-target activity. We found that some computed scores, unlike all 6 experimental features, significantly contribute to the predictions of both models. As a practical research contribution, we make the off-target dataset containing all 19 epigenetic features available to the research community. CONCLUSIONS Our comprehensive computational analysis helps the CRISPR-Cas9 community better understand the relationship between epigenetic features and CRISPR-Cas9 off-target activity.
Collapse
|
19
|
EpiCas-DL: Predicting sgRNA activity for CRISPR-mediated epigenome editing by deep learning. Comput Struct Biotechnol J 2022; 21:202-211. [PMID: 36582444 PMCID: PMC9763632 DOI: 10.1016/j.csbj.2022.11.034] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 11/15/2022] [Accepted: 11/15/2022] [Indexed: 11/21/2022] Open
Abstract
CRISPR-mediated epigenome editing enables gene expression regulation without changing the underlying DNA sequence, and thus has vast potential for basic research and gene therapy. Effective selection of a single guide RNA (sgRNA) with high on-target efficiency and specificity would facilitate the application of epigenome editing tools. Here we performed an extensive analysis of CRISPR-mediated epigenome editing tools on thousands of experimentally examined on-target sites and established EpiCas-DL, a deep learning framework to optimize sgRNA design for gene silencing or activation. EpiCas-DL achieves high accuracy in sgRNA activity prediction for targeted gene silencing or activation and outperforms other available in silico methods. In addition, EpiCas-DL also identifies both epigenetic and sequence features that affect sgRNA efficacy in gene silencing and activation, facilitating the application of epigenome editing for research and therapy. EpiCas-DL is available at http://www.sunlab.fun:3838/EpiCas-DL.
Collapse
|
20
|
Integration of CRISPR/Cas9 with artificial intelligence for improved cancer therapeutics. J Transl Med 2022; 20:534. [PMID: 36401282 PMCID: PMC9673220 DOI: 10.1186/s12967-022-03765-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 11/08/2022] [Indexed: 11/19/2022] Open
Abstract
Gene editing has great potential in treating diseases caused by well-characterized molecular alterations. The introduction of clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 (Cas9)–based gene-editing tools has substantially improved the precision and efficiency of gene editing. The CRISPR/Cas9 system offers several advantages over the existing gene-editing approaches, such as its ability to target practically any genomic sequence, enabling the rapid development and deployment of novel CRISPR-mediated knock-out/knock-in methods. CRISPR/Cas9 has been widely used to develop cancer models, validate essential genes as druggable targets, study drug-resistance mechanisms, explore gene non-coding areas, and develop biomarkers. CRISPR gene editing can create more-effective chimeric antigen receptor (CAR)-T cells that are durable, cost-effective, and more readily available. However, further research is needed to define the CRISPR/Cas9 system’s pros and cons, establish best practices, and determine social and ethical implications. This review summarizes recent CRISPR/Cas9 developments, particularly in cancer research and immunotherapy, and the potential of CRISPR/Cas9-based screening in developing cancer precision medicine and engineering models for targeted cancer therapy, highlighting the existing challenges and future directions. Lastly, we highlight the role of artificial intelligence in refining the CRISPR system's on-target and off-target effects, a critical factor for the broader application in cancer therapeutics.
Collapse
|
21
|
Panda G, Ray A. Decrypting the mechanistic basis of CRISPR/Cas9 protein. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 172:60-76. [PMID: 35577099 DOI: 10.1016/j.pbiomolbio.2022.05.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 04/14/2022] [Accepted: 05/10/2022] [Indexed: 12/25/2022]
Abstract
CRISPR/Cas system, a newly but extensively investigated genome-editing method, harbors practical solutions for various genetic problems. It relies on short guide RNAs (gRNAs) to recruit the Cas9 protein, a DNA cleaving enzyme, to its genomic target DNAs. The Cas9 enzyme exhibits some unique properties, like the ability to differentiate self vs. non-self - DNA strands using the base-pairing potential of crRNA, i.e., only CRISPR DNA is entirely complementary to the CRISPR repeat sequences at the crRNA whereas the presence of mismatches in the upstream region of the spacer permit CRISPR interference which is inhibited in case of CRISPR-DNA, allosteric regulation in its domains, and domain reorientation on sgRNA binding. Several groups have contributed their efforts in understanding the functioning of the CRISPR/Cas system, but even then, there is a lot more to explore in this area. The structural and sequence-based understanding of the whole CRISPR-associated bacterial ortholog family landscape is still ambiguous. A better understanding of the underlying energetics of the CRISPR/Cas9 system should reveal critical parameters to design better CRISPR/Cas9s.
Collapse
Affiliation(s)
- Gayatri Panda
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| | - Arjun Ray
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India.
| |
Collapse
|
22
|
Niu M, Zou Q. SgRNA-RF: Identification of SgRNA On-Target Activity With Imbalanced Datasets. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2442-2453. [PMID: 33979289 DOI: 10.1109/tcbb.2021.3079116] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Single-guide RNA is a guide RNA (gRNA), which guides the insertion or deletion of uridine residues into kinetoplastid during RNA editing. It is a small non-coding RNA that can be combined with pre -mRNA pairing. SgRNA is a critical component of the CRISPR/Cas9 gene knockout system and play an important role in gene editing and gene regulation. It is important to accurately and quickly identify highly on-target activity sgRNAs. Due to its importance, several computational predictors have been proposed to predict sgRNAs on-target activity. All these methods have clearly contributed to the development of this very important field. However, they also have certain limitations. In the paper, we developed a new classifier SgRNA-RF, which extracts the features of nucleic acid composition and structure of on-target activity sgRNA sequence and identified by random forest algorithm. In addition to solving an imbalanced dataset, this paper proposed a new method called CS-Smote. We compared sgRNA-RF with state-of-the-art predictors on the five datasets, and found SgRNA-RF significantly improved the identification accuracy, with accuracies of 0.8636,0.9161,0.894,0.938,0.965,0.77,0.979,0.973, respectively. The user-friendly web server that implements sgRNA-RF is freely available at http://server.malab.cn/sgRNA-RF/.
Collapse
|
23
|
Mattiello L, Rütgers M, Sua-Rojas MF, Tavares R, Soares JS, Begcy K, Menossi M. Molecular and Computational Strategies to Increase the Efficiency of CRISPR-Based Techniques. FRONTIERS IN PLANT SCIENCE 2022; 13:868027. [PMID: 35712599 PMCID: PMC9194676 DOI: 10.3389/fpls.2022.868027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 04/27/2022] [Indexed: 06/15/2023]
Abstract
The prokaryote-derived Clustered Regularly Interspaced Palindromic Repeats (CRISPR)/Cas mediated gene editing tools have revolutionized our ability to precisely manipulate specific genome sequences in plants and animals. The simplicity, precision, affordability, and robustness of this technology have allowed a myriad of genomes from a diverse group of plant species to be successfully edited. Even though CRISPR/Cas, base editing, and prime editing technologies have been rapidly adopted and implemented in plants, their editing efficiency rate and specificity varies greatly. In this review, we provide a critical overview of the recent advances in CRISPR/Cas9-derived technologies and their implications on enhancing editing efficiency. We highlight the major efforts of engineering Cas9, Cas12a, Cas12b, and Cas12f proteins aiming to improve their efficiencies. We also provide a perspective on the global future of agriculturally based products using DNA-free CRISPR/Cas techniques. The improvement of CRISPR-based technologies efficiency will enable the implementation of genome editing tools in a variety of crop plants, as well as accelerate progress in basic research and molecular breeding.
Collapse
Affiliation(s)
- Lucia Mattiello
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, State University of Campinas (UNICAMP), Campinas, Brazil
| | - Mark Rütgers
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, State University of Campinas (UNICAMP), Campinas, Brazil
| | - Maria Fernanda Sua-Rojas
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, State University of Campinas (UNICAMP), Campinas, Brazil
| | - Rafael Tavares
- Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - José Sérgio Soares
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, State University of Campinas (UNICAMP), Campinas, Brazil
| | - Kevin Begcy
- Environmental Horticulture Department, University of Florida, Gainesville, FL, United States
| | - Marcelo Menossi
- Department of Genetics, Evolution, Microbiology and Immunology, Institute of Biology, State University of Campinas (UNICAMP), Campinas, Brazil
| |
Collapse
|
24
|
Konstantakos V, Nentidis A, Krithara A, Paliouras G. CRISPR-Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning. Nucleic Acids Res 2022; 50:3616-3637. [PMID: 35349718 PMCID: PMC9023298 DOI: 10.1093/nar/gkac192] [Citation(s) in RCA: 51] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 03/09/2022] [Accepted: 03/28/2022] [Indexed: 12/26/2022] Open
Abstract
The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) system has become a successful and promising technology for gene-editing. To facilitate its effective application, various computational tools have been developed. These tools can assist researchers in the guide RNA (gRNA) design process by predicting cleavage efficiency and specificity and excluding undesirable targets. However, while many tools are available, assessment of their application scenarios and performance benchmarks are limited. Moreover, new deep learning tools have been explored lately for gRNA efficiency prediction, but have not been systematically evaluated. Here, we discuss the approaches that pertain to the on-target activity problem, focusing mainly on the features and computational methods they utilize. Furthermore, we evaluate these tools on independent datasets and give some suggestions for their usage. We conclude with some challenges and perspectives about future directions for CRISPR-Cas9 guide design.
Collapse
Affiliation(s)
- Vasileios Konstantakos
- Institute of Informatics and Telecommunications, NCSR Demokritos, Patr. Gregoriou E & 27 Neapoleos Str, 15341 Athens, Greece
| | - Anastasios Nentidis
- Institute of Informatics and Telecommunications, NCSR Demokritos, Patr. Gregoriou E & 27 Neapoleos Str, 15341 Athens, Greece
- School of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
| | - Anastasia Krithara
- Institute of Informatics and Telecommunications, NCSR Demokritos, Patr. Gregoriou E & 27 Neapoleos Str, 15341 Athens, Greece
| | - Georgios Paliouras
- Institute of Informatics and Telecommunications, NCSR Demokritos, Patr. Gregoriou E & 27 Neapoleos Str, 15341 Athens, Greece
| |
Collapse
|
25
|
Li B, Ai D, Liu X. CNN-XG: A Hybrid Framework for sgRNA On-Target Prediction. Biomolecules 2022; 12:409. [PMID: 35327601 PMCID: PMC8945678 DOI: 10.3390/biom12030409] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 02/23/2022] [Accepted: 03/03/2022] [Indexed: 02/04/2023] Open
Abstract
As the third generation gene editing technology, Crispr/Cas9 has a wide range of applications. The success of Crispr depends on the editing of the target gene via a functional complex of sgRNA and Cas9 proteins. Therefore, highly specific and high on-target cleavage efficiency sgRNA can make this process more accurate and efficient. Although there are already many sophisticated machine learning or deep learning models to predict the on-target cleavage efficiency of sgRNA, prediction accuracy remains to be improved. XGBoost is good at classification as the ensemble model could overcome the deficiency of a single classifier to classify, and we would like to improve the prediction efficiency for sgRNA on-target activity by introducing XGBoost into the model. We present a novel machine learning framework which combines a convolutional neural network (CNN) and XGBoost to predict sgRNA on-target knockout efficacy. Our framework, called CNN-XG, is mainly composed of two parts: a feature extractor CNN is used to automatically extract features from sequences and predictor XGBoost is applied to predict features extracted after convolution. Experiments on commonly used datasets show that CNN-XG performed significantly better than other existing frameworks in the predicted classification mode.
Collapse
Affiliation(s)
- Bohao Li
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (B.L.); (D.A.)
| | - Dongmei Ai
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (B.L.); (D.A.)
- Basic Experimental Center of Natural Science, University of Science and Technology Beijing, Beijing 100083, China
| | - Xiuqin Liu
- School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China; (B.L.); (D.A.)
| |
Collapse
|
26
|
A systematic mapping study on machine learning techniques for the prediction of CRISPR/Cas9 sgRNA target cleavage. Comput Struct Biotechnol J 2022; 20:5813-5823. [PMID: 36382194 PMCID: PMC9630617 DOI: 10.1016/j.csbj.2022.10.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 09/21/2022] [Accepted: 10/08/2022] [Indexed: 11/30/2022] Open
Abstract
CRISPR/Cas9 technology has greatly accelerated genome engineering research. The CRISPR/Cas9 complex, a bacterial immune response system, is widely adopted for RNA-driven targeted genome editing. The systematic mapping study presented in this paper examines the literature on machine learning (ML) techniques employed in the prediction of CRISPR/Cas9 sgRNA on/off-target cleavage, focusing on improving support in sgRNA design activities and identifying areas currently being researched. This area of research has greatly expanded recently, and we found it appropriate to work on a Systematic Mapping Study (SMS), an investigation that has proven to be an effective secondary study method. Unlike a classic review, in an SMS, no comparison of methods or results is made, while this task can instead be the subject of a systematic literature review that chooses one theme among those highlighted in this SMS. The study is illustrated in this paper. To the best of the authors' knowledge, no other SMS studies have been published on this topic. Fifty-seven papers published in the period 2017–2022 (April, 30) were analyzed. This study reveals that the most widely used ML model is the convolutional neural network (CNN), followed by the feedforward neural network (FNN), while the use of other models is marginal. Other interesting information has emerged, such as the wide availability of both open code and platforms dedicated to supporting the activity of researchers or the fact that there is a clear prevalence of public funds that finance research on this topic.
Collapse
|
27
|
Abstract
:
Clustered regularly interspaced short palindromic repeats along with CRISPR-associated protein
mechanisms preserve the memory of previous experiences with DNA invaders, in particular spacers
that are embedded in CRISPR arrays between coordinate repeats. There has been a fast progression in
the comprehension of this immune system and its implementations; however, there are numerous points
of view that anticipate explanations to make the field an energetic research zone. The efficiency of
CRISPR-Cas depends upon well-considered single guide RNA; for this purpose, many bioinformatics
methods and tools are created to support the design of greatly active and precise single guide RNA. Insilico
single guide RNA architecture is a crucial point for effective gene editing by means of the
CRISPR technique. Persistent attempts have been made to improve in-silico single guide RNA formulation
having great on-target effectiveness and decreased off-target effects. This review offers a summary
of the CRISPR computational tools to help different researchers pick a specific tool for their work according
to pros and cons, along with new thoughts to make new computational tools to overcome all existing
limitations.
Collapse
Affiliation(s)
- Mohsin Ali Nasir
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave,
West Hi-Tech Zone, Chengdu 611731, China
| | - Samia Nawaz
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave,
West Hi-Tech Zone, Chengdu 611731, China
| | - Jian Huang
- Center for Informational Biology, University of Electronic Science and Technology of China, No. 2006, Xiyuan Ave,
West Hi-Tech Zone, Chengdu 611731, China
| |
Collapse
|
28
|
Xiao LM, Wan YQ, Jiang ZR. AttCRISPR: a spacetime interpretable model for prediction of sgRNA on-target activity. BMC Bioinformatics 2021; 22:589. [PMID: 34903170 PMCID: PMC8667445 DOI: 10.1186/s12859-021-04509-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 12/01/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND More and more Cas9 variants with higher specificity are developed to avoid the off-target effect, which brings a significant volume of experimental data. Conventional machine learning performs poorly on these datasets, while the methods based on deep learning often lack interpretability, which makes researchers have to trade-off accuracy and interpretability. It is necessary to develop a method that can not only match deep learning-based methods in performance but also with good interpretability that can be comparable to conventional machine learning methods. RESULTS To overcome these problems, we propose an intrinsically interpretable method called AttCRISPR based on deep learning to predict the on-target activity. The advantage of AttCRISPR lies in using the ensemble learning strategy to stack available encoding-based methods and embedding-based methods with strong interpretability. Comparison with the state-of-the-art methods using WT-SpCas9, eSpCas9(1.1), SpCas9-HF1 datasets, AttCRISPR can achieve an average Spearman value of 0.872, 0.867, 0.867, respectively on several public datasets, which is superior to these methods. Furthermore, benefits from two attention modules-one spatial and one temporal, AttCRISPR has good interpretability. Through these modules, we can understand the decisions made by AttCRISPR at both global and local levels without other post hoc explanations techniques. CONCLUSION With the trained models, we reveal the preference for each position-dependent nucleotide on the sgRNA (short guide RNA) sequence in each dataset at a global level. And at a local level, we prove that the interpretability of AttCRISPR can be used to guide the researchers to design sgRNA with higher activity.
Collapse
Affiliation(s)
- Li-Ming Xiao
- School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China
| | - Yun-Qi Wan
- School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China
| | - Zhen-Ran Jiang
- School of Computer Science and Technology, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
29
|
Li X, Wang C, Peng T, Chai Z, Ni D, Liu Y, Zhang J, Chen T, Lu S. Atomic-scale insights into allosteric inhibition and evolutional rescue mechanism of Streptococcus thermophilus Cas9 by the anti-CRISPR protein AcrIIA6. Comput Struct Biotechnol J 2021; 19:6108-6124. [PMID: 34900128 PMCID: PMC8632846 DOI: 10.1016/j.csbj.2021.11.010] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 11/10/2021] [Accepted: 11/11/2021] [Indexed: 12/26/2022] Open
Abstract
CRISPR-Cas systems are prokaryotic adaptive immunity against invading phages and plasmids. Phages have evolved diverse protein inhibitors of CRISPR-Cas systems, called anti-CRISPR (Acr) proteins, to neutralize this CRISPR machinery. In response, bacteria have co-evolved Cas variants to escape phage's anti-CRISPR strategies, called anti-anti-CRISPR systems. Here we explore the anti-CRISPR allosteric inhibition and anti-anti-CRISPR rescue mechanisms between Streptococcus thermophilus Cas9 (St1Cas9) and the anti-CRISPR protein AcrIIA6 at the atomic level, by generating mutants of key residues in St1Cas9. Extensive unbiased molecular dynamics simulations show that the functional motions of St1Cas9 in the presence of AcrIIA6 differ substantially from those of St1Cas9 alone. AcrIIA6 binding triggers a shift of St1Cas9 conformational ensemble towards a less catalytically competent state; this state significantly compromises protospacer adjacent motif (PAM) recognition and nuclease activity by altering interdependently conformational dynamics and allosteric signals among nuclease domains, PAM-interacting (PI) regions, and AcrIIA6 binding motifs. Via in vitro DNA cleavage assays, we further elucidate the rescue mechanism of efficiently escaping AcrIIA6 inhibition harboring St1Cas9 triple mutations (G993K/K1008M/K1010E) in the PI domain and identify the evolutionary landscape of such mutational escape within species. Our results provide mechanistic insights into Acr proteins as natural brakes for the CRISPR-Cas systems and a promising potential for the design of allosteric Acr peptidomimetics.
Collapse
Affiliation(s)
- Xinyi Li
- Department of Cardiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| | - Chengxiang Wang
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| | - Ting Peng
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| | - Zongtao Chai
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital, Naval Medical University, Shanghai 200438, China
| | - Duan Ni
- The Charles Perkins Centre, University of Sydney, Sydney, NSW 2006, Australia
| | - Yaqin Liu
- Medicinal Chemistry and Bioinformatics Centre, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| | - Jian Zhang
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
- Medicinal Chemistry and Bioinformatics Centre, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| | - Ting Chen
- Department of Cardiology, Changzheng Hospital, Naval Medical University, Shanghai 200003, China
| | - Shaoyong Lu
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
- Medicinal Chemistry and Bioinformatics Centre, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| |
Collapse
|
30
|
Hesami M, Yoosefzadeh Najafabadi M, Adamek K, Torkamaneh D, Jones AMP. Synergizing Off-Target Predictions for In Silico Insights of CENH3 Knockout in Cannabis through CRISPR/Cas. Molecules 2021; 26:molecules26072053. [PMID: 33916717 PMCID: PMC8038328 DOI: 10.3390/molecules26072053] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/25/2021] [Accepted: 03/31/2021] [Indexed: 12/13/2022] Open
Abstract
The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas-mediated genome editing system has recently been used for haploid production in plants. Haploid induction using the CRISPR/Cas system represents an attractive approach in cannabis, an economically important industrial, recreational, and medicinal plant. However, the CRISPR system requires the design of precise (on-target) single-guide RNA (sgRNA). Therefore, it is essential to predict off-target activity of the designed sgRNAs to avoid unexpected outcomes. The current study is aimed to assess the predictive ability of three machine learning (ML) algorithms (radial basis function (RBF), support vector machine (SVM), and random forest (RF)) alongside the ensemble-bagging (E-B) strategy by synergizing MIT and cutting frequency determination (CFD) scores to predict sgRNA off-target activity through in silico targeting a histone H3-like centromeric protein, HTR12, in cannabis. The RF algorithm exhibited the highest precision, recall, and F-measure compared to all the tested individual algorithms with values of 0.61, 0.64, and 0.62, respectively. We then used the RF algorithm as a meta-classifier for the E-B method, which led to an increased precision with an F-measure of 0.62 and 0.66, respectively. The E-B algorithm had the highest area under the precision recall curves (AUC-PRC; 0.74) and area under the receiver operating characteristic (ROC) curves (AUC-ROC; 0.71), displaying the success of using E-B as one of the common ensemble strategies. This study constitutes a foundational resource of utilizing ML models to predict gRNA off-target activities in cannabis.
Collapse
Affiliation(s)
- Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.H.); (M.Y.N.); (K.A.); (D.T.)
| | | | - Kristian Adamek
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.H.); (M.Y.N.); (K.A.); (D.T.)
| | - Davoud Torkamaneh
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.H.); (M.Y.N.); (K.A.); (D.T.)
- Département de Phytologie, Université Laval, Québec City, QC G1V 0A6, Canada
| | - Andrew Maxwell Phineas Jones
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada; (M.H.); (M.Y.N.); (K.A.); (D.T.)
- Correspondence:
| |
Collapse
|
31
|
Niu M, Lin Y, Zou Q. sgRNACNN: identifying sgRNA on-target activity in four crops using ensembles of convolutional neural networks. PLANT MOLECULAR BIOLOGY 2021; 105:483-495. [PMID: 33385273 DOI: 10.1007/s11103-020-01102-y] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Accepted: 12/01/2020] [Indexed: 06/12/2023]
Abstract
KEY MESSAGE We proposed an ensemble convolutional neural network model to identify sgRNA high on-target activity in four crops and we used one-hot encoding and k-mers for sequence encoding. As an important component of the CRISPR/Cas9 system, single-guide RNA (sgRNA) plays an important role in gene redirection and editing. sgRNA has played an important role in the improvement of agronomic species, but there is a lack of effective bioinformatics tools to identify the activity of sgRNA in agronomic species. Therefore, it is necessary to develop a method based on machine learning to identify sgRNA high on-target activity. In this work, we proposed a simple convolutional neural network method to identify sgRNA high on-target activity. Our study used one-hot encoding and k-mers for sequence data conversion and a voting algorithm for constructing the convolutional neural network ensemble model sgRNACNN for the prediction of sgRNA activity. The ensemble model sgRNACNN was used for predictions in four crops: Glycine max, Zea mays, Sorghum bicolor and Triticum aestivum. The accuracy rates of the four crops in the sgRNACNN model were 82.43%, 80.33%, 78.25% and 87.49%, respectively. The experimental results showed that sgRNACNN realizes the identification of high on-target activity sgRNA of agronomic data and can meet the demands of sgRNA activity prediction in agronomy to a certain extent. These results have certain significance for guiding crop gene editing and academic research. The source code and relevant dataset can be found in the following link: https://github.com/nmt315320/sgRNACNN.git .
Collapse
Affiliation(s)
- Mengting Niu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Yuan Lin
- Department of System Integration, Sparebanken Vest, Bergen, Norway.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
32
|
Simpson KE, Venkateshappa R, Pang ZK, Faizi S, Tibbits GF, Claydon TW. Utility of Zebrafish Models of Acquired and Inherited Long QT Syndrome. Front Physiol 2021; 11:624129. [PMID: 33519527 PMCID: PMC7844309 DOI: 10.3389/fphys.2020.624129] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 12/21/2020] [Indexed: 01/12/2023] Open
Abstract
Long-QT Syndrome (LQTS) is a cardiac electrical disorder, distinguished by irregular heart rates and sudden death. Accounting for ∼40% of cases, LQTS Type 2 (LQTS2), is caused by defects in the Kv11.1 (hERG) potassium channel that is critical for cardiac repolarization. Drug block of hERG channels or dysfunctional channel variants can result in acquired or inherited LQTS2, respectively, which are typified by delayed repolarization and predisposition to lethal arrhythmia. As such, there is significant interest in clear identification of drugs and channel variants that produce clinically meaningful perturbation of hERG channel function. While toxicological screening of hERG channels, and phenotypic assessment of inherited channel variants in heterologous systems is now commonplace, affordable, efficient, and insightful whole organ models for acquired and inherited LQTS2 are lacking. Recent work has shown that zebrafish provide a viable in vivo or whole organ model of cardiac electrophysiology. Characterization of cardiac ion currents and toxicological screening work in intact embryos, as well as adult whole hearts, has demonstrated the utility of the zebrafish model to contribute to the development of therapeutics that lack hERG-blocking off-target effects. Moreover, forward and reverse genetic approaches show zebrafish as a tractable model in which LQTS2 can be studied. With the development of new tools and technologies, zebrafish lines carrying precise channel variants associated with LQTS2 have recently begun to be generated and explored. In this review, we discuss the present knowledge and questions raised related to the use of zebrafish as models of acquired and inherited LQTS2. We focus discussion, in particular, on developments in precise gene-editing approaches in zebrafish to create whole heart inherited LQTS2 models and evidence that zebrafish hearts can be used to study arrhythmogenicity and to identify potential anti-arrhythmic compounds.
Collapse
Affiliation(s)
- Kyle E. Simpson
- Molecular Cardiac Physiology Group, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
| | - Ravichandra Venkateshappa
- Molecular Cardiac Physiology Group, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
| | - Zhao Kai Pang
- Molecular Cardiac Physiology Group, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
| | - Shoaib Faizi
- Molecular Cardiac Physiology Group, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
| | - Glen F. Tibbits
- Molecular Cardiac Physiology Group, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
- Department of Cardiovascular Science, British Columbia Children’s Hospital, Vancouver, BC, Canada
| | - Tom W. Claydon
- Molecular Cardiac Physiology Group, Department of Biomedical Physiology and Kinesiology, Simon Fraser University, Burnaby, BC, Canada
| |
Collapse
|
33
|
Störtz F, Minary P. crisprSQL: a novel database platform for CRISPR/Cas off-target cleavage assays. Nucleic Acids Res 2021; 49:D855-D861. [PMID: 33084893 PMCID: PMC7778913 DOI: 10.1093/nar/gkaa885] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/23/2020] [Accepted: 10/17/2020] [Indexed: 12/20/2022] Open
Abstract
With ongoing development of the CRISPR/Cas programmable nuclease system, applications in the area of in vivo therapeutic gene editing are increasingly within reach. However, non-negligible off-target effects remain a major concern for clinical applications. Even though a multitude of off-target cleavage datasets have been published, a comprehensive, transparent overview tool has not yet been established. Here, we present crisprSQL (http://www.crisprsql.com), an interactive and bioinformatically enhanced collection of CRISPR/Cas9 off-target cleavage studies aimed at enriching the fields of cleavage profiling, gene editing safety analysis and transcriptomics. The current version of crisprSQL contains cleavage data from 144 guide RNAs on 25,632 guide-target pairs from human and rodent cell lines, with interaction-specific references to epigenetic markers and gene names. The first curated database of this standard, it promises to enhance safety quantification research, inform experiment design and fuel development of computational off-target prediction algorithms.
Collapse
Affiliation(s)
- Florian Störtz
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| | - Peter Minary
- Department of Computer Science, University of Oxford, Parks Road, Oxford OX1 3QD, UK
| |
Collapse
|
34
|
Bhat MA, Bhat MA, Kumar V, Wani IA, Bashir H, Shah AA, Rahman S, Jan AT. The era of editing plant genomes using CRISPR/Cas: A critical appraisal. J Biotechnol 2020; 324:34-60. [DOI: 10.1016/j.jbiotec.2020.09.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 09/08/2020] [Accepted: 09/14/2020] [Indexed: 12/11/2022]
|