Back to Search Start Over

Deep sampling of gRNA in the human genome and deep-learning-informed prediction of gRNA activities

Authors :
Heng Zhang
Jianfeng Yan
Zhike Lu
Yangfan Zhou
Qingfeng Zhang
Tingting Cui
Yini Li
Hui Chen
Lijia Ma
Source :
Cell Discovery, Vol 9, Iss 1, Pp 1-20 (2023)
Publication Year :
2023
Publisher :
Nature Publishing Group, 2023.

Abstract

Abstract Life science studies involving clustered regularly interspaced short palindromic repeat (CRISPR) editing generally apply the best-performing guide RNA (gRNA) for a gene of interest. Computational models are combined with massive experimental quantification on synthetic gRNA-target libraries to accurately predict gRNA activity and mutational patterns. However, the measurements are inconsistent between studies due to differences in the designs of the gRNA-target pair constructs, and there has not yet been an integrated investigation that concurrently focuses on multiple facets of gRNA capacity. In this study, we analyzed the DNA double-strand break (DSB)-induced repair outcomes and measured SpCas9/gRNA activities at both matched and mismatched locations using 926,476 gRNAs covering 19,111 protein-coding genes and 20,268 non-coding genes. We developed machine learning models to forecast the on-target cleavage efficiency (AIdit_ON), off-target cleavage specificity (AIdit_OFF), and mutational profiles (AIdit_DSB) of SpCas9/gRNA from a uniformly collected and processed dataset by deep sampling and massively quantifying gRNA capabilities in K562 cells. Each of these models exhibited superlative performance in predicting SpCas9/gRNA activities on independent datasets when benchmarked with previous models. A previous unknown parameter was also empirically determined regarding the “sweet spot” in the size of datasets used to establish an effective model to predict gRNA capabilities at a manageable experimental scale. In addition, we observed cell type-specific mutational profiles and were able to link nucleotidylexotransferase as the key factor driving these outcomes. These massive datasets and deep learning algorithms have been implemented into the user-friendly web service http://crispr-aidit.com to evaluate and rank gRNAs for life science studies.

Subjects

Subjects :
Cytology
QH573-671

Details

Language :
English
ISSN :
20565968
Volume :
9
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Cell Discovery
Publication Type :
Academic Journal
Accession number :
edsdoj.881e6e86c01463aa5b6f0f3696e4ad5
Document Type :
article
Full Text :
https://doi.org/10.1038/s41421-023-00549-9