TY - JOUR
T1 - sgRNA-PSM
T2 - Predict sgRNAs On-Target Activity Based on Position-Specific Mismatch
AU - Liu, Bin
AU - Luo, Zhihua
AU - He, Juan
N1 - Publisher Copyright:
© 2020 The Author(s)
PY - 2020/6/5
Y1 - 2020/6/5
N2 - As a key technique for the CRISPR-Cas9 system, identification of single-guide RNAs (sgRNAs) on-target activity is critical for both theoretical research (investigation of RNA functions) and real-world applications (genome editing and synthetic biology). Because of its importance, several computational predictors have been proposed to predict sgRNAs on-target activity. All of these methods have clearly contributed to the developments of this very important field. However, they are suffering from certain limitations. We proposed two new methods called “sgRNA-PSM” and “sgRNA-ExPSM” for sgRNAs on-target activity prediction via capturing the long-range sequence information and evolutionary information using a new way to reduce the dimension of the feature vector to avoid the risk of overfitting. Rigorous leave-one-gene-out cross-validation on a benchmark dataset with 11 human genes and 6 mouse genes, as well as an independent dataset, indicated that the two new methods outperformed other competing methods. To make it easier for users to use the proposed sgRNA-PSM predictor, we have established a corresponding web server, which is available at http://bliulab.net/sgRNA-PSM/.
AB - As a key technique for the CRISPR-Cas9 system, identification of single-guide RNAs (sgRNAs) on-target activity is critical for both theoretical research (investigation of RNA functions) and real-world applications (genome editing and synthetic biology). Because of its importance, several computational predictors have been proposed to predict sgRNAs on-target activity. All of these methods have clearly contributed to the developments of this very important field. However, they are suffering from certain limitations. We proposed two new methods called “sgRNA-PSM” and “sgRNA-ExPSM” for sgRNAs on-target activity prediction via capturing the long-range sequence information and evolutionary information using a new way to reduce the dimension of the feature vector to avoid the risk of overfitting. Rigorous leave-one-gene-out cross-validation on a benchmark dataset with 11 human genes and 6 mouse genes, as well as an independent dataset, indicated that the two new methods outperformed other competing methods. To make it easier for users to use the proposed sgRNA-PSM predictor, we have established a corresponding web server, which is available at http://bliulab.net/sgRNA-PSM/.
KW - XGBoost
KW - position-specific mismatch
KW - sgRNAs on-target activity
UR - http://www.scopus.com/inward/record.url?scp=85081666985&partnerID=8YFLogxK
U2 - 10.1016/j.omtn.2020.01.029
DO - 10.1016/j.omtn.2020.01.029
M3 - Article
AN - SCOPUS:85081666985
SN - 2162-2531
VL - 20
SP - 323
EP - 330
JO - Molecular Therapy Nucleic Acids
JF - Molecular Therapy Nucleic Acids
ER -