sgRNA-2wPSM: Identify sgRNAs on-target activity by combining two-window-based position specific mismatch and synthetic minority oversampling technique

  • Lichao Zhang
  • , Tao Bai*
  • , Hao Wu*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Motivation: sgRNAs on-target activity prediction is a critical step in the CRISPR-Cas9 system. Due to its importance to RNA function research and genome editing application, some computational methods were introduced, treating it as a binary classification task or a regression task. Among these methods, sgRNA-PSM is a state-of-the-art method. In this work, we improved this method by proposing a new feature extraction method called two-window-based PSM, which divides the DNA sequences into two non-overlapping segments so as to extract different patterns in the two different segments. The two-window-based PSM were fed into Support Vector Machines (SVMs), and a new method called sgRNA-2wPSM was proposed. Furthermore, a new oversampling method called SCORE-SVM-SMOTE was proposed to solve the imbalanced training set problem based on the SVM-SMOTE algorithm. Results on the benchmark datasets indicated that sgRNA-2wPSM is superior to other methods.

Original languageEnglish
Article number106489
JournalComputers in Biology and Medicine
Volume155
DOIs
Publication statusPublished - Mar 2023

Keywords

  • SCORE-SVM-SMOTE
  • Support vector machine
  • Two-window-based PSM
  • sgRNAs on-target activity

Fingerprint

Dive into the research topics of 'sgRNA-2wPSM: Identify sgRNAs on-target activity by combining two-window-based position specific mismatch and synthetic minority oversampling technique'. Together they form a unique fingerprint.

Cite this