DiagAF: A More Accurate and Efficient Pre-Alignment Filter for Sequence Alignment

Changyong Yu*, Yuhai Zhao, Chu Zhao, Haitao Ma, Guoren Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Sequence alignment is an essential step in computational genomics. More accurate and efficient sequence pre-alignment methods that run before conducting expensive computation for final verification are still urgently needed. In this article, we propose a more accurate and efficient pre-alignment algorithm for sequence alignment, called DiagAF. Firstly, DiagAF uses a new lower bound of edit distance based on shift hamming masks. The new lower bound makes use of fewer shift hamming masks comparing with state-of-the-art algorithms such as SHD and MAGNET. Moreover, it takes account the information of edit distance path exchanging on shift hamming masks. Secondly, DiagAF can deal with alignments of sequence pairs with not equal length, rather than state-of-the-art methods just for equal length. Thirdly, DiagAF can align sequences with early termination for true alignments. In the experiment, we compared DiagAF with state-of-the-art methods. DiagAF can achieve a much smaller error rate than them, meanwhile use less time than them. We believe that DiagAF algorithm can further improve the performance of state-of-the-art sequence alignment softwares. The source codes of DiagAF can be downloaded from web site https://github.com/BioLab-cz/DiagAF.

Original languageEnglish
Pages (from-to)3404-3415
Number of pages12
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume19
Issue number6
DOIs
Publication statusPublished - 1 Nov 2022

Keywords

  • Sequence alignment
  • edit distance
  • filter
  • read mapping
  • shift hamming mask

Fingerprint

Dive into the research topics of 'DiagAF: A More Accurate and Efficient Pre-Alignment Filter for Sequence Alignment'. Together they form a unique fingerprint.

Cite this