Protein Remote Homology Detection and Fold Recognition Based on Sequence-Order Frequency Matrix

Bin Liu*, Junjie Chen, Mingyue Guo, Xiaolong Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Citations (Scopus)

Abstract

Protein remote homology detection and fold recognition are two critical tasks for the studies of protein structures and functions. Currently, the profile-based methods achieve the state-of-the-art performance in these fields. However, the widely used sequence profiles, like position-specific frequency matrix (PSFM) and position-specific scoring matrix (PSSM), ignore the sequence-order effects along protein sequence. In this study, we have proposed a novel profile, called sequence-order frequency matrix (SOFM), to extract the sequence-order information of neighboring residues from multiple sequence alignment (MSA). Combined with two profile feature extraction approaches, top-n-grams and the Smith-Waterman algorithm, the SOFMs are applied to protein remote homology detection and fold recognition, and two predictors called SOFM-Top and SOFM-SW are proposed. Experimental results show that SOFM contains more information content than other profiles, and these two predictors outperform other state-of-the-art methods. It is anticipated that SOFM will become a very useful profile in the studies of protein structures and functions.

Original languageEnglish
Article number8078207
Pages (from-to)292-300
Number of pages9
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume16
Issue number1
DOIs
Publication statusPublished - 1 Jan 2019
Externally publishedYes

Keywords

  • Protein remote homology detection
  • Smith-Waterman local alignment algorithm
  • protein fold recognition
  • sequence-order frequency matrix
  • top-n-gram

Fingerprint

Dive into the research topics of 'Protein Remote Homology Detection and Fold Recognition Based on Sequence-Order Frequency Matrix'. Together they form a unique fingerprint.

Cite this