Protein remote homology detection by combining chou's pseudo amino acid composition and profile-based protein representation

Bin Liu*, Xiaolong Wang, Quan Zou, Qiwen Dong, Qingcai Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

102 Citations (Scopus)

Abstract

Protein remote homology detection is a key problem in bioinformatics. Currently the discriminative methods, such as Support Vector Machine (SVM) can achieve the best performance. The most efficient approach to improve the performance of SVM-based methods is to find a general protein representation method that is able to convert proteins with different lengths into fixed length vectors and captures the different properties of the proteins for the discrimination. The bottleneck of designing the protein representation method is that native proteins have different lengths. Motivated by the success of the pseudo amino acid composition (PseAAC) proposed by Chou, we applied this approach for protein remote homology detection. Some new indices derived from the amino acid index (AAIndex) database are incorporated into the PseAAC to improve the generalization ability of this method. Finally, the performance is further improved by combining the modified PseAAC with profile-based protein representation containing the evolutionary information extracted from the frequency profiles. Our experiments on a well-known benchmark show this method achieves superior or comparable performance with current state-of-theart methods.

Original languageEnglish
Pages (from-to)775-782
Number of pages8
JournalMolecular Informatics
Volume32
Issue number9-10
DOIs
Publication statusPublished - Oct 2013
Externally publishedYes

Keywords

  • Frequency profile
  • Protein remote homology
  • Pseudo amino acid composition
  • Support Vector Machine

Fingerprint

Dive into the research topics of 'Protein remote homology detection by combining chou's pseudo amino acid composition and profile-based protein representation'. Together they form a unique fingerprint.

Cite this