A discriminative method for protein remote homology detection based on N-nary profiles

Bin Liu, Lei Lin, Xiaolong Wang, Qiwen Dong, Xuan Wang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Protein homology detection is a key problem in computational biology. In this paper, a novel building block for protein called N-nary profile which contains the evolutionary information of protein sequence frequency profiles has been presented. The protein sequence frequency profiles calculated from the multiple sequence alignments outputted by PSI-BLAST are converted into N-nary profiles. Such N-nary profiles are filtered by a feature selection algorithm called chi-square algorithm. The protein sequences are transformed into fixed-dimension feature vectors by the occurrence times of each N-nary profile and then the corresponding vectors are inputted to support vector machine (SVM). The latent semantic analysis (LSA) model, an efficient feature extraction algorithm, is adopted to further improve the performance of this method. When tested on the SCOP 1.53 data set, the prediction performance of N-nary profile method outperforms all compared methods of protein remote homology detection. The ROC50 score is 0.736, which is higher than the current best method for nearly 4 percent.

Original languageEnglish
Title of host publicationBioinformatics Research and Development - Second International Conference, BIRD 2008, Proceedings
PublisherSpringer Verlag
Pages74-86
Number of pages13
ISBN (Print)9783540705987
DOIs
Publication statusPublished - 2008
Externally publishedYes
Event2nd International Conference on Bioinformatics Research and Development, BIRD 2008 - Vienna, Austria
Duration: 7 Jul 20089 Jul 2008

Publication series

NameCommunications in Computer and Information Science
Volume13
ISSN (Print)1865-0929

Conference

Conference2nd International Conference on Bioinformatics Research and Development, BIRD 2008
Country/TerritoryAustria
CityVienna
Period7/07/089/07/08

Keywords

  • Chi-square algorithm
  • Latent semantic analysis
  • N-nary profiles
  • Remote homology

Fingerprint

Dive into the research topics of 'A discriminative method for protein remote homology detection based on N-nary profiles'. Together they form a unique fingerprint.

Cite this