跳到主要导航 跳到搜索 跳到主要内容

Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection

  • Bin Liu*
  • , Deyuan Zhang
  • , Ruifeng Xu
  • , Jinghao Xu
  • , Xiaolong Wang
  • , Qingcai Chen
  • , Qiwen Dong
  • , Kuo Chen Chou
  • *此作品的通讯作者
  • School of Computer Science and Technology
  • Harbin Institute of Technology Shenzhen
  • Shanghai Key Laboratory of Intelligent Information Processing
  • Gordon Life Science Institute
  • Shenyang Aerospace University
  • Fudan University
  • King Abdulaziz University

科研成果: 期刊稿件文章同行评审

摘要

Motivation: Owing to its importance in both basic research (such as molecular evolution and protein attribute prediction) and practical application (such as timely modeling the 3D structures of proteins targeted for drug development), protein remote homology detection has attracted a great deal of interest. It is intriguing to note that the profile-based approach is promising and holds high potential in this regard. To further improve protein remote homology detection, a key step is how to find an optimal means to extract the evolutionary information into the profiles.Results: Here, we propose a novel approach, the so-called profile-based protein representation, to extract the evolutionary information via the frequency profiles. The latter can be calculated from the multiple sequence alignments generated by PSI-BLAST. Three top performing sequence-based kernels (SVM-Ngram, SVM-pairwise and SVM-LA) were combined with the profile-based protein representation. Various tests were conducted on a SCOP benchmark dataset that contains 54 families and 23 superfamilies. The results showed that the new approach is promising, and can obviously improve the performance of the three kernels. Furthermore, our approach can also provide useful insights for studying the features of proteins in various families. It has not escaped our notice that the current approach can be easily combined with the existing sequence-based methods so as to improve their performance as well.Availability and implementation: For users' convenience, the source code of generating the profile-based proteins and the multiple kernel learning was also provided athttp://bioinformatics.hitsz.edu.cn/ main/∼binliu/remote/Contact: or bliu@gordonlifescience.orgSupplementary information: Supplementary data are available at Bioinformatics online.

源语言英语
页(从-至)472-479
页数8
期刊Bioinformatics
30
4
DOI
出版状态已出版 - 15 2月 2014
已对外发布

指纹

探究 'Combining evolutionary information extracted from frequency profiles with sequence-based kernels for protein remote homology detection' 的科研主题。它们共同构成独一无二的指纹。

引用此