Exploratory predicting protein folding model with random forest and hybrid features

Xuewei Zhao, Quan Zou, Bin Liu, Xiangrong Liu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

58 Citations (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 58
  • Captures
    • Readers: 9
see details

Abstract

Recent developments in bioinformatics have highlighted the importance of protein structure prediction for which information about structure classes forms the foundation and plays an important role in the prediction of protein folds and tertiary structure. The majority of previous researches have focused on only four protein classes in the Structure Classification of Proteins (SCOP) database. In this paper, we focused mainly on finding the best performing prediction method using SCOP—extended (SCOPe, Release 2.03; previously known as version 1.75C in SCOP), which contains seven major protein classes, including all-α proteins, all-β proteins, α/β proteins, α+β proteins, multi-domain proteins, membrane and cell surface proteins and peptides, and small proteins. The framework that we developed consists of two stages: in the first stage we used a hybrid frequency method for feature extraction from a SCOPe dataset, and in the second stage, we calculated an effective parameter (number of trees) for the Random Forest Classifier. Our computational results on the SCOPe dataset demonstrate the efficiency and effectiveness of our model that generated predictions with an accuracy of 88%, which is much higher than the accuracies reported in previous studies. These encouraging results may be helpful for future research on protein structure and protein fold prediction. Our codes are available in http://datamining.xmu.edu.cn/~zhaoxuewei/PSP.

Original languageEnglish
Pages (from-to)289-299
Number of pages11
JournalCurrent Proteomics
Volume11
Issue number4
DOIs
Publication statusPublished - 1 Dec 2014
Externally publishedYes

Keywords

  • Classifier
  • Protein class
  • Protein structure prediction
  • Random forests
  • SCOP dataset
  • n-gram feature

Fingerprint

Dive into the research topics of 'Exploratory predicting protein folding model with random forest and hybrid features'. Together they form a unique fingerprint.

Cite this

Zhao, X., Zou, Q., Liu, B., & Liu, X. (2014). Exploratory predicting protein folding model with random forest and hybrid features. Current Proteomics, 11(4), 289-299. https://doi.org/10.2174/157016461104150121115154