Abstract
Protein fold recognition is one of the most critical tasks to explore the structures and functions of the proteins based on their primary sequence information. The existing protein fold recognition approaches rely on features reflecting the characteristics of protein folds. However, the feature extraction methods are still the bottleneck of the performance improvement of these methods. In this paper, we proposed two new feature extraction methods called MotifCNN and MotifDCNN to extract more discriminative fold-specific features based on structural motif kernels to construct the motif-based convolutional neural networks (CNNs). The pairwise sequence similarity scores calculated based on fold-specific features are then fed into support vector machines to construct the predictor for fold recognition, and a predictor called MotifCNN-fold has been proposed. Experimental results on the benchmark dataset showed that MotifCNN-fold obviously outperformed all the other competing methods. In particular, the fold-specific features extracted by MotifCNN and MotifDCNN are more discriminative than the fold-specific features extracted by other deep learning techniques, indicating that incorporating the structural motifs into the CNN is able to capture the characteristics of protein folds.
Original language | English |
---|---|
Pages (from-to) | 2133-2141 |
Number of pages | 9 |
Journal | Briefings in Bioinformatics |
Volume | 21 |
Issue number | 6 |
DOIs | |
Publication status | Published - 1 Nov 2020 |
Keywords
- Motif-based convolutional neural networks
- Protein fold recognition
- Structural motif kernel
- Support vector machine