SelfAT-Fold: Protein Fold Recognition Based on Residue-Based and Motif-Based Self-Attention Networks

Yihe Pang, Bin Liu*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

10 引用 (Scopus)

摘要

The protein fold recognition is a fundamental and crucial step of tertiary structure determination. In this regard, several computational predictors have been proposed. Recently, the predictive performance has been obviously improved by the fold-specific features generated by deep learning techniques. However, these methods failed to measure the global associations among residues or motifs along the protein sequences. Furthermore, these deep learning techniques are often treated as black boxes without interpretability. Inspired by the similarities between protein sequences and natural language sentences, we applied the self-attention mechanism derived from natural language processing (NLP) field to protein fold recognition. The motif-based self-attention network (MSAN) and the residue-based self-attention network (RSAN) were constructed based on a training set to capture the global associations among the structure motifs and residues along the protein sequences, respectively. The fold-specific attention features trained and generated from the training set were then combined with Support Vector Machines (SVMs) to predict the samples in the widely used LE benchmark dataset, which is fully independent from the training set. Experimental results showed that the proposed two SelfAT-Fold predictors outperformed 34 existing state-of-the-art computational predictors. The two SelfAT-Fold predictors were further tested on an independent dataset SCOP_TEST, and they can achieve stable performance. Furthermore, the fold-specific attention features can be used to analyse the characteristics of protein folds. The trained models and data of SelfAT-Fold can be downloaded from http://bliulab.net/selfAT_fold/.

源语言英语
页(从-至)1861-1869
页数9
期刊IEEE/ACM Transactions on Computational Biology and Bioinformatics
19
3
DOI
出版状态已出版 - 2022

指纹

探究 'SelfAT-Fold: Protein Fold Recognition Based on Residue-Based and Motif-Based Self-Attention Networks' 的科研主题。它们共同构成独一无二的指纹。

引用此