DeepIDP-2L: Protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network

Yi Jun Tang; Yi He Pang; Bin Liu

doi:10.1093/bioinformatics/btab810

DeepIDP-2L: Protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network

Yi Jun Tang, Yi He Pang, Bin Liu^*

^*此作品的通讯作者

计算机学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

23 引用（Scopus）

摘要

Motivation: Intrinsically disordered regions (IDRs) are widely distributed in proteins. Accurate prediction of IDRs is critical for the protein structure and function analysis. The IDRs are divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their lengths. Previous studies have shown that LDRs and SDRs have different proprieties. However, the existing computational methods fail to extract different features for LDRs and SDRs separately. As a result, they achieve unstable performance on datasets with different ratios of LDRs and SDRs. Results: In this study, a two-layer predictor was proposed called DeepIDP-2L. In the first layer, two kinds of attention-based models are used to extract different features for LDRs and SDRs, respectively. The hierarchical attention network is used to capture the distribution pattern features of LDRs, and convolutional attention network is used to capture the local correlation features of SDRs. The second layer of DeepIDP-2L maps the feature extracted in the first layer into a new feature space. Convolutional network and bidirectional long short term memory are used to capture the local and long-range information for predicting both SDRs and LDRs. Experimental results show that DeepIDP-2L can achieve more stable performance than other exiting predictors on independent test sets with different ratios of SDRs and LDRs.

源语言	英语
页（从-至）	1252-1260
页数	9
期刊	Bioinformatics
卷	38
期	5
DOI	https://doi.org/10.1093/bioinformatics/btab810
出版状态	已出版 - 1 3月 2022

访问文件

10.1093/bioinformatics/btab810

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{bc7320ba67ea41a194ef3e512ed7e202,

title = "DeepIDP-2L: Protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network",

abstract = "Motivation: Intrinsically disordered regions (IDRs) are widely distributed in proteins. Accurate prediction of IDRs is critical for the protein structure and function analysis. The IDRs are divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their lengths. Previous studies have shown that LDRs and SDRs have different proprieties. However, the existing computational methods fail to extract different features for LDRs and SDRs separately. As a result, they achieve unstable performance on datasets with different ratios of LDRs and SDRs. Results: In this study, a two-layer predictor was proposed called DeepIDP-2L. In the first layer, two kinds of attention-based models are used to extract different features for LDRs and SDRs, respectively. The hierarchical attention network is used to capture the distribution pattern features of LDRs, and convolutional attention network is used to capture the local correlation features of SDRs. The second layer of DeepIDP-2L maps the feature extracted in the first layer into a new feature space. Convolutional network and bidirectional long short term memory are used to capture the local and long-range information for predicting both SDRs and LDRs. Experimental results show that DeepIDP-2L can achieve more stable performance than other exiting predictors on independent test sets with different ratios of SDRs and LDRs.",

author = "Tang, {Yi Jun} and Pang, {Yi He} and Bin Liu",

year = "2022",

month = mar,

day = "1",

doi = "10.1093/bioinformatics/btab810",

language = "English",

volume = "38",

pages = "1252--1260",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "5",

}

TY - JOUR

T1 - DeepIDP-2L

T2 - Protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network

AU - Tang, Yi Jun

AU - Pang, Yi He

AU - Liu, Bin

PY - 2022/3/1

Y1 - 2022/3/1

N2 - Motivation: Intrinsically disordered regions (IDRs) are widely distributed in proteins. Accurate prediction of IDRs is critical for the protein structure and function analysis. The IDRs are divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their lengths. Previous studies have shown that LDRs and SDRs have different proprieties. However, the existing computational methods fail to extract different features for LDRs and SDRs separately. As a result, they achieve unstable performance on datasets with different ratios of LDRs and SDRs. Results: In this study, a two-layer predictor was proposed called DeepIDP-2L. In the first layer, two kinds of attention-based models are used to extract different features for LDRs and SDRs, respectively. The hierarchical attention network is used to capture the distribution pattern features of LDRs, and convolutional attention network is used to capture the local correlation features of SDRs. The second layer of DeepIDP-2L maps the feature extracted in the first layer into a new feature space. Convolutional network and bidirectional long short term memory are used to capture the local and long-range information for predicting both SDRs and LDRs. Experimental results show that DeepIDP-2L can achieve more stable performance than other exiting predictors on independent test sets with different ratios of SDRs and LDRs.

AB - Motivation: Intrinsically disordered regions (IDRs) are widely distributed in proteins. Accurate prediction of IDRs is critical for the protein structure and function analysis. The IDRs are divided into long disordered regions (LDRs) and short disordered regions (SDRs) according to their lengths. Previous studies have shown that LDRs and SDRs have different proprieties. However, the existing computational methods fail to extract different features for LDRs and SDRs separately. As a result, they achieve unstable performance on datasets with different ratios of LDRs and SDRs. Results: In this study, a two-layer predictor was proposed called DeepIDP-2L. In the first layer, two kinds of attention-based models are used to extract different features for LDRs and SDRs, respectively. The hierarchical attention network is used to capture the distribution pattern features of LDRs, and convolutional attention network is used to capture the local correlation features of SDRs. The second layer of DeepIDP-2L maps the feature extracted in the first layer into a new feature space. Convolutional network and bidirectional long short term memory are used to capture the local and long-range information for predicting both SDRs and LDRs. Experimental results show that DeepIDP-2L can achieve more stable performance than other exiting predictors on independent test sets with different ratios of SDRs and LDRs.

UR - http://www.scopus.com/inward/record.url?scp=85124197747&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btab810

DO - 10.1093/bioinformatics/btab810

M3 - Article

C2 - 34864847

AN - SCOPUS:85124197747

SN - 1367-4803

VL - 38

SP - 1252

EP - 1260

JO - Bioinformatics

JF - Bioinformatics

IS - 5

ER -

DeepIDP-2L: Protein intrinsically disordered region prediction by combining convolutional attention network and hierarchical attention network

摘要

访问文件

其它文件与链接

指纹

引用此