TY - JOUR
T1 - Position-aware self-attention based neural sequence labeling
AU - Wei, Wei
AU - Wang, Zanbo
AU - Mao, Xianling
AU - Zhou, Guangyou
AU - Zhou, Pan
AU - Jiang, Sheng
N1 - Publisher Copyright:
© 2020
PY - 2021/2
Y1 - 2021/2
N2 - Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e., PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e., part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.
AB - Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e., PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e., part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.
KW - Discrete context dependency
KW - Equence labeling
KW - Self-attention
UR - http://www.scopus.com/inward/record.url?scp=85090424400&partnerID=8YFLogxK
U2 - 10.1016/j.patcog.2020.107636
DO - 10.1016/j.patcog.2020.107636
M3 - Article
AN - SCOPUS:85090424400
SN - 0031-3203
VL - 110
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 107636
ER -