Position-aware self-attention based neural sequence labeling

Wei Wei; Zanbo Wang; Xianling Mao; Guangyou Zhou; Pan Zhou; Sheng Jiang

doi:10.1016/j.patcog.2020.107636

Position-aware self-attention based neural sequence labeling

Wei Wei^*, Zanbo Wang, Xianling Mao, Guangyou Zhou, Pan Zhou, Sheng Jiang

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

21 Citations (Scopus)

Abstract

Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e., PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e., part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.

Original language	English
Article number	107636
Journal	Pattern Recognition
Volume	110
DOIs	https://doi.org/10.1016/j.patcog.2020.107636
Publication status	Published - Feb 2021

Keywords

Discrete context dependency
Equence labeling
Self-attention

Access to Document

10.1016/j.patcog.2020.107636

Cite this

@article{6b38750dc61b4f2ba594a5a998640344,

title = "Position-aware self-attention based neural sequence labeling",

abstract = "Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e., PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e., part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.",

keywords = "Discrete context dependency, Equence labeling, Self-attention",

author = "Wei Wei and Zanbo Wang and Xianling Mao and Guangyou Zhou and Pan Zhou and Sheng Jiang",

note = "Publisher Copyright: {\textcopyright} 2020",

year = "2021",

month = feb,

doi = "10.1016/j.patcog.2020.107636",

language = "English",

volume = "110",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Position-aware self-attention based neural sequence labeling

AU - Wei, Wei

AU - Wang, Zanbo

AU - Mao, Xianling

AU - Zhou, Guangyou

AU - Zhou, Pan

AU - Jiang, Sheng

PY - 2021/2

Y1 - 2021/2

N2 - Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e., PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e., part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.

AB - Sequence labeling is a fundamental task in natural language processing and has been widely studied. Recently, RNN-based sequence labeling models have increasingly gained attentions. Despite superior performance achieved by learning the long short-term (i.e., successive) dependencies, the way of sequentially processing inputs might limit the ability to capture the non-continuous relations over tokens within a sentence. To tackle the problem, we focus on how to effectively model successive and discrete dependencies of each token for enhancing the sequence labeling performance. Specifically, we propose an innovative attention-based model (called position-aware self-attention, i.e., PSA) as well as a well-designed self-attentional context fusion layer within a neural network architecture, to explore the positional information of an input sequence for capturing the latent relations among tokens. Extensive experiments on three classical tasks in sequence labeling domain, i.e., part-of-speech (POS) tagging, named entity recognition (NER) and phrase chunking, demonstrate our proposed model outperforms the state-of-the-arts without any external knowledge, in terms of various metrics.

KW - Discrete context dependency

KW - Equence labeling

KW - Self-attention

UR - http://www.scopus.com/inward/record.url?scp=85090424400&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2020.107636

DO - 10.1016/j.patcog.2020.107636

M3 - Article

AN - SCOPUS:85090424400

SN - 0031-3203

VL - 110

JO - Pattern Recognition

JF - Pattern Recognition

M1 - 107636

ER -

Position-aware self-attention based neural sequence labeling

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this