A pattern-aware self-attention network for distant supervised relation extraction

Yu Ming Shang; Heyan Huang; Xin Sun; Wei Wei; Xian Ling Mao

doi:10.1016/j.ins.2021.10.047

A pattern-aware self-attention network for distant supervised relation extraction

Yu Ming Shang, Heyan Huang, Xin Sun^*, Wei Wei, Xian Ling Mao

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

28 Citations (Scopus)

Abstract

Distant supervised relation extraction is an efficient strategy of finding relational facts from unstructured text without labeled training data. A recent paradigm to develop relation extractors is using pre-trained Transformer language models to produce high-quality sentence representations. However, due to the original Transformer is weak at capturing local dependencies and phrasal structures, existing Transformer-based methods cannot identify various relational patterns in sentences. To address this issue, we propose a novel distant supervised relation extraction model, which employs a specific-designed pattern-aware self-attention network to automatically discover relational patterns for pre-trained Transformers in an end-to-end manner. Specifically, the proposed method assumes that the correlation between two adjacent tokens reflects the probability that they belong to the same pattern. Based on this assumption, a novel self-attention network is designed to generate the probability distribution of all patterns in a sentence. Then, the probability distribution is applied as a constraint in the first Transformer layer to encourage its attention heads to follow the relational pattern structures. As a result, fine-grained pattern information is enhanced in the pre-trained Transformer without losing global dependencies. Extensive experimental results on two popular benchmark datasets demonstrate that our model performs better than the state-of-the-art baselines.

Original language	English
Pages (from-to)	269-279
Number of pages	11
Journal	Information Sciences
Volume	584
DOIs	https://doi.org/10.1016/j.ins.2021.10.047
Publication status	Published - Jan 2022

Keywords

distant supervision
pre-trained Transformer
relation extraction
relational pattern
self-attention network

Access to Document

10.1016/j.ins.2021.10.047

Cite this

@article{70471eee0abe49778a43075e3111fe74,

title = "A pattern-aware self-attention network for distant supervised relation extraction",

abstract = "Distant supervised relation extraction is an efficient strategy of finding relational facts from unstructured text without labeled training data. A recent paradigm to develop relation extractors is using pre-trained Transformer language models to produce high-quality sentence representations. However, due to the original Transformer is weak at capturing local dependencies and phrasal structures, existing Transformer-based methods cannot identify various relational patterns in sentences. To address this issue, we propose a novel distant supervised relation extraction model, which employs a specific-designed pattern-aware self-attention network to automatically discover relational patterns for pre-trained Transformers in an end-to-end manner. Specifically, the proposed method assumes that the correlation between two adjacent tokens reflects the probability that they belong to the same pattern. Based on this assumption, a novel self-attention network is designed to generate the probability distribution of all patterns in a sentence. Then, the probability distribution is applied as a constraint in the first Transformer layer to encourage its attention heads to follow the relational pattern structures. As a result, fine-grained pattern information is enhanced in the pre-trained Transformer without losing global dependencies. Extensive experimental results on two popular benchmark datasets demonstrate that our model performs better than the state-of-the-art baselines.",

keywords = "distant supervision, pre-trained Transformer, relation extraction, relational pattern, self-attention network",

author = "Shang, {Yu Ming} and Heyan Huang and Xin Sun and Wei Wei and Mao, {Xian Ling}",

note = "Publisher Copyright: {\textcopyright} 2021 Elsevier Inc.",

year = "2022",

month = jan,

doi = "10.1016/j.ins.2021.10.047",

language = "English",

volume = "584",

pages = "269--279",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier Inc.",

}

TY - JOUR

T1 - A pattern-aware self-attention network for distant supervised relation extraction

AU - Shang, Yu Ming

AU - Huang, Heyan

AU - Sun, Xin

AU - Wei, Wei

AU - Mao, Xian Ling

PY - 2022/1

Y1 - 2022/1

N2 - Distant supervised relation extraction is an efficient strategy of finding relational facts from unstructured text without labeled training data. A recent paradigm to develop relation extractors is using pre-trained Transformer language models to produce high-quality sentence representations. However, due to the original Transformer is weak at capturing local dependencies and phrasal structures, existing Transformer-based methods cannot identify various relational patterns in sentences. To address this issue, we propose a novel distant supervised relation extraction model, which employs a specific-designed pattern-aware self-attention network to automatically discover relational patterns for pre-trained Transformers in an end-to-end manner. Specifically, the proposed method assumes that the correlation between two adjacent tokens reflects the probability that they belong to the same pattern. Based on this assumption, a novel self-attention network is designed to generate the probability distribution of all patterns in a sentence. Then, the probability distribution is applied as a constraint in the first Transformer layer to encourage its attention heads to follow the relational pattern structures. As a result, fine-grained pattern information is enhanced in the pre-trained Transformer without losing global dependencies. Extensive experimental results on two popular benchmark datasets demonstrate that our model performs better than the state-of-the-art baselines.

AB - Distant supervised relation extraction is an efficient strategy of finding relational facts from unstructured text without labeled training data. A recent paradigm to develop relation extractors is using pre-trained Transformer language models to produce high-quality sentence representations. However, due to the original Transformer is weak at capturing local dependencies and phrasal structures, existing Transformer-based methods cannot identify various relational patterns in sentences. To address this issue, we propose a novel distant supervised relation extraction model, which employs a specific-designed pattern-aware self-attention network to automatically discover relational patterns for pre-trained Transformers in an end-to-end manner. Specifically, the proposed method assumes that the correlation between two adjacent tokens reflects the probability that they belong to the same pattern. Based on this assumption, a novel self-attention network is designed to generate the probability distribution of all patterns in a sentence. Then, the probability distribution is applied as a constraint in the first Transformer layer to encourage its attention heads to follow the relational pattern structures. As a result, fine-grained pattern information is enhanced in the pre-trained Transformer without losing global dependencies. Extensive experimental results on two popular benchmark datasets demonstrate that our model performs better than the state-of-the-art baselines.

KW - distant supervision

KW - pre-trained Transformer

KW - relation extraction

KW - relational pattern

KW - self-attention network

UR - http://www.scopus.com/inward/record.url?scp=85118828887&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2021.10.047

DO - 10.1016/j.ins.2021.10.047

M3 - Article

AN - SCOPUS:85118828887

SN - 0020-0255

VL - 584

SP - 269

EP - 279

JO - Information Sciences

JF - Information Sciences

ER -

A pattern-aware self-attention network for distant supervised relation extraction

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this