How does Attention Affect the Model?

Cheng Zhang, Qiuchi Li, Lingyu Hua, Dawei Song*

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

The attention layer has become a prevalent component in improving the effectiveness of neural network models for NLP tasks. Figuring out why attention is effective and its interpretability has attracted a widespread deliberation. Current studies mostly investigate the effect of attention mechanism based on the attention distribution it generates with one single neural network structure. However they do not consider the changes in semantic capability of different components in the model due to the attention mechanism, which can vary across different network structures. In this paper, we propose a comprehensive analytical framework that exploits a convex hull representation of sequence semantics in an n-dimensional Semantic Euclidean Space and defines a series of indicators to capture the impact of attention on sequence semantics. Through a series of experiments on various NLP tasks and three representative recurrent units, we analyze why and how attention benefits the semantic capacity of different types of recurrent neural networks based on the indicators defined in the proposed framework.

源语言英语
主期刊名Findings of the Association for Computational Linguistics
主期刊副标题ACL-IJCNLP 2021
编辑Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
出版商Association for Computational Linguistics (ACL)
256-268
页数13
ISBN(电子版)9781954085541
出版状态已出版 - 2021
活动Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online
期限: 1 8月 20216 8月 2021

出版系列

姓名Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

会议

会议Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Virtual, Online
时期1/08/216/08/21

指纹

探究 'How does Attention Affect the Model?' 的科研主题。它们共同构成独一无二的指纹。

引用此