How does Attention Affect the Model?

Cheng Zhang; Qiuchi Li; Lingyu Hua; Dawei Song

How does Attention Affect the Model?

Cheng Zhang, Qiuchi Li, Lingyu Hua, Dawei Song^*

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

2 引用（Scopus）

摘要

The attention layer has become a prevalent component in improving the effectiveness of neural network models for NLP tasks. Figuring out why attention is effective and its interpretability has attracted a widespread deliberation. Current studies mostly investigate the effect of attention mechanism based on the attention distribution it generates with one single neural network structure. However they do not consider the changes in semantic capability of different components in the model due to the attention mechanism, which can vary across different network structures. In this paper, we propose a comprehensive analytical framework that exploits a convex hull representation of sequence semantics in an n-dimensional Semantic Euclidean Space and defines a series of indicators to capture the impact of attention on sequence semantics. Through a series of experiments on various NLP tasks and three representative recurrent units, we analyze why and how attention benefits the semantic capacity of different types of recurrent neural networks based on the indicators defined in the proposed framework.

源语言	英语
主期刊名	Findings of the Association for Computational Linguistics
主期刊副标题	ACL-IJCNLP 2021
编辑	Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
出版商	Association for Computational Linguistics (ACL)
页	256-268
页数	13
ISBN（电子版）	9781954085541
出版状态	已出版 - 2021
活动	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online 期限: 1 8月 2021 → 6 8月 2021

出版系列

姓名	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

会议

会议	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
市	Virtual, Online
时期	1/08/21 → 6/08/21

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, C., Li, Q., Hua, L., & Song, D. (2021). How does Attention Affect the Model? 在 C. Zong, F. Xia, W. Li, & R. Navigli (编辑), Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (页码 256-268). (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021). Association for Computational Linguistics (ACL).

Zhang, Cheng ; Li, Qiuchi ; Hua, Lingyu 等. / How does Attention Affect the Model?. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 编辑 / Chengqing Zong ; Fei Xia ; Wenjie Li ; Roberto Navigli. Association for Computational Linguistics (ACL), 2021. 页码 256-268 (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021).

@inproceedings{18b17257e1894a42967443d79d66caf7,

title = "How does Attention Affect the Model?",

abstract = "The attention layer has become a prevalent component in improving the effectiveness of neural network models for NLP tasks. Figuring out why attention is effective and its interpretability has attracted a widespread deliberation. Current studies mostly investigate the effect of attention mechanism based on the attention distribution it generates with one single neural network structure. However they do not consider the changes in semantic capability of different components in the model due to the attention mechanism, which can vary across different network structures. In this paper, we propose a comprehensive analytical framework that exploits a convex hull representation of sequence semantics in an n-dimensional Semantic Euclidean Space and defines a series of indicators to capture the impact of attention on sequence semantics. Through a series of experiments on various NLP tasks and three representative recurrent units, we analyze why and how attention benefits the semantic capacity of different types of recurrent neural networks based on the indicators defined in the proposed framework.",

author = "Cheng Zhang and Qiuchi Li and Lingyu Hua and Dawei Song",

note = "Publisher Copyright: {\textcopyright} 2021 Association for Computational Linguistics; Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 ; Conference date: 01-08-2021 Through 06-08-2021",

year = "2021",

language = "English",

series = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",

publisher = "Association for Computational Linguistics (ACL)",

pages = "256--268",

editor = "Chengqing Zong and Fei Xia and Wenjie Li and Roberto Navigli",

booktitle = "Findings of the Association for Computational Linguistics",

address = "United States",

}

Zhang, C, Li, Q, Hua, L & Song, D 2021, How does Attention Affect the Model? 在 C Zong, F Xia, W Li & R Navigli (编辑), Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Association for Computational Linguistics (ACL), 页码 256-268, Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Virtual, Online, 1/08/21.

How does Attention Affect the Model? / Zhang, Cheng; Li, Qiuchi; Hua, Lingyu 等.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 编辑 / Chengqing Zong; Fei Xia; Wenjie Li; Roberto Navigli. Association for Computational Linguistics (ACL), 2021. 页码 256-268 (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021).