AttenWalker: Unsupervised Long-Document Question Answering via Attention-based Graph Walking

Yuxiang Nie, Heyan Huang*, Wei Wei, Xian Ling Mao

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)

摘要

Annotating long-document question answering (long-document QA) pairs is time-consuming and expensive. To alleviate the problem, it might be possible to generate long-document QA pairs via unsupervised question answering (UQA) methods. However, existing UQA tasks are based on short documents, and can hardly incorporate long-range information. To tackle the problem, we propose a new task, named unsupervised long-document question answering (ULQA), aiming to generate high-quality long-document QA instances in an unsupervised manner. Besides, we propose AttenWalker, a novel unsupervised method to aggregate and generate answers with long-range dependency so as to construct long-document QA pairs. Specifically, AttenWalker is composed of three modules, i.e., span collector, span linker and answer aggregator. Firstly, the span collector takes advantage of constituent parsing and reconstruction loss to select informative candidate spans for constructing answers. Secondly, by going through the attention graph of a pre-trained long-document model, potentially interrelated text spans (that might be far apart) could be linked together via an attention-walking algorithm. Thirdly, in the answer aggregator, linked spans are aggregated into the final answer via the mask-filling ability of a pre-trained model. Extensive experiments show that AttenWalker outperforms previous methods on Qasper and NarrativeQA. In addition, AttenWalker also shows strong performance in the few-shot learning setting.

源语言英语
主期刊名Findings of the Association for Computational Linguistics, ACL 2023
出版商Association for Computational Linguistics (ACL)
13650-13663
页数14
ISBN(电子版)9781959429623
出版状态已出版 - 2023
活动61st Annual Meeting of the Association for Computational Linguistics, ACL 2023 - Toronto, 加拿大
期限: 9 7月 202314 7月 2023

出版系列

姓名Proceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN(印刷版)0736-587X

会议

会议61st Annual Meeting of the Association for Computational Linguistics, ACL 2023
国家/地区加拿大
Toronto
时期9/07/2314/07/23

指纹

探究 'AttenWalker: Unsupervised Long-Document Question Answering via Attention-based Graph Walking' 的科研主题。它们共同构成独一无二的指纹。

引用此