Domain Adaptation and Summary Distillation for Unsupervised Query Focused Summarization

Jiancheng Du, Yang Gao*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

1 引用 (Scopus)

摘要

Text summarizing is the task of reducing a document's length while maintaining its essential information. In the age of information explosion, how to obtain the content that users needed from a large volume of information becomes particularly significant. Under such circumstances, query-focused abstractive summarization (qfs) becomes more dominant since it is able to focus on user needs while delivering fluent, concise, succinct paraphrased summaries. However, unlike generic summarization, which has achieved remarkable progress driven by a substantial amount of parallel data, the qfs struggles due to a deficiency of parallel corpus. Therefore, in this paper, we leverage a typical large generic summarization dataset to facilitate the pressing demands on unsupervised qfs. The large-scale query-free benchmark is automatically transformed into a query-focused dataset (Query-CNNDM) while preserving its informative summaries. We propose a simple yet effective unsupervised method, called Domain Adaptation and Summary Distillation method (DASD). In the model, to achieve the domain adaptation for unsupervised qfs, we design a query-aware gap sentence generation (q-GSG) strategy to equip the model with the capability of learning target textual knowledge and obtaining a good initialization at the target domain. As instance-specific regularization, we train a teacher model with the Query-CNNDM to generate pseudo-labels for summary distillation. Experimental results indicate that our DASD model achieves state-of-the-art performance on two benchmark datasets, Debatepedia and Wikiref, in a zero-shot setting and shows good generalization to the abstractive few-shot qfs.

源语言英语
页(从-至)1044-1055
页数12
期刊IEEE Transactions on Knowledge and Data Engineering
36
3
DOI
出版状态已出版 - 1 3月 2024

指纹

探究 'Domain Adaptation and Summary Distillation for Unsupervised Query Focused Summarization' 的科研主题。它们共同构成独一无二的指纹。

引用此