Domain Adaptation and Summary Distillation for Unsupervised Query Focused Summarization

Jiancheng Du, Yang Gao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Text summarizing is the task of reducing a document's length while maintaining its essential information. In the age of information explosion, how to obtain the content that users needed from a large volume of information becomes particularly significant. Under such circumstances, query-focused abstractive summarization (qfs) becomes more dominant since it is able to focus on user needs while delivering fluent, concise, succinct paraphrased summaries. However, unlike generic summarization, which has achieved remarkable progress driven by a substantial amount of parallel data, the qfs struggles due to a deficiency of parallel corpus. Therefore, in this paper, we leverage a typical large generic summarization dataset to facilitate the pressing demands on unsupervised qfs. The large-scale query-free benchmark is automatically transformed into a query-focused dataset (Query-CNNDM) while preserving its informative summaries. We propose a simple yet effective unsupervised method, called Domain Adaptation and Summary Distillation method (DASD). In the model, to achieve the domain adaptation for unsupervised qfs, we design a query-aware gap sentence generation (q-GSG) strategy to equip the model with the capability of learning target textual knowledge and obtaining a good initialization at the target domain. As instance-specific regularization, we train a teacher model with the Query-CNNDM to generate pseudo-labels for summary distillation. Experimental results indicate that our DASD model achieves state-of-the-art performance on two benchmark datasets, Debatepedia and Wikiref, in a zero-shot setting and shows good generalization to the abstractive few-shot qfs.

Original languageEnglish
Pages (from-to)1044-1055
Number of pages12
JournalIEEE Transactions on Knowledge and Data Engineering
Volume36
Issue number3
DOIs
Publication statusPublished - 1 Mar 2024

Keywords

  • Abstractive summarization
  • domain adaptation
  • query-focused summarization
  • summary distillation
  • unsupervised learning

Fingerprint

Dive into the research topics of 'Domain Adaptation and Summary Distillation for Unsupervised Query Focused Summarization'. Together they form a unique fingerprint.

Cite this