TY - JOUR
T1 - Query focused summarization via relevance distillation
AU - Yue, Ye
AU - Li, Yuanli
AU - Zhan, Jia ao
AU - Gao, Yang
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
PY - 2023/8
Y1 - 2023/8
N2 - Creating a short version of a concise and relevant summary regarding a specific query can broadly meet a user’s information needs in many areas. In a summarization system, the extractive technique is attractive because it is simple and fast and produces reliable outputs. Salience and relevance are two key points for the extractive summarization. The majority of existing approaches to achieving them are augmenting input features, incorporating additional attention, or expanding the training scales. Yet, there is much unsupervised but query-related knowledge needs better exploration. To this end, in this paper, we frame the query-focused document summarization as a combination of salience prediction and relevance prediction. Concretely, in addition to the oracle summary set for the salience task, we further create a pseudo-summary set regarding user-specific queries (i.e., title or image captions as the query) for the relevance task. Then, based on a modified BERT fine-tune summarization, we propose two methods, called guidance and distillation, respectively. Specifically, the guidance training essentially shares salient information to reinforce the useful contextual representations in a two-stage training with the salience-and-relevance objective. For the distillation, we propose a new “guide-student” learning paradigm that the relevance knowledge of the query is distilled and transferred from a guide model to a salience-oriented student model. Experiment results demonstrate that guidance training prevails at improving the salience of the summary and distillation training is significantly advanced at relevance learning. Both of them achieve the best state of the arts in unsupervised query-focused settings of CNN and DailyMail dataset.
AB - Creating a short version of a concise and relevant summary regarding a specific query can broadly meet a user’s information needs in many areas. In a summarization system, the extractive technique is attractive because it is simple and fast and produces reliable outputs. Salience and relevance are two key points for the extractive summarization. The majority of existing approaches to achieving them are augmenting input features, incorporating additional attention, or expanding the training scales. Yet, there is much unsupervised but query-related knowledge needs better exploration. To this end, in this paper, we frame the query-focused document summarization as a combination of salience prediction and relevance prediction. Concretely, in addition to the oracle summary set for the salience task, we further create a pseudo-summary set regarding user-specific queries (i.e., title or image captions as the query) for the relevance task. Then, based on a modified BERT fine-tune summarization, we propose two methods, called guidance and distillation, respectively. Specifically, the guidance training essentially shares salient information to reinforce the useful contextual representations in a two-stage training with the salience-and-relevance objective. For the distillation, we propose a new “guide-student” learning paradigm that the relevance knowledge of the query is distilled and transferred from a guide model to a salience-oriented student model. Experiment results demonstrate that guidance training prevails at improving the salience of the summary and distillation training is significantly advanced at relevance learning. Both of them achieve the best state of the arts in unsupervised query-focused settings of CNN and DailyMail dataset.
KW - Document summarization
KW - Knowledge distillation
KW - Unsupervised method
UR - http://www.scopus.com/inward/record.url?scp=85153627598&partnerID=8YFLogxK
U2 - 10.1007/s00521-023-08525-w
DO - 10.1007/s00521-023-08525-w
M3 - Article
AN - SCOPUS:85153627598
SN - 0941-0643
VL - 35
SP - 16543
EP - 16557
JO - Neural Computing and Applications
JF - Neural Computing and Applications
IS - 22
ER -