TY - GEN
T1 - Query-focused Abstractive Summarization via Question-answering Model
AU - Du, Jiancheng
AU - Gao, Yang
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Text summarization is a task that creates a short version of a document while preserving the main content. In the age of information explosion, how to obtain the content that users care about from a large amount of information becomes par-ticularly significant. Under these circumstances, query-focused abstractive summarization (QFS) becomes more dominant since it is able to focus on user needs while generating fluent, con-cise, succinct paraphrased summaries. However, different from generic summarization that has achieved remarkable results driven by a large scale of parallel data, the QFS is suffering from lacking enough parallel corpus. To address the above issues, in this paper, we migrate the large-scale generic summarization datasets into query-focused datasets while preserving the informative summaries. Based on the synthetic queries and data, we proposed a new model, called SQAS, which is capable of extracting fine-grained factual information with respect to a specific question, and take into account the reasoning information by understanding the source document leveraged by the question-answering model. Receiving the extracted content, the summary generator can not only generate semantically relevant content but also assure fluent and readable sentences thanks to the language generation capability of a pre-trained language model. Experimental results on both generic datasets and query-focused summary datasets demonstrate the effectiveness of our proposed model in terms of automatic ROUGE metrics and investigating real cases.
AB - Text summarization is a task that creates a short version of a document while preserving the main content. In the age of information explosion, how to obtain the content that users care about from a large amount of information becomes par-ticularly significant. Under these circumstances, query-focused abstractive summarization (QFS) becomes more dominant since it is able to focus on user needs while generating fluent, con-cise, succinct paraphrased summaries. However, different from generic summarization that has achieved remarkable results driven by a large scale of parallel data, the QFS is suffering from lacking enough parallel corpus. To address the above issues, in this paper, we migrate the large-scale generic summarization datasets into query-focused datasets while preserving the informative summaries. Based on the synthetic queries and data, we proposed a new model, called SQAS, which is capable of extracting fine-grained factual information with respect to a specific question, and take into account the reasoning information by understanding the source document leveraged by the question-answering model. Receiving the extracted content, the summary generator can not only generate semantically relevant content but also assure fluent and readable sentences thanks to the language generation capability of a pre-trained language model. Experimental results on both generic datasets and query-focused summary datasets demonstrate the effectiveness of our proposed model in terms of automatic ROUGE metrics and investigating real cases.
KW - Abstractive summarization
KW - Query-focused summarization
KW - Question answering
UR - http://www.scopus.com/inward/record.url?scp=85125098792&partnerID=8YFLogxK
U2 - 10.1109/ICKG52313.2021.00065
DO - 10.1109/ICKG52313.2021.00065
M3 - Conference contribution
AN - SCOPUS:85125098792
T3 - Proceedings - 12th IEEE International Conference on Big Knowledge, ICBK 2021
SP - 440
EP - 447
BT - Proceedings - 12th IEEE International Conference on Big Knowledge, ICBK 2021
A2 - Gong, Zhiguo
A2 - Li, Xue
A2 - Oguducu, Sule Gunduz
A2 - Chen, Lei
A2 - Manjon, Baltasar Fernandez
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 12th IEEE International Conference on Big Knowledge, ICBK 2021
Y2 - 7 December 2021 through 8 December 2021
ER -