Improving Dialogue Summarization with Mixup Label Smoothing

Saihua Cheng, Dandan Song*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The abstractive dialogue summarization models trained with Maximum Likelihood Estimation suffer from the overconfident issue because the training objective encourages the model to assign all probability to the hard target. Although Label Smoothing is widely adopted to prevent the models from being overconfident, it assumes a pre-defined uniform distribution that is not adaptive and is not an ideal soft target. Therefore, we propose a Mixup Label Smoothing method in this paper, which exploits the general knowledge from the language model to construct a flexible soft target to present diverse candidates. We conceptualize the hypothesis distribution obtained from a pretrained language model as the context-smoothing target, which encodes much knowledge through the massive pretraining corpus and implies more possible candidate summaries. Extensive experiments on three popular dialogue summarization datasets demonstrate that our method effectively outperforms various strong baselines, as well as in low-resource settings.

Original languageEnglish
Title of host publicationProceedings of 2023 Chinese Intelligent Automation Conference
EditorsZhidong Deng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages460-475
Number of pages16
ISBN (Print)9789819961863
DOIs
Publication statusPublished - 2023
EventChinese Intelligent Automation Conference, CIAC 2023 - Nanjing, China
Duration: 2 Oct 20235 Oct 2023

Publication series

NameLecture Notes in Electrical Engineering
Volume1082 LNEE
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

ConferenceChinese Intelligent Automation Conference, CIAC 2023
Country/TerritoryChina
CityNanjing
Period2/10/235/10/23

Keywords

  • Dialogue summarization
  • Label smoothing
  • Pretrained language model

Fingerprint

Dive into the research topics of 'Improving Dialogue Summarization with Mixup Label Smoothing'. Together they form a unique fingerprint.

Cite this