Bilevel Scheduled Sampling for Dialogue Generation

Jiawen Liu; Kan Li

doi:10.1007/978-3-031-44693-1_64

Bilevel Scheduled Sampling for Dialogue Generation

Jiawen Liu, Kan Li^*

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words’ quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model’s adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level information to an appropriate range, and employ probabilistic sampling based on the mapped values instead of threshold truncation. Experiments conducted on the DailyDialog and PersonaChat datasets demonstrate the effectiveness of our proposed methods, which significantly alleviate the exposure bias problem and outperform state-of-the-art scheduled sampling methods.

Original language	English
Title of host publication	Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings
Editors	Fei Liu, Nan Duan, Qingting Xu, Yu Hong
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	827-839
Number of pages	13
ISBN (Print)	9783031446924
DOIs	https://doi.org/10.1007/978-3-031-44693-1_64
Publication status	Published - 2023
Event	12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023 - Foshan, China Duration: 12 Oct 2023 → 15 Oct 2023

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	14302 LNAI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023
Country/Territory	China
City	Foshan
Period	12/10/23 → 15/10/23

Keywords

Dialog Generation
Exposure Bias
Scheduled Sampling

Access to Document

10.1007/978-3-031-44693-1_64

Cite this

Liu, J., & Li, K. (2023). Bilevel Scheduled Sampling for Dialogue Generation. In F. Liu, N. Duan, Q. Xu, & Y. Hong (Eds.), Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings (pp. 827-839). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14302 LNAI). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-44693-1_64

Liu, Jiawen ; Li, Kan. / Bilevel Scheduled Sampling for Dialogue Generation. Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings. editor / Fei Liu ; Nan Duan ; Qingting Xu ; Yu Hong. Springer Science and Business Media Deutschland GmbH, 2023. pp. 827-839 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{05118148011b4ebe88990af9cb6b30fe,

title = "Bilevel Scheduled Sampling for Dialogue Generation",

abstract = "Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words{\textquoteright} quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model{\textquoteright}s adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level information to an appropriate range, and employ probabilistic sampling based on the mapped values instead of threshold truncation. Experiments conducted on the DailyDialog and PersonaChat datasets demonstrate the effectiveness of our proposed methods, which significantly alleviate the exposure bias problem and outperform state-of-the-art scheduled sampling methods.",

keywords = "Dialog Generation, Exposure Bias, Scheduled Sampling",

author = "Jiawen Liu and Kan Li",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.; 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023 ; Conference date: 12-10-2023 Through 15-10-2023",

year = "2023",

doi = "10.1007/978-3-031-44693-1\_64",

language = "English",

isbn = "9783031446924",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "827--839",

editor = "Fei Liu and Nan Duan and Qingting Xu and Yu Hong",

booktitle = "Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings",

address = "Germany",

}

Liu, J & Li, K 2023, Bilevel Scheduled Sampling for Dialogue Generation. in F Liu, N Duan, Q Xu & Y Hong (eds), Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 14302 LNAI, Springer Science and Business Media Deutschland GmbH, pp. 827-839, 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023, Foshan, China, 12/10/23. https://doi.org/10.1007/978-3-031-44693-1_64

Bilevel Scheduled Sampling for Dialogue Generation. / Liu, Jiawen; Li, Kan.
Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings. ed. / Fei Liu; Nan Duan; Qingting Xu; Yu Hong. Springer Science and Business Media Deutschland GmbH, 2023. p. 827-839 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14302 LNAI).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Bilevel Scheduled Sampling for Dialogue Generation

AU - Liu, Jiawen

AU - Li, Kan

PY - 2023

Y1 - 2023

N2 - Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words’ quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model’s adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level information to an appropriate range, and employ probabilistic sampling based on the mapped values instead of threshold truncation. Experiments conducted on the DailyDialog and PersonaChat datasets demonstrate the effectiveness of our proposed methods, which significantly alleviate the exposure bias problem and outperform state-of-the-art scheduled sampling methods.

AB - Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words’ quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model’s adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level information to an appropriate range, and employ probabilistic sampling based on the mapped values instead of threshold truncation. Experiments conducted on the DailyDialog and PersonaChat datasets demonstrate the effectiveness of our proposed methods, which significantly alleviate the exposure bias problem and outperform state-of-the-art scheduled sampling methods.

KW - Dialog Generation

KW - Exposure Bias

KW - Scheduled Sampling

UR - http://www.scopus.com/inward/record.url?scp=85174679294&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-44693-1_64

DO - 10.1007/978-3-031-44693-1_64

M3 - Conference contribution

AN - SCOPUS:85174679294

SN - 9783031446924

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 827

EP - 839

BT - Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings

A2 - Liu, Fei

A2 - Duan, Nan

A2 - Xu, Qingting

A2 - Hong, Yu

PB - Springer Science and Business Media Deutschland GmbH

T2 - 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023

Y2 - 12 October 2023 through 15 October 2023

ER -

Liu J, Li K. Bilevel Scheduled Sampling for Dialogue Generation. In Liu F, Duan N, Xu Q, Hong Y, editors, Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings. Springer Science and Business Media Deutschland GmbH. 2023. p. 827-839. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-44693-1_64