TY - GEN
T1 - Bilevel Scheduled Sampling for Dialogue Generation
AU - Liu, Jiawen
AU - Li, Kan
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2023
Y1 - 2023
N2 - Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words’ quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model’s adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level information to an appropriate range, and employ probabilistic sampling based on the mapped values instead of threshold truncation. Experiments conducted on the DailyDialog and PersonaChat datasets demonstrate the effectiveness of our proposed methods, which significantly alleviate the exposure bias problem and outperform state-of-the-art scheduled sampling methods.
AB - Exposure bias poses a common challenge in numerous natural language processing tasks, particularly in the dialog generation. In response to this issue, researchers have devised various techniques, among which scheduled sampling has proven to be an effective method for mitigating exposure bias. However, the existing state-of-the-art scheduled sampling methods solely consider the current sampling words’ quality for threshold truncation sampling, which overlooks the importance of sentence-level information and the method of threshold truncation warrants further discussion. In this paper, we propose a bilevel scheduled sampling model that takes the sentence-level information into account and incorporates it with word-level quality. To enhance sampling diversity and improve the model’s adaptability, we propose a smooth function that maps the combined result of sentence-level and word-level information to an appropriate range, and employ probabilistic sampling based on the mapped values instead of threshold truncation. Experiments conducted on the DailyDialog and PersonaChat datasets demonstrate the effectiveness of our proposed methods, which significantly alleviate the exposure bias problem and outperform state-of-the-art scheduled sampling methods.
KW - Dialog Generation
KW - Exposure Bias
KW - Scheduled Sampling
UR - http://www.scopus.com/inward/record.url?scp=85174679294&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-44693-1_64
DO - 10.1007/978-3-031-44693-1_64
M3 - Conference contribution
AN - SCOPUS:85174679294
SN - 9783031446924
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 827
EP - 839
BT - Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Proceedings
A2 - Liu, Fei
A2 - Duan, Nan
A2 - Xu, Qingting
A2 - Hong, Yu
PB - Springer Science and Business Media Deutschland GmbH
T2 - 12th National CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2023
Y2 - 12 October 2023 through 15 October 2023
ER -