Diversifying Neural Dialogue Generation via Negative Distillation

Yiwei Li; Shaoxiong Feng; Bin Sun; Kan Li

Diversifying Neural Dialogue Generation via Negative Distillation

Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

8 Citations (Scopus)

Abstract

Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios. Recently, an interesting approach, namely negative training, has been proposed to alleviate this problem by reminding the model not to generate high-frequency responses during training. However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. First, we introduce a negative teacher model that can produce query-wise generic responses, and then the student model is required to maximize the distance with multi-level negative knowledge. Empirical results show that our method outperforms previous negative training methods significantly.

Original language	English
Title of host publication	NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publication	Human Language Technologies, Proceedings of the Conference
Publisher	Association for Computational Linguistics (ACL)
Pages	407-418
Number of pages	12
ISBN (Electronic)	9781955917711
Publication status	Published - 2022
Event	2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022 - Seattle, United States Duration: 10 Jul 2022 → 15 Jul 2022

Publication series

Name	NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

Conference

Conference	2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022
Country/Territory	United States
City	Seattle
Period	10/07/22 → 15/07/22

Cite this

Li, Y., Feng, S., Sun, B., & Li, K. (2022). Diversifying Neural Dialogue Generation via Negative Distillation. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 407-418). (NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference). Association for Computational Linguistics (ACL).

Li, Yiwei ; Feng, Shaoxiong ; Sun, Bin et al. / Diversifying Neural Dialogue Generation via Negative Distillation. NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference. Association for Computational Linguistics (ACL), 2022. pp. 407-418 (NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference).

@inproceedings{5a02dc345cda4685bb9b7246dda3024b,

title = "Diversifying Neural Dialogue Generation via Negative Distillation",

abstract = "Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios. Recently, an interesting approach, namely negative training, has been proposed to alleviate this problem by reminding the model not to generate high-frequency responses during training. However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. First, we introduce a negative teacher model that can produce query-wise generic responses, and then the student model is required to maximize the distance with multi-level negative knowledge. Empirical results show that our method outperforms previous negative training methods significantly.",

author = "Yiwei Li and Shaoxiong Feng and Bin Sun and Kan Li",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computational Linguistics.; 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022 ; Conference date: 10-07-2022 Through 15-07-2022",

year = "2022",

language = "English",

series = "NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference",

publisher = "Association for Computational Linguistics (ACL)",

pages = "407--418",

booktitle = "NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics",

address = "United States",

}

Li, Y, Feng, S, Sun, B & Li, K 2022, Diversifying Neural Dialogue Generation via Negative Distillation. in NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference. NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, Association for Computational Linguistics (ACL), pp. 407-418, 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022, Seattle, United States, 10/07/22.

Diversifying Neural Dialogue Generation via Negative Distillation. / Li, Yiwei; Feng, Shaoxiong; Sun, Bin et al.
NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference. Association for Computational Linguistics (ACL), 2022. p. 407-418 (NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Diversifying Neural Dialogue Generation via Negative Distillation

AU - Li, Yiwei

AU - Feng, Shaoxiong

AU - Sun, Bin

AU - Li, Kan

PY - 2022

Y1 - 2022

N2 - Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios. Recently, an interesting approach, namely negative training, has been proposed to alleviate this problem by reminding the model not to generate high-frequency responses during training. However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. First, we introduce a negative teacher model that can produce query-wise generic responses, and then the student model is required to maximize the distance with multi-level negative knowledge. Empirical results show that our method outperforms previous negative training methods significantly.

AB - Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios. Recently, an interesting approach, namely negative training, has been proposed to alleviate this problem by reminding the model not to generate high-frequency responses during training. However, its performance is hindered by two issues, ignoring low-frequency but generic responses and bringing low-frequency but meaningless responses. In this paper, we propose a novel negative training paradigm, called negative distillation, to keep the model away from the undesirable generic responses while avoiding the above problems. First, we introduce a negative teacher model that can produce query-wise generic responses, and then the student model is required to maximize the distance with multi-level negative knowledge. Empirical results show that our method outperforms previous negative training methods significantly.

UR - http://www.scopus.com/inward/record.url?scp=85138378441&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85138378441

T3 - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

SP - 407

EP - 418

BT - NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics

PB - Association for Computational Linguistics (ACL)

T2 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2022

Y2 - 10 July 2022 through 15 July 2022

ER -

Li Y, Feng S, Sun B, Li K. Diversifying Neural Dialogue Generation via Negative Distillation. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference. Association for Computational Linguistics (ACL). 2022. p. 407-418. (NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference).

Diversifying Neural Dialogue Generation via Negative Distillation

Abstract

Publication series

Conference

Other files and links

Fingerprint

Cite this