Efficient and Effective Role Player: A Compact Knowledge-grounded Persona-based Dialogue Model Enhanced by LLM Distillation

Linmei Hu*, Xinyu Zhang, Dandan Song, Changzhi Zhou, Hongyu He, Liqiang Nie

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Incorporating explicit personas into dialogue models is critical for generating responses that fulfill specific user needs and preferences, creating a more personalized and engaging interaction. Early works on persona-based dialogue generation directly concatenate the persona descriptions and dialogue history into relatively small pre-trained language models (PLMs) for response generation, which leads to uninformative and inferior results due to the sparse persona information and the limited model generation capabilities. Recently, large language models (LLMs) have shown their surprising capabilities in language generation. Prompting the LLMs with the persona descriptions for role-playing dialogue generation has also achieved promising results. However, deploying LLMs is challenging for practical applications due to their large scale, spurring efforts to distill the generation capabilities into more concise and compact models through teacher-student learning. In this article, we propose an efficient compact Knowledge-grounded Persona-based Dialogue model enhanced by LLM Distillation (KPDD). Specifically, first, we propose to enrich the annotated persona descriptions by integrating external knowledge graphs (KGs) with a mixed encoding network, coupled with a mixture of experts (MoE) module for both informative and diverse response generation. The mixed encoding network contains multiple layers of modality interaction operations, enabling information from both modalities propagates to the other. Second, to fully exploit the generation capabilities of LLMs, we turn to the distillation technique to improve the generation capabilities of our model, facilitated by a natural language inference (NLI)-based filtering mechanism to extract high-quality information from LLMs. In addition, we employ a curriculum learning strategy to train our model on the high-quality filtered distilled data and progressively on the relatively noisy original data, enhancing its adaptability and performance. Extensive experiments show that KPDD outperforms state-of-the-art baselines in terms of both automatic and human evaluation.

Original languageEnglish
Article number59
JournalACM Transactions on Information Systems
Volume43
Issue number3
DOIs
Publication statusPublished - 27 Feb 2025

Keywords

  • Curriculum Learning
  • Distillation
  • Knowledge Graph
  • Large Language Model
  • MoE
  • Persona-based Dialogue Generation

Fingerprint

Dive into the research topics of 'Efficient and Effective Role Player: A Compact Knowledge-grounded Persona-based Dialogue Model Enhanced by LLM Distillation'. Together they form a unique fingerprint.

Cite this