OSTOD: One-Step Task-Oriented Dialogue with activated state and retelling response

Heyan Huang; Puhai Yang; Wei Wei; Shumin Shi; Xian Ling Mao

doi:10.1016/j.knosys.2024.111677

OSTOD: One-Step Task-Oriented Dialogue with activated state and retelling response

Heyan Huang, Puhai Yang^*, Wei Wei, Shumin Shi, Xian Ling Mao

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

As of present, the progress of conversational AI research has been greatly propelled by large-scale pre-trained language models. In particular, task-oriented dialogue systems have gained widespread attention owing to their immense potential in helping individuals accomplish diverse objectives, such as booking hotels, making restaurant reservations, and purchasing train tickets. In the past, task-oriented dialogue systems were typically viewed as a multi-step process that included spoken language understanding, dialogue state tracking, dialogue policy learning, and natural language generation. More recently, large-scale pre-trained language models enables the development of end-to-end neural pipeline task-oriented dialogue systems, which combine multiple steps into a single model, allowing for joint optimization and preventing error propagation. However, in order to explicitly retrieve information from databases to ensure the interpretability of the system, almost all end-to-end neural pipeline methods inevitably require predicting dialogue state as an intermediate result specialized for the domain or task, which results in significant challenges for generalization. To solve the problem above, we propose One-Step Task-Oriented Dialogue (OSTOD) in this paper, which models task-oriented dialogue by synchronously generating activated states and retelling responses, where activated states refer to slot values that contribute to database access, and retelling responses are system responses that contain activated state information. Specifically, first, automatic methods are designed to build data containing activated states and retelling responses. Then, a joint generation model that synchronously predicts activated states and retelling responses in a single step is proposed for task-oriented dialogue modelling. Based on empirical results obtained from the MultiWOZ 2.0 and MultiWOZ 2.1 datasets, our OSTOD model demonstrates comparable performance to state-of-the-art baselines. Moreover, our model exhibits exceptional generalization capabilities in few-shot learning and domain transfer scenarios.

源语言	英语
文章编号	111677
期刊	Knowledge-Based Systems
卷	293
DOI	https://doi.org/10.1016/j.knosys.2024.111677
出版状态	已出版 - 7 6月 2024

访问文件

10.1016/j.knosys.2024.111677

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{6adfdd78ef694600a6752ffce33feb59,

title = "OSTOD: One-Step Task-Oriented Dialogue with activated state and retelling response",

abstract = "As of present, the progress of conversational AI research has been greatly propelled by large-scale pre-trained language models. In particular, task-oriented dialogue systems have gained widespread attention owing to their immense potential in helping individuals accomplish diverse objectives, such as booking hotels, making restaurant reservations, and purchasing train tickets. In the past, task-oriented dialogue systems were typically viewed as a multi-step process that included spoken language understanding, dialogue state tracking, dialogue policy learning, and natural language generation. More recently, large-scale pre-trained language models enables the development of end-to-end neural pipeline task-oriented dialogue systems, which combine multiple steps into a single model, allowing for joint optimization and preventing error propagation. However, in order to explicitly retrieve information from databases to ensure the interpretability of the system, almost all end-to-end neural pipeline methods inevitably require predicting dialogue state as an intermediate result specialized for the domain or task, which results in significant challenges for generalization. To solve the problem above, we propose One-Step Task-Oriented Dialogue (OSTOD) in this paper, which models task-oriented dialogue by synchronously generating activated states and retelling responses, where activated states refer to slot values that contribute to database access, and retelling responses are system responses that contain activated state information. Specifically, first, automatic methods are designed to build data containing activated states and retelling responses. Then, a joint generation model that synchronously predicts activated states and retelling responses in a single step is proposed for task-oriented dialogue modelling. Based on empirical results obtained from the MultiWOZ 2.0 and MultiWOZ 2.1 datasets, our OSTOD model demonstrates comparable performance to state-of-the-art baselines. Moreover, our model exhibits exceptional generalization capabilities in few-shot learning and domain transfer scenarios.",

keywords = "Dialogue state tracking, End-to-end dialogue, Response generation, Retelling response, Task-oriented dialogue",

author = "Heyan Huang and Puhai Yang and Wei Wei and Shumin Shi and Mao, {Xian Ling}",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier B.V.",

year = "2024",

month = jun,

day = "7",

doi = "10.1016/j.knosys.2024.111677",

language = "English",

volume = "293",

journal = "Knowledge-Based Systems",

issn = "0950-7051",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - OSTOD

T2 - One-Step Task-Oriented Dialogue with activated state and retelling response

AU - Huang, Heyan

AU - Yang, Puhai

AU - Wei, Wei

AU - Shi, Shumin

AU - Mao, Xian Ling

PY - 2024/6/7

Y1 - 2024/6/7

N2 - As of present, the progress of conversational AI research has been greatly propelled by large-scale pre-trained language models. In particular, task-oriented dialogue systems have gained widespread attention owing to their immense potential in helping individuals accomplish diverse objectives, such as booking hotels, making restaurant reservations, and purchasing train tickets. In the past, task-oriented dialogue systems were typically viewed as a multi-step process that included spoken language understanding, dialogue state tracking, dialogue policy learning, and natural language generation. More recently, large-scale pre-trained language models enables the development of end-to-end neural pipeline task-oriented dialogue systems, which combine multiple steps into a single model, allowing for joint optimization and preventing error propagation. However, in order to explicitly retrieve information from databases to ensure the interpretability of the system, almost all end-to-end neural pipeline methods inevitably require predicting dialogue state as an intermediate result specialized for the domain or task, which results in significant challenges for generalization. To solve the problem above, we propose One-Step Task-Oriented Dialogue (OSTOD) in this paper, which models task-oriented dialogue by synchronously generating activated states and retelling responses, where activated states refer to slot values that contribute to database access, and retelling responses are system responses that contain activated state information. Specifically, first, automatic methods are designed to build data containing activated states and retelling responses. Then, a joint generation model that synchronously predicts activated states and retelling responses in a single step is proposed for task-oriented dialogue modelling. Based on empirical results obtained from the MultiWOZ 2.0 and MultiWOZ 2.1 datasets, our OSTOD model demonstrates comparable performance to state-of-the-art baselines. Moreover, our model exhibits exceptional generalization capabilities in few-shot learning and domain transfer scenarios.

AB - As of present, the progress of conversational AI research has been greatly propelled by large-scale pre-trained language models. In particular, task-oriented dialogue systems have gained widespread attention owing to their immense potential in helping individuals accomplish diverse objectives, such as booking hotels, making restaurant reservations, and purchasing train tickets. In the past, task-oriented dialogue systems were typically viewed as a multi-step process that included spoken language understanding, dialogue state tracking, dialogue policy learning, and natural language generation. More recently, large-scale pre-trained language models enables the development of end-to-end neural pipeline task-oriented dialogue systems, which combine multiple steps into a single model, allowing for joint optimization and preventing error propagation. However, in order to explicitly retrieve information from databases to ensure the interpretability of the system, almost all end-to-end neural pipeline methods inevitably require predicting dialogue state as an intermediate result specialized for the domain or task, which results in significant challenges for generalization. To solve the problem above, we propose One-Step Task-Oriented Dialogue (OSTOD) in this paper, which models task-oriented dialogue by synchronously generating activated states and retelling responses, where activated states refer to slot values that contribute to database access, and retelling responses are system responses that contain activated state information. Specifically, first, automatic methods are designed to build data containing activated states and retelling responses. Then, a joint generation model that synchronously predicts activated states and retelling responses in a single step is proposed for task-oriented dialogue modelling. Based on empirical results obtained from the MultiWOZ 2.0 and MultiWOZ 2.1 datasets, our OSTOD model demonstrates comparable performance to state-of-the-art baselines. Moreover, our model exhibits exceptional generalization capabilities in few-shot learning and domain transfer scenarios.

KW - Dialogue state tracking

KW - End-to-end dialogue

KW - Response generation

KW - Retelling response

KW - Task-oriented dialogue

UR - http://www.scopus.com/inward/record.url?scp=85189671986&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2024.111677

DO - 10.1016/j.knosys.2024.111677

M3 - Article

AN - SCOPUS:85189671986

SN - 0950-7051

VL - 293

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

M1 - 111677

ER -

OSTOD: One-Step Task-Oriented Dialogue with activated state and retelling response

摘要

访问文件

其它文件与链接

指纹

引用此