TY - JOUR
T1 - AgentTOD
T2 - A Task-Oriented Dialogue Agent with a Flexible and Adaptive API Calling Paradigm
AU - Xu, Heng Da
AU - Mao, Xian Ling
AU - Sun, Fanshu
AU - Che, Tian Yi
AU - Xu, Chun
AU - Huang, Heyan
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/8/8
Y1 - 2025/8/8
N2 - Task-oriented dialogue (TOD) systems play a vital role in numerous assistance and service scenarios, significantly improving people’s daily lives. Conventionally, a TOD system adheres to a fixed paradigm, where it must first extract user goals and query external databases before it can generate the final response. However, this fixed extract-and-query paradigm is not always optimal for all dialogue turns, which is redundant for the simple turns that do not need external information, and is inadequate for the complex turns that need to interact with the external world multiple times. To address the limitations, in this article, we propose AgentTOD, a novel TOD framework that uses a large language model (LLM) as the intelligent agent to achieve a flexible dialogue paradigm. AgentTOD deprecates the traditional modular architecture (including dialogue state tracking and dialogue policy) by utilizing an LLM as the controller brain to determine when and how to call the provided APIs to obtain external information. It can choose to call APIs any number of times with various parameters until it’s enough to reply to the user. Besides, to train AgentTOD, we construct a large and comprehensive TOD dataset, called TrajsTOD (Trajectories of TODs), which consists of 66k+ user-agent dialogue trajectories converted from eight popular TOD datasets covering 60 domains. TrajsTOD is constructed with minimal dialogue annotations where only the API calling logs are needed and can empower AgentTOD with the general ability to call APIs and generate responses according to the task definition. Extensive experimental results on the MultiWOZ-series and SGD datasets demonstrate AgentTOD has superior performance on TODs as well as a superior adaptability to new task scenarios.
AB - Task-oriented dialogue (TOD) systems play a vital role in numerous assistance and service scenarios, significantly improving people’s daily lives. Conventionally, a TOD system adheres to a fixed paradigm, where it must first extract user goals and query external databases before it can generate the final response. However, this fixed extract-and-query paradigm is not always optimal for all dialogue turns, which is redundant for the simple turns that do not need external information, and is inadequate for the complex turns that need to interact with the external world multiple times. To address the limitations, in this article, we propose AgentTOD, a novel TOD framework that uses a large language model (LLM) as the intelligent agent to achieve a flexible dialogue paradigm. AgentTOD deprecates the traditional modular architecture (including dialogue state tracking and dialogue policy) by utilizing an LLM as the controller brain to determine when and how to call the provided APIs to obtain external information. It can choose to call APIs any number of times with various parameters until it’s enough to reply to the user. Besides, to train AgentTOD, we construct a large and comprehensive TOD dataset, called TrajsTOD (Trajectories of TODs), which consists of 66k+ user-agent dialogue trajectories converted from eight popular TOD datasets covering 60 domains. TrajsTOD is constructed with minimal dialogue annotations where only the API calling logs are needed and can empower AgentTOD with the general ability to call APIs and generate responses according to the task definition. Extensive experimental results on the MultiWOZ-series and SGD datasets demonstrate AgentTOD has superior performance on TODs as well as a superior adaptability to new task scenarios.
KW - Intelligent Agents
KW - Large Language Models
KW - Task-Oriented Dialogue
UR - https://www.scopus.com/pages/publications/105018466395
U2 - 10.1145/3745021
DO - 10.1145/3745021
M3 - Article
AN - SCOPUS:105018466395
SN - 1046-8188
VL - 43
JO - ACM Transactions on Information Systems
JF - ACM Transactions on Information Systems
IS - 5
M1 - 136
ER -