AgentTOD: A Task-Oriented Dialogue Agent with a Flexible and Adaptive API Calling Paradigm

Research output: Contribution to journalArticlepeer-review

Abstract

Task-oriented dialogue (TOD) systems play a vital role in numerous assistance and service scenarios, significantly improving people’s daily lives. Conventionally, a TOD system adheres to a fixed paradigm, where it must first extract user goals and query external databases before it can generate the final response. However, this fixed extract-and-query paradigm is not always optimal for all dialogue turns, which is redundant for the simple turns that do not need external information, and is inadequate for the complex turns that need to interact with the external world multiple times. To address the limitations, in this article, we propose AgentTOD, a novel TOD framework that uses a large language model (LLM) as the intelligent agent to achieve a flexible dialogue paradigm. AgentTOD deprecates the traditional modular architecture (including dialogue state tracking and dialogue policy) by utilizing an LLM as the controller brain to determine when and how to call the provided APIs to obtain external information. It can choose to call APIs any number of times with various parameters until it’s enough to reply to the user. Besides, to train AgentTOD, we construct a large and comprehensive TOD dataset, called TrajsTOD (Trajectories of TODs), which consists of 66k+ user-agent dialogue trajectories converted from eight popular TOD datasets covering 60 domains. TrajsTOD is constructed with minimal dialogue annotations where only the API calling logs are needed and can empower AgentTOD with the general ability to call APIs and generate responses according to the task definition. Extensive experimental results on the MultiWOZ-series and SGD datasets demonstrate AgentTOD has superior performance on TODs as well as a superior adaptability to new task scenarios.

Original languageEnglish
Article number136
JournalACM Transactions on Information Systems
Volume43
Issue number5
DOIs
Publication statusPublished - 8 Aug 2025
Externally publishedYes

Keywords

  • Intelligent Agents
  • Large Language Models
  • Task-Oriented Dialogue

Fingerprint

Dive into the research topics of 'AgentTOD: A Task-Oriented Dialogue Agent with a Flexible and Adaptive API Calling Paradigm'. Together they form a unique fingerprint.

Cite this