LLM-Guided Reinforcement Learning for Interactive Environments

Fuxue Yang, Jiawen Liu, Kan Li*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

We propose herein LLM-Guided Reinforcement Learning (LGRL), a novel framework that leverages large language models (LLMs) to decompose high-level objectives into a sequence of manageable subgoals in interactive environments. Our approach decouples high-level planning from low-level action execution by dynamically generating context-aware subgoals that guide the reinforcement learning (RL) agent. During training, intermediate subgoals—each associated with partial rewards—are generated based on the agent’s current progress, providing fine-grained feedback that facilitates structured exploration and accelerates convergence. At inference, a chain-of-thought strategy is employed, enabling the LLM to adaptively update subgoals in response to evolving environmental states. Although demonstrated on a representative interactive setting, our method is generalizable to a wide range of complex, goal-oriented tasks. Experimental results show that LGRL achieves higher success rates, improved efficiency, and faster convergence compared to baseline approaches.

Original languageEnglish
Article number1932
JournalMathematics
Volume13
Issue number12
DOIs
Publication statusPublished - Jun 2025
Externally publishedYes

Keywords

  • chain of thought
  • large language models
  • reinforcement learning

Fingerprint

Dive into the research topics of 'LLM-Guided Reinforcement Learning for Interactive Environments'. Together they form a unique fingerprint.

Cite this