TY - GEN
T1 - PITAR
T2 - SIGGRAPH Asia 2025 XR, SA 2025
AU - Jiang, Haiyan
AU - Qiu, Dongyu
AU - Stanescu, Ana
AU - Wang, Yidi
AU - Duh, Henry Been Lirn
AU - Guan, Frank
N1 - Publisher Copyright:
©2025 Copyright held by the owner/author(s).
PY - 2025/12/14
Y1 - 2025/12/14
N2 - We present PITAR, an large language model (LLM)-powered agent for intelligent and accurate manipulations in extended reality (XR) through multimodal interactions. PITAR integrates eye gaze, pointing gesture, and speech - particularly pronoun-based commands - to correctly infer user intent and control virtual objects. Using real-time data from XR headset and a few-shot prompting strategy, PITAR performs joint reasoning over multimodal signals and memory of the scenario to identify the target object and determine desired action and interaction parameters. A prototype VR system implementing PITAR demonstrates intuitive, human-like communication between users and virtual environments, advancing the development of intelligent agents for immersive interaction.
AB - We present PITAR, an large language model (LLM)-powered agent for intelligent and accurate manipulations in extended reality (XR) through multimodal interactions. PITAR integrates eye gaze, pointing gesture, and speech - particularly pronoun-based commands - to correctly infer user intent and control virtual objects. Using real-time data from XR headset and a few-shot prompting strategy, PITAR performs joint reasoning over multimodal signals and memory of the scenario to identify the target object and determine desired action and interaction parameters. A prototype VR system implementing PITAR demonstrates intuitive, human-like communication between users and virtual environments, advancing the development of intelligent agents for immersive interaction.
UR - https://www.scopus.com/pages/publications/105029415168
U2 - 10.1145/3761667.3761944
DO - 10.1145/3761667.3761944
M3 - Conference contribution
AN - SCOPUS:105029415168
T3 - Proceedings - SIGGRAPH Asia 2025 XR, SA 2025
BT - Proceedings - SIGGRAPH Asia 2025 XR, SA 2025
A2 - Spencer, Stephen N.
A2 - Komura, Taku
A2 - Peng, Evan Yifan
PB - Association for Computing Machinery, Inc
Y2 - 15 December 2025 through 18 December 2025
ER -