TY - JOUR
T1 - Personalized decision-making for agents in face-to-face interaction in virtual reality
AU - Dongye, Xiaonuo
AU - Weng, Dongdong
AU - Jiang, Haiyan
AU - Tian, Zeyu
AU - Bao, Yihua
AU - Chen, Pukun
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.
PY - 2025/2
Y1 - 2025/2
N2 - Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user’s multimodal interaction inputs. Designing the agent’s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents.
AB - Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user’s multimodal interaction inputs. Designing the agent’s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents.
KW - Face-to-face interaction
KW - Human–agent interaction
KW - Large language model
KW - Virtual reality
UR - http://www.scopus.com/inward/record.url?scp=85212947825&partnerID=8YFLogxK
U2 - 10.1007/s00530-024-01591-7
DO - 10.1007/s00530-024-01591-7
M3 - Article
AN - SCOPUS:85212947825
SN - 0942-4962
VL - 31
JO - Multimedia Systems
JF - Multimedia Systems
IS - 1
M1 - 28
ER -