Personalized decision-making for agents in face-to-face interaction in virtual reality

Xiaonuo Dongye; Dongdong Weng; Haiyan Jiang; Zeyu Tian; Yihua Bao; Pukun Chen

doi:10.1007/s00530-024-01591-7

Personalized decision-making for agents in face-to-face interaction in virtual reality

Xiaonuo Dongye, Dongdong Weng^*, Haiyan Jiang^*, Zeyu Tian, Yihua Bao, Pukun Chen

^*Corresponding author for this work

School of Optics and Photonics

Research output: Contribution to journal › Article › peer-review

Abstract

Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user’s multimodal interaction inputs. Designing the agent’s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents.

Original language	English
Article number	28
Journal	Multimedia Systems
Volume	31
Issue number	1
DOIs	https://doi.org/10.1007/s00530-024-01591-7
Publication status	Published - Feb 2025

Keywords

Face-to-face interaction
Human–agent interaction
Large language model
Virtual reality

Access to Document

10.1007/s00530-024-01591-7

Cite this

Dongye, X., Weng, D., Jiang, H., Tian, Z., Bao, Y., & Chen, P. (2025). Personalized decision-making for agents in face-to-face interaction in virtual reality. Multimedia Systems, 31(1), Article 28. https://doi.org/10.1007/s00530-024-01591-7

@article{4f71ebe91b6c44e8ae2acb28b4508b27,

title = "Personalized decision-making for agents in face-to-face interaction in virtual reality",

abstract = "Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user{\textquoteright}s multimodal interaction inputs. Designing the agent{\textquoteright}s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents.",

keywords = "Face-to-face interaction, Human–agent interaction, Large language model, Virtual reality",

author = "Xiaonuo Dongye and Dongdong Weng and Haiyan Jiang and Zeyu Tian and Yihua Bao and Pukun Chen",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.",

year = "2025",

month = feb,

doi = "10.1007/s00530-024-01591-7",

language = "English",

volume = "31",

journal = "Multimedia Systems",

issn = "0942-4962",

publisher = "Springer Verlag",

number = "1",

}

TY - JOUR

T1 - Personalized decision-making for agents in face-to-face interaction in virtual reality

AU - Dongye, Xiaonuo

AU - Weng, Dongdong

AU - Jiang, Haiyan

AU - Tian, Zeyu

AU - Bao, Yihua

AU - Chen, Pukun

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

PY - 2025/2

Y1 - 2025/2

N2 - Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user’s multimodal interaction inputs. Designing the agent’s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents.

AB - Intelligent agents for face-to-face interaction in virtual reality are expected to make decisions and provide appropriate feedback based on the user’s multimodal interaction inputs. Designing the agent’s decision-making process poses a significant challenge owing to the limited availability of multimodal interaction decision-making datasets and the complexities associated with providing personalized interaction feedback to diverse users. To overcome these challenges, we propose a novel design framework that involves generating and labeling symbolic interaction data, pre-training a small-scale real-time decision-making network, collecting personalized interaction data within interactions, and fine-tuning the network using personalized data. We develop a prototype system to demonstrate our design framework, which utilizes interaction distances, head orientations, and hand postures as inputs in virtual reality. The agent is capable of delivering personalized feedback from different users. We evaluate the proposed design framework by demonstrating the utilization of large language models for data labeling, emphasizing reliability and robustness. Furthermore, we evaluate the incorporation of personalized data fine-tuning for decision-making networks within our design framework, highlighting its importance in improving the user interaction experience. The design principles of this framework can be further explored and applied to various domains involving virtual agents.

KW - Face-to-face interaction

KW - Human–agent interaction

KW - Large language model

KW - Virtual reality

UR - http://www.scopus.com/inward/record.url?scp=85212947825&partnerID=8YFLogxK

U2 - 10.1007/s00530-024-01591-7

DO - 10.1007/s00530-024-01591-7

M3 - Article

AN - SCOPUS:85212947825

SN - 0942-4962

VL - 31

JO - Multimedia Systems

JF - Multimedia Systems

IS - 1

M1 - 28

ER -

Personalized decision-making for agents in face-to-face interaction in virtual reality

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this