TY - JOUR
T1 - Learning Fused State Representations for Control from Multi-View Observations
AU - Wang, Zeyu
AU - Li, Yao Hui
AU - Li, Xin
AU - Zang, Hongyu
AU - Laroche, Romain
AU - Islam, Riashat
N1 - Publisher Copyright:
© 2025 by the author(s).
PY - 2025
Y1 - 2025
N2 - Multi-View Reinforcement Learning (MVRL) seeks to provide agents with multi-view observations, enabling them to perceive environment with greater effectiveness and precision. Recent advancements in MVRL focus on extracting latent representations from multiview observations and leveraging them in control tasks. However, it is not straightforward to learn compact and task-relevant representations, particularly in the presence of redundancy, distracting information, or missing views. In this paper, we propose Multi-view Fusion State for Control (MFSC), firstly incorporating bisimulation metric learning into MVRL to learn task-relevant representations. Furthermore, we propose a multiview-based mask and latent reconstruction auxiliary task that exploits shared information across views and improves MFSC’s robustness in missing views by introducing a mask token. Extensive experimental results demonstrate that our method outperforms existing approaches in MVRL tasks. Even in more realistic scenarios with interference or missing views, MFSC consistently maintains high performance. The project code is available at https://github.com/zpwdev/MFSC.
AB - Multi-View Reinforcement Learning (MVRL) seeks to provide agents with multi-view observations, enabling them to perceive environment with greater effectiveness and precision. Recent advancements in MVRL focus on extracting latent representations from multiview observations and leveraging them in control tasks. However, it is not straightforward to learn compact and task-relevant representations, particularly in the presence of redundancy, distracting information, or missing views. In this paper, we propose Multi-view Fusion State for Control (MFSC), firstly incorporating bisimulation metric learning into MVRL to learn task-relevant representations. Furthermore, we propose a multiview-based mask and latent reconstruction auxiliary task that exploits shared information across views and improves MFSC’s robustness in missing views by introducing a mask token. Extensive experimental results demonstrate that our method outperforms existing approaches in MVRL tasks. Even in more realistic scenarios with interference or missing views, MFSC consistently maintains high performance. The project code is available at https://github.com/zpwdev/MFSC.
UR - https://www.scopus.com/pages/publications/105023463917
M3 - Conference article
AN - SCOPUS:105023463917
SN - 2640-3498
VL - 267
SP - 63365
EP - 63386
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 42nd International Conference on Machine Learning, ICML 2025
Y2 - 13 July 2025 through 19 July 2025
ER -