TY - JOUR
T1 - Principled Offline RL in the Presence of Rich Exogenous Information
AU - Islam, Riashat
AU - Tomar, Manan
AU - Lamb, Alex
AU - Efroni, Yonathan
AU - Zang, Hongyu
AU - Didolkar, Aniket
AU - Misra, Dipendra
AU - Li, Xin
AU - Van Seijen, Harm
AU - Des Combes, Remi Tachet
AU - Langford, John
N1 - Publisher Copyright:
© 2023 Proceedings of Machine Learning Research. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Learning to control an agent from offline data collected in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenous information, i.e., any control-irrelevant information contained in observations. For example, a robot navigating in busy streets needs to ignore irrelevant information, such as other people walking in the background, textures of objects, or birds in the sky. In this paper, we focus on the setting with visually detailed exogenous information and introduce new offline RL benchmarks that offer the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time-dependent process, which is prevalent in practical applications. To address these, we propose to use multi-step inverse models to learn Agent-Centric Representations for Offline-RL (ACRO). Despite being simple and reward-free, we show theoretically and empirically that the representation created by this objective greatly outperforms baselines.
AB - Learning to control an agent from offline data collected in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenous information, i.e., any control-irrelevant information contained in observations. For example, a robot navigating in busy streets needs to ignore irrelevant information, such as other people walking in the background, textures of objects, or birds in the sky. In this paper, we focus on the setting with visually detailed exogenous information and introduce new offline RL benchmarks that offer the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time-dependent process, which is prevalent in practical applications. To address these, we propose to use multi-step inverse models to learn Agent-Centric Representations for Offline-RL (ACRO). Despite being simple and reward-free, we show theoretically and empirically that the representation created by this objective greatly outperforms baselines.
UR - http://www.scopus.com/inward/record.url?scp=85174415375&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85174415375
SN - 2640-3498
VL - 202
SP - 14390
EP - 14421
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 40th International Conference on Machine Learning, ICML 2023
Y2 - 23 July 2023 through 29 July 2023
ER -