Representation Learning in Deep RL via Discrete Information Bottleneck

Riashat Islam; Hongyu Zang; Manan Tomar; Aniket Didolkar; Md Mofijul Islam; Samin Yeasar Arnob; Tariq Iqbal; Xin Li; Anirudh Goyal; Nicolas Heess; Alex Lamb

Representation Learning in Deep RL via Discrete Information Bottleneck

Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb

计算机学院

科研成果: 期刊稿件 › 会议文章 › 同行评审

摘要

Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.

源语言	英语
页（从-至）	8699-8722
页数	24
期刊	Proceedings of Machine Learning Research
卷	206
出版状态	已出版 - 2023
活动	26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 - Valencia, 西班牙期限: 25 4月 2023 → 27 4月 2023

其它文件与链接

链接到 Scopus 的出版物

引用此

Islam, R., Zang, H., Tomar, M., Didolkar, A., Islam, M. M., Arnob, S. Y., Iqbal, T., Li, X., Goyal, A., Heess, N., & Lamb, A. (2023). Representation Learning in Deep RL via Discrete Information Bottleneck. Proceedings of Machine Learning Research, 206, 8699-8722.

@article{c4f80bb3f5944d2fb4b8faf9304368b7,

title = "Representation Learning in Deep RL via Discrete Information Bottleneck",

abstract = "Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.",

author = "Riashat Islam and Hongyu Zang and Manan Tomar and Aniket Didolkar and Islam, {Md Mofijul} and Arnob, {Samin Yeasar} and Tariq Iqbal and Xin Li and Anirudh Goyal and Nicolas Heess and Alex Lamb",

note = "Publisher Copyright: Copyright {\textcopyright} 2023 by the author(s); 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 ; Conference date: 25-04-2023 Through 27-04-2023",

year = "2023",

language = "English",

volume = "206",

pages = "8699--8722",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

publisher = "ML Research Press",

}

TY - JOUR

T1 - Representation Learning in Deep RL via Discrete Information Bottleneck

AU - Islam, Riashat

AU - Zang, Hongyu

AU - Tomar, Manan

AU - Didolkar, Aniket

AU - Islam, Md Mofijul

AU - Arnob, Samin Yeasar

AU - Iqbal, Tariq

AU - Li, Xin

AU - Goyal, Anirudh

AU - Heess, Nicolas

AU - Lamb, Alex

PY - 2023

Y1 - 2023

N2 - Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.

AB - Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.

UR - http://www.scopus.com/inward/record.url?scp=85165157845&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85165157845

SN - 2640-3498

VL - 206

SP - 8699

EP - 8722

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023

Y2 - 25 April 2023 through 27 April 2023

ER -

Representation Learning in Deep RL via Discrete Information Bottleneck

摘要

其它文件与链接

指纹

引用此