Representation Learning in Deep RL via Discrete Information Bottleneck

Riashat Islam; Hongyu Zang; Manan Tomar; Aniket Didolkar; Md Mofijul Islam; Samin Yeasar Arnob; Tariq Iqbal; Xin Li; Anirudh Goyal; Nicolas Heess; Alex Lamb

Representation Learning in Deep RL via Discrete Information Bottleneck

Riashat Islam, Hongyu Zang, Manan Tomar, Aniket Didolkar, Md Mofijul Islam, Samin Yeasar Arnob, Tariq Iqbal, Xin Li, Anirudh Goyal, Nicolas Heess, Alex Lamb

School of Computer Science and Technology

Research output: Contribution to journal › Conference article › peer-review

Abstract

Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.

Original language	English
Pages (from-to)	8699-8722
Number of pages	24
Journal	Proceedings of Machine Learning Research
Volume	206
Publication status	Published - 2023
Event	26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 - Valencia, Spain Duration: 25 Apr 2023 → 27 Apr 2023

Cite this

@article{c4f80bb3f5944d2fb4b8faf9304368b7,

title = "Representation Learning in Deep RL via Discrete Information Bottleneck",

abstract = "Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.",

author = "Riashat Islam and Hongyu Zang and Manan Tomar and Aniket Didolkar and Islam, {Md Mofijul} and Arnob, {Samin Yeasar} and Tariq Iqbal and Xin Li and Anirudh Goyal and Nicolas Heess and Alex Lamb",

note = "Publisher Copyright: Copyright {\textcopyright} 2023 by the author(s); 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 ; Conference date: 25-04-2023 Through 27-04-2023",

year = "2023",

language = "English",

volume = "206",

pages = "8699--8722",

journal = "Proceedings of Machine Learning Research",

issn = "2640-3498",

publisher = "ML Research Press",

}

TY - JOUR

T1 - Representation Learning in Deep RL via Discrete Information Bottleneck

AU - Islam, Riashat

AU - Zang, Hongyu

AU - Tomar, Manan

AU - Didolkar, Aniket

AU - Islam, Md Mofijul

AU - Arnob, Samin Yeasar

AU - Iqbal, Tariq

AU - Li, Xin

AU - Goyal, Anirudh

AU - Heess, Nicolas

AU - Lamb, Alex

PY - 2023

Y1 - 2023

N2 - Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.

AB - Several self-supervised representation learning methods have been proposed for reinforcement learning (RL) with rich observations. For real world applications of RL, recovering underlying latent states is crucial, particularly when sensory inputs contain irrelevant and exogenous information. In this work, we study how information bottlenecks can be used to construct latent states efficiently in the presence of task irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as REPDIB, to learn structured factorized representations. Exploiting the expressiveness bought by factorized representations, we introduce a simple, yet effective, bottleneck that can be integrated with any existing self supervised objective for RL. We demonstrate this across several online and offline RL benchmarks, along with a real robot arm task, where we find that compressed representations with REPDIB can lead to strong performance improvements, as the learnt bottlenecks help predict only the relevant state, while ignoring irrelevant information.

UR - http://www.scopus.com/inward/record.url?scp=85165157845&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:85165157845

SN - 2640-3498

VL - 206

SP - 8699

EP - 8722

JO - Proceedings of Machine Learning Research

JF - Proceedings of Machine Learning Research

T2 - 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023

Y2 - 25 April 2023 through 27 April 2023

ER -

Representation Learning in Deep RL via Discrete Information Bottleneck

Abstract

Other files and links

Fingerprint

Cite this