Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input

Keyang Jiang; Qiang Wang; Yahao Xu; Hongbin Deng

doi:10.1109/IJCNN60899.2024.10650736

Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input

Keyang Jiang, Qiang Wang^*, Yahao Xu, Hongbin Deng

^*Corresponding author for this work

School of Mechatronical Engineering

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Reinforcement learning tasks that involve visual input continue to pose a challenge when it comes to partial observability problems. Although prior research has introduced methods such as LSTM, GTrXL, and DNC, each of these approaches possesses its own limitations. To address the problem of partial observability in a more universal context, this paper proposed the Observation-Time-Action deep stacking algorithm. First, observations, actions, and time data were combined into a tuple and then stacked into a longer sequence. Then, convolution and fully connected layers were utilized to extract relevant features from the sequence, which were then fed into the algorithm for processing. We designed a number of experiments with partial observability, corresponding to different typical scenarios in reinforcement learning. The experiment results demonstrated that the proposed method obtained a higher success rate. Moreover, we also investigated the effect of stacking frame length and different reinforcement learning elements on the algorithm. Finally, we conducted a HITL (Hardware-in-the-Loop) experiment to further verify the effectiveness of our algorithm.

Original language	English
Title of host publication	2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9798350359312
DOIs	https://doi.org/10.1109/IJCNN60899.2024.10650736
Publication status	Published - 2024
Event	2024 International Joint Conference on Neural Networks, IJCNN 2024 - Yokohama, Japan Duration: 30 Jun 2024 → 5 Jul 2024

Publication series

Name	Proceedings of the International Joint Conference on Neural Networks

Conference

Conference	2024 International Joint Conference on Neural Networks, IJCNN 2024
Country/Territory	Japan
City	Yokohama
Period	30/06/24 → 5/07/24

Keywords

Partial Observability Problems
Reinforcement learning
Visual perception

Access to Document

10.1109/IJCNN60899.2024.10650736

Cite this

Jiang, K., Wang, Q., Xu, Y., & Deng, H. (2024). Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input. In 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings (Proceedings of the International Joint Conference on Neural Networks). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IJCNN60899.2024.10650736

Jiang, Keyang ; Wang, Qiang ; Xu, Yahao et al. / Observation-Time-Action Deep Stacking Strategy : Solving Partial Observability Problems with Visual Input. 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2024. (Proceedings of the International Joint Conference on Neural Networks).

@inproceedings{f36d14ce389647c88d04ee7fe1548243,

title = "Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input",

abstract = "Reinforcement learning tasks that involve visual input continue to pose a challenge when it comes to partial observability problems. Although prior research has introduced methods such as LSTM, GTrXL, and DNC, each of these approaches possesses its own limitations. To address the problem of partial observability in a more universal context, this paper proposed the Observation-Time-Action deep stacking algorithm. First, observations, actions, and time data were combined into a tuple and then stacked into a longer sequence. Then, convolution and fully connected layers were utilized to extract relevant features from the sequence, which were then fed into the algorithm for processing. We designed a number of experiments with partial observability, corresponding to different typical scenarios in reinforcement learning. The experiment results demonstrated that the proposed method obtained a higher success rate. Moreover, we also investigated the effect of stacking frame length and different reinforcement learning elements on the algorithm. Finally, we conducted a HITL (Hardware-in-the-Loop) experiment to further verify the effectiveness of our algorithm.",

keywords = "Partial Observability Problems, Reinforcement learning, Visual perception",

author = "Keyang Jiang and Qiang Wang and Yahao Xu and Hongbin Deng",

note = "Publisher Copyright: {\textcopyright} 2024 IEEE.; 2024 International Joint Conference on Neural Networks, IJCNN 2024 ; Conference date: 30-06-2024 Through 05-07-2024",

year = "2024",

doi = "10.1109/IJCNN60899.2024.10650736",

language = "English",

series = "Proceedings of the International Joint Conference on Neural Networks",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings",

address = "United States",

}

Jiang, K, Wang, Q, Xu, Y & Deng, H 2024, Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input. in 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings. Proceedings of the International Joint Conference on Neural Networks, Institute of Electrical and Electronics Engineers Inc., 2024 International Joint Conference on Neural Networks, IJCNN 2024, Yokohama, Japan, 30/06/24. https://doi.org/10.1109/IJCNN60899.2024.10650736

Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input. / Jiang, Keyang; Wang, Qiang; Xu, Yahao et al.
2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2024. (Proceedings of the International Joint Conference on Neural Networks).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Observation-Time-Action Deep Stacking Strategy

T2 - 2024 International Joint Conference on Neural Networks, IJCNN 2024

AU - Jiang, Keyang

AU - Wang, Qiang

AU - Xu, Yahao

AU - Deng, Hongbin

PY - 2024

Y1 - 2024

N2 - Reinforcement learning tasks that involve visual input continue to pose a challenge when it comes to partial observability problems. Although prior research has introduced methods such as LSTM, GTrXL, and DNC, each of these approaches possesses its own limitations. To address the problem of partial observability in a more universal context, this paper proposed the Observation-Time-Action deep stacking algorithm. First, observations, actions, and time data were combined into a tuple and then stacked into a longer sequence. Then, convolution and fully connected layers were utilized to extract relevant features from the sequence, which were then fed into the algorithm for processing. We designed a number of experiments with partial observability, corresponding to different typical scenarios in reinforcement learning. The experiment results demonstrated that the proposed method obtained a higher success rate. Moreover, we also investigated the effect of stacking frame length and different reinforcement learning elements on the algorithm. Finally, we conducted a HITL (Hardware-in-the-Loop) experiment to further verify the effectiveness of our algorithm.

AB - Reinforcement learning tasks that involve visual input continue to pose a challenge when it comes to partial observability problems. Although prior research has introduced methods such as LSTM, GTrXL, and DNC, each of these approaches possesses its own limitations. To address the problem of partial observability in a more universal context, this paper proposed the Observation-Time-Action deep stacking algorithm. First, observations, actions, and time data were combined into a tuple and then stacked into a longer sequence. Then, convolution and fully connected layers were utilized to extract relevant features from the sequence, which were then fed into the algorithm for processing. We designed a number of experiments with partial observability, corresponding to different typical scenarios in reinforcement learning. The experiment results demonstrated that the proposed method obtained a higher success rate. Moreover, we also investigated the effect of stacking frame length and different reinforcement learning elements on the algorithm. Finally, we conducted a HITL (Hardware-in-the-Loop) experiment to further verify the effectiveness of our algorithm.

KW - Partial Observability Problems

KW - Reinforcement learning

KW - Visual perception

UR - http://www.scopus.com/inward/record.url?scp=85204994951&partnerID=8YFLogxK

U2 - 10.1109/IJCNN60899.2024.10650736

DO - 10.1109/IJCNN60899.2024.10650736

M3 - Conference contribution

AN - SCOPUS:85204994951

T3 - Proceedings of the International Joint Conference on Neural Networks

BT - 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 30 June 2024 through 5 July 2024

ER -

Jiang K, Wang Q, Xu Y, Deng H. Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input. In 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2024. (Proceedings of the International Joint Conference on Neural Networks). doi: 10.1109/IJCNN60899.2024.10650736

Observation-Time-Action Deep Stacking Strategy: Solving Partial Observability Problems with Visual Input

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this