复杂动态环境下基于深度强化学习的AGV避障方法

Ze Cai; Yaoguang Hu; Jingqian Wen; Lixiang Zhang

doi:10.13196/j.cims.2023.01.020

复杂动态环境下基于深度强化学习的AGV避障方法

Translated title of the contribution: Collision avoidance for AGV based on deep reinforcement learning in complex dynamic environment

Ze Cai, Yaoguang Hu^*, Jingqian Wen, Lixiang Zhang

^*Corresponding author for this work

School of Mechanical Engineering

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

4 Citations (Scopus)

Abstract

To improve the collision avoidance capability of Automated Guided Vehicles (AGV) in the complex dynamic environment of smart factories,enable them to carry out material handling tasks more safely and efficiently following the global path, a local collision avoidance method based on deep reinforcement learning was proposed. The problem of collision avoidance of AGV was formulated as Partial Observational Markov Decision Process (POMDP) in which observation space, action space and reward function were expatiated. Tracking of the global path was a-chieved by setting different reward values. Then a Deep Deterministic Policy Gradient (DDPG) method was further implemented to solve collision avoidance policy. The trained policy was validated in various simulated scenarios, and the effectiveness was proved. The experimental results showed the proposed approach could respond to the complex dynamic environment and reduce the time and distance of collision avoidance.

Translated title of the contribution	Collision avoidance for AGV based on deep reinforcement learning in complex dynamic environment
Original language	Chinese (Traditional)
Pages (from-to)	236-245
Number of pages	10
Journal	Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS
Volume	29
Issue number	1
DOIs	https://doi.org/10.13196/j.cims.2023.01.020
Publication status	Published - 31 Jan 2023

Access to Document

10.13196/j.cims.2023.01.020

Cite this

@article{25cd79c3555447fe8ed91d7c130ac283,

title = "复杂动态环境下基于深度强化学习的AGV避障方法",

abstract = "To improve the collision avoidance capability of Automated Guided Vehicles (AGV) in the complex dynamic environment of smart factories,enable them to carry out material handling tasks more safely and efficiently following the global path, a local collision avoidance method based on deep reinforcement learning was proposed. The problem of collision avoidance of AGV was formulated as Partial Observational Markov Decision Process (POMDP) in which observation space, action space and reward function were expatiated. Tracking of the global path was a-chieved by setting different reward values. Then a Deep Deterministic Policy Gradient (DDPG) method was further implemented to solve collision avoidance policy. The trained policy was validated in various simulated scenarios, and the effectiveness was proved. The experimental results showed the proposed approach could respond to the complex dynamic environment and reduce the time and distance of collision avoidance.",

keywords = "deep reinforcement learning, dynamic collision avoidance, smart factory, tracking of global path",

author = "Ze Cai and Yaoguang Hu and Jingqian Wen and Lixiang Zhang",

year = "2023",

month = jan,

day = "31",

doi = "10.13196/j.cims.2023.01.020",

language = "繁体中文",

volume = "29",

pages = "236--245",

journal = "Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS",

issn = "1006-5911",

publisher = "Computer Integrated Manufacturing Systems",

number = "1",

}

TY - JOUR

T1 - 复杂动态环境下基于深度强化学习的AGV避障方法

AU - Cai, Ze

AU - Hu, Yaoguang

AU - Wen, Jingqian

AU - Zhang, Lixiang

PY - 2023/1/31

Y1 - 2023/1/31

N2 - To improve the collision avoidance capability of Automated Guided Vehicles (AGV) in the complex dynamic environment of smart factories,enable them to carry out material handling tasks more safely and efficiently following the global path, a local collision avoidance method based on deep reinforcement learning was proposed. The problem of collision avoidance of AGV was formulated as Partial Observational Markov Decision Process (POMDP) in which observation space, action space and reward function were expatiated. Tracking of the global path was a-chieved by setting different reward values. Then a Deep Deterministic Policy Gradient (DDPG) method was further implemented to solve collision avoidance policy. The trained policy was validated in various simulated scenarios, and the effectiveness was proved. The experimental results showed the proposed approach could respond to the complex dynamic environment and reduce the time and distance of collision avoidance.

AB - To improve the collision avoidance capability of Automated Guided Vehicles (AGV) in the complex dynamic environment of smart factories,enable them to carry out material handling tasks more safely and efficiently following the global path, a local collision avoidance method based on deep reinforcement learning was proposed. The problem of collision avoidance of AGV was formulated as Partial Observational Markov Decision Process (POMDP) in which observation space, action space and reward function were expatiated. Tracking of the global path was a-chieved by setting different reward values. Then a Deep Deterministic Policy Gradient (DDPG) method was further implemented to solve collision avoidance policy. The trained policy was validated in various simulated scenarios, and the effectiveness was proved. The experimental results showed the proposed approach could respond to the complex dynamic environment and reduce the time and distance of collision avoidance.

KW - deep reinforcement learning

KW - dynamic collision avoidance

KW - smart factory

KW - tracking of global path

UR - http://www.scopus.com/inward/record.url?scp=85151539780&partnerID=8YFLogxK

U2 - 10.13196/j.cims.2023.01.020

DO - 10.13196/j.cims.2023.01.020

M3 - 文章

AN - SCOPUS:85151539780

SN - 1006-5911

VL - 29

SP - 236

EP - 245

JO - Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS

JF - Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS

IS - 1

ER -

复杂动态环境下基于深度强化学习的AGV避障方法

Abstract

Access to Document

Other files and links

Fingerprint

Cite this