TY - GEN
T1 - Smooth Actor-Critic Algorithm for End-to-End Autonomous Driving
AU - Song, Wenjie
AU - Liu, Shixian
AU - Li, Yujun
AU - Yang, Yi
AU - Xiang, Changle
N1 - Publisher Copyright:
© 2020 AACC.
PY - 2020/7
Y1 - 2020/7
N2 - For the intelligent sequential decision-making tasks like autonomous driving, decisions or actions made by the agent in a short period of time should be smooth enough or not too choppy. In order to help the agent learn smooth actions (steering, accelerating, braking) for autonomous driving, this paper proposes the smooth actor-critic algorithm for both deterministic policy and stochastic policy systems. Specifically, a regularization term is added to the objective function of actorcritic methods to constrain the difference between neighbouring actions in a small region without affecting the convergence performance of the whole system. Then, the theoretical analysis and proof for the modified methods are conducted so that it can be theoretically guaranteed in terms of iterative improvements. Moreover, experiments in different simulation systems also prove that the methods can generate much smoother actions and obtain more robust performance for reinforcement learning-based End-to-End autonomous driving.
AB - For the intelligent sequential decision-making tasks like autonomous driving, decisions or actions made by the agent in a short period of time should be smooth enough or not too choppy. In order to help the agent learn smooth actions (steering, accelerating, braking) for autonomous driving, this paper proposes the smooth actor-critic algorithm for both deterministic policy and stochastic policy systems. Specifically, a regularization term is added to the objective function of actorcritic methods to constrain the difference between neighbouring actions in a small region without affecting the convergence performance of the whole system. Then, the theoretical analysis and proof for the modified methods are conducted so that it can be theoretically guaranteed in terms of iterative improvements. Moreover, experiments in different simulation systems also prove that the methods can generate much smoother actions and obtain more robust performance for reinforcement learning-based End-to-End autonomous driving.
UR - http://www.scopus.com/inward/record.url?scp=85089591388&partnerID=8YFLogxK
U2 - 10.23919/ACC45564.2020.9147960
DO - 10.23919/ACC45564.2020.9147960
M3 - Conference contribution
AN - SCOPUS:85089591388
T3 - Proceedings of the American Control Conference
SP - 3242
EP - 3248
BT - 2020 American Control Conference, ACC 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 American Control Conference, ACC 2020
Y2 - 1 July 2020 through 3 July 2020
ER -