TY - GEN
T1 - Application of adaptive critic design on angle bracket inverted pendulum control
AU - Wang, Zhen Yu
AU - Dai, Ya Ping
AU - Li, Yong Wei
AU - Yao, Yuan
PY - 2010
Y1 - 2010
N2 - The angle bracket Inverted pendulum is a complex nonlinear system, which is composed of a cart, a pole and a ramp supported by a bracket. Compared with normal inverted pendulum system, the angle bracket inverted pendulum has an inertia weight that caused by gravity downward along the ramp. The character of this system increases the difficulty of balancing the pole. In this paper, adaptive dynamic programming method is used for the system control. Two neural networks are designed separately to achieve the estimation of cost-to-go function and the output of control action through continuously learning. In the utility function design, the consideration of inertia influence is included which makes the estimation of system cost more exactly. In the output layer of the action network we use the S function as the transfer function that makes the outputs of control action be continuous variables. Furthermore, we increase a compensation part to decrease the influence caused by inertia factor. Simulation results show that the method has good results, also prove that it is feasible for the adaptive dynamic programming method to solve inertia problem.
AB - The angle bracket Inverted pendulum is a complex nonlinear system, which is composed of a cart, a pole and a ramp supported by a bracket. Compared with normal inverted pendulum system, the angle bracket inverted pendulum has an inertia weight that caused by gravity downward along the ramp. The character of this system increases the difficulty of balancing the pole. In this paper, adaptive dynamic programming method is used for the system control. Two neural networks are designed separately to achieve the estimation of cost-to-go function and the output of control action through continuously learning. In the utility function design, the consideration of inertia influence is included which makes the estimation of system cost more exactly. In the output layer of the action network we use the S function as the transfer function that makes the outputs of control action be continuous variables. Furthermore, we increase a compensation part to decrease the influence caused by inertia factor. Simulation results show that the method has good results, also prove that it is feasible for the adaptive dynamic programming method to solve inertia problem.
KW - Action-dependent adaptive dynamic programming
KW - Adaptive dynamic critic designs
KW - Angle bracket inverted pendulum
KW - Neural network
UR - http://www.scopus.com/inward/record.url?scp=78149327196&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2010.5580633
DO - 10.1109/ICMLC.2010.5580633
M3 - Conference contribution
AN - SCOPUS:78149327196
SN - 9781424465262
T3 - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
SP - 2198
EP - 2203
BT - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
T2 - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
Y2 - 11 July 2010 through 14 July 2010
ER -