Abstract
In the new power system, the uncertainty of both sides of the source and load makes the power flow fluctuation increase significantly. The power system security correction control can eliminate the system power flow over-limit and ensure the safe operation of the power grid. However, the traditional security correction control methods have many constraints and complex calculation, and it is difficult to make real-time multi-step decisions for large-scale power grids. Therefore, this paper proposes a two-stage training method based on deep deterministic policy gradient (DDPG) to determine the security correction control strategy. Firstly, combining the security correction control problem with deep reinforcement learning, the Markov decision process (MDP) model of the security correction is constructed by designing the state, action and reward function of reinforcement learning. Secondly, a two-stage training framework is proposed to obtain the optimal correction strategy. In the pre-training stage of the imitation learning, based on the expert strategy, the imitation learning is used to provide the initial neural network for agents and improve the training speed. In the training stage of the reinforcement learning, the agent is further trained through the continuous interaction between DDPG agent and the environment. The trained agent can be applied in real time to obtain the optimal decision. Finally, the effectiveness of the proposed method is verified by a simulation case based on a provincial power grid of China.
Translated title of the contribution | Power System Security Correction Control Based on Deep Reinforcement Learning |
---|---|
Original language | Chinese (Traditional) |
Pages (from-to) | 121-129 |
Number of pages | 9 |
Journal | Dianli Xitong Zidonghua/Automation of Electric Power Systems |
Volume | 47 |
Issue number | 12 |
DOIs | |
Publication status | Published - 2023 |
Externally published | Yes |