基 于 深 度 强 化 学 习 的 电 力 系 统 安 全 校 正 控 制

Yidi Wang, Lixin Li*, Yijun Yu, Nan Yang, Meng Liu, Tong Li

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

2 引用 (Scopus)

摘要

In the new power system, the uncertainty of both sides of the source and load makes the power flow fluctuation increase significantly. The power system security correction control can eliminate the system power flow over-limit and ensure the safe operation of the power grid. However, the traditional security correction control methods have many constraints and complex calculation, and it is difficult to make real-time multi-step decisions for large-scale power grids. Therefore, this paper proposes a two-stage training method based on deep deterministic policy gradient (DDPG) to determine the security correction control strategy. Firstly, combining the security correction control problem with deep reinforcement learning, the Markov decision process (MDP) model of the security correction is constructed by designing the state, action and reward function of reinforcement learning. Secondly, a two-stage training framework is proposed to obtain the optimal correction strategy. In the pre-training stage of the imitation learning, based on the expert strategy, the imitation learning is used to provide the initial neural network for agents and improve the training speed. In the training stage of the reinforcement learning, the agent is further trained through the continuous interaction between DDPG agent and the environment. The trained agent can be applied in real time to obtain the optimal decision. Finally, the effectiveness of the proposed method is verified by a simulation case based on a provincial power grid of China.

投稿的翻译标题Power System Security Correction Control Based on Deep Reinforcement Learning
源语言繁体中文
页(从-至)121-129
页数9
期刊Dianli Xitong Zidonghua/Automation of Electric Power Systems
47
12
DOI
出版状态已出版 - 2023
已对外发布

关键词

  • deep reinforcement learning
  • imitation learning
  • power flow over-limit
  • security correction control

指纹

探究 '基 于 深 度 强 化 学 习 的 电 力 系 统 安 全 校 正 控 制' 的科研主题。它们共同构成独一无二的指纹。

引用此