Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game with Reinforcement Learning

Yongfeng Lv; Xuemei Ren

doi:10.1109/TSMC.2018.2861826

Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game with Reinforcement Learning

Yongfeng Lv, Xuemei Ren^*

^*此作品的通讯作者

自动化学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

75 引用（Scopus）

摘要

Inspired by Nash game theory, a multiplayer mixed-zero-sum (MZS) nonlinear game considering both two situations [zero-sum and nonzero-sum (NZS) Nash games] is proposed in this paper. A synchronous reinforcement learning (RL) scheme based on the identifier-critic structure is developed to learn the Nash equilibrium solution of the proposed MZS game. First, the MZS game formulation is presented, where the performance indexes for players 1 to ${N}$ - 1 and ${N}$ NZS Nash game are presented, and another performance index for players ${N}$ and ${N}$ + 1 zero-sum game is presented, such that player ${N}$ cooperates with players 1 to ${N}$ - 1, while competes with player ${N}$ + 1, which leads to a Nash equilibrium of all players. A single-layer neural network (NN) is then used to approximate the unknown dynamics of the nonlinear game system. Finally, an RL scheme based on NNs is developed to learn the optimal performance indexes, which can be used to produce the optimal control policy of every player such that Nash equilibrium can be obtained. Thus, the widely used actor NN in RL literature is not needed. To this end, a recently proposed adaptive law is used to estimate the unknown identifier coefficient vectors, and an improved adaptive law with the error performance index is further developed to update the critic coefficient vectors. Both linear and nonlinear simulations are presented to demonstrate the existence of Nash equilibrium for MZS game and performance of the proposed algorithm.

源语言	英语
文章编号	8438886
页（从-至）	2739-2750
页数	12
期刊	IEEE Transactions on Systems, Man, and Cybernetics: Systems
卷	49
期	12
DOI	https://doi.org/10.1109/TSMC.2018.2861826
出版状态	已出版 - 12月 2019

访问文件

10.1109/TSMC.2018.2861826

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{954726247db54366a62ce8aed30a5d39,

title = "Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game with Reinforcement Learning",

abstract = "Inspired by Nash game theory, a multiplayer mixed-zero-sum (MZS) nonlinear game considering both two situations [zero-sum and nonzero-sum (NZS) Nash games] is proposed in this paper. A synchronous reinforcement learning (RL) scheme based on the identifier-critic structure is developed to learn the Nash equilibrium solution of the proposed MZS game. First, the MZS game formulation is presented, where the performance indexes for players 1 to ${N}$ - 1 and ${N}$ NZS Nash game are presented, and another performance index for players ${N}$ and ${N}$ + 1 zero-sum game is presented, such that player ${N}$ cooperates with players 1 to ${N}$ - 1, while competes with player ${N}$ + 1, which leads to a Nash equilibrium of all players. A single-layer neural network (NN) is then used to approximate the unknown dynamics of the nonlinear game system. Finally, an RL scheme based on NNs is developed to learn the optimal performance indexes, which can be used to produce the optimal control policy of every player such that Nash equilibrium can be obtained. Thus, the widely used actor NN in RL literature is not needed. To this end, a recently proposed adaptive law is used to estimate the unknown identifier coefficient vectors, and an improved adaptive law with the error performance index is further developed to update the critic coefficient vectors. Both linear and nonlinear simulations are presented to demonstrate the existence of Nash equilibrium for MZS game and performance of the proposed algorithm.",

keywords = "Approximate dynamic programming (ADP), Nash games, neural networks (NNs), reinforcement learning (RL), system identification",

author = "Yongfeng Lv and Xuemei Ren",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2019",

month = dec,

doi = "10.1109/TSMC.2018.2861826",

language = "English",

volume = "49",

pages = "2739--2750",

journal = "IEEE Transactions on Systems, Man, and Cybernetics: Systems",

issn = "2168-2216",

publisher = "IEEE Advancing Technology for Humanity",

number = "12",

}

TY - JOUR

T1 - Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game with Reinforcement Learning

AU - Lv, Yongfeng

AU - Ren, Xuemei

PY - 2019/12

Y1 - 2019/12

N2 - Inspired by Nash game theory, a multiplayer mixed-zero-sum (MZS) nonlinear game considering both two situations [zero-sum and nonzero-sum (NZS) Nash games] is proposed in this paper. A synchronous reinforcement learning (RL) scheme based on the identifier-critic structure is developed to learn the Nash equilibrium solution of the proposed MZS game. First, the MZS game formulation is presented, where the performance indexes for players 1 to ${N}$ - 1 and ${N}$ NZS Nash game are presented, and another performance index for players ${N}$ and ${N}$ + 1 zero-sum game is presented, such that player ${N}$ cooperates with players 1 to ${N}$ - 1, while competes with player ${N}$ + 1, which leads to a Nash equilibrium of all players. A single-layer neural network (NN) is then used to approximate the unknown dynamics of the nonlinear game system. Finally, an RL scheme based on NNs is developed to learn the optimal performance indexes, which can be used to produce the optimal control policy of every player such that Nash equilibrium can be obtained. Thus, the widely used actor NN in RL literature is not needed. To this end, a recently proposed adaptive law is used to estimate the unknown identifier coefficient vectors, and an improved adaptive law with the error performance index is further developed to update the critic coefficient vectors. Both linear and nonlinear simulations are presented to demonstrate the existence of Nash equilibrium for MZS game and performance of the proposed algorithm.

AB - Inspired by Nash game theory, a multiplayer mixed-zero-sum (MZS) nonlinear game considering both two situations [zero-sum and nonzero-sum (NZS) Nash games] is proposed in this paper. A synchronous reinforcement learning (RL) scheme based on the identifier-critic structure is developed to learn the Nash equilibrium solution of the proposed MZS game. First, the MZS game formulation is presented, where the performance indexes for players 1 to ${N}$ - 1 and ${N}$ NZS Nash game are presented, and another performance index for players ${N}$ and ${N}$ + 1 zero-sum game is presented, such that player ${N}$ cooperates with players 1 to ${N}$ - 1, while competes with player ${N}$ + 1, which leads to a Nash equilibrium of all players. A single-layer neural network (NN) is then used to approximate the unknown dynamics of the nonlinear game system. Finally, an RL scheme based on NNs is developed to learn the optimal performance indexes, which can be used to produce the optimal control policy of every player such that Nash equilibrium can be obtained. Thus, the widely used actor NN in RL literature is not needed. To this end, a recently proposed adaptive law is used to estimate the unknown identifier coefficient vectors, and an improved adaptive law with the error performance index is further developed to update the critic coefficient vectors. Both linear and nonlinear simulations are presented to demonstrate the existence of Nash equilibrium for MZS game and performance of the proposed algorithm.

KW - Approximate dynamic programming (ADP)

KW - Nash games

KW - neural networks (NNs)

KW - reinforcement learning (RL)

KW - system identification

UR - http://www.scopus.com/inward/record.url?scp=85051791120&partnerID=8YFLogxK

U2 - 10.1109/TSMC.2018.2861826

DO - 10.1109/TSMC.2018.2861826

M3 - Article

AN - SCOPUS:85051791120

SN - 2168-2216

VL - 49

SP - 2739

EP - 2750

JO - IEEE Transactions on Systems, Man, and Cybernetics: Systems

JF - IEEE Transactions on Systems, Man, and Cybernetics: Systems

IS - 12

M1 - 8438886

ER -

Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game with Reinforcement Learning

摘要

访问文件

其它文件与链接

指纹

引用此