Opponent portrait for multiagent reinforcement learning in competitive environment

Yuxi Ma; Meng Shen; Yuhang Zhao; Zhao Li; Xiaoyao Tong; Quanxin Zhang; Zhi Wang

doi:10.1002/int.22594

Opponent portrait for multiagent reinforcement learning in competitive environment

Yuxi Ma, Meng Shen, Yuhang Zhao, Zhao Li, Xiaoyao Tong, Quanxin Zhang^*, Zhi Wang^*

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

19 引用（Scopus）

摘要

Existing investigations of opponent modeling and intention inferencing cannot make clear descriptions and practical explanations of the opponent's behaviors and intentions, which may inevitably limit the applicability of them. In this work, we propose a novel approach for opponent's policy explanation and intention inference based on the behavioral portrait of opponent. Specifically, we use the multiagent deep deterministic policy gradients (MADDPG) algorithm to train the agent and opponent in the competitive environment, and collect the behavioral data of opponent based on agent's observations. Then we perform pattern segmentation and extract the opponent's behavior events via Toeplitz inverse covariance-based clustering (TICC) algorithm; hence the opponent's behavior data can be encoded into a knowledge graph, named opponent's behavior knowledge graph (OKG). Based on this, we built a question-answer system (QA system) to query and match opponent historical information in OKG, so that the agent can obtain additional experience and gradually infer the intention of opponent with the episodes of iteration. We evaluate the proposed method on the competitive scenario in multiagent particle environment (MPE). Simulation results show that the agents are able to learn better policies with opponent portrait in competitive settings.

源语言	英语
页（从-至）	7461-7474
页数	14
期刊	International Journal of Intelligent Systems
卷	36
期	12
DOI	https://doi.org/10.1002/int.22594
出版状态	已出版 - 12月 2021

访问文件

10.1002/int.22594

其它文件与链接

链接到 Scopus 的出版物

引用此

Ma, Y., Shen, M., Zhao, Y., Li, Z., Tong, X., Zhang, Q., & Wang, Z. (2021). Opponent portrait for multiagent reinforcement learning in competitive environment. International Journal of Intelligent Systems, 36(12), 7461-7474. https://doi.org/10.1002/int.22594

@article{a9af4e80b5ac4691b90503b9de3a333a,

title = "Opponent portrait for multiagent reinforcement learning in competitive environment",

abstract = "Existing investigations of opponent modeling and intention inferencing cannot make clear descriptions and practical explanations of the opponent's behaviors and intentions, which may inevitably limit the applicability of them. In this work, we propose a novel approach for opponent's policy explanation and intention inference based on the behavioral portrait of opponent. Specifically, we use the multiagent deep deterministic policy gradients (MADDPG) algorithm to train the agent and opponent in the competitive environment, and collect the behavioral data of opponent based on agent's observations. Then we perform pattern segmentation and extract the opponent's behavior events via Toeplitz inverse covariance-based clustering (TICC) algorithm; hence the opponent's behavior data can be encoded into a knowledge graph, named opponent's behavior knowledge graph (OKG). Based on this, we built a question-answer system (QA system) to query and match opponent historical information in OKG, so that the agent can obtain additional experience and gradually infer the intention of opponent with the episodes of iteration. We evaluate the proposed method on the competitive scenario in multiagent particle environment (MPE). Simulation results show that the agents are able to learn better policies with opponent portrait in competitive settings.",

keywords = "deep reinforcement learning, intention inference, knowledge graph, multiagent system, opponent modeling",

author = "Yuxi Ma and Meng Shen and Yuhang Zhao and Zhao Li and Xiaoyao Tong and Quanxin Zhang and Zhi Wang",

note = "Publisher Copyright: {\textcopyright} 2021 Wiley Periodicals LLC",

year = "2021",

month = dec,

doi = "10.1002/int.22594",

language = "English",

volume = "36",

pages = "7461--7474",

journal = "International Journal of Intelligent Systems",

issn = "0884-8173",

publisher = "John Wiley and Sons Inc.",

number = "12",

}

TY - JOUR

T1 - Opponent portrait for multiagent reinforcement learning in competitive environment

AU - Ma, Yuxi

AU - Shen, Meng

AU - Zhao, Yuhang

AU - Li, Zhao

AU - Tong, Xiaoyao

AU - Zhang, Quanxin

AU - Wang, Zhi

PY - 2021/12

Y1 - 2021/12

N2 - Existing investigations of opponent modeling and intention inferencing cannot make clear descriptions and practical explanations of the opponent's behaviors and intentions, which may inevitably limit the applicability of them. In this work, we propose a novel approach for opponent's policy explanation and intention inference based on the behavioral portrait of opponent. Specifically, we use the multiagent deep deterministic policy gradients (MADDPG) algorithm to train the agent and opponent in the competitive environment, and collect the behavioral data of opponent based on agent's observations. Then we perform pattern segmentation and extract the opponent's behavior events via Toeplitz inverse covariance-based clustering (TICC) algorithm; hence the opponent's behavior data can be encoded into a knowledge graph, named opponent's behavior knowledge graph (OKG). Based on this, we built a question-answer system (QA system) to query and match opponent historical information in OKG, so that the agent can obtain additional experience and gradually infer the intention of opponent with the episodes of iteration. We evaluate the proposed method on the competitive scenario in multiagent particle environment (MPE). Simulation results show that the agents are able to learn better policies with opponent portrait in competitive settings.

AB - Existing investigations of opponent modeling and intention inferencing cannot make clear descriptions and practical explanations of the opponent's behaviors and intentions, which may inevitably limit the applicability of them. In this work, we propose a novel approach for opponent's policy explanation and intention inference based on the behavioral portrait of opponent. Specifically, we use the multiagent deep deterministic policy gradients (MADDPG) algorithm to train the agent and opponent in the competitive environment, and collect the behavioral data of opponent based on agent's observations. Then we perform pattern segmentation and extract the opponent's behavior events via Toeplitz inverse covariance-based clustering (TICC) algorithm; hence the opponent's behavior data can be encoded into a knowledge graph, named opponent's behavior knowledge graph (OKG). Based on this, we built a question-answer system (QA system) to query and match opponent historical information in OKG, so that the agent can obtain additional experience and gradually infer the intention of opponent with the episodes of iteration. We evaluate the proposed method on the competitive scenario in multiagent particle environment (MPE). Simulation results show that the agents are able to learn better policies with opponent portrait in competitive settings.

KW - deep reinforcement learning

KW - intention inference

KW - knowledge graph

KW - multiagent system

KW - opponent modeling

UR - http://www.scopus.com/inward/record.url?scp=85112479835&partnerID=8YFLogxK

U2 - 10.1002/int.22594

DO - 10.1002/int.22594

M3 - Article

AN - SCOPUS:85112479835

SN - 0884-8173

VL - 36

SP - 7461

EP - 7474

JO - International Journal of Intelligent Systems

JF - International Journal of Intelligent Systems

IS - 12

ER -

Opponent portrait for multiagent reinforcement learning in competitive environment

摘要

访问文件

其它文件与链接

指纹

引用此