System-of-systems approach to spatio-temporal crowdsourcing design using improved PPO algorithm based on an invalid action masking

Wei Ding; Zhenjun Ming; Guoxin Wang; Yan Yan

doi:10.1016/j.knosys.2024.111381

System-of-systems approach to spatio-temporal crowdsourcing design using improved PPO algorithm based on an invalid action masking

Wei Ding, Zhenjun Ming^*, Guoxin Wang, Yan Yan

^*此作品的通讯作者

机械与车辆学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

3 引用（Scopus）

摘要

Spatio-temporal crowdsourcing (STC) is a typical case of complex system-of-systems (SoSs) design, wherein the primary objective is to allocate real-time tasks to suitable groups of workers. Over time, the STC allocation has gradually evolved into a dynamic matching involving three distinct entities: tasks, workers, and workplaces. Aiming at addressing the problems of poor convergence, slow response and sparse actions caused by the spatial complexity and time dynamics of the STC, this paper proposes an improved proximal policy optimization algorithm based on an invalid action masking (IAM-IPPO) for the SoSs design of the STC. Initially, the ternary dynamic matching (TDM) of tasks, workers and workplaces in the STC is described. Furthermore, the STC allocation is formulated as a Markov decision process, with the corresponding definition of state space, action space, and reward mechanism. On this basis, an invalid action masking (IAM) method is mainly introduced to update the policy-based network of proximal policy optimization (PPO), realizing sampling only from valid actions to masking invalid action selection. Subsequently, the algorithmic framework of IAM-IPPO is elaborated upon, and the model is trained to generate an effective allocation scheme. Comparative experiments are conducted on authentic datasets, aiming to assess performance indicators of the presented approach. The findings demonstrate a substantial enhancement in performance for the IAM-IPPO algorithm compared to other baselines, which is helpful in exploring excellent design schemes of the crowdsourcing SoSs, especially in dynamic large-scale cases.

源语言	英语
文章编号	111381
期刊	Knowledge-Based Systems
卷	285
DOI	https://doi.org/10.1016/j.knosys.2024.111381
出版状态	已出版 - 15 2月 2024

访问文件

10.1016/j.knosys.2024.111381

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{cea9e8cd8f3f49d38d0b1eaa2420583f,

title = "System-of-systems approach to spatio-temporal crowdsourcing design using improved PPO algorithm based on an invalid action masking",

abstract = "Spatio-temporal crowdsourcing (STC) is a typical case of complex system-of-systems (SoSs) design, wherein the primary objective is to allocate real-time tasks to suitable groups of workers. Over time, the STC allocation has gradually evolved into a dynamic matching involving three distinct entities: tasks, workers, and workplaces. Aiming at addressing the problems of poor convergence, slow response and sparse actions caused by the spatial complexity and time dynamics of the STC, this paper proposes an improved proximal policy optimization algorithm based on an invalid action masking (IAM-IPPO) for the SoSs design of the STC. Initially, the ternary dynamic matching (TDM) of tasks, workers and workplaces in the STC is described. Furthermore, the STC allocation is formulated as a Markov decision process, with the corresponding definition of state space, action space, and reward mechanism. On this basis, an invalid action masking (IAM) method is mainly introduced to update the policy-based network of proximal policy optimization (PPO), realizing sampling only from valid actions to masking invalid action selection. Subsequently, the algorithmic framework of IAM-IPPO is elaborated upon, and the model is trained to generate an effective allocation scheme. Comparative experiments are conducted on authentic datasets, aiming to assess performance indicators of the presented approach. The findings demonstrate a substantial enhancement in performance for the IAM-IPPO algorithm compared to other baselines, which is helpful in exploring excellent design schemes of the crowdsourcing SoSs, especially in dynamic large-scale cases.",

keywords = "Improved PPO algorithm, Invalid action masking, Spatio-temporal crowdsourcing, System-of-systems design, Task allocation",

author = "Wei Ding and Zhenjun Ming and Guoxin Wang and Yan Yan",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier B.V.",

year = "2024",

month = feb,

day = "15",

doi = "10.1016/j.knosys.2024.111381",

language = "English",

volume = "285",

journal = "Knowledge-Based Systems",

issn = "0950-7051",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - System-of-systems approach to spatio-temporal crowdsourcing design using improved PPO algorithm based on an invalid action masking

AU - Ding, Wei

AU - Ming, Zhenjun

AU - Wang, Guoxin

AU - Yan, Yan

PY - 2024/2/15

Y1 - 2024/2/15

N2 - Spatio-temporal crowdsourcing (STC) is a typical case of complex system-of-systems (SoSs) design, wherein the primary objective is to allocate real-time tasks to suitable groups of workers. Over time, the STC allocation has gradually evolved into a dynamic matching involving three distinct entities: tasks, workers, and workplaces. Aiming at addressing the problems of poor convergence, slow response and sparse actions caused by the spatial complexity and time dynamics of the STC, this paper proposes an improved proximal policy optimization algorithm based on an invalid action masking (IAM-IPPO) for the SoSs design of the STC. Initially, the ternary dynamic matching (TDM) of tasks, workers and workplaces in the STC is described. Furthermore, the STC allocation is formulated as a Markov decision process, with the corresponding definition of state space, action space, and reward mechanism. On this basis, an invalid action masking (IAM) method is mainly introduced to update the policy-based network of proximal policy optimization (PPO), realizing sampling only from valid actions to masking invalid action selection. Subsequently, the algorithmic framework of IAM-IPPO is elaborated upon, and the model is trained to generate an effective allocation scheme. Comparative experiments are conducted on authentic datasets, aiming to assess performance indicators of the presented approach. The findings demonstrate a substantial enhancement in performance for the IAM-IPPO algorithm compared to other baselines, which is helpful in exploring excellent design schemes of the crowdsourcing SoSs, especially in dynamic large-scale cases.

AB - Spatio-temporal crowdsourcing (STC) is a typical case of complex system-of-systems (SoSs) design, wherein the primary objective is to allocate real-time tasks to suitable groups of workers. Over time, the STC allocation has gradually evolved into a dynamic matching involving three distinct entities: tasks, workers, and workplaces. Aiming at addressing the problems of poor convergence, slow response and sparse actions caused by the spatial complexity and time dynamics of the STC, this paper proposes an improved proximal policy optimization algorithm based on an invalid action masking (IAM-IPPO) for the SoSs design of the STC. Initially, the ternary dynamic matching (TDM) of tasks, workers and workplaces in the STC is described. Furthermore, the STC allocation is formulated as a Markov decision process, with the corresponding definition of state space, action space, and reward mechanism. On this basis, an invalid action masking (IAM) method is mainly introduced to update the policy-based network of proximal policy optimization (PPO), realizing sampling only from valid actions to masking invalid action selection. Subsequently, the algorithmic framework of IAM-IPPO is elaborated upon, and the model is trained to generate an effective allocation scheme. Comparative experiments are conducted on authentic datasets, aiming to assess performance indicators of the presented approach. The findings demonstrate a substantial enhancement in performance for the IAM-IPPO algorithm compared to other baselines, which is helpful in exploring excellent design schemes of the crowdsourcing SoSs, especially in dynamic large-scale cases.

KW - Improved PPO algorithm

KW - Invalid action masking

KW - Spatio-temporal crowdsourcing

KW - System-of-systems design

KW - Task allocation

UR - http://www.scopus.com/inward/record.url?scp=85182266748&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2024.111381

DO - 10.1016/j.knosys.2024.111381

M3 - Article

AN - SCOPUS:85182266748

SN - 0950-7051

VL - 285

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

M1 - 111381

ER -

System-of-systems approach to spatio-temporal crowdsourcing design using improved PPO algorithm based on an invalid action masking

摘要

访问文件

其它文件与链接

指纹

引用此