TY - JOUR
T1 - System-of-systems approach to spatio-temporal crowdsourcing design using improved PPO algorithm based on an invalid action masking
AU - Ding, Wei
AU - Ming, Zhenjun
AU - Wang, Guoxin
AU - Yan, Yan
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/2/15
Y1 - 2024/2/15
N2 - Spatio-temporal crowdsourcing (STC) is a typical case of complex system-of-systems (SoSs) design, wherein the primary objective is to allocate real-time tasks to suitable groups of workers. Over time, the STC allocation has gradually evolved into a dynamic matching involving three distinct entities: tasks, workers, and workplaces. Aiming at addressing the problems of poor convergence, slow response and sparse actions caused by the spatial complexity and time dynamics of the STC, this paper proposes an improved proximal policy optimization algorithm based on an invalid action masking (IAM-IPPO) for the SoSs design of the STC. Initially, the ternary dynamic matching (TDM) of tasks, workers and workplaces in the STC is described. Furthermore, the STC allocation is formulated as a Markov decision process, with the corresponding definition of state space, action space, and reward mechanism. On this basis, an invalid action masking (IAM) method is mainly introduced to update the policy-based network of proximal policy optimization (PPO), realizing sampling only from valid actions to masking invalid action selection. Subsequently, the algorithmic framework of IAM-IPPO is elaborated upon, and the model is trained to generate an effective allocation scheme. Comparative experiments are conducted on authentic datasets, aiming to assess performance indicators of the presented approach. The findings demonstrate a substantial enhancement in performance for the IAM-IPPO algorithm compared to other baselines, which is helpful in exploring excellent design schemes of the crowdsourcing SoSs, especially in dynamic large-scale cases.
AB - Spatio-temporal crowdsourcing (STC) is a typical case of complex system-of-systems (SoSs) design, wherein the primary objective is to allocate real-time tasks to suitable groups of workers. Over time, the STC allocation has gradually evolved into a dynamic matching involving three distinct entities: tasks, workers, and workplaces. Aiming at addressing the problems of poor convergence, slow response and sparse actions caused by the spatial complexity and time dynamics of the STC, this paper proposes an improved proximal policy optimization algorithm based on an invalid action masking (IAM-IPPO) for the SoSs design of the STC. Initially, the ternary dynamic matching (TDM) of tasks, workers and workplaces in the STC is described. Furthermore, the STC allocation is formulated as a Markov decision process, with the corresponding definition of state space, action space, and reward mechanism. On this basis, an invalid action masking (IAM) method is mainly introduced to update the policy-based network of proximal policy optimization (PPO), realizing sampling only from valid actions to masking invalid action selection. Subsequently, the algorithmic framework of IAM-IPPO is elaborated upon, and the model is trained to generate an effective allocation scheme. Comparative experiments are conducted on authentic datasets, aiming to assess performance indicators of the presented approach. The findings demonstrate a substantial enhancement in performance for the IAM-IPPO algorithm compared to other baselines, which is helpful in exploring excellent design schemes of the crowdsourcing SoSs, especially in dynamic large-scale cases.
KW - Improved PPO algorithm
KW - Invalid action masking
KW - Spatio-temporal crowdsourcing
KW - System-of-systems design
KW - Task allocation
UR - http://www.scopus.com/inward/record.url?scp=85182266748&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2024.111381
DO - 10.1016/j.knosys.2024.111381
M3 - Article
AN - SCOPUS:85182266748
SN - 0950-7051
VL - 285
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 111381
ER -