Abstract
Multiagent reinforcement learning (MARL) remains fundamentally challenged by partial observability, unstable value learning, and inefficient exploration - difficulties that intensify in high-dimensional robotic control and large-scale coordination scenarios. Meanwhile, the existing algorithms lack a mechanism to guide the improvement of long-term strategies. We propose policy representation integration for metavalue evolution (PRIME) in MARL that addresses these limitations through representation-asymmetric policy parameterization, metavalue-augmented optimization, and metavalue-modulated evolutionary search. PRIME constructs a shared nonlinear encoder with lightweight team-specific linear heads, providing a coherent latent policy manifold that supports both fine-grained robotic manipulation and large-population coordination. A learned metavalue function estimates the long-horizon utility of policy updates, whose gradients shape both actor learning and representation formation. In parallel, evolutionary operators - direction-aware crossover and metagradient-scaled low-rank mutation - enable globally diverse yet strategically targeted exploration in policy space. Evaluations on multiagent MuJoCo (MA-MuJoCo), DexHands dexterous manipulation, and the large-scale decentralized collective assault (DCA) benchmark demonstrate that PRIME achieves consistently superior performance, faster convergence, and stronger robustness than state-of-the-art baselines.
| Original language | English |
|---|---|
| Pages (from-to) | 24893-24911 |
| Number of pages | 19 |
| Journal | IEEE Internet of Things Journal |
| Volume | 13 |
| Issue number | 11 |
| DOIs | |
| Publication status | Published - 2026 |
Keywords
- Deep learning
- large-scale multiagent systems (LMASs)
- metavalue
- multiagent reinforcement learning (MARL)
- multirobot control
Fingerprint
Dive into the research topics of 'PRIME: Policy Representation Integration With Metavalue-Modulated Evolution in Multiagent Reinforcement Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver