Quantization Enabled Differential Privacy in Bandit Games With Cooperative Players

Yeming Lin, Kun Liu*, Ilai Bistritz, Qian Ma, Yuanqing Xia

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This article addresses the bandit game problem subject to privacy leakage, where the cooperative players aim to learn the optimal action profile that minimizes the global cost. The players do not have closed-form expressions for their payoff functions and can only receive the feedback of their local costs. We propose a privacy-preserving distributed bandit learning algorithm based on the residual gradient estimator, which adopts the stochastic quantization with a binary randomized response scheme to mask action profile estimates before communication. The theoretical analysis demonstrates that our algorithm can achieve an expected regret order of O(T3/4) and preserve ϵdp-differential privacy for the players.

Original languageEnglish
Pages (from-to)7771-7778
Number of pages8
JournalIEEE Transactions on Automatic Control
Volume70
Issue number11
DOIs
Publication statusPublished - 2025
Externally publishedYes

Keywords

  • Bandit games
  • cooperative optimization
  • privacy preservation
  • stochastic quantization

Fingerprint

Dive into the research topics of 'Quantization Enabled Differential Privacy in Bandit Games With Cooperative Players'. Together they form a unique fingerprint.

Cite this