Dynamic privacy pricing: A multi-armed bandit approach with time-variant rewards

Lei Xu, Chunxiao Jiang, Yi Qian, Youjian Zhao, Jianhua Li, Yong Ren

Research output: Contribution to journalArticlepeer-review

41 Citations (Scopus)

Abstract

Recently, the conflict between exploiting the value of personal data and protecting individuals' privacy has attracted much attention. Personal data market provides a promising solution to this conflict, while determining the price of privacy is a tough issue. In this paper, we study the pricing problem in a setting where a data collector sequentially buys data from multiple data owners whose valuations of privacy are randomly drawn from an unknown distribution. To maximize the total payoff, the collector needs to dynamically adjust the prices offered to owners. We model the sequential decision-making problem of the collector as a multi-armed bandit problem with each arm representing a candidate price. Specifically, the privacy protection technique adopted by the collector is taken into account. Protecting privacy generally causes a negative effect on the value of data, and this effect is embodied by the time-variant distributions of the rewards associated with arms. Based on the classic upper confidence bound policy, we propose two learning policies for the bandit problem. The first policy estimates the expected reward of a price by counting how many times the price has been accepted by data owners. The second policy treats the time-variant data value as a context and uses ridge regression to estimate the rewards in different contexts. Simulation results on real-world data demonstrate that by applying the proposed policies, the collector can get a payoff which is close to that he can get by setting a fixed price, which is the best in hindsight, for all data owners.

Original languageEnglish
Article number7572170
Pages (from-to)271-285
Number of pages15
JournalIEEE Transactions on Information Forensics and Security
Volume12
Issue number2
DOIs
Publication statusPublished - Feb 2017
Externally publishedYes

Keywords

  • Bandit problems
  • Data anonymization
  • Dynamic pricing
  • Learning policy
  • Private data collecting

Fingerprint

Dive into the research topics of 'Dynamic privacy pricing: A multi-armed bandit approach with time-variant rewards'. Together they form a unique fingerprint.

Cite this

Xu, L., Jiang, C., Qian, Y., Zhao, Y., Li, J., & Ren, Y. (2017). Dynamic privacy pricing: A multi-armed bandit approach with time-variant rewards. IEEE Transactions on Information Forensics and Security, 12(2), 271-285. Article 7572170. https://doi.org/10.1109/TIFS.2016.2611487