PrivMiner: a similar-first approach to frequent itemset mining under local differential privacy

Yanhui Li*, Chen Huang, Mengyuan Cheng, Tianci Lv, Yuxin Zhao, Yongjiao Sun, Ye Yuan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The discovery of frequent itemset can serve valuable economic and research purposes. Mining such data, however, presents privacy challenges. Local differential privacy (LDP) has been established as a strong and rigorous privacy scheme for collecting sensitive information from users. Due to the inherent high-dimensionality and heterogeneity of set-valued data, it is quite difficult to mine frequent itemsets under LDP. Existing solutions for this issue are either limited to single items or rely on sampling-based frequency oracle protocols, leading to poor overall result quality. Motivated by this, we present PrivMiner, a two-phase LDP protocol which directly collects user record information comprehensively. While the methodology of directly collecting an entire user record is an obvious choice, obtaining an accurate record under LDP is highly challenging. PrivMiner accomplishes this with a novel technique called the Group-wise Estimation Mechanism by exploiting the similarity between records. PrivMiner needs significantly less perturbations than previous methods, and it achieves higher overall result quality, even for millions of items. Extensive experiments conducted on real-world datasets demonstrate the effectiveness of our proposed method and its advantages over existing solutions. In particular, PrivMiner improves the error of estimation count by at least one order of magnitude.

Original languageEnglish
Article number25
JournalWorld Wide Web
Volume28
Issue number2
DOIs
Publication statusPublished - Mar 2025

Keywords

  • Frequent itemset mining
  • Group-wise estimation mechanism
  • Local differential privacy
  • Privacy protection

Fingerprint

Dive into the research topics of 'PrivMiner: a similar-first approach to frequent itemset mining under local differential privacy'. Together they form a unique fingerprint.

Cite this