Optimal differentially private algorithms for k-means clustering

Zhiyi Huang, Jinyan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Citations (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 32
  • Captures
    • Readers: 27
see details

Abstract

We consider privacy-preserving k-means clustering. For the objective of minimizing the Wasserstein distance between the output and the optimal solution, we show that there is a polynomial-time (,)-differentially private algorithm which, for any sufficiently large2 well-separated datasets, outputs k centers that are within Wasserstein distance O(2) from the optimal. This result improves the previous bounds by removing the dependence on, number of centers k, and dimension d. Further, we prove a matching lower bound that no (,)-differentially private algorithm can guarantee Wasserstein distance less than Ω(2) and, thus, our positive result is optimal up to a constant factor. For minimizing the kmeans objective when the dimension d is bounded, we propose a polynomial-time private local search algorithm that outputs an n-additive approximation when the size of the dataset is at least Õk3/2 · d ·−1 · poly(−1).

Original languageEnglish
Title of host publicationPODS 2018 - Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
EditorsJan Van den Bussche, Mart�n Ugarte, Marcelo Arenas
PublisherAssociation for Computing Machinery
Pages395-408
Number of pages14
ISBN (Electronic)9781450347068
DOIs
Publication statusPublished - 27 May 2018
Externally publishedYes
Event37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2018 - Houston, United States
Duration: 10 Jun 201815 Jun 2018

Publication series

NameProceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Conference

Conference37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2018
Country/TerritoryUnited States
CityHouston
Period10/06/1815/06/18

Keywords

  • Differential privacy
  • K-means clustering
  • Well separation

Fingerprint

Dive into the research topics of 'Optimal differentially private algorithms for k-means clustering'. Together they form a unique fingerprint.

Cite this

Huang, Z., & Liu, J. (2018). Optimal differentially private algorithms for k-means clustering. In J. Van den Bussche, M. Ugarte, & M. Arenas (Eds.), PODS 2018 - Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (pp. 395-408). (Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems). Association for Computing Machinery. https://doi.org/10.1145/3196959.3196977