TY - GEN
T1 - Optimal differentially private algorithms for k-means clustering
AU - Huang, Zhiyi
AU - Liu, Jinyan
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/5/27
Y1 - 2018/5/27
N2 - We consider privacy-preserving k-means clustering. For the objective of minimizing the Wasserstein distance between the output and the optimal solution, we show that there is a polynomial-time (,)-differentially private algorithm which, for any sufficiently large2 well-separated datasets, outputs k centers that are within Wasserstein distance O(2) from the optimal. This result improves the previous bounds by removing the dependence on, number of centers k, and dimension d. Further, we prove a matching lower bound that no (,)-differentially private algorithm can guarantee Wasserstein distance less than Ω(2) and, thus, our positive result is optimal up to a constant factor. For minimizing the kmeans objective when the dimension d is bounded, we propose a polynomial-time private local search algorithm that outputs an n-additive approximation when the size of the dataset is at least Õk3/2 · d ·−1 · poly(−1).
AB - We consider privacy-preserving k-means clustering. For the objective of minimizing the Wasserstein distance between the output and the optimal solution, we show that there is a polynomial-time (,)-differentially private algorithm which, for any sufficiently large2 well-separated datasets, outputs k centers that are within Wasserstein distance O(2) from the optimal. This result improves the previous bounds by removing the dependence on, number of centers k, and dimension d. Further, we prove a matching lower bound that no (,)-differentially private algorithm can guarantee Wasserstein distance less than Ω(2) and, thus, our positive result is optimal up to a constant factor. For minimizing the kmeans objective when the dimension d is bounded, we propose a polynomial-time private local search algorithm that outputs an n-additive approximation when the size of the dataset is at least Õk3/2 · d ·−1 · poly(−1).
KW - Differential privacy
KW - K-means clustering
KW - Well separation
UR - http://www.scopus.com/inward/record.url?scp=85047982941&partnerID=8YFLogxK
U2 - 10.1145/3196959.3196977
DO - 10.1145/3196959.3196977
M3 - Conference contribution
AN - SCOPUS:85047982941
T3 - Proceedings of the ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems
SP - 395
EP - 408
BT - PODS 2018 - Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
A2 - Van den Bussche, Jan
A2 - Ugarte, Mart�n
A2 - Arenas, Marcelo
PB - Association for Computing Machinery
T2 - 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2018
Y2 - 10 June 2018 through 15 June 2018
ER -