TY - GEN
T1 - Analysis of optimal privacy protection mechanism based on IT-MPD
AU - Li, Binhan
AU - Zhao, Xiaolin
AU - Liu, Zhenyan
AU - Chang, Yue
AU - Ren, Xuanyu
AU - Liu, Yuhao
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/12/19
Y1 - 2025/12/19
N2 - Balancing privacy preservation and data utility remains a critical challenge in data-driven systems. Traditional metrics, such as k-anonymity and differential privacy, often adopt static thresholds or oversimplified assumptions, limiting their adaptability to dynamic privacy-utility trade-offs. This paper proposes Information-Theoretic Maximum Privacy Disclosure (IT-MPD), a novel metric leveraging entropy and KL divergence to quantify the worst-case privacy leakage. We formulate privacy protection as a constrained optimization problem, minimizing IT-MPD while ensuring mutual information between raw and anonymized data exceeds a utility threshold ϵ. By modeling the data release process as a Markov chain, we demonstrate that optimal mechanisms emerge at the Pareto frontier of privacy-utility trade-offs. Experiments on age datasets anonymized via K-anonymity show IT-MPD values decline from 6.644 (K=1) to 0.020 (K=50), while utility (measured by mutual information) drops from 4.605 to 0.693. Polynomial fitting of these trends enables practitioners to select K based on operational requirements - e.g., ϵ = 3 yields K=5 as the optimal parameter. This work bridges theoretical rigor with practical applicability, offering a dynamic framework for privacy compliance in heterogeneous data ecosystems.
AB - Balancing privacy preservation and data utility remains a critical challenge in data-driven systems. Traditional metrics, such as k-anonymity and differential privacy, often adopt static thresholds or oversimplified assumptions, limiting their adaptability to dynamic privacy-utility trade-offs. This paper proposes Information-Theoretic Maximum Privacy Disclosure (IT-MPD), a novel metric leveraging entropy and KL divergence to quantify the worst-case privacy leakage. We formulate privacy protection as a constrained optimization problem, minimizing IT-MPD while ensuring mutual information between raw and anonymized data exceeds a utility threshold ϵ. By modeling the data release process as a Markov chain, we demonstrate that optimal mechanisms emerge at the Pareto frontier of privacy-utility trade-offs. Experiments on age datasets anonymized via K-anonymity show IT-MPD values decline from 6.644 (K=1) to 0.020 (K=50), while utility (measured by mutual information) drops from 4.605 to 0.693. Polynomial fitting of these trends enables practitioners to select K based on operational requirements - e.g., ϵ = 3 yields K=5 as the optimal parameter. This work bridges theoretical rigor with practical applicability, offering a dynamic framework for privacy compliance in heterogeneous data ecosystems.
KW - Constrained optimization
KW - Data utility
KW - Entropy
KW - Information theory
KW - Markov chain
KW - Privacy preservation
UR - https://www.scopus.com/pages/publications/105026342209
U2 - 10.1145/3773365.3773645
DO - 10.1145/3773365.3773645
M3 - Conference contribution
AN - SCOPUS:105026342209
T3 - Proceedings of 2025 8th International Conference on Computer Information Science and Artificial Intelligence, CISAI 2025
SP - 1771
EP - 1778
BT - Proceedings of 2025 8th International Conference on Computer Information Science and Artificial Intelligence, CISAI 2025
PB - Association for Computing Machinery, Inc
T2 - 2025 8th International Conference on Computer Information Science and Artificial Intelligence, CISAI 2025
Y2 - 12 September 2025 through 14 September 2025
ER -