S2JSD-LSH: A locality-sensitive hashing schema for probability distributions

Xian Ling Mao, Bo Si Feng, Yi Jing Hao, Liqiang Nie, Heyan Huang*, Guihua Wen

*Corresponding author for this work

Research output: Contribution to conferencePaperpeer-review

6 Citations (Scopus)

Abstract

To compare the similarity of probability distributions, the information-theoretically motivated metrics like Kullback-Leibler divergence (KL) and Jensen-Shannon divergence (JSD) are often more reasonable compared with metrics for vectors like Euclidean and angular distance. However, existing locality-sensitive hashing (LSH) algorithms cannot support the information-theoretically motivated metrics for probability distributions. In this paper, we first introduce a new approximation formula for S2JSD-distance, and then propose a novel LSH scheme adapted to S2JSD-distance for approximate nearest neighbors search in high-dimensional probability distributions. We define the specific hashing functions, and prove their local-sensitivity. Furthermore, extensive empirical evaluations well illustrate the effectiveness of the proposed hashing schema on six public image datasets and two text datasets, in terms of mean Average Precision, Precision@N and Precision-Recall curve.

Original languageEnglish
Pages3244-3251
Number of pages8
Publication statusPublished - 2017
Event31st AAAI Conference on Artificial Intelligence, AAAI 2017 - San Francisco, United States
Duration: 4 Feb 201710 Feb 2017

Conference

Conference31st AAAI Conference on Artificial Intelligence, AAAI 2017
Country/TerritoryUnited States
CitySan Francisco
Period4/02/1710/02/17

Fingerprint

Dive into the research topics of 'S2JSD-LSH: A locality-sensitive hashing schema for probability distributions'. Together they form a unique fingerprint.

Cite this