TY - CONF
T1 - S2JSD-LSH
T2 - 31st AAAI Conference on Artificial Intelligence, AAAI 2017
AU - Mao, Xian Ling
AU - Feng, Bo Si
AU - Hao, Yi Jing
AU - Nie, Liqiang
AU - Huang, Heyan
AU - Wen, Guihua
N1 - Publisher Copyright:
Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2017
Y1 - 2017
N2 - To compare the similarity of probability distributions, the information-theoretically motivated metrics like Kullback-Leibler divergence (KL) and Jensen-Shannon divergence (JSD) are often more reasonable compared with metrics for vectors like Euclidean and angular distance. However, existing locality-sensitive hashing (LSH) algorithms cannot support the information-theoretically motivated metrics for probability distributions. In this paper, we first introduce a new approximation formula for S2JSD-distance, and then propose a novel LSH scheme adapted to S2JSD-distance for approximate nearest neighbors search in high-dimensional probability distributions. We define the specific hashing functions, and prove their local-sensitivity. Furthermore, extensive empirical evaluations well illustrate the effectiveness of the proposed hashing schema on six public image datasets and two text datasets, in terms of mean Average Precision, Precision@N and Precision-Recall curve.
AB - To compare the similarity of probability distributions, the information-theoretically motivated metrics like Kullback-Leibler divergence (KL) and Jensen-Shannon divergence (JSD) are often more reasonable compared with metrics for vectors like Euclidean and angular distance. However, existing locality-sensitive hashing (LSH) algorithms cannot support the information-theoretically motivated metrics for probability distributions. In this paper, we first introduce a new approximation formula for S2JSD-distance, and then propose a novel LSH scheme adapted to S2JSD-distance for approximate nearest neighbors search in high-dimensional probability distributions. We define the specific hashing functions, and prove their local-sensitivity. Furthermore, extensive empirical evaluations well illustrate the effectiveness of the proposed hashing schema on six public image datasets and two text datasets, in terms of mean Average Precision, Precision@N and Precision-Recall curve.
UR - http://www.scopus.com/inward/record.url?scp=85030479577&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85030479577
SP - 3244
EP - 3251
Y2 - 4 February 2017 through 10 February 2017
ER -