Locality-sensitive hashing for finding nearest neighbors in probability distributions

Yi Kun Tang, Xian Ling Mao*, Yi Jing Hao, Cheng Xu, Heyan Huang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

In the past ten years, new powerful algorithms based on efficient data structures have been proposed to solve the problem of Approximate Nearest Neighbors search (ANN). To find the nearest neighbors in probability-distribution-type data, the existing Locality Sensitive Hashing (LSH) algorithms for vector-type data can be directly used to solve it. However, these methods do not consider the special properties of probability distributions. In this paper, based on the special properties of probability distributions, we present a novel LSH scheme adapted to angular distance for ANN search in high-dimensional probability distributions. We define the specific hashing functions, and prove their local-sensitivity. Also, we propose a Sequential Interleaving algorithm based on the “Unbalance Effect” of Euclidean and angular metrics for probability distributions. Finally, we compare, through experiments, our methods with the state-of-the-art LSH algorithms in the context of ANN on six public image databases. The results prove the proposed algorithms can provide far better accuracy in the context of ANN than baselines.

源语言英语
主期刊名Social Media Processing - 6th National Conference, SMP 2017, Proceedings
编辑Huan Liu, Xing Xie, Xueqi Cheng, Huawei Shen, Weiying Ma, Shizheng Feng
出版商Springer Verlag
3-15
页数13
ISBN(印刷版)9789811068041
DOI
出版状态已出版 - 2017
活动6th National Conference on Social Media Processing, SMP 2017 - Beijing, 中国
期限: 14 9月 201717 9月 2017

出版系列

姓名Communications in Computer and Information Science
774
ISSN(印刷版)1865-0929

会议

会议6th National Conference on Social Media Processing, SMP 2017
国家/地区中国
Beijing
时期14/09/1717/09/17

指纹

探究 'Locality-sensitive hashing for finding nearest neighbors in probability distributions' 的科研主题。它们共同构成独一无二的指纹。

引用此