LASH: Large-Scale Academic Deep Semantic Hashing

Jia Nan Guo, Xian Ling Mao*, Tian Lan, Rong Xin Tu, Wei Wei, Heyan Huang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

8 引用 (Scopus)

摘要

With the explosively increasing of academic papers, efficient academic document retrieval is becoming an essential requirement for large-scale information retrieval systems. Inspired by the success of deep semantic hashing in normal document retrieval, deep semantic hashing is a promising approach for academic document retrieval by mapping academic documents into efficient hash codes. However, for academic document retrieval, the existing deep semantic hashing methods suffer from following two problems: (1) they cannot differentiate the importance of different field labels; (2) they cannot plenty utilize the structure information in paper citations. To address these problems, we propose a novel Large-scale Academic deep Semantic Hashing, called LASH. Specifically, LASH first treats paper citations as a citation network, and then employs a multi-input deep autoencoder to directly encode both structure information of the citation network and semantic information of academic documents into unified hash codes. Moreover, a weighted percentage similarity is designed to measure the importance of different field labels, which is a linear combination of Jaccard and Cosine similarity. Supervised by the similarity, the learned unified hash codes can further preserve the importance of different field labels. Extensive experiments show LASH significantly outperforms state-of-The-Art baselines over proposed three real-world large-scale academic document datasets.

源语言英语
页(从-至)1734-1746
页数13
期刊IEEE Transactions on Knowledge and Data Engineering
35
2
DOI
出版状态已出版 - 1 2月 2023

指纹

探究 'LASH: Large-Scale Academic Deep Semantic Hashing' 的科研主题。它们共同构成独一无二的指纹。

引用此