Query intent detection based on clustering of phrase embedding

Jiahui Gu*, Chong Feng, Xiong Gao, Yashen Wang, Heyan Huang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

Understanding ambiguous or multi-faceted search queries is essential for information retrieval. The task of identifying the major aspects or senses of queries can be viewed as detection of query intents, where the intents are represented as a number of clusters. So the challenging issue in this task is how to generate intent candidates and group them semantically. This paper explores the competence of lexical statistics and embedding method. First a novel term expansion algorithm is designed to sketch all possible intent candidates. Moreover, an efficient query intent generation model is proposed, which learns latent representations for intent candidates via embedding-based methods. And then vectorized intent candidates are clustered and detected as query intents. Experimental results, based on the NTCIR-12 IMine-2 corpus, show that query intent generation model via phrase embedding significantly outperforms the state-of-art clustering algorithms in query intent detection.

Original languageEnglish
Title of host publicationSocial Media Processing - 5th National Conference, SMP 2016, Proceedings
EditorsHongfei Lin, Yuming Li, Guoxiong Xiang, Mingwen Wang
PublisherSpringer Verlag
Pages110-122
Number of pages13
ISBN (Print)9789811029929
DOIs
Publication statusPublished - 2016
Event5th National Conference on Social Media Processing, SMP 2016 - Nanchang, China
Duration: 29 Oct 201630 Oct 2016

Publication series

NameCommunications in Computer and Information Science
Volume669
ISSN (Print)1865-0929

Conference

Conference5th National Conference on Social Media Processing, SMP 2016
Country/TerritoryChina
CityNanchang
Period29/10/1630/10/16

Keywords

  • Phrase embedding
  • Query intents
  • Term expansion algorithm

Fingerprint

Dive into the research topics of 'Query intent detection based on clustering of phrase embedding'. Together they form a unique fingerprint.

Cite this