Pattern-based topic modelling for query expansion

Yang Gao, Yue Xu, Yuefeng Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

One big problem with information retrieval (IR) is that the size of queries is usually short and the keywords in a query are very often ambiguous or inconsistent. Automatic query expansion is a widely recognized technique which is effective to deal with this problem. However, many query expansions methods require extra information such as explicit relevance feedback from users or pseudo relevance feedback from retrieval results. In this paper, we propose an unsupervised query expansion method, called Topical Query Expansion (TQE), which does not require extra information. The proposed TQE method expands a given query based on the topical patterns which can create links among those more associated and semantic words in each topic. This model also discovers related topics that are related to the original query. Based on the expanded terms and related topics, we propose to rank the document relevance with different ranking strategies. We conduct experiments on popularly used datasets, TREC datasets, to evaluate the proposed methods. The results demonstrate outstanding results against several state-of-the- art models.

Original languageEnglish
Title of host publicationData Mining and Analytics 2014 - Proceedings of the 12th Australasian Data Mining Conference, AusDM 2014
EditorsYanchang Zhao, Yanchang Zhao, Lin Liu, Kok-Leong Ong, Xue Li
PublisherAustralian Computer Society
Pages165-174
Number of pages10
ISBN (Electronic)9781921770173
Publication statusPublished - 2014
Externally publishedYes

Publication series

NameConferences in Research and Practice in Information Technology Series
Volume158
ISSN (Print)1445-1336

Keywords

  • Information retrieval
  • Query expansion
  • Topical pattern

Fingerprint

Dive into the research topics of 'Pattern-based topic modelling for query expansion'. Together they form a unique fingerprint.

Cite this