Pattern-based topic models for information filtering

Research output: Contribution to conferencePaperpeer-review

Abstract

Topic modelling, such as Latent Dirichlet Allocation (LDA), was proposed to generate statistical models to represent multiple topics in a collection of documents, which has been widely utilized in the fields of machine learning and information retrieval, etc. But its effectiveness in information filtering is rarely known. Patterns are always thought to be more representative than single terms for representing documents. In this paper, a novel information filtering model, Pattern-based Topic Model (PBTM), is proposed to represent the text documents not only using the topic distributions at general level but also using semantic pattern representations at detailed specific level, both of which contribute to the accurate document representation and document relevance ranking. Extensive experiments are conducted to evaluate the effectiveness of PBTM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model achieves outstanding performance.

Original languageEnglish
Pages921-928
Number of pages8
DOIs
Publication statusPublished - 2013
Externally publishedYes
Event2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013 - Dallas, TX, United States
Duration: 7 Dec 201310 Dec 2013

Conference

Conference2013 13th IEEE International Conference on Data Mining Workshops, ICDMW 2013
Country/TerritoryUnited States
CityDallas, TX
Period7/12/1310/12/13

Keywords

  • Closed pattern
  • Information filtering
  • Pattern mining
  • Topic models
  • User modelling

Fingerprint

Dive into the research topics of 'Pattern-based topic models for information filtering'. Together they form a unique fingerprint.

Cite this