Efficient keyword search on uncertain graph data

Ye Yuan, Guoren Wang, Lei Chen, Haixun Wang

Research output: Contribution to journalArticlepeer-review

50 Citations (Scopus)

Abstract

As a popular search mechanism, keyword search has been applied to retrieve useful data in documents, texts, graphs, and even relational databases. However, so far, there is no work on keyword search over uncertain graph data even though the uncertain graphs have been widely used in many real applications, such as modeling road networks, influential detection in social networks, and data analysis on PPI networks. Therefore, in this paper, we study the problem of top-k keyword search over uncertain graph data. Following the similar answer definition for keyword search over deterministic graphs, we consider a subtree in the uncertain graph as an answer to a keyword query if 1) it contains all the keywords; 2) it has a high score (defined by users or applications) based on keyword matching; and 3) it has low uncertainty. Keyword search over deterministic graphs is already a hard problem as stated in [1], [2], [3]. Due to the existence of uncertainty, keyword search over uncertain graphs is much harder. Therefore, to improve the search efficiency, we employ a filtering-and-verification strategy based on a probabilistic keyword index, PKIndex. For each keyword, we offline compute path-based top-k probabilities, and attach these values to PKIndex in an optimal, compressed way. In the filtering phase, we perform existence, path-based and tree-based probabilistic pruning phases, which filter out most false subtrees. In the verification, we propose a sampling algorithm to verify the candidates. Extensive experimental results demonstrate the effectiveness of the proposed algorithms.

Original languageEnglish
Article number6648594
Pages (from-to)2767-2779
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume25
Issue number12
DOIs
Publication statusPublished - 2013
Externally publishedYes

Keywords

  • Algorithm
  • Database
  • Graph data
  • Uncertain data

Fingerprint

Dive into the research topics of 'Efficient keyword search on uncertain graph data'. Together they form a unique fingerprint.

Cite this