Abstract
Retrieving graphs containing a query graph from a large graph database is a key task in many graph-based applications, includ-ing chemical compounds discovery, protein complex prediction, and structural pattern recognition. However, graph data handled by these applications is often noisy, incomplete, and inaccurate be-cause of the way the data is produced. In this paper, we study sub-graph queries over uncertain graphs. Specifically, we consider the problem of answering threshold-based probabilistic queries over a large uncertain graph database with the possible world seman-tics. We prove that problem is #P-complete, therefore, we adopt a filtering-and-verification strategy to speed up the search. In the filtering phase, we use a probabilistic inverted index, PIndex, based on subgraph features obtained by an optimal feature selection pro-cess. During the verification phase, we develop exact and bound algorithms to validate the remaining candidates. Extensive experi-mental results demonstrate the effectiveness of the proposed algo-rithms.
Original language | English |
---|---|
Pages (from-to) | 876-886 |
Number of pages | 11 |
Journal | Proceedings of the VLDB Endowment |
Volume | 4 |
Issue number | 11 |
DOIs | |
Publication status | Published - Aug 2011 |
Externally published | Yes |
Event | 37th International Conference on Very Large Data Bases, VLDB 2011 - Seattle, United States Duration: 29 Aug 2011 → 3 Sept 2011 |