Keyword Search over Probabilistic XML Documents Based on Node Classification

Yue Zhao*, Ye Yuan, Guoren Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

This paper describes a keyword search measure on probabilistic XML data based on ELM (extreme learning machine). We use this method to carry out keyword search on probabilistic XML data. A probabilistic XML document differs from a traditional XML document to realize keyword search in the consideration of possible world semantics. A probabilistic XML document can be seen as a set of nodes consisting of ordinary nodes and distributional nodes. ELM has good performance in text classification applications. As the typical semistructured data; the label of XML data possesses the function of definition itself. Label and context of the node can be seen as the text data of this node. ELM offers significant advantages such as fast learning speed, ease of implementation, and effective node classification. Set intersection can compute SLCA quickly in the node sets which is classified by using ELM. In this paper, we adopt ELM to classify nodes and compute probability. We propose two algorithms that are based on ELM and probability threshold to improve the overall performance. The experimental results verify the benefits of our methods according to various evaluation metrics.

Original languageEnglish
Article number210961
JournalMathematical Problems in Engineering
Volume2015
DOIs
Publication statusPublished - 2015
Externally publishedYes

Fingerprint

Dive into the research topics of 'Keyword Search over Probabilistic XML Documents Based on Node Classification'. Together they form a unique fingerprint.

Cite this