PathGuide: An efficient clustering based indexing method for XML path expressions

Jiefeng Cheng, Ge Yu, Guoren Wang, J. X. Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

This paper focuses on the performance improvement for long-path XML query processing. It is motivated by the fact that the existing inverted index and join algorithms are efficient for short path XML queries, but are inefficient for long path XML queries since the response time of the existing approaches is exponential to the length of paths. We propose a clustering based indexing method, called PathGuide, in this paper, which enhances the XML inverted index with the clustering technique. The element nodes are clustered based on their path patterns and the summary for such path information is kept in a suffix tree as the index of these element nodes. In addition, new operations are proposed to fully utilize PathGuide. With the assistance of PathGuide, unlike the path expansion approach used in Lore, the set of a relative location path can be found via one-step index lookup. Compared to the existing structural join method, PathGuide significantly reduces both join overhead and disk I/O cost. The extensive experimental studies are conducted and our results show that PathGuide outperforms the structural joins at least four times in most cases.

Original languageEnglish
Title of host publicationProceedings - 8th International Conference on Database Systems for Advanced Applications, DASFAA 2003
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages257-264
Number of pages8
ISBN (Electronic)0769518958, 9780769518954
DOIs
Publication statusPublished - 2003
Externally publishedYes
Event8th International Conference on Database Systems for Advanced Applications, DASFAA 2003 - Kyoto, Japan
Duration: 26 Mar 200328 Mar 2003

Publication series

NameProceedings - 8th International Conference on Database Systems for Advanced Applications, DASFAA 2003

Conference

Conference8th International Conference on Database Systems for Advanced Applications, DASFAA 2003
Country/TerritoryJapan
CityKyoto
Period26/03/0328/03/03

Keywords

  • Clustering algorithms
  • Costs
  • Database languages
  • Delay
  • Indexing
  • Navigation
  • Proposals
  • Query processing
  • Tree data structures
  • XML

Fingerprint

Dive into the research topics of 'PathGuide: An efficient clustering based indexing method for XML path expressions'. Together they form a unique fingerprint.

Cite this