Construction of fault-tolerant synopsis over data stream based on prefix-tree

  • Yuyang You*
  • , Jianpei Zhang
  • , Zhihong Yang
  • , Yong You*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Complexity of data mining algorithm over data stream is the most important and it should be more focused on algorithm efficiency because of the great consumption of algorithm resources. Fault-tolerant frequent pattern mining is more generalized and suitable for extracting interesting knowledge from real-world data stream polluted by noise. An algorithm, called data stream fault-tolerant frequent pattern tree(DSFT-tree), was proposed. It could achieve a frequency-descending and highly compact prefix-tree structure with a single-pass to capture fault-tolerant frequent itemsets in recent sliding window. To completely and efficiently perform the tree restructuring operation, an efficient mechanism based on sliding window pointer and bit-vector representation were utilized to restructure the tree. The efficient reconstruction mechanism greatly reduced the consumption of calculation resources and achieved fault-tolerant frequent itemsets mining. Experimental transaction database was generated by IBM synthetic data generator. The number of frequent itemsets extracted by DSFT-tree is 1.5 fold greater than that extracted by FP-stream. Experimental results show that developed algorithm is an efficient fault-tolerant synopsis.

Original languageEnglish
Pages (from-to)564-568
Number of pages5
JournalBeijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics
Volume37
Issue number5
Publication statusPublished - May 2011
Externally publishedYes

Keywords

  • Data stream
  • Fault-tolerant frequent pattern
  • Prefix-tree
  • Synopsis

Fingerprint

Dive into the research topics of 'Construction of fault-tolerant synopsis over data stream based on prefix-tree'. Together they form a unique fingerprint.

Cite this