Abstract
Complexity of data mining algorithm over data stream is the most important and it should be more focused on algorithm efficiency because of the great consumption of algorithm resources. Fault-tolerant frequent pattern mining is more generalized and suitable for extracting interesting knowledge from real-world data stream polluted by noise. An algorithm, called data stream fault-tolerant frequent pattern tree(DSFT-tree), was proposed. It could achieve a frequency-descending and highly compact prefix-tree structure with a single-pass to capture fault-tolerant frequent itemsets in recent sliding window. To completely and efficiently perform the tree restructuring operation, an efficient mechanism based on sliding window pointer and bit-vector representation were utilized to restructure the tree. The efficient reconstruction mechanism greatly reduced the consumption of calculation resources and achieved fault-tolerant frequent itemsets mining. Experimental transaction database was generated by IBM synthetic data generator. The number of frequent itemsets extracted by DSFT-tree is 1.5 fold greater than that extracted by FP-stream. Experimental results show that developed algorithm is an efficient fault-tolerant synopsis.
| Original language | English |
|---|---|
| Pages (from-to) | 564-568 |
| Number of pages | 5 |
| Journal | Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics |
| Volume | 37 |
| Issue number | 5 |
| Publication status | Published - May 2011 |
| Externally published | Yes |
Keywords
- Data stream
- Fault-tolerant frequent pattern
- Prefix-tree
- Synopsis
Fingerprint
Dive into the research topics of 'Construction of fault-tolerant synopsis over data stream based on prefix-tree'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver