TY - GEN
T1 - Fast discovery of frequent closed sequential patterns based on positional data
AU - Huang, Guo Yan
AU - Yang, Fei
AU - Hu, Chang Zhen
AU - Ren, Jia Dong
PY - 2010
Y1 - 2010
N2 - Frequent closed sequential patterns mining is one of the hot topics in data mining. In this paper, a novel frequent closed sequential pattern mining algorithm, FCSM-PD (frequent closed sequential pattern mining algorithm based on positional data) is proposed, which is the improved BIDE algorithm based on the positional data. The positional data is used to reserve the position information of items in the algorithm, By storing all the position information of the prefix sequences in advance, the verifying about the existence of extension of position with a prefix sequence can be easily implemented by scanning the position information of the prefix sequence, rather than scanning the pseudo-projected database repeatedly in the BI-Directional Extension closure checking scheme, which is the most consumed time phase in the algorithm of BIDE. Meanwhile optimization strategy is applied to reduce the time and memory cost in the mining process. The experimental results show that FCSM-PD costs significantly lower running time than BIDE, especially in the intensive database.
AB - Frequent closed sequential patterns mining is one of the hot topics in data mining. In this paper, a novel frequent closed sequential pattern mining algorithm, FCSM-PD (frequent closed sequential pattern mining algorithm based on positional data) is proposed, which is the improved BIDE algorithm based on the positional data. The positional data is used to reserve the position information of items in the algorithm, By storing all the position information of the prefix sequences in advance, the verifying about the existence of extension of position with a prefix sequence can be easily implemented by scanning the position information of the prefix sequence, rather than scanning the pseudo-projected database repeatedly in the BI-Directional Extension closure checking scheme, which is the most consumed time phase in the algorithm of BIDE. Meanwhile optimization strategy is applied to reduce the time and memory cost in the mining process. The experimental results show that FCSM-PD costs significantly lower running time than BIDE, especially in the intensive database.
KW - BI-directional extension closure check
KW - Closed sequential pattern
KW - Positional data
UR - http://www.scopus.com/inward/record.url?scp=78149316216&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2010.5581020
DO - 10.1109/ICMLC.2010.5581020
M3 - Conference contribution
AN - SCOPUS:78149316216
SN - 9781424465262
T3 - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
SP - 444
EP - 449
BT - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
T2 - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
Y2 - 11 July 2010 through 14 July 2010
ER -