Mining frequent itemsets based on projection array

Hai Tao He; Hai Yan Cao; Rui Xia Yao; Jia Dong Ren; Chang Zhen Hu

doi:10.1109/ICMLC.2010.5581018

Mining frequent itemsets based on projection array

Hai Tao He^*, Hai Yan Cao, Rui Xia Yao, Jia Dong Ren, Chang Zhen Hu

^*Corresponding author for this work

School of Cyberspace Science and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Frequent itemsets mining is a crucial problem in the field of data mining. Although many related studies have been suggested, these algorithms may suffer from high computation cost and spatial complexity in dense database, especially when mining long frequent itemsets or support threshold is very lower. To address this problem, a new data structure called PArray is proposed. PArray makes use of data horizontally and vertically like BitTableFI, and those itemsets that co-occurence with single frequent items are found by computing intersection in PArray. Then, a new algorithm, call MFIPA, is proposed based on PArray. Some frequent itemsets which have the same supports as single frequent item can be found firstly by connecting the single frequent item with every nonempty subsets of its projection, then all other frequent itemsets can be found by using depth-first search strategy. The experimental results show that the proposed algorithm is superior to BitTableFI in execution efficiency and memory requirement, especially for dense database.

Original language	English
Title of host publication	2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
Pages	454-459
Number of pages	6
DOIs	https://doi.org/10.1109/ICMLC.2010.5581018
Publication status	Published - 2010
Event	2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010 - Qingdao, China Duration: 11 Jul 2010 → 14 Jul 2010

Publication series

Name	2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
Volume	1

Conference

Conference	2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
Country/Territory	China
City	Qingdao
Period	11/07/10 → 14/07/10

Keywords

Depth-first search
Frequent itemsets
Projection array

Access to Document

10.1109/ICMLC.2010.5581018

Cite this

@inproceedings{31cce5cf42774314b945df59042c5466,

title = "Mining frequent itemsets based on projection array",

abstract = "Frequent itemsets mining is a crucial problem in the field of data mining. Although many related studies have been suggested, these algorithms may suffer from high computation cost and spatial complexity in dense database, especially when mining long frequent itemsets or support threshold is very lower. To address this problem, a new data structure called PArray is proposed. PArray makes use of data horizontally and vertically like BitTableFI, and those itemsets that co-occurence with single frequent items are found by computing intersection in PArray. Then, a new algorithm, call MFIPA, is proposed based on PArray. Some frequent itemsets which have the same supports as single frequent item can be found firstly by connecting the single frequent item with every nonempty subsets of its projection, then all other frequent itemsets can be found by using depth-first search strategy. The experimental results show that the proposed algorithm is superior to BitTableFI in execution efficiency and memory requirement, especially for dense database.",

keywords = "Depth-first search, Frequent itemsets, Projection array",

author = "He, {Hai Tao} and Cao, {Hai Yan} and Yao, {Rui Xia} and Ren, {Jia Dong} and Hu, {Chang Zhen}",

year = "2010",

doi = "10.1109/ICMLC.2010.5581018",

language = "English",

isbn = "9781424465262",

series = "2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010",

pages = "454--459",

booktitle = "2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010",

note = "2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010 ; Conference date: 11-07-2010 Through 14-07-2010",

}

He, HT, Cao, HY, Yao, RX, Ren, JD & Hu, CZ 2010, Mining frequent itemsets based on projection array. in 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010., 5581018, 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010, vol. 1, pp. 454-459, 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010, Qingdao, China, 11/07/10. https://doi.org/10.1109/ICMLC.2010.5581018

Mining frequent itemsets based on projection array. / He, Hai Tao; Cao, Hai Yan; Yao, Rui Xia et al.
2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010. 2010. p. 454-459 5581018 (2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010; Vol. 1).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Mining frequent itemsets based on projection array

AU - He, Hai Tao

AU - Cao, Hai Yan

AU - Yao, Rui Xia

AU - Ren, Jia Dong

AU - Hu, Chang Zhen

PY - 2010

Y1 - 2010

N2 - Frequent itemsets mining is a crucial problem in the field of data mining. Although many related studies have been suggested, these algorithms may suffer from high computation cost and spatial complexity in dense database, especially when mining long frequent itemsets or support threshold is very lower. To address this problem, a new data structure called PArray is proposed. PArray makes use of data horizontally and vertically like BitTableFI, and those itemsets that co-occurence with single frequent items are found by computing intersection in PArray. Then, a new algorithm, call MFIPA, is proposed based on PArray. Some frequent itemsets which have the same supports as single frequent item can be found firstly by connecting the single frequent item with every nonempty subsets of its projection, then all other frequent itemsets can be found by using depth-first search strategy. The experimental results show that the proposed algorithm is superior to BitTableFI in execution efficiency and memory requirement, especially for dense database.

AB - Frequent itemsets mining is a crucial problem in the field of data mining. Although many related studies have been suggested, these algorithms may suffer from high computation cost and spatial complexity in dense database, especially when mining long frequent itemsets or support threshold is very lower. To address this problem, a new data structure called PArray is proposed. PArray makes use of data horizontally and vertically like BitTableFI, and those itemsets that co-occurence with single frequent items are found by computing intersection in PArray. Then, a new algorithm, call MFIPA, is proposed based on PArray. Some frequent itemsets which have the same supports as single frequent item can be found firstly by connecting the single frequent item with every nonempty subsets of its projection, then all other frequent itemsets can be found by using depth-first search strategy. The experimental results show that the proposed algorithm is superior to BitTableFI in execution efficiency and memory requirement, especially for dense database.

KW - Depth-first search

KW - Frequent itemsets

KW - Projection array

UR - http://www.scopus.com/inward/record.url?scp=78149343914&partnerID=8YFLogxK

U2 - 10.1109/ICMLC.2010.5581018

DO - 10.1109/ICMLC.2010.5581018

M3 - Conference contribution

AN - SCOPUS:78149343914

SN - 9781424465262

T3 - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010

SP - 454

EP - 459

BT - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010

T2 - 2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010

Y2 - 11 July 2010 through 14 July 2010

ER -

Mining frequent itemsets based on projection array

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this