TY - GEN
T1 - A K-means approach based on concept hierarchical tree for search results clustering
AU - Jiang, Peng
AU - Zhang, Chunxia
AU - Guo, Guisuo
AU - Niu, Zhendong
AU - Gao, Dongping
PY - 2009
Y1 - 2009
N2 - Search results clustering aims to facilitate users' information retrieval process and query refinement by online grouping similar documents returned from the search engine. It has stringent requirements on performance and meaningful cluster labels. Thus, most existing clustering algorithms such as K-means and agglomerative hierarchical clustering cannot be directly applied to the task of online search results clustering. In this paper, we propose a K-means approach based on concept hierarchical tree to cluster search results. This algorithm not only overcomes weaknesses of the classic K-means method: the results produced depend on the initial seeds and the parameter k is often unknown, but also satisfies the requirements of online search results clustering. Our method utilizes the semantic relation among documents by mapping terms to concepts in the concept hierarchical tree, which can be constructed by WordNet. We have developed a meta-search and clustering system based on our approach, followed by using an impersonal and repeatable evaluation solution. Experimental results indicate that our proposed algorithm is effective and suitable in performing the task of clustering search results.
AB - Search results clustering aims to facilitate users' information retrieval process and query refinement by online grouping similar documents returned from the search engine. It has stringent requirements on performance and meaningful cluster labels. Thus, most existing clustering algorithms such as K-means and agglomerative hierarchical clustering cannot be directly applied to the task of online search results clustering. In this paper, we propose a K-means approach based on concept hierarchical tree to cluster search results. This algorithm not only overcomes weaknesses of the classic K-means method: the results produced depend on the initial seeds and the parameter k is often unknown, but also satisfies the requirements of online search results clustering. Our method utilizes the semantic relation among documents by mapping terms to concepts in the concept hierarchical tree, which can be constructed by WordNet. We have developed a meta-search and clustering system based on our approach, followed by using an impersonal and repeatable evaluation solution. Experimental results indicate that our proposed algorithm is effective and suitable in performing the task of clustering search results.
UR - http://www.scopus.com/inward/record.url?scp=76349084066&partnerID=8YFLogxK
U2 - 10.1109/FSKD.2009.658
DO - 10.1109/FSKD.2009.658
M3 - Conference contribution
AN - SCOPUS:76349084066
SN - 9780769537351
T3 - 6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009
SP - 380
EP - 386
BT - 6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009
T2 - 6th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2009
Y2 - 14 August 2009 through 16 August 2009
ER -