A hierarchical clustering algorithm based on dynamic programming for categorical sequences

Jiadong Ren*, Shiyuan Cao, Changzhen Hu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

More and more attention has been paid to the issue of sequence mining. In this paper, a new clustering algorithm for categorical sequences is proposed. For the property that sequences have unequal length, we introduce a similarity measure for clustering of categorical and sequential attributes. The similarity measure is derived from the regular sequence alignment and is based on the idea of dynamic programming. The relative distance between element pairs is used to compute the similarity value for two sequences. The sequence similarity measure is applied in the traditional hierarchical clustering algorithm to cluster sequences. Using a splice dataset and synthetic datasets, we show the quality of clusters generated by our proposed approach and the scalability of our algorithm.

Original languageEnglish
Pages (from-to)1575-1581
Number of pages7
JournalJournal of Computational Information Systems
Volume7
Issue number5
Publication statusPublished - May 2011

Keywords

  • Categorical sequences
  • Clustering
  • Dynamic programming

Fingerprint

Dive into the research topics of 'A hierarchical clustering algorithm based on dynamic programming for categorical sequences'. Together they form a unique fingerprint.

Cite this