Active query selection for constraint-based clustering algorithms

Walid Atwa, Kan Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

Semi-supervised clustering uses a small amount of supervised data in the form of pairwise constraints to improve the clustering performance. However, most current methods are passive in the sense that the pairwise constraints are provided beforehand and selected randomly. This may lead to the use of constraints that are redundant, unnecessary, or even harmful to the clustering results. In this paper, we address the problem of constraint selection to improve the performance of constraint-based clustering algorithms. Based on the concepts of Maximum Mean Discrepancy, we select the set of most informative instances that minimizes the difference in distribution between the labeled and unlabeled data. Then, we query these instances with the existing neighborhoods to determine which neighborhood they belong. The experimental results with state-of-the-art methods on different real world dataset demonstrate the effectiveness and efficiency of the proposed method.

Original languageEnglish
Title of host publicationDatabase and Expert Systems Applications - 25th International Conference, DEXA 2014, Proceedings
PublisherSpringer Verlag
Pages438-445
Number of pages8
EditionPART 1
ISBN (Print)9783319100722
DOIs
Publication statusPublished - 2014
Event25th International Conference on Database and Expert Systems Applications, DEXA 2014 - Munich, Germany
Duration: 1 Sept 20144 Sept 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume8644 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on Database and Expert Systems Applications, DEXA 2014
Country/TerritoryGermany
CityMunich
Period1/09/144/09/14

Keywords

  • Semi-supervised clustering
  • active learning
  • pairwise constrain

Fingerprint

Dive into the research topics of 'Active query selection for constraint-based clustering algorithms'. Together they form a unique fingerprint.

Cite this