A discretization algorithm of numerical attributes for digital library evaluation based on data mining technology

Yumin Zhao*, Zhendong Niu, Xueping Peng, Lin Dai

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Citations (Scopus)

Abstract

We present here a discretization algorithm for numerical attributes of digital collections. In our research data mining technology is imported into digital library evaluation to provide a better decision-making support. But data prediction algorithms work not well based on the traditional discretization method during the data mining process. The reason is that numerical attributes of digital collections are complicated and not in the same scale of distribution distance. We study the characteristic of numerical attributes and put forward a discretization method based on the Z-score idea of mathematical statistics. This algorithm can reflect the dynamic semantic distance for different numerical attributes and significantly enhance the precision rate and recall rate of data prediction algorithms. Furthermore a 'nonlinear conditional relationship' among attributes of digital collections is discovered during the study of discretization algorithm and impacts the actual application result of traditional data mining algorithms.

Original languageEnglish
Title of host publicationDigital Libraries
Subtitle of host publicationFor Cultural Heritage, Knowledge Dissemination, and Future Creation - 13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011, Proceedings
Pages70-76
Number of pages7
DOIs
Publication statusPublished - 2011
Event13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011 - Beijing, China
Duration: 24 Oct 201127 Oct 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7008 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th International Conference on Asia-Pacific Digital Libraries, ICADL 2011
Country/TerritoryChina
CityBeijing
Period24/10/1127/10/11

Keywords

  • Discretization algorithm
  • data mining
  • digital library evaluation

Fingerprint

Dive into the research topics of 'A discretization algorithm of numerical attributes for digital library evaluation based on data mining technology'. Together they form a unique fingerprint.

Cite this