KNN text categorization algorithm based on semantic centre

Xiao Fei Zhang; He Yan Huang; Ke Liang Zhang

doi:10.1109/ITCS.2009.57

KNN text categorization algorithm based on semantic centre

Xiao Fei Zhang^*, He Yan Huang, Ke Liang Zhang

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

6 Citations (Scopus)

Abstract

As a classical statistical pattern recognition algorithm characterized with high accuracy and stability, KNN has been used widely in text categorization. But since KNN's time complexity is directly proportional to the sample size, its classification speed is very slow. In this paper, we propose a new KNN text categorization algorithm based on semantic centre, which we call SKNN, to speed up the classification. The basic thread is to replace the large number of original sample documents with a small amount of sample semantic centers. Experiments have proved that the SKNN's clarification is over 10 times as fast as that of the traditional KNN and its F1 value is approximately equal to SVM and traditional KNN algorithm.

Original language	English
Title of host publication	Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009
Pages	249-252
Number of pages	4
DOIs	https://doi.org/10.1109/ITCS.2009.57
Publication status	Published - 2009
Externally published	Yes
Event	2009 International Conference on Information Technology and Computer Science, ITCS 2009 - Kiev, Ukraine Duration: 25 Jul 2009 → 26 Jul 2009

Publication series

Name	Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009
Volume	1

Conference

Conference	2009 International Conference on Information Technology and Computer Science, ITCS 2009
Country/Territory	Ukraine
City	Kiev
Period	25/07/09 → 26/07/09

Keywords

KNN
Semantic center
Text categorization

Access to Document

10.1109/ITCS.2009.57

Cite this

Zhang, X. F., Huang, H. Y., & Zhang, K. L. (2009). KNN text categorization algorithm based on semantic centre. In Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009 (pp. 249-252). Article 5190062 (Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009; Vol. 1). https://doi.org/10.1109/ITCS.2009.57

@inproceedings{bd328172fdf647b681cc4ce0a97e8f7e,

title = "KNN text categorization algorithm based on semantic centre",

abstract = "As a classical statistical pattern recognition algorithm characterized with high accuracy and stability, KNN has been used widely in text categorization. But since KNN's time complexity is directly proportional to the sample size, its classification speed is very slow. In this paper, we propose a new KNN text categorization algorithm based on semantic centre, which we call SKNN, to speed up the classification. The basic thread is to replace the large number of original sample documents with a small amount of sample semantic centers. Experiments have proved that the SKNN's clarification is over 10 times as fast as that of the traditional KNN and its F1 value is approximately equal to SVM and traditional KNN algorithm.",

keywords = "KNN, Semantic center, Text categorization",

author = "Zhang, {Xiao Fei} and Huang, {He Yan} and Zhang, {Ke Liang}",

year = "2009",

doi = "10.1109/ITCS.2009.57",

language = "English",

isbn = "9780769536880",

series = "Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009",

pages = "249--252",

booktitle = "Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009",

note = "2009 International Conference on Information Technology and Computer Science, ITCS 2009 ; Conference date: 25-07-2009 Through 26-07-2009",

}

Zhang, XF, Huang, HY & Zhang, KL 2009, KNN text categorization algorithm based on semantic centre. in Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009., 5190062, Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009, vol. 1, pp. 249-252, 2009 International Conference on Information Technology and Computer Science, ITCS 2009, Kiev, Ukraine, 25/07/09. https://doi.org/10.1109/ITCS.2009.57

KNN text categorization algorithm based on semantic centre. / Zhang, Xiao Fei; Huang, He Yan; Zhang, Ke Liang.
Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009. 2009. p. 249-252 5190062 (Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009; Vol. 1).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - KNN text categorization algorithm based on semantic centre

AU - Zhang, Xiao Fei

AU - Huang, He Yan

AU - Zhang, Ke Liang

PY - 2009

Y1 - 2009

N2 - As a classical statistical pattern recognition algorithm characterized with high accuracy and stability, KNN has been used widely in text categorization. But since KNN's time complexity is directly proportional to the sample size, its classification speed is very slow. In this paper, we propose a new KNN text categorization algorithm based on semantic centre, which we call SKNN, to speed up the classification. The basic thread is to replace the large number of original sample documents with a small amount of sample semantic centers. Experiments have proved that the SKNN's clarification is over 10 times as fast as that of the traditional KNN and its F1 value is approximately equal to SVM and traditional KNN algorithm.

AB - As a classical statistical pattern recognition algorithm characterized with high accuracy and stability, KNN has been used widely in text categorization. But since KNN's time complexity is directly proportional to the sample size, its classification speed is very slow. In this paper, we propose a new KNN text categorization algorithm based on semantic centre, which we call SKNN, to speed up the classification. The basic thread is to replace the large number of original sample documents with a small amount of sample semantic centers. Experiments have proved that the SKNN's clarification is over 10 times as fast as that of the traditional KNN and its F1 value is approximately equal to SVM and traditional KNN algorithm.

KW - KNN

KW - Semantic center

KW - Text categorization

UR - http://www.scopus.com/inward/record.url?scp=71049172088&partnerID=8YFLogxK

U2 - 10.1109/ITCS.2009.57

DO - 10.1109/ITCS.2009.57

M3 - Conference contribution

AN - SCOPUS:71049172088

SN - 9780769536880

T3 - Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009

SP - 249

EP - 252

BT - Proceedings - 2009 International Conference on Information Technology and Computer Science, ITCS 2009

T2 - 2009 International Conference on Information Technology and Computer Science, ITCS 2009

Y2 - 25 July 2009 through 26 July 2009

ER -

KNN text categorization algorithm based on semantic centre

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this