Design of text categorization system based on SVM

Zhenyan Liu*, Weiping Wang, Yong Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Original languageEnglish
Title of host publicationMaterials Science and Information Technology II
Pages1191-1195
Number of pages5
DOIs
Publication statusPublished - 2012
Event2012 2nd International Conference on Materials Science and Information Technology, MSIT 2012 - Xi'an, Shaan, China
Duration: 24 Aug 201226 Aug 2012

Publication series

NameAdvanced Materials Research
Volume532-533
ISSN (Print)1022-6680

Conference

Conference2012 2nd International Conference on Materials Science and Information Technology, MSIT 2012
Country/TerritoryChina
CityXi'an, Shaan
Period24/08/1226/08/12

Keywords

  • Feature extraction
  • Feature selection
  • SVM
  • Text categorization
  • VSM

Fingerprint

Dive into the research topics of 'Design of text categorization system based on SVM'. Together they form a unique fingerprint.

Cite this