A domain-specific chinese term extraction method based on prefix and suffix

Dongmei Li*, Qinglin Wang, Yuan Li, Qian Peng

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.

Original languageEnglish
Title of host publicationProceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012
Pages1356-1359
Number of pages4
DOIs
Publication statusPublished - 2012
Event2012 International Conference on Computer Science and Service System, CSSS 2012 - Nanjing, China
Duration: 11 Aug 201213 Aug 2012

Publication series

NameProceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012

Conference

Conference2012 International Conference on Computer Science and Service System, CSSS 2012
Country/TerritoryChina
CityNanjing
Period11/08/1213/08/12

Keywords

  • co-occurrence analysis
  • domain-specific term
  • term extraction
  • term recognition

Fingerprint

Dive into the research topics of 'A domain-specific chinese term extraction method based on prefix and suffix'. Together they form a unique fingerprint.

Cite this