A domain-specific chinese term extraction method based on prefix and suffix

Dongmei Li; Qinglin Wang; Yuan Li; Qian Peng

doi:10.1109/CSSS.2012.342

A domain-specific chinese term extraction method based on prefix and suffix

Dongmei Li^*, Qinglin Wang, Yuan Li, Qian Peng

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.

Original language	English
Title of host publication	Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012
Pages	1356-1359
Number of pages	4
DOIs	https://doi.org/10.1109/CSSS.2012.342
Publication status	Published - 2012
Event	2012 International Conference on Computer Science and Service System, CSSS 2012 - Nanjing, China Duration: 11 Aug 2012 → 13 Aug 2012

Publication series

Name	Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012

Conference

Conference	2012 International Conference on Computer Science and Service System, CSSS 2012
Country/Territory	China
City	Nanjing
Period	11/08/12 → 13/08/12

Keywords

co-occurrence analysis
domain-specific term
term extraction
term recognition

Access to Document

10.1109/CSSS.2012.342

Cite this

Li, D., Wang, Q., Li, Y., & Peng, Q. (2012). A domain-specific chinese term extraction method based on prefix and suffix. In Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012 (pp. 1356-1359). Article 6394580 (Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012). https://doi.org/10.1109/CSSS.2012.342

@inproceedings{6111349a9ecf4ce2914e78d98ac3f994,

title = "A domain-specific chinese term extraction method based on prefix and suffix",

abstract = "The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.",

keywords = "co-occurrence analysis, domain-specific term, term extraction, term recognition",

author = "Dongmei Li and Qinglin Wang and Yuan Li and Qian Peng",

year = "2012",

doi = "10.1109/CSSS.2012.342",

language = "English",

isbn = "9780769547190",

series = "Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012",

pages = "1356--1359",

booktitle = "Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012",

note = "2012 International Conference on Computer Science and Service System, CSSS 2012 ; Conference date: 11-08-2012 Through 13-08-2012",

}

Li, D, Wang, Q, Li, Y & Peng, Q 2012, A domain-specific chinese term extraction method based on prefix and suffix. in Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012., 6394580, Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012, pp. 1356-1359, 2012 International Conference on Computer Science and Service System, CSSS 2012, Nanjing, China, 11/08/12. https://doi.org/10.1109/CSSS.2012.342

A domain-specific chinese term extraction method based on prefix and suffix. / Li, Dongmei; Wang, Qinglin; Li, Yuan et al.
Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012. 2012. p. 1356-1359 6394580 (Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - A domain-specific chinese term extraction method based on prefix and suffix

AU - Li, Dongmei

AU - Wang, Qinglin

AU - Li, Yuan

AU - Peng, Qian

PY - 2012

Y1 - 2012

N2 - The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.

AB - The term recognition and extraction is the foundation of text information processing. This paper presents a domain-specific Chinese term extraction method based on prefix and suffix. Firstly, the commonly used prefix and suffix are extracted from a given set of seed terms. Secondly, we segment the testing corpus to obtain statistics of words which are next to the prefixes and suffixes. And then, we judge whether a word and a prefix/suffix is a candidate term according to frequency information of the word. Thirdly, we enlarge initial candidate term set by frequency judgment. Finally we filter candidate terms by co-occurrence analysis. Experiment shows that terms with common prefixes and suffixes can be well extracted.

KW - co-occurrence analysis

KW - domain-specific term

KW - term extraction

KW - term recognition

UR - http://www.scopus.com/inward/record.url?scp=84873849541&partnerID=8YFLogxK

U2 - 10.1109/CSSS.2012.342

DO - 10.1109/CSSS.2012.342

M3 - Conference contribution

AN - SCOPUS:84873849541

SN - 9780769547190

T3 - Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012

SP - 1356

EP - 1359

BT - Proceedings - 2012 International Conference on Computer Science and Service System, CSSS 2012

T2 - 2012 International Conference on Computer Science and Service System, CSSS 2012

Y2 - 11 August 2012 through 13 August 2012

ER -

A domain-specific chinese term extraction method based on prefix and suffix

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this