Application of Conditional Random Fields model in Unknown Words Identification

Hai Jun Zhang*, Wei Min Pan, Shu Min Shi, Chao Yong Zhu

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

4 引用 (Scopus)

摘要

This paper proposed a method for Unknown Words Identification (UWI) based on repeats. To identify Unknown words with reliable theory, we put forward a formal model for the process of UWI, which can give directions on the selection of features used in UWI in theory. For the formal model, we propose employing Conditional Random Fields model (CRF) as statistical frame to resolve it. Under the statistical frame, UWI is converted to the process of exploiting effective features that can represent the essences of unknown words. The experiments show that the method of this paper is effective, and reasonable combination of features used in CRF can evidently improve the result of UWI. The ultimate result (F score) of this method is 47.81% and 69.83% in open test and word extraction respectively, which is better over the best result reported in previous works.

源语言英语
主期刊名2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
1839-1843
页数5
DOI
出版状态已出版 - 2010
活动2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010 - Qingdao, 中国
期限: 11 7月 201014 7月 2010

出版系列

姓名2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
4

会议

会议2010 International Conference on Machine Learning and Cybernetics, ICMLC 2010
国家/地区中国
Qingdao
时期11/07/1014/07/10

指纹

探究 'Application of Conditional Random Fields model in Unknown Words Identification' 的科研主题。它们共同构成独一无二的指纹。

引用此