A method of part-of-speech guessing of chinese unknown words based on combined features

Hai Jun Zhang*, Shu Min Shi, Chong Feng, He Yan Huang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

Part-Of-Speech (POS) guessing of Unknown Words is an essential phase in the process of Unknown Words Identification. This paper applies combined features (namely, both external and internal features) in POS guessing of Chinese unknown words, under Conditional Random Field model (CRF). For acquiring high-precision of POS guessing, this paper puts forward a method of integrating Chinese radical, as a new internal feature of Chinese characters, into the existing feature set. Experiments show that the application of combined features is effective for POS guessing, and the new feature can significantly improve the performance of POS guessing (precision is up to 94.67%). The results also show that Chinese radical, as an effective internal feature in the field of lexical analysis, has a certain practical value.

源语言英语
主期刊名Proceedings of the 2009 International Conference on Machine Learning and Cybernetics
328-332
页数5
DOI
出版状态已出版 - 2009
活动2009 International Conference on Machine Learning and Cybernetics - Baoding, 中国
期限: 12 7月 200915 7月 2009

出版系列

姓名Proceedings of the 2009 International Conference on Machine Learning and Cybernetics
1

会议

会议2009 International Conference on Machine Learning and Cybernetics
国家/地区中国
Baoding
时期12/07/0915/07/09

指纹

探究 'A method of part-of-speech guessing of chinese unknown words based on combined features' 的科研主题。它们共同构成独一无二的指纹。

引用此