TY - GEN
T1 - A method of part-of-speech guessing of chinese unknown words based on combined features
AU - Zhang, Hai Jun
AU - Shi, Shu Min
AU - Feng, Chong
AU - Huang, He Yan
PY - 2009
Y1 - 2009
N2 - Part-Of-Speech (POS) guessing of Unknown Words is an essential phase in the process of Unknown Words Identification. This paper applies combined features (namely, both external and internal features) in POS guessing of Chinese unknown words, under Conditional Random Field model (CRF). For acquiring high-precision of POS guessing, this paper puts forward a method of integrating Chinese radical, as a new internal feature of Chinese characters, into the existing feature set. Experiments show that the application of combined features is effective for POS guessing, and the new feature can significantly improve the performance of POS guessing (precision is up to 94.67%). The results also show that Chinese radical, as an effective internal feature in the field of lexical analysis, has a certain practical value.
AB - Part-Of-Speech (POS) guessing of Unknown Words is an essential phase in the process of Unknown Words Identification. This paper applies combined features (namely, both external and internal features) in POS guessing of Chinese unknown words, under Conditional Random Field model (CRF). For acquiring high-precision of POS guessing, this paper puts forward a method of integrating Chinese radical, as a new internal feature of Chinese characters, into the existing feature set. Experiments show that the application of combined features is effective for POS guessing, and the new feature can significantly improve the performance of POS guessing (precision is up to 94.67%). The results also show that Chinese radical, as an effective internal feature in the field of lexical analysis, has a certain practical value.
KW - CRF
KW - Chinese word segmentation
KW - POS guessing
KW - Unknown words
UR - http://www.scopus.com/inward/record.url?scp=70350728138&partnerID=8YFLogxK
U2 - 10.1109/ICMLC.2009.5212477
DO - 10.1109/ICMLC.2009.5212477
M3 - Conference contribution
AN - SCOPUS:70350728138
SN - 9781424437030
T3 - Proceedings of the 2009 International Conference on Machine Learning and Cybernetics
SP - 328
EP - 332
BT - Proceedings of the 2009 International Conference on Machine Learning and Cybernetics
T2 - 2009 International Conference on Machine Learning and Cybernetics
Y2 - 12 July 2009 through 15 July 2009
ER -