The application of CRFs in part-of-speech tagging

Xiaofei Zhang*, Heyan Huang, Liang Zhang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

Conditional random fields (CRFs) for sequence labeling offer advantages over both generative models like Hidden Markov model (HMM) and classifiers applied at each sequence position. First, the CRFs don't force to adhere to the independence assumption and thus can depend on arbitrary, non-independent features, without accounting for the distribution of those dependencies. Since CRFs models are able to flexibly utilize a wide variety of features, the training data sparse problem can be efficiently resolved. Moreover, the parameter estimation for CRFs is global, which effectively resolve the label bias problem. In this paper, the CRFs with Gaussian prior smoothing is used for Part-of-Speech (POS) tagging. Experiments show that the POS tagging error rate is reduced by 55.17% in close test and 43.64% in open test over HMM-based baseline, and synchronously an accuracy of 98.05% in close test and 95.79% in open test are also achieved. These positive results confirm CRFs superior performance.

源语言英语
主期刊名2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2009
347-350
页数4
DOI
出版状态已出版 - 2009
活动2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2009 - Hangzhou, Zhejiang, 中国
期限: 26 8月 200927 8月 2009

出版系列

姓名2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2009
2

会议

会议2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2009
国家/地区中国
Hangzhou, Zhejiang
时期26/08/0927/08/09

指纹

探究 'The application of CRFs in part-of-speech tagging' 的科研主题。它们共同构成独一无二的指纹。

引用此