Abstract
Maximal-length noun phrase indentification is meaningful to machine translation and many other natural language processing tasks. For the purpose of studying Chinese maximal-length noun phrases, on the basis of current methods, starting from linguistics particularity in Chinese and characteristics of sequence labeling algorithm based on support vector machine (SVM), we explore the adaptability of combination algorithm based on hybrid features. The algorithm is effective, by theoretical analysis and experimental results, to identify Chinese maximal-length noun phrase by applying hybrid unit with words and base chunk, and it is complementary in bi-directional labeling results. From the above, a combination algorithm of bi-directional labeling based on "boundary fork" can discover complement of two directions identification and achieve a high combination accuracy.
Original language | English |
---|---|
Pages (from-to) | 1274-1282 |
Number of pages | 9 |
Journal | Zidonghua Xuebao/Acta Automatica Sinica |
Volume | 41 |
Issue number | 7 |
DOIs | |
Publication status | Published - 1 Jul 2015 |
Keywords
- Base chunk
- Bi-directional labeling
- Hybrid feature
- Maximal-length noun phrase