Neural chinese word segmentation as sequence to sequence translation

Xuewen Shi, Heyan Huang, Ping Jian*, Yuhang Guo, Xiaochi Wei, Yi Kun Tang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

6 引用 (Scopus)

摘要

Recently, Chinese word segmentation (CWS) methods using neural networks have made impressive progress. Most of them regard the CWS as a sequence labeling problem which construct models based on local features rather than considering global information of input sequence. In this paper, we cast the CWS as a sequence translation problem and propose a novel sequence-to-sequence CWS model with an attention-based encoder-decoder framework. The model captures the global information from the input and directly outputs the segmented sequence. It can also tackle other NLP tasks with CWS jointly in an end-to-end mode. Experiments on Weibo, PKU and MSRA benchmark datasets show that our approach has achieved competitive performances compared with state-of-the-art methods. Meanwhile, we successfully applied our proposed model to jointly learning CWS and Chinese spelling correction, which demonstrates its applicability of multi-task fusion.

源语言英语
主期刊名Social Media Processing - 6th National Conference, SMP 2017, Proceedings
编辑Huan Liu, Xing Xie, Xueqi Cheng, Huawei Shen, Weiying Ma, Shizheng Feng
出版商Springer Verlag
91-103
页数13
ISBN(印刷版)9789811068041
DOI
出版状态已出版 - 2017
活动6th National Conference on Social Media Processing, SMP 2017 - Beijing, 中国
期限: 14 9月 201717 9月 2017

出版系列

姓名Communications in Computer and Information Science
774
ISSN(印刷版)1865-0929

会议

会议6th National Conference on Social Media Processing, SMP 2017
国家/地区中国
Beijing
时期14/09/1717/09/17

指纹

探究 'Neural chinese word segmentation as sequence to sequence translation' 的科研主题。它们共同构成独一无二的指纹。

引用此