Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Heng Da Xu, Zhongli Li, Qingyu Zhou*, Chao Li, Zizhen Wang, Yunbo Cao, Heyan Huang, Xian Ling Mao

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

55 引用 (Scopus)

摘要

Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attempts notice this phenomenon and try to utilize the similarity relationship for this task. However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. In this paper, we propose a Chinese spell checker called REALISE, by directly leveraging the multimodal information of the Chinese characters. The REALISE model tackles the CSC task by (1) capturing the semantic, phonetic and graphic information of the input characters, and (2) selectively mixing the information in these modalities to predict the correct output. Experiments on the SIGHAN benchmarks show that the proposed model outperforms strong baselines by a large margin.

源语言英语
主期刊名Findings of the Association for Computational Linguistics
主期刊副标题ACL-IJCNLP 2021
编辑Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
出版商Association for Computational Linguistics (ACL)
716-728
页数13
ISBN(电子版)9781954085541
出版状态已出版 - 2021
活动Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online
期限: 1 8月 20216 8月 2021

出版系列

姓名Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

会议

会议Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Virtual, Online
时期1/08/216/08/21

指纹

探究 'Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking' 的科研主题。它们共同构成独一无二的指纹。

引用此