Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Heng Da Xu; Zhongli Li; Qingyu Zhou; Chao Li; Zizhen Wang; Yunbo Cao; Heyan Huang; Xian Ling Mao

Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

Heng Da Xu, Zhongli Li, Qingyu Zhou^*, Chao Li, Zizhen Wang, Yunbo Cao, Heyan Huang, Xian Ling Mao

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

65 引用（Scopus）

摘要

Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attempts notice this phenomenon and try to utilize the similarity relationship for this task. However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. In this paper, we propose a Chinese spell checker called REALISE, by directly leveraging the multimodal information of the Chinese characters. The REALISE model tackles the CSC task by (1) capturing the semantic, phonetic and graphic information of the input characters, and (2) selectively mixing the information in these modalities to predict the correct output. Experiments on the SIGHAN benchmarks show that the proposed model outperforms strong baselines by a large margin.

源语言	英语
主期刊名	Findings of the Association for Computational Linguistics
主期刊副标题	ACL-IJCNLP 2021
编辑	Chengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
出版商	Association for Computational Linguistics (ACL)
页	716-728
页数	13
ISBN（电子版）	9781954085541
出版状态	已出版 - 2021
活动	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online 期限: 1 8月 2021 → 6 8月 2021

出版系列

姓名	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

会议

会议	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
市	Virtual, Online
时期	1/08/21 → 6/08/21

其它文件与链接

链接到 Scopus 的出版物

引用此

Xu, H. D., Li, Z., Zhou, Q., Li, C., Wang, Z., Cao, Y., Huang, H., & Mao, X. L. (2021). Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking. 在 C. Zong, F. Xia, W. Li, & R. Navigli (编辑), Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (页码 716-728). (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021). Association for Computational Linguistics (ACL).

Xu, Heng Da ; Li, Zhongli ; Zhou, Qingyu 等. / Read, Listen, and See : Leveraging Multimodal Information Helps Chinese Spell Checking. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 编辑 / Chengqing Zong ; Fei Xia ; Wenjie Li ; Roberto Navigli. Association for Computational Linguistics (ACL), 2021. 页码 716-728 (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021).

@inproceedings{81f9481f4cc74e0f98847bdf8a55b385,

title = "Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking",

abstract = "Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attempts notice this phenomenon and try to utilize the similarity relationship for this task. However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. In this paper, we propose a Chinese spell checker called REALISE, by directly leveraging the multimodal information of the Chinese characters. The REALISE model tackles the CSC task by (1) capturing the semantic, phonetic and graphic information of the input characters, and (2) selectively mixing the information in these modalities to predict the correct output. Experiments on the SIGHAN benchmarks show that the proposed model outperforms strong baselines by a large margin.",

author = "Xu, {Heng Da} and Zhongli Li and Qingyu Zhou and Chao Li and Zizhen Wang and Yunbo Cao and Heyan Huang and Mao, {Xian Ling}",

note = "Publisher Copyright: {\textcopyright} 2021 Association for Computational Linguistics; Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 ; Conference date: 01-08-2021 Through 06-08-2021",

year = "2021",

language = "English",

series = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",

publisher = "Association for Computational Linguistics (ACL)",

pages = "716--728",

editor = "Chengqing Zong and Fei Xia and Wenjie Li and Roberto Navigli",

booktitle = "Findings of the Association for Computational Linguistics",

address = "United States",

}

Xu, HD, Li, Z, Zhou, Q, Li, C, Wang, Z, Cao, Y, Huang, H & Mao, XL 2021, Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking. 在 C Zong, F Xia, W Li & R Navigli (编辑), Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Association for Computational Linguistics (ACL), 页码 716-728, Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Virtual, Online, 1/08/21.

Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking. / Xu, Heng Da; Li, Zhongli; Zhou, Qingyu 等.
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. 编辑 / Chengqing Zong; Fei Xia; Wenjie Li; Roberto Navigli. Association for Computational Linguistics (ACL), 2021. 页码 716-728 (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Read, Listen, and See

T2 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

AU - Xu, Heng Da

AU - Li, Zhongli

AU - Zhou, Qingyu

AU - Li, Chao

AU - Wang, Zizhen

AU - Cao, Yunbo

AU - Huang, Heyan

AU - Mao, Xian Ling

PY - 2021

Y1 - 2021

N2 - Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attempts notice this phenomenon and try to utilize the similarity relationship for this task. However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. In this paper, we propose a Chinese spell checker called REALISE, by directly leveraging the multimodal information of the Chinese characters. The REALISE model tackles the CSC task by (1) capturing the semantic, phonetic and graphic information of the input characters, and (2) selectively mixing the information in these modalities to predict the correct output. Experiments on the SIGHAN benchmarks show that the proposed model outperforms strong baselines by a large margin.

AB - Chinese Spell Checking (CSC) aims to detect and correct erroneous characters for user-generated text in Chinese language. Most of the Chinese spelling errors are misused semantically, phonetically or graphically similar characters. Previous attempts notice this phenomenon and try to utilize the similarity relationship for this task. However, these methods use either heuristics or handcrafted confusion sets to predict the correct character. In this paper, we propose a Chinese spell checker called REALISE, by directly leveraging the multimodal information of the Chinese characters. The REALISE model tackles the CSC task by (1) capturing the semantic, phonetic and graphic information of the input characters, and (2) selectively mixing the information in these modalities to predict the correct output. Experiments on the SIGHAN benchmarks show that the proposed model outperforms strong baselines by a large margin.

UR - http://www.scopus.com/inward/record.url?scp=85123235144&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85123235144

T3 - Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

SP - 716

EP - 728

BT - Findings of the Association for Computational Linguistics

A2 - Zong, Chengqing

A2 - Xia, Fei

A2 - Li, Wenjie

A2 - Navigli, Roberto

PB - Association for Computational Linguistics (ACL)

Y2 - 1 August 2021 through 6 August 2021

ER -

Xu HD, Li Z, Zhou Q, Li C, Wang Z, Cao Y 等. Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking. 在 Zong C, Xia F, Li W, Navigli R, 编辑, Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics (ACL). 2021. 页码 716-728. (Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021).

Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking

摘要

出版系列

会议

其它文件与链接

指纹

引用此