DRMSpell: dynamically reweighting multimodality for Chinese spelling correction

Yinghao Li, Heyan Huang, Baojun Wang, Yang Gao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Chinese spelling correction (CSC) is a task that aims to detect and correct the spelling errors that may occur in Chinese texts. However, the Chinese language exhibits a high degree of complexity, characterized by the presence of multiple phonetic representations known as pinyin, which possess distinct tonal variations that can correspond to various characters. Given the complexity inherent in the Chinese language, the CSC task becomes imperative for ensuring the accuracy and clarity of written communication. Recent research has included external knowledge into the model using phonological and visual modalities. However, these methods do not effectively target the utilization of modality information to address the different types of errors. In this paper, we propose a multimodal pretrained language model called DRMSpell for CSC, which takes into consideration the interaction between the modalities. A dynamically reweighting multimodality (DRM) module is introduced to reweight various modalities for obtaining more multimodal information. To fully use the multimodal information obtained and to further strengthen the model, an independent-modality masking strategy (IMS) is proposed to independently mask three modalities of a token in the pretraining stage. Our method achieves state-of-the-art performance on most metrics constituting widely used benchmarks. The findings of the experiments demonstrate that our method is capable of modeling the interactive information between modalities and is also robust to incorrect modal information.

Translated title of the contributionDRMSpell: 中文拼写纠正中的动态多模态重新加权技术
Original languageEnglish
Pages (from-to)354-366
Number of pages13
JournalFrontiers of Information Technology and Electronic Engineering
Volume26
Issue number3
DOIs
Publication statusPublished - Mar 2025

Keywords

  • Chinese spelling correction
  • Masking strategy
  • Multimodality
  • TP391.1

Cite this