VisPhone: Chinese named entity recognition model enhanced by visual and phonetic features

Baohua Zhang, Jiahao Cai, Huaping Zhang*, Jianyun Shang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

22 Citations (Scopus)

Abstract

Many Chinese NER models only focus on lexical and radical information, ignoring the fact that there are also certain rules for the pronunciation of Chinese entities. In this paper, we propose VisPhone, which incorporates Chinese characters’ Phonetic features into Transformer Encoder along with the Lattice and Visual features. We present the common rules for the pronunciation of Chinese entities and explore the most appropriate method to encode it. VisPhone uses two identical cross transformer encoders to fuse the visual and phonetic features of the input characters with the text embedding. A selective fusion module is used to get the final features. We conducted experiments on four well-known Chinese NER benchmark datasets: OntoNotes4.0, MSRA, Resume, and Weibo, with F1 scores of 82.63%, 96.07%, 96.26%, 70.79% respectively, improving the performance by 0.79%, 0.32%, 0.39%, and 3.47%. Our ablation experiments have also demonstrated the effectiveness of VisPhone.

Original languageEnglish
Article number103314
JournalInformation Processing and Management
Volume60
Issue number3
DOIs
Publication statusPublished - May 2023

Keywords

  • Chinese NER
  • Cross transformer
  • Phonetic information
  • Selective fusion
  • Visual information

Fingerprint

Dive into the research topics of 'VisPhone: Chinese named entity recognition model enhanced by visual and phonetic features'. Together they form a unique fingerprint.

Cite this