Multi-granularity interaction model based on pinyins and radicals for Chinese semantic matching

Pengyu Zhao, Wenpeng Lu*, Shoujin Wang, Xueping Peng, Ping Jian, Hao Wu, Weiyu Zhang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

Semantic matching plays a critical role in many downstream tasks of natural language processing. Existing semantic matching methods, which focus on learning sentence semantic features based on character and word granularities, neglect to consider the special characteristics of Chinese, e.g., pinyins and radicals. However, both pinyins and radicals contain rich semantics which are able to enhance the Chinese sentence representation. In this paper, we propose a multi-granularity interaction model based on pinyins and radicals (MIPR) for Chinese semantic matching. MIPR first employs an input encoding layer to incorporate multi-granularity information including character, word, pinyin and radical granularities together, next utilizes soft-alignment attention mechanism to devise a multi-granularity interaction layer for capturing the interaction features among different granularities and sentences, then devises a feature aggregation layer to merge the various interaction features for obtaining the final matching representation, followed by a prediction layer to judge the matching degree of the pair of input sentences. Extensive experiments on two public Chinese datasets demonstrate that MIPR achieves significant improvement against the compared models and comparable performance with BERT-based model for Chinese semantic matching task.

Original languageEnglish
Pages (from-to)1703-1723
Number of pages21
JournalWorld Wide Web
Volume25
Issue number4
DOIs
Publication statusPublished - Jul 2022

Keywords

  • Semantic matching
  • multi-granularity interaction
  • pinyin
  • radical
  • soft-alignment attention

Fingerprint

Dive into the research topics of 'Multi-granularity interaction model based on pinyins and radicals for Chinese semantic matching'. Together they form a unique fingerprint.

Cite this