Exploring Conditional Variational Mechanism to Pinyin Input Method for Addressing One-to-Many Mappings in Low-Resource Scenarios

Bin Sun, Jianfeng Li, Hao Zhou, Fandong Meng, Kan Li*, Jie Zhou

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Pinyin input method engine (IME) refers to the transformation tool from pinyin sequence to Chinese characters, which is widely used on mobile phone applications. Due to the homophones, Pinyin IME suffers from the one-to-many mapping problem in the process of pinyin sequences to Chinese characters. To solve the above issue, this paper makes the first exploration to leverage an effective conditional variational mechanism (CVM) for pinyin IME. However, to ensure the stable and smooth operation of Pinyin IME under low-resource conditions (e.g., on offline mobile devices), we should balance diversity, accuracy, and efficiency with CVM, which is still challenging. To this end, we employ a novel strategy that simplifies the complexity of semantic encoding by facilitating the interaction between pinyin and the Chinese character information during the construction of continuous latent variables. Concurrently, the accuracy of the outcomes is enhanced by capitalizing on the discrete latent variables. Experimental results demonstrate the superior performance of our method.

Original languageEnglish
Title of host publicationShort Papers
EditorsLun-Wei Ku, Andre F. T. Martins, Vivek Srikumar
PublisherAssociation for Computational Linguistics (ACL)
Pages616-629
Number of pages14
ISBN (Electronic)9798891760950
DOIs
Publication statusPublished - 2024
Event62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Bangkok, Thailand
Duration: 11 Aug 202416 Aug 2024

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume2
ISSN (Print)0736-587X

Conference

Conference62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Country/TerritoryThailand
CityBangkok
Period11/08/2416/08/24

Fingerprint

Dive into the research topics of 'Exploring Conditional Variational Mechanism to Pinyin Input Method for Addressing One-to-Many Mappings in Low-Resource Scenarios'. Together they form a unique fingerprint.

Cite this