RS-BERT: Pre-training radical enhanced sense embedding for Chinese word sense disambiguation

Xiaofeng Zhou; Heyan Huang; Zewen Chi; Mucheng Ren; Yang Gao

doi:10.1016/j.ipm.2024.103740

RS-BERT: Pre-training radical enhanced sense embedding for Chinese word sense disambiguation

Xiaofeng Zhou, Heyan Huang^*, Zewen Chi, Mucheng Ren, Yang Gao

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

Abstract

Word sense disambiguation is a crucial task to test whether a model can perform deep understanding. Nowadays, the rise of pre-trained language models facilitates substantial success in such tasks. However, most current pre-training tasks are taken in place of the token level while ignoring the linguistic sense of the tokens themselves. Thus it is questionable whether the token-predicting objectives are enough to learn the polysemy and disambiguate senses. To explore this question, we introduce RS-BERT, a radical enhanced sense embedding model with a novel pre-training objective, sense-aware language modeling, which introduces additional sense-level information to the model. For each training step, we first predict the senses and then update the model given the predicted senses. During training, we alternately perform the above two steps in an expectation–maximization manner. Besides, we also introduce radical information to RS-BERT at the beginning of pre-training. We conduct experiments on two Chinese word sense disambiguation datasets. Experimental results show that RS-BERT is competitive. When combined with other dedicated adaptions for specific datasets, RS-BERT shows impressive performance. Moreover, our analysis shows that RS-BERT successfully clusters Chinese characters into various senses. The experiment results demonstrate that the token-predicting objectives are not enough and the sense-level objective performs better for polysemy and sense disambiguation.

Original language	English
Article number	103740
Journal	Information Processing and Management
Volume	61
Issue number	4
DOIs	https://doi.org/10.1016/j.ipm.2024.103740
Publication status	Published - Jul 2024

Keywords

Language models
Sense embeddings
Word sense disambiguation

Access to Document

10.1016/j.ipm.2024.103740

Cite this

Zhou, X., Huang, H., Chi, Z., Ren, M., & Gao, Y. (2024). RS-BERT: Pre-training radical enhanced sense embedding for Chinese word sense disambiguation. Information Processing and Management, 61(4), Article 103740. https://doi.org/10.1016/j.ipm.2024.103740

@article{62fd2b6871114fd984ff79c573d652f4,

title = "RS-BERT: Pre-training radical enhanced sense embedding for Chinese word sense disambiguation",

abstract = "Word sense disambiguation is a crucial task to test whether a model can perform deep understanding. Nowadays, the rise of pre-trained language models facilitates substantial success in such tasks. However, most current pre-training tasks are taken in place of the token level while ignoring the linguistic sense of the tokens themselves. Thus it is questionable whether the token-predicting objectives are enough to learn the polysemy and disambiguate senses. To explore this question, we introduce RS-BERT, a radical enhanced sense embedding model with a novel pre-training objective, sense-aware language modeling, which introduces additional sense-level information to the model. For each training step, we first predict the senses and then update the model given the predicted senses. During training, we alternately perform the above two steps in an expectation–maximization manner. Besides, we also introduce radical information to RS-BERT at the beginning of pre-training. We conduct experiments on two Chinese word sense disambiguation datasets. Experimental results show that RS-BERT is competitive. When combined with other dedicated adaptions for specific datasets, RS-BERT shows impressive performance. Moreover, our analysis shows that RS-BERT successfully clusters Chinese characters into various senses. The experiment results demonstrate that the token-predicting objectives are not enough and the sense-level objective performs better for polysemy and sense disambiguation.",

keywords = "Language models, Sense embeddings, Word sense disambiguation",

author = "Xiaofeng Zhou and Heyan Huang and Zewen Chi and Mucheng Ren and Yang Gao",

note = "Publisher Copyright: {\textcopyright} 2024",

year = "2024",

month = jul,

doi = "10.1016/j.ipm.2024.103740",

language = "English",

volume = "61",

journal = "Information Processing and Management",

issn = "0306-4573",

publisher = "Elsevier Ltd.",

number = "4",

}

TY - JOUR

T1 - RS-BERT

T2 - Pre-training radical enhanced sense embedding for Chinese word sense disambiguation

AU - Zhou, Xiaofeng

AU - Huang, Heyan

AU - Chi, Zewen

AU - Ren, Mucheng

AU - Gao, Yang

PY - 2024/7

Y1 - 2024/7

N2 - Word sense disambiguation is a crucial task to test whether a model can perform deep understanding. Nowadays, the rise of pre-trained language models facilitates substantial success in such tasks. However, most current pre-training tasks are taken in place of the token level while ignoring the linguistic sense of the tokens themselves. Thus it is questionable whether the token-predicting objectives are enough to learn the polysemy and disambiguate senses. To explore this question, we introduce RS-BERT, a radical enhanced sense embedding model with a novel pre-training objective, sense-aware language modeling, which introduces additional sense-level information to the model. For each training step, we first predict the senses and then update the model given the predicted senses. During training, we alternately perform the above two steps in an expectation–maximization manner. Besides, we also introduce radical information to RS-BERT at the beginning of pre-training. We conduct experiments on two Chinese word sense disambiguation datasets. Experimental results show that RS-BERT is competitive. When combined with other dedicated adaptions for specific datasets, RS-BERT shows impressive performance. Moreover, our analysis shows that RS-BERT successfully clusters Chinese characters into various senses. The experiment results demonstrate that the token-predicting objectives are not enough and the sense-level objective performs better for polysemy and sense disambiguation.

AB - Word sense disambiguation is a crucial task to test whether a model can perform deep understanding. Nowadays, the rise of pre-trained language models facilitates substantial success in such tasks. However, most current pre-training tasks are taken in place of the token level while ignoring the linguistic sense of the tokens themselves. Thus it is questionable whether the token-predicting objectives are enough to learn the polysemy and disambiguate senses. To explore this question, we introduce RS-BERT, a radical enhanced sense embedding model with a novel pre-training objective, sense-aware language modeling, which introduces additional sense-level information to the model. For each training step, we first predict the senses and then update the model given the predicted senses. During training, we alternately perform the above two steps in an expectation–maximization manner. Besides, we also introduce radical information to RS-BERT at the beginning of pre-training. We conduct experiments on two Chinese word sense disambiguation datasets. Experimental results show that RS-BERT is competitive. When combined with other dedicated adaptions for specific datasets, RS-BERT shows impressive performance. Moreover, our analysis shows that RS-BERT successfully clusters Chinese characters into various senses. The experiment results demonstrate that the token-predicting objectives are not enough and the sense-level objective performs better for polysemy and sense disambiguation.

KW - Language models

KW - Sense embeddings

KW - Word sense disambiguation

UR - http://www.scopus.com/inward/record.url?scp=85191181628&partnerID=8YFLogxK

U2 - 10.1016/j.ipm.2024.103740

DO - 10.1016/j.ipm.2024.103740

M3 - Article

AN - SCOPUS:85191181628

SN - 0306-4573

VL - 61

JO - Information Processing and Management

JF - Information Processing and Management

IS - 4

M1 - 103740

ER -

RS-BERT: Pre-training radical enhanced sense embedding for Chinese word sense disambiguation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this