TY - JOUR
T1 - RS-BERT
T2 - Pre-training radical enhanced sense embedding for Chinese word sense disambiguation
AU - Zhou, Xiaofeng
AU - Huang, Heyan
AU - Chi, Zewen
AU - Ren, Mucheng
AU - Gao, Yang
N1 - Publisher Copyright:
© 2024
PY - 2024/7
Y1 - 2024/7
N2 - Word sense disambiguation is a crucial task to test whether a model can perform deep understanding. Nowadays, the rise of pre-trained language models facilitates substantial success in such tasks. However, most current pre-training tasks are taken in place of the token level while ignoring the linguistic sense of the tokens themselves. Thus it is questionable whether the token-predicting objectives are enough to learn the polysemy and disambiguate senses. To explore this question, we introduce RS-BERT, a radical enhanced sense embedding model with a novel pre-training objective, sense-aware language modeling, which introduces additional sense-level information to the model. For each training step, we first predict the senses and then update the model given the predicted senses. During training, we alternately perform the above two steps in an expectation–maximization manner. Besides, we also introduce radical information to RS-BERT at the beginning of pre-training. We conduct experiments on two Chinese word sense disambiguation datasets. Experimental results show that RS-BERT is competitive. When combined with other dedicated adaptions for specific datasets, RS-BERT shows impressive performance. Moreover, our analysis shows that RS-BERT successfully clusters Chinese characters into various senses. The experiment results demonstrate that the token-predicting objectives are not enough and the sense-level objective performs better for polysemy and sense disambiguation.
AB - Word sense disambiguation is a crucial task to test whether a model can perform deep understanding. Nowadays, the rise of pre-trained language models facilitates substantial success in such tasks. However, most current pre-training tasks are taken in place of the token level while ignoring the linguistic sense of the tokens themselves. Thus it is questionable whether the token-predicting objectives are enough to learn the polysemy and disambiguate senses. To explore this question, we introduce RS-BERT, a radical enhanced sense embedding model with a novel pre-training objective, sense-aware language modeling, which introduces additional sense-level information to the model. For each training step, we first predict the senses and then update the model given the predicted senses. During training, we alternately perform the above two steps in an expectation–maximization manner. Besides, we also introduce radical information to RS-BERT at the beginning of pre-training. We conduct experiments on two Chinese word sense disambiguation datasets. Experimental results show that RS-BERT is competitive. When combined with other dedicated adaptions for specific datasets, RS-BERT shows impressive performance. Moreover, our analysis shows that RS-BERT successfully clusters Chinese characters into various senses. The experiment results demonstrate that the token-predicting objectives are not enough and the sense-level objective performs better for polysemy and sense disambiguation.
KW - Language models
KW - Sense embeddings
KW - Word sense disambiguation
UR - http://www.scopus.com/inward/record.url?scp=85191181628&partnerID=8YFLogxK
U2 - 10.1016/j.ipm.2024.103740
DO - 10.1016/j.ipm.2024.103740
M3 - Article
AN - SCOPUS:85191181628
SN - 0306-4573
VL - 61
JO - Information Processing and Management
JF - Information Processing and Management
IS - 4
M1 - 103740
ER -