TY - GEN
T1 - NEB-Filter
T2 - 2022 International Conference on Asian Language Processing, IALP 2022
AU - Jian, Linzhen
AU - Jian, Ping
AU - Fei, Weilun
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Named Entity Recognition (NER) on low-resource languages remains a challenging task due to the scarcity of labeled data. With the advent of large-scale neural machine translation systems and multilingual pre-training models, it has become possible to transfer knowledge from the resource-rich source language side (e.g., English) to the resource-poor target language side. In this paper, we focus on scenarios where the source language side labeled data is also insufficient. We found that transfer entity boundary information alone cross the languages is much easier than transfer the whole labeled information, which includes not only the entity boundary but also the entity category. Therefore, we propose a simple but effective cross-lingual NER method called NEB-Filter for the first time. NEB-Filter trains a filter based on the boundaries of named entities, which can greatly improve the precision of the cross-lingual NER model, but only brings a small reduction in recall. Moreover, our method has been shown to benefit from knowledge distillation (KD), leading to greater improvements on F1 score. A series of experiments have shown encouraging results for our approach in low-resource cross-lingual NER.
AB - Named Entity Recognition (NER) on low-resource languages remains a challenging task due to the scarcity of labeled data. With the advent of large-scale neural machine translation systems and multilingual pre-training models, it has become possible to transfer knowledge from the resource-rich source language side (e.g., English) to the resource-poor target language side. In this paper, we focus on scenarios where the source language side labeled data is also insufficient. We found that transfer entity boundary information alone cross the languages is much easier than transfer the whole labeled information, which includes not only the entity boundary but also the entity category. Therefore, we propose a simple but effective cross-lingual NER method called NEB-Filter for the first time. NEB-Filter trains a filter based on the boundaries of named entities, which can greatly improve the precision of the cross-lingual NER model, but only brings a small reduction in recall. Moreover, our method has been shown to benefit from knowledge distillation (KD), leading to greater improvements on F1 score. A series of experiments have shown encouraging results for our approach in low-resource cross-lingual NER.
KW - cross-lingual named entity recognition
KW - knowledge distillation
KW - low-resource
KW - named entity boundary
UR - http://www.scopus.com/inward/record.url?scp=85143989806&partnerID=8YFLogxK
U2 - 10.1109/IALP57159.2022.9961320
DO - 10.1109/IALP57159.2022.9961320
M3 - Conference contribution
AN - SCOPUS:85143989806
T3 - 2022 International Conference on Asian Language Processing, IALP 2022
SP - 482
EP - 487
BT - 2022 International Conference on Asian Language Processing, IALP 2022
A2 - Tong, Rong
A2 - Lu, Yanfeng
A2 - Dong, Minghui
A2 - Gong, Wengao
A2 - Li, Haizhou
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 October 2022 through 28 October 2022
ER -