TY - GEN
T1 - 融合预训练模型的端到端语音命名实体识别
AU - Lan, Tianwei
AU - Guo, Yuhang
N1 - Publisher Copyright:
© 2023 China National Conference on Computational Linguistics.
PY - 2023
Y1 - 2023
N2 - Speech Named Entity Recognition (SNER) aims to recognize the boundary, type and content of named entities in speech from audio, which is one of the important tasks in spoken language understanding. Recognizing named entities directly from speech, that is, the end-to-end method is the current mainstream method of SNER. However, the training corpus for speech named entity recognition is less, and the end-to-end model has the following problems: (1) The recognition effect of the model will be greatly reduced in the case of cross-domain recognition. (2) During the recognition process, the model may miss or mislabel named entities due to phenomena such as homophones, which further affects the accuracy of named entity recognition. Aiming at problem (1), this paper proposes to use a pre-trained entity recognition model to construct a training corpus for speech entity recognition. For problem (2), this paper proposes to use the pre-trained language model to re-score the N-BEST list of speech named entity recognition, and use the external knowledge in the pre-trained model to help the end-to-end model select the best result. In order to verify the domain migration ability of the model, we labeled the MAGICDATA-NER data set with few samples. The experiment on this data shows that the method proposed in this paper has an improvement of 43.29% in F1 value compared with the traditional method.
AB - Speech Named Entity Recognition (SNER) aims to recognize the boundary, type and content of named entities in speech from audio, which is one of the important tasks in spoken language understanding. Recognizing named entities directly from speech, that is, the end-to-end method is the current mainstream method of SNER. However, the training corpus for speech named entity recognition is less, and the end-to-end model has the following problems: (1) The recognition effect of the model will be greatly reduced in the case of cross-domain recognition. (2) During the recognition process, the model may miss or mislabel named entities due to phenomena such as homophones, which further affects the accuracy of named entity recognition. Aiming at problem (1), this paper proposes to use a pre-trained entity recognition model to construct a training corpus for speech entity recognition. For problem (2), this paper proposes to use the pre-trained language model to re-score the N-BEST list of speech named entity recognition, and use the external knowledge in the pre-trained model to help the end-to-end model select the best result. In order to verify the domain migration ability of the model, we labeled the MAGICDATA-NER data set with few samples. The experiment on this data shows that the method proposed in this paper has an improvement of 43.29% in F1 value compared with the traditional method.
KW - Cross-domain Recognition
KW - External Knowledge
KW - Few-shot Training
KW - Fusion of Pre-trained Models
KW - Speech Named Entity Recognition
UR - http://www.scopus.com/inward/record.url?scp=85218007075&partnerID=8YFLogxK
M3 - 会议稿件
AN - SCOPUS:85218007075
T3 - Proceedings of the 22nd Chinese National Conference on Computational Linguistics, CCL 2023
SP - 174
EP - 185
BT - Proceedings of the 22nd Chinese National Conference on Computational Linguistics, CCL 2023
A2 - Sun, Maosong
A2 - Qin, Bing
A2 - Qiu, Xipeng
A2 - Jiang, Jing
A2 - Han, Xianpei
PB - Association for Computational Linguistics (ACL)
T2 - 22nd Chinese National Conference on Computational Linguistics, CCL 2023
Y2 - 3 August 2023 through 5 August 2023
ER -