End-to-end Oriental Language Speech Recognition with Integrated Language Identification

Anbin Qi, Xiang Xie*, Qingran Zhan, Chenguang Hu, Xinmei Su

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

In recent years, with the rise of human-computer interaction and the successful application of end-to-end models in the field of speech recognition, the construction of end-to-end speech recognition models has received extensive attention. Relying on the multi-task learning method and the connection between language identification and speech recognition, we proposed an end-to-end Transformer model, which is a multilingual speech recognition model integrating language identification. The model takes the speech recognition task as the main task and the language identification task as the auxiliary task. In this paper, the validity of the model is verified by using the datasets of 13 languages in the 2021 Oriental Language Recognition challenge (OLR). The experimental results show that the model constructed in this paper has a relative improvement of 37.46% in the speech recognition task compared with the baseline system proposed by the OLR organizer. The accuracy of language identification reaches 89.70 %. The results can get the fifth place in the 2021 OLR constraint track of speech recognition equally.

源语言英语
主期刊名Proceedings - 2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022
出版商Institute of Electrical and Electronics Engineers Inc.
27-31
页数5
ISBN(电子版)9781665454599
DOI
出版状态已出版 - 2022
活动2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022 - Suzhou, 中国
期限: 29 10月 202231 10月 2022

出版系列

姓名Proceedings - 2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022

会议

会议2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022
国家/地区中国
Suzhou
时期29/10/2231/10/22

指纹

探究 'End-to-end Oriental Language Speech Recognition with Integrated Language Identification' 的科研主题。它们共同构成独一无二的指纹。

引用此