End-to-end Oriental Language Speech Recognition with Integrated Language Identification

Anbin Qi, Xiang Xie*, Qingran Zhan, Chenguang Hu, Xinmei Su

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In recent years, with the rise of human-computer interaction and the successful application of end-to-end models in the field of speech recognition, the construction of end-to-end speech recognition models has received extensive attention. Relying on the multi-task learning method and the connection between language identification and speech recognition, we proposed an end-to-end Transformer model, which is a multilingual speech recognition model integrating language identification. The model takes the speech recognition task as the main task and the language identification task as the auxiliary task. In this paper, the validity of the model is verified by using the datasets of 13 languages in the 2021 Oriental Language Recognition challenge (OLR). The experimental results show that the model constructed in this paper has a relative improvement of 37.46% in the speech recognition task compared with the baseline system proposed by the OLR organizer. The accuracy of language identification reaches 89.70 %. The results can get the fifth place in the 2021 OLR constraint track of speech recognition equally.

Original languageEnglish
Title of host publicationProceedings - 2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages27-31
Number of pages5
ISBN (Electronic)9781665454599
DOIs
Publication statusPublished - 2022
Event2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022 - Suzhou, China
Duration: 29 Oct 202231 Oct 2022

Publication series

NameProceedings - 2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022

Conference

Conference2022 International Conference on Machine Learning, Control, and Robotics, MLCR 2022
Country/TerritoryChina
CitySuzhou
Period29/10/2231/10/22

Keywords

  • End-to-end
  • Language Identification
  • Multi-task learning
  • Oriental Languages
  • Speech Recognition

Fingerprint

Dive into the research topics of 'End-to-end Oriental Language Speech Recognition with Integrated Language Identification'. Together they form a unique fingerprint.

Cite this