End-To-End Code-Switching ASR for Low-Resourced Language Pairs

Xianghu Yue, Grandee Lee, Emre Yilmaz, Fang Deng, Haizhou Li

科研成果: 书/报告/会议事项章节会议稿件同行评审

26 引用 (Scopus)

摘要

Despite the significant progress in end-To-end (E2E) automatic speech recognition (ASR), E2E ASR for low resourced code-switching (CS) speech has not been well studied. In this work, we describe an E2E ASR pipeline for the recognition of CS speech in which a low-resourced language is mixed with a high resourced language. Low-resourcedness in acoustic data hinders the performance of E2E ASR systems more severely than the conventional ASR systems. To mitigate this problem in the transcription of archives with code-switching Frisian-Dutch speech, we integrate a designated decoding scheme and perform rescoring with neural network-based language models to enable better utilization of the available textual resources. We first incorporate a multi-graph decoding approach which creates parallel search spaces for each monolingual and mixed recognition tasks to maximize the utilization of the textual resources from each language. Further, language model rescoring is performed using a recurrent neural network pre-Trained with cross-lingual embedding and further adapted with the limited amount of in-domain CS text. The ASR experiments demonstrate the effectiveness of the described techniques in improving the recognition performance of an E2E CS ASR system in a low-resourced scenario.

源语言英语
主期刊名2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings
出版商Institute of Electrical and Electronics Engineers Inc.
972-979
页数8
ISBN(电子版)9781728103068
DOI
出版状态已出版 - 12月 2019
活动2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Singapore, 新加坡
期限: 15 12月 201918 12月 2019

出版系列

姓名2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019 - Proceedings

会议

会议2019 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2019
国家/地区新加坡
Singapore
时期15/12/1918/12/19

指纹

探究 'End-To-End Code-Switching ASR for Low-Resourced Language Pairs' 的科研主题。它们共同构成独一无二的指纹。

引用此