Comparison of Two Cross-lingual AF Extraction Methods

Shixuan Du, Qingran Zhan, Yahui Shan, Xiang Xie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper we propose two different cross-lingual articulatory features (AFs) extraction methods and build recognition systems based on cross-lingual AFs. The AF extractors are trained from source language (English) and cross-lingual AFs are generated for the target language (Mandarin) using the trained extractors. Experiments are carried with two kinds of AFs extraction architectures, mutilayer perception (MLP) and the Bidirectional Long Short-Term Memory (BLSTM) based connectionist temporal classification (CTC). The MLP architectures requires frame-level AF label which converted by phone alignment obtained from GMM-HMM using Phone-to-AF mapping, while the BLSTM-based CTC eliminates the need for alignments. The Mandarin speech recognition system is built by the joint features which are concatenated with AFs and MFCC. The results show that the using of cross-lingual AFs can improve the performance of ASR task on THCHS-30. Among two architectures, cross-lingual AFs extracted using BLSTM-based CTC gives better recognition performance.

Original languageEnglish
Title of host publication2019 2nd IEEE International Conference on Information Communication and Signal Processing, ICICSP 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages262-266
Number of pages5
ISBN (Electronic)9781728151021
DOIs
Publication statusPublished - Sept 2019
Event2nd IEEE International Conference on Information Communication and Signal Processing, ICICSP 2019 - Weihai, China
Duration: 28 Sept 201930 Sept 2019

Publication series

Name2019 2nd IEEE International Conference on Information Communication and Signal Processing, ICICSP 2019

Conference

Conference2nd IEEE International Conference on Information Communication and Signal Processing, ICICSP 2019
Country/TerritoryChina
CityWeihai
Period28/09/1930/09/19

Keywords

  • Articulatory feature
  • Connectionist temporal classification
  • Cross-lingual
  • Speech recognition

Fingerprint

Dive into the research topics of 'Comparison of Two Cross-lingual AF Extraction Methods'. Together they form a unique fingerprint.

Cite this