Semi-supervised Cross-Lingual Speech Recognition Exploiting Articulatory Features

Xinmei Su, Xiang Xie, Chenguang Hu, Shu Wu*, Jing Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The state-of-the-art (SOTA) Automatic Speech Recognition (ASR) systems are mostly based on the data-driven methods. However, low-resource languages may lack data for training. Articulatory Features (AFs) describe the movements of the vocal organ which can be shared across languages. Thus, this paper investigates AFs-based semi-supervised techniques to share data between languages. First, the traditional acoustic features and the AFs are combined as front-end features to provide articulatory information for cross-lingual knowledge transfer. Then, the dropout-based lattice decoded are used as the pseudo-labels for the unsupervised data to address the problem of data deficiency. In addition, the Lattice-free Maximum Mutual Information (LF-MMI) objective is adopted to better adapt to small datasets. Experiments show that our system can obtain a relative improvement of 58.6% on Character Error Rate (CER) comparing to the baseline system. More specifically, the smaller the datasets are, the more obvious the advantages of our system can be.

Original languageEnglish
Title of host publicationPattern Recognition - 27th International Conference, ICPR 2024, Proceedings
EditorsApostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages141-153
Number of pages13
ISBN (Print)9783031801358
DOIs
Publication statusPublished - 2025
Event27th International Conference on Pattern Recognition, ICPR 2024 - Kolkata, India
Duration: 1 Dec 20245 Dec 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15333 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference27th International Conference on Pattern Recognition, ICPR 2024
Country/TerritoryIndia
CityKolkata
Period1/12/245/12/24

Keywords

  • Articulatory features
  • Automatic speech recognition
  • Semi-supervised

Fingerprint

Dive into the research topics of 'Semi-supervised Cross-Lingual Speech Recognition Exploiting Articulatory Features'. Together they form a unique fingerprint.

Cite this