Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation

  • Yi Chang
  • , Zhao Ren
  • , Zhonghao Zhao
  • , Thanh Tam Nguyen
  • , Kun Qian
  • , Tanja Schultz
  • , Björn W. Schuller

Research output: Contribution to journalConference articlepeer-review

Abstract

Speech emotion recognition (SER) plays a crucial role in human-computer interaction. The emergence of edge devices in the Internet of Things (IoT) presents challenges in constructing intricate deep learning models due to constraints in memory and computational resources. Moreover, emotional speech data often contains private information, raising concerns about privacy leakage during the deployment of SER models. To address these challenges, we propose a data distillation framework to facilitate efficient development of SER models in IoT applications using a synthesised, smaller, and distilled dataset. Our experiments demonstrate that the distilled dataset can be effectively utilised to train SER models with fixed initialisation, achieving performances comparable to those developed using the original full emotional speech dataset.

Original languageEnglish
Pages (from-to)141-145
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event26th Interspeech Conference 2025 - Rotterdam, Netherlands
Duration: 17 Aug 202521 Aug 2025

Keywords

  • computational paralinguistics
  • human-computer interaction
  • speech recognition

Fingerprint

Dive into the research topics of 'Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation'. Together they form a unique fingerprint.

Cite this