跳到主要导航 跳到搜索 跳到主要内容

ANA-Mix: A Synthetic Corpus of Mandarin Speech in Airport Noise Conditions

  • Xiaoliang Wang*
  • , Yu Wang
  • , Ye Liu
  • , Xudong Zhou
  • , Fengming Liu
  • , Fengge Yu
  • , Shuai Zhang
  • , Guozheng Li
  • *此作品的通讯作者
  • TravelSky Technology Limited
  • Beijing Institute of Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

This paper presents the Airport Noise-AISHELL Mix (ANA-Mix), a rich and realistic dataset tailored for advancing speech recognition and interactive systems in complex airport acoustic conditions. The noisy speech dataset is constructed by combining the publicly available AISHELL-3 Mandarin speech dataset with the environmental noise data actually collected at airports. The AISHELL-3 dataset provides a rich variety of high-quality sentence recordings, while the airport noise data captures a variety of typical airport noise scenarios, including crowd conversations, luggage rolling, and boarding announcements. A data mixing method is used to superimpose clean speech and randomly selected airport noise in waveforms to create 200,000 sets of noisy speech samples, including approximately 100,000 sets of single-person noisy speech and another 100,000 sets of multi-person (2~4 speakers) speech. This voice construction results are close to the actual deployment environment. The dataset constructed in this study can be used for a variety of tasks such as speech recognition, voiceprint recognition, and speech enhancement, demonstrating its potential value in improving the performance of voice interaction systems.

源语言英语
主期刊名2025 IEEE 3rd International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025
出版商Institute of Electrical and Electronics Engineers Inc.
98-102
页数5
ISBN(电子版)9798331503567
DOI
出版状态已出版 - 2025
活动3rd IEEE International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025 - Jinzhou, 中国
期限: 29 8月 202531 8月 2025

出版系列

姓名2025 IEEE 3rd International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025

会议

会议3rd IEEE International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025
国家/地区中国
Jinzhou
时期29/08/2531/08/25

指纹

探究 'ANA-Mix: A Synthetic Corpus of Mandarin Speech in Airport Noise Conditions' 的科研主题。它们共同构成独一无二的指纹。

引用此