Learning multi-resolution representations for acoustic scene classification via neural networks

Zijiang Yang, Kun Qian*, Zhao Ren, Alice Baird, Zixing Zhang, Björn Schuller

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

5 引用 (Scopus)

摘要

This study investigates the performance of wavelet as well as conventional temporal and spectral features for acoustic scene classification, testing the effectiveness of both feature sets when combined with neural networks on acoustic scene classification. The TUT Acoustic Scenes 2017 Database is used in the evaluation of the system. The model with wavelet energy feature achieved 74.8 % and 60.2 % on development and evaluation set respectively, which is better than the model using temporal and spectral feature set (72.9 % and 59.4 %). Additionally, to optimise the generalisation and robustness of the models, a decision fusion method based on the posterior probability of each audio scene is used. Comparing with the baseline system of the Detection and Classification Acoustic Scenes and Events 2017 (DCASE 2017) challenge, the best decision fusion model achieves 79.2 % and 63.8 % on the development and evaluation sets, respectively, where both results significantly exceed the baseline system result of 74.8 % and 61.0 % (confirmed by one tailed z-test p < 0.01 and p < 0.05 respectively.

源语言英语
主期刊名Proceedings of the 7th Conference on Sound and Music Technology CSMT 2019, Revised Selected Papers
编辑Haifeng Li, Lin Ma, Shengchen Li, Chunying Fang, Yidan Zhu
出版商Springer
133-143
页数11
ISBN(印刷版)9789811527555
DOI
出版状态已出版 - 2020
已对外发布
活动7th Conference on Sound and Music Technology, CSMT 2019 - Harbin, 中国
期限: 26 12月 201929 12月 2019

出版系列

姓名Lecture Notes in Electrical Engineering
635
ISSN(印刷版)1876-1100
ISSN(电子版)1876-1119

会议

会议7th Conference on Sound and Music Technology, CSMT 2019
国家/地区中国
Harbin
时期26/12/1929/12/19

指纹

探究 'Learning multi-resolution representations for acoustic scene classification via neural networks' 的科研主题。它们共同构成独一无二的指纹。

引用此