Learning multi-resolution representations for acoustic scene classification via neural networks

Zijiang Yang, Kun Qian*, Zhao Ren, Alice Baird, Zixing Zhang, Björn Schuller

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Citations (Scopus)

Abstract

This study investigates the performance of wavelet as well as conventional temporal and spectral features for acoustic scene classification, testing the effectiveness of both feature sets when combined with neural networks on acoustic scene classification. The TUT Acoustic Scenes 2017 Database is used in the evaluation of the system. The model with wavelet energy feature achieved 74.8 % and 60.2 % on development and evaluation set respectively, which is better than the model using temporal and spectral feature set (72.9 % and 59.4 %). Additionally, to optimise the generalisation and robustness of the models, a decision fusion method based on the posterior probability of each audio scene is used. Comparing with the baseline system of the Detection and Classification Acoustic Scenes and Events 2017 (DCASE 2017) challenge, the best decision fusion model achieves 79.2 % and 63.8 % on the development and evaluation sets, respectively, where both results significantly exceed the baseline system result of 74.8 % and 61.0 % (confirmed by one tailed z-test p < 0.01 and p < 0.05 respectively.

Original languageEnglish
Title of host publicationProceedings of the 7th Conference on Sound and Music Technology CSMT 2019, Revised Selected Papers
EditorsHaifeng Li, Lin Ma, Shengchen Li, Chunying Fang, Yidan Zhu
PublisherSpringer
Pages133-143
Number of pages11
ISBN (Print)9789811527555
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event7th Conference on Sound and Music Technology, CSMT 2019 - Harbin, China
Duration: 26 Dec 201929 Dec 2019

Publication series

NameLecture Notes in Electrical Engineering
Volume635
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

Conference7th Conference on Sound and Music Technology, CSMT 2019
Country/TerritoryChina
CityHarbin
Period26/12/1929/12/19

Keywords

  • Acoustic Scene Classification
  • Machine Learning
  • Neural Networks
  • Wavelets

Fingerprint

Dive into the research topics of 'Learning multi-resolution representations for acoustic scene classification via neural networks'. Together they form a unique fingerprint.

Cite this