Enhancing music information retrieval by incorporating image-based local features

Leszek Kaliciak*, Ben Horsburgh, Dawei Song, Nirmalie Wiratunga, Jeff Pan

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

1 引用 (Scopus)
Plum Print visual indicator of research metrics
  • Citations
    • Citation Indexes: 1
  • Captures
    • Readers: 5
see details

摘要

This paper presents a novel approach to music genre classification. Having represented music tracks in the form of two dimensional images, we apply the "bag of visual words" method from visual IR in order to classify the songs into 19 genres. By switching to visual domain, we can abstract from musical concepts such as melody, timbre and rhythm. We obtained classification accuracy of 46% (with 5% theoretical baseline for random classification) which is comparable with existing state-of-the-art approaches. Moreover, the novel features characterize different properties of the signal than standard methods. Therefore, the combination of them should further improve the performance of existing techniques. The motivation behind this work was the hypothesis, that 2D images of music tracs (spectrograms) perceived as similar would correspond to the same music genres. Conversely, it is possible to treat real life images as spectrograms and utilize music-based features to represent these images in a vector form. This points to an interesting interchangeability between visual and music information retrieval.

源语言英语
主期刊名Information Retrieval Technology - 8th Asia Information Retrieval Societies Conference, AIRS 2012, Proceedings
226-237
页数12
DOI
出版状态已出版 - 2012
已对外发布
活动8th Asia Information Retrieval Societies Conference, AIRS 2012 - Tianjin, 中国
期限: 17 12月 201219 12月 2012

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
7675 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议8th Asia Information Retrieval Societies Conference, AIRS 2012
国家/地区中国
Tianjin
时期17/12/1219/12/12

指纹

探究 'Enhancing music information retrieval by incorporating image-based local features' 的科研主题。它们共同构成独一无二的指纹。

引用此

Kaliciak, L., Horsburgh, B., Song, D., Wiratunga, N., & Pan, J. (2012). Enhancing music information retrieval by incorporating image-based local features. 在 Information Retrieval Technology - 8th Asia Information Retrieval Societies Conference, AIRS 2012, Proceedings (页码 226-237). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 7675 LNCS). https://doi.org/10.1007/978-3-642-35341-3_19