Automatic Audio Augmentation for Requests Sub-Challenge

Yanjie Sun; Kele Xu; Chaorun Liu; Yong Dou; Kun Qian

doi:10.1145/3581783.3612849

Automatic Audio Augmentation for Requests Sub-Challenge

Yanjie Sun, Kele Xu^*, Chaorun Liu, Yong Dou, Kun Qian

^*Corresponding author for this work

School of Medical and Technology

National University of Defense Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

3 Citations (Scopus)

Abstract

This paper presents our solution for the Requests Sub-challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. Drawing upon the framework of self-supervised learning, we put forth an automated data augmentation technique for audio classification, accompanied by a multi-channel fusion strategy aimed at enhancing overall performance. Specifically, to tackle the issue of imbalanced classes in complaint classification, we propose an audio data augmentation method that generates appropriate augmentation strategies for the challenge dataset. Furthermore, recognizing the distinctive characteristics of the dual-channel HC-C dataset, we individually evaluate the classification performance of the left channel, right channel, channel difference, and channel sum, subsequently selecting the optimal integration approach. Our approach yields a significant improvement in performance when compared to the competitive baselines, particularly in the context of the complaint task. Moreover, our method demonstrates noteworthy cross-task transferability.

Original language	English
Title of host publication	MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
Publisher	Association for Computing Machinery, Inc
Pages	9482-9486
Number of pages	5
ISBN (Electronic)	9798400701085
DOIs	https://doi.org/10.1145/3581783.3612849
Publication status	Published - 26 Oct 2023
Event	31st ACM International Conference on Multimedia, MM 2023 - Ottawa, Canada Duration: 29 Oct 2023 → 3 Nov 2023

Publication series

Name	MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

Conference

Conference	31st ACM International Conference on Multimedia, MM 2023
Country/Territory	Canada
City	Ottawa
Period	29/10/23 → 3/11/23

Keywords

audio classification
automatic audio augmentation
computational paralinguistics
data augmentation

Access to Document

10.1145/3581783.3612849

Cite this

Sun, Y., Xu, K., Liu, C., Dou, Y., & Qian, K. (2023). Automatic Audio Augmentation for Requests Sub-Challenge. In MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia (pp. 9482-9486). (MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3581783.3612849

@inproceedings{3e5173dc035a4daaa810d0c6a05dd197,

title = "Automatic Audio Augmentation for Requests Sub-Challenge",

abstract = "This paper presents our solution for the Requests Sub-challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. Drawing upon the framework of self-supervised learning, we put forth an automated data augmentation technique for audio classification, accompanied by a multi-channel fusion strategy aimed at enhancing overall performance. Specifically, to tackle the issue of imbalanced classes in complaint classification, we propose an audio data augmentation method that generates appropriate augmentation strategies for the challenge dataset. Furthermore, recognizing the distinctive characteristics of the dual-channel HC-C dataset, we individually evaluate the classification performance of the left channel, right channel, channel difference, and channel sum, subsequently selecting the optimal integration approach. Our approach yields a significant improvement in performance when compared to the competitive baselines, particularly in the context of the complaint task. Moreover, our method demonstrates noteworthy cross-task transferability.",

keywords = "audio classification, automatic audio augmentation, computational paralinguistics, data augmentation",

author = "Yanjie Sun and Kele Xu and Chaorun Liu and Yong Dou and Kun Qian",

note = "Publisher Copyright: {\textcopyright} 2023 ACM.; 31st ACM International Conference on Multimedia, MM 2023 ; Conference date: 29-10-2023 Through 03-11-2023",

year = "2023",

month = oct,

day = "26",

doi = "10.1145/3581783.3612849",

language = "English",

series = "MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia",

publisher = "Association for Computing Machinery, Inc",

pages = "9482--9486",

booktitle = "MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia",

}

Sun, Y, Xu, K, Liu, C, Dou, Y & Qian, K 2023, Automatic Audio Augmentation for Requests Sub-Challenge. in MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia. MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, Association for Computing Machinery, Inc, pp. 9482-9486, 31st ACM International Conference on Multimedia, MM 2023, Ottawa, Canada, 29/10/23. https://doi.org/10.1145/3581783.3612849

Automatic Audio Augmentation for Requests Sub-Challenge. / Sun, Yanjie; Xu, Kele; Liu, Chaorun et al.
MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2023. p. 9482-9486 (MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Automatic Audio Augmentation for Requests Sub-Challenge

AU - Sun, Yanjie

AU - Xu, Kele

AU - Liu, Chaorun

AU - Dou, Yong

AU - Qian, Kun

PY - 2023/10/26

Y1 - 2023/10/26

N2 - This paper presents our solution for the Requests Sub-challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. Drawing upon the framework of self-supervised learning, we put forth an automated data augmentation technique for audio classification, accompanied by a multi-channel fusion strategy aimed at enhancing overall performance. Specifically, to tackle the issue of imbalanced classes in complaint classification, we propose an audio data augmentation method that generates appropriate augmentation strategies for the challenge dataset. Furthermore, recognizing the distinctive characteristics of the dual-channel HC-C dataset, we individually evaluate the classification performance of the left channel, right channel, channel difference, and channel sum, subsequently selecting the optimal integration approach. Our approach yields a significant improvement in performance when compared to the competitive baselines, particularly in the context of the complaint task. Moreover, our method demonstrates noteworthy cross-task transferability.

AB - This paper presents our solution for the Requests Sub-challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. Drawing upon the framework of self-supervised learning, we put forth an automated data augmentation technique for audio classification, accompanied by a multi-channel fusion strategy aimed at enhancing overall performance. Specifically, to tackle the issue of imbalanced classes in complaint classification, we propose an audio data augmentation method that generates appropriate augmentation strategies for the challenge dataset. Furthermore, recognizing the distinctive characteristics of the dual-channel HC-C dataset, we individually evaluate the classification performance of the left channel, right channel, channel difference, and channel sum, subsequently selecting the optimal integration approach. Our approach yields a significant improvement in performance when compared to the competitive baselines, particularly in the context of the complaint task. Moreover, our method demonstrates noteworthy cross-task transferability.

KW - audio classification

KW - automatic audio augmentation

KW - computational paralinguistics

KW - data augmentation

UR - http://www.scopus.com/inward/record.url?scp=85179556301&partnerID=8YFLogxK

U2 - 10.1145/3581783.3612849

DO - 10.1145/3581783.3612849

M3 - Conference contribution

AN - SCOPUS:85179556301

T3 - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

SP - 9482

EP - 9486

BT - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

PB - Association for Computing Machinery, Inc

T2 - 31st ACM International Conference on Multimedia, MM 2023

Y2 - 29 October 2023 through 3 November 2023

ER -

Automatic Audio Augmentation for Requests Sub-Challenge

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this