TY - GEN
T1 - Automatic Audio Augmentation for Requests Sub-Challenge
AU - Sun, Yanjie
AU - Xu, Kele
AU - Liu, Chaorun
AU - Dou, Yong
AU - Qian, Kun
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/10/26
Y1 - 2023/10/26
N2 - This paper presents our solution for the Requests Sub-challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. Drawing upon the framework of self-supervised learning, we put forth an automated data augmentation technique for audio classification, accompanied by a multi-channel fusion strategy aimed at enhancing overall performance. Specifically, to tackle the issue of imbalanced classes in complaint classification, we propose an audio data augmentation method that generates appropriate augmentation strategies for the challenge dataset. Furthermore, recognizing the distinctive characteristics of the dual-channel HC-C dataset, we individually evaluate the classification performance of the left channel, right channel, channel difference, and channel sum, subsequently selecting the optimal integration approach. Our approach yields a significant improvement in performance when compared to the competitive baselines, particularly in the context of the complaint task. Moreover, our method demonstrates noteworthy cross-task transferability.
AB - This paper presents our solution for the Requests Sub-challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge. Drawing upon the framework of self-supervised learning, we put forth an automated data augmentation technique for audio classification, accompanied by a multi-channel fusion strategy aimed at enhancing overall performance. Specifically, to tackle the issue of imbalanced classes in complaint classification, we propose an audio data augmentation method that generates appropriate augmentation strategies for the challenge dataset. Furthermore, recognizing the distinctive characteristics of the dual-channel HC-C dataset, we individually evaluate the classification performance of the left channel, right channel, channel difference, and channel sum, subsequently selecting the optimal integration approach. Our approach yields a significant improvement in performance when compared to the competitive baselines, particularly in the context of the complaint task. Moreover, our method demonstrates noteworthy cross-task transferability.
KW - audio classification
KW - automatic audio augmentation
KW - computational paralinguistics
KW - data augmentation
UR - http://www.scopus.com/inward/record.url?scp=85179556301&partnerID=8YFLogxK
U2 - 10.1145/3581783.3612849
DO - 10.1145/3581783.3612849
M3 - Conference contribution
AN - SCOPUS:85179556301
T3 - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
SP - 9482
EP - 9486
BT - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 31st ACM International Conference on Multimedia, MM 2023
Y2 - 29 October 2023 through 3 November 2023
ER -