Optimization of EVS speech/music classifier based on deep learning

Zhitong Li, Xiang Xie, Jing Wang, Volodya Grancharov, Wei Liu

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

EVS (Enhanced Voice Services) is a multi-mode codec proposed by 3GPP (3rd Generation Partnership Project) for 4G mobile services with a good performance and codec quality. The key technology of EVS lies in the flexible switch between speech and audio coding mode which mostly depends on the speech/music classifier. In general, the music signal is more complex than speech signal, and it conform less to any known LP (Linear Prediction)-based model. Taking the EVS's internal classifier as a baseline system, this study presents the optimization of the speech/music classifier from the perspective of neural network. The paper demonstrates the effectiveness of the optimized system on the MUSAN database. The experimental results show that the optimized system can improve the performance of the classifier, especially for music classification. Performed subjective experiments indicate that the proposed classification architecture improves perceived audio quality of the EVS codec.

源语言英语
主期刊名2018 14th IEEE International Conference on Signal Processing Proceedings, ICSP 2018
编辑Yuan Baozong, Ruan Qiuqi, Zhao Yao, An Gaoyun
出版商Institute of Electrical and Electronics Engineers Inc.
260-264
页数5
ISBN(电子版)9781538646724
DOI
出版状态已出版 - 2 2月 2019
活动14th IEEE International Conference on Signal Processing, ICSP 2018 - Beijing, 中国
期限: 12 8月 201816 8月 2018

出版系列

姓名International Conference on Signal Processing Proceedings, ICSP
2018-August

会议

会议14th IEEE International Conference on Signal Processing, ICSP 2018
国家/地区中国
Beijing
时期12/08/1816/08/18

指纹

探究 'Optimization of EVS speech/music classifier based on deep learning' 的科研主题。它们共同构成独一无二的指纹。

引用此