Optimization of EVS speech/music classifier based on deep learning

Zhitong Li, Xiang Xie, Jing Wang, Volodya Grancharov, Wei Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

EVS (Enhanced Voice Services) is a multi-mode codec proposed by 3GPP (3rd Generation Partnership Project) for 4G mobile services with a good performance and codec quality. The key technology of EVS lies in the flexible switch between speech and audio coding mode which mostly depends on the speech/music classifier. In general, the music signal is more complex than speech signal, and it conform less to any known LP (Linear Prediction)-based model. Taking the EVS's internal classifier as a baseline system, this study presents the optimization of the speech/music classifier from the perspective of neural network. The paper demonstrates the effectiveness of the optimized system on the MUSAN database. The experimental results show that the optimized system can improve the performance of the classifier, especially for music classification. Performed subjective experiments indicate that the proposed classification architecture improves perceived audio quality of the EVS codec.

Original languageEnglish
Title of host publication2018 14th IEEE International Conference on Signal Processing Proceedings, ICSP 2018
EditorsYuan Baozong, Ruan Qiuqi, Zhao Yao, An Gaoyun
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages260-264
Number of pages5
ISBN (Electronic)9781538646724
DOIs
Publication statusPublished - 2 Feb 2019
Event14th IEEE International Conference on Signal Processing, ICSP 2018 - Beijing, China
Duration: 12 Aug 201816 Aug 2018

Publication series

NameInternational Conference on Signal Processing Proceedings, ICSP
Volume2018-August

Conference

Conference14th IEEE International Conference on Signal Processing, ICSP 2018
Country/TerritoryChina
CityBeijing
Period12/08/1816/08/18

Keywords

  • Audio test
  • Deep Learning
  • EVS
  • Speech/Music classifier

Fingerprint

Dive into the research topics of 'Optimization of EVS speech/music classifier based on deep learning'. Together they form a unique fingerprint.

Cite this