FI-Net: A Speech Emotion Recognition Framework with Feature Integration and Data Augmentation

Guangmin Xia, Fan Li*, Dongdi Zhao, Qian Zhang, Song Yang

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

3 引用 (Scopus)

摘要

Speech emotion recognition, as an important auxiliary component of speech interaction technology, has always been a research hotspot. In this work, we propose a novel framework for speech emotion recognition based on deep neural network. The proposed framework is composed of two main modules: a local feature extractor module that utilizes deep recurrent layers to extract frame-level feature representations and a global feature integration module that learns utterance-level representations for emotion recognition. Two architectures, one multi-granularity convolutional layer and one multi-scale attentive layer are constructed for the feature integration module. Furthermore, we adopt two data augmentation approaches, noise injection and vocal tract length perturbation which both improve the performance and robustness of models and reduce the influence of individual variations. The proposed models achieve recognition accuracies of 92.08% and 90.41% on Emo-DB and CASIA dataset, respectively. In addition, ablation experiments are conducted to show the effectiveness of the proposed feature integration module and data augmentation approaches.

源语言英语
主期刊名Proceedings - 5th International Conference on Big Data Computing and Communications, BIGCOM 2019
出版商Institute of Electrical and Electronics Engineers Inc.
195-203
页数9
ISBN(电子版)9781728140247
DOI
出版状态已出版 - 8月 2019
活动5th International Conference on Big Data Computing and Communications, BIGCOM 2019 - Qingdao, 中国
期限: 9 8月 201911 8月 2019

出版系列

姓名Proceedings - 5th International Conference on Big Data Computing and Communications, BIGCOM 2019

会议

会议5th International Conference on Big Data Computing and Communications, BIGCOM 2019
国家/地区中国
Qingdao
时期9/08/1911/08/19

指纹

探究 'FI-Net: A Speech Emotion Recognition Framework with Feature Integration and Data Augmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此