Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition

Shihan Huang; Hua Dang; Rongkun Jiang; Yue Hao; Chengbo Xue; Wei Gu

doi:10.3390/electronics10232891

Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition

Shihan Huang, Hua Dang, Rongkun Jiang, Yue Hao, Chengbo Xue, Wei Gu^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

16 Citations (Scopus)

Abstract

Speech Emotion Recognition (SER) plays a significant role in the field of Human–Computer Interaction (HCI) with a wide range of applications. However, there are still some issues in practical application. One of the issues is the difference between emotional expression amongst various individuals, and another is that some indistinguishable emotions may reduce the stability of the SER system. In this paper, we propose a multi-layer hybrid fuzzy support vector machine (MLHF-SVM) model, which includes three layers: feature extraction layer, pre-classification layer, and classification layer. The MLHF-SVM model solves the above-mentioned issues by fuzzy c-means (FCM) based on identification information of human and multi-layer SVM classifiers, respectively. In addition, to overcome the weakness that FCM tends to fall into local minima, an improved natural exponential inertia weight particle swarm optimization (IEPSO) algorithm is proposed and integrated with fuzzy c-means for optimization. Moreover, in the feature extraction layer, non-personalized features and personalized features are combined to improve accuracy. In order to verify the effectiveness of the proposed model, all emotions in three popular datasets are used for simulation. The results show that this model can effectively improve the success rate of classification and the maximum value of a single emotion recognition rate is 97.67% on the EmoDB dataset.

Original language	English
Article number	2891
Journal	Electronics (Switzerland)
Volume	10
Issue number	23
DOIs	https://doi.org/10.3390/electronics10232891
Publication status	Published - Dec 2021

Keywords

Fuzzy c-means
Particle swarm optimization
Speech emotion recognition
Support vector machines

Access to Document

10.3390/electronics10232891

Cite this

Huang, S., Dang, H., Jiang, R., Hao, Y., Xue, C., & Gu, W. (2021). Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition. Electronics (Switzerland), 10(23), Article 2891. https://doi.org/10.3390/electronics10232891

@article{5368bf39a83940efb9b216be7892f57d,

title = "Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition",

abstract = "Speech Emotion Recognition (SER) plays a significant role in the field of Human–Computer Interaction (HCI) with a wide range of applications. However, there are still some issues in practical application. One of the issues is the difference between emotional expression amongst various individuals, and another is that some indistinguishable emotions may reduce the stability of the SER system. In this paper, we propose a multi-layer hybrid fuzzy support vector machine (MLHF-SVM) model, which includes three layers: feature extraction layer, pre-classification layer, and classification layer. The MLHF-SVM model solves the above-mentioned issues by fuzzy c-means (FCM) based on identification information of human and multi-layer SVM classifiers, respectively. In addition, to overcome the weakness that FCM tends to fall into local minima, an improved natural exponential inertia weight particle swarm optimization (IEPSO) algorithm is proposed and integrated with fuzzy c-means for optimization. Moreover, in the feature extraction layer, non-personalized features and personalized features are combined to improve accuracy. In order to verify the effectiveness of the proposed model, all emotions in three popular datasets are used for simulation. The results show that this model can effectively improve the success rate of classification and the maximum value of a single emotion recognition rate is 97.67% on the EmoDB dataset.",

keywords = "Fuzzy c-means, Particle swarm optimization, Speech emotion recognition, Support vector machines",

author = "Shihan Huang and Hua Dang and Rongkun Jiang and Yue Hao and Chengbo Xue and Wei Gu",

note = "Publisher Copyright: {\textcopyright} 2021 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2021",

month = dec,

doi = "10.3390/electronics10232891",

language = "English",

volume = "10",

journal = "Electronics (Switzerland)",

issn = "2079-9292",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "23",

}

TY - JOUR

T1 - Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition

AU - Huang, Shihan

AU - Dang, Hua

AU - Jiang, Rongkun

AU - Hao, Yue

AU - Xue, Chengbo

AU - Gu, Wei

PY - 2021/12

Y1 - 2021/12

N2 - Speech Emotion Recognition (SER) plays a significant role in the field of Human–Computer Interaction (HCI) with a wide range of applications. However, there are still some issues in practical application. One of the issues is the difference between emotional expression amongst various individuals, and another is that some indistinguishable emotions may reduce the stability of the SER system. In this paper, we propose a multi-layer hybrid fuzzy support vector machine (MLHF-SVM) model, which includes three layers: feature extraction layer, pre-classification layer, and classification layer. The MLHF-SVM model solves the above-mentioned issues by fuzzy c-means (FCM) based on identification information of human and multi-layer SVM classifiers, respectively. In addition, to overcome the weakness that FCM tends to fall into local minima, an improved natural exponential inertia weight particle swarm optimization (IEPSO) algorithm is proposed and integrated with fuzzy c-means for optimization. Moreover, in the feature extraction layer, non-personalized features and personalized features are combined to improve accuracy. In order to verify the effectiveness of the proposed model, all emotions in three popular datasets are used for simulation. The results show that this model can effectively improve the success rate of classification and the maximum value of a single emotion recognition rate is 97.67% on the EmoDB dataset.

AB - Speech Emotion Recognition (SER) plays a significant role in the field of Human–Computer Interaction (HCI) with a wide range of applications. However, there are still some issues in practical application. One of the issues is the difference between emotional expression amongst various individuals, and another is that some indistinguishable emotions may reduce the stability of the SER system. In this paper, we propose a multi-layer hybrid fuzzy support vector machine (MLHF-SVM) model, which includes three layers: feature extraction layer, pre-classification layer, and classification layer. The MLHF-SVM model solves the above-mentioned issues by fuzzy c-means (FCM) based on identification information of human and multi-layer SVM classifiers, respectively. In addition, to overcome the weakness that FCM tends to fall into local minima, an improved natural exponential inertia weight particle swarm optimization (IEPSO) algorithm is proposed and integrated with fuzzy c-means for optimization. Moreover, in the feature extraction layer, non-personalized features and personalized features are combined to improve accuracy. In order to verify the effectiveness of the proposed model, all emotions in three popular datasets are used for simulation. The results show that this model can effectively improve the success rate of classification and the maximum value of a single emotion recognition rate is 97.67% on the EmoDB dataset.

KW - Fuzzy c-means

KW - Particle swarm optimization

KW - Speech emotion recognition

KW - Support vector machines

UR - http://www.scopus.com/inward/record.url?scp=85119606443&partnerID=8YFLogxK

U2 - 10.3390/electronics10232891

DO - 10.3390/electronics10232891

M3 - Article

AN - SCOPUS:85119606443

SN - 2079-9292

VL - 10

JO - Electronics (Switzerland)

JF - Electronics (Switzerland)

IS - 23

M1 - 2891

ER -

Multi-layer hybrid fuzzy classification based on svm and improved pso for speech emotion recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this