Non-Intrusive Speech Quality Assessment with Multi-Task Learning Based on Tensor Network

Hanyue Liu, Miao Liu, Jing Wang, Xiang Xie, Lidong Yang

Research output: Contribution to journalConference articlepeer-review

2 Citations (Scopus)

Abstract

With the growing significance of non-intrusive speech quality assessment in speech systems, existing methods predominantly rely on neural networks to extract low-order features. Typically, these features undergo a low-dimensional linear transformation, yielding the network's output. However, the intercorrelation between feature points is often overlooked. In this paper, we explore the concept of kernel method, which maps features into high dimensional space through dot product, in order to enhance the extraction of relationships among all feature points. Considering the unique advantages of tensors in complex data representation, we extend the utilization of tensor network and propose a novel framework that incorporates a matrix product state (MPS) layer to predict mean opinion score (MOS). By integrating the MPS layer, our model can transform low-order features into higher-order representations, facilitating linear transformation in a high dimensional space without increasing the number of parameters. Furthermore, we propose a loss function that concurrently assesses regression and classification biases, along with correlation with real MOS labels. Experimental results demonstrate that our proposed model consistently outperforms the baseline system across all evaluation metrics and surpasses state-of-the-art models on the test set.

Original languageEnglish
Pages (from-to)851-855
Number of pages5
JournalProceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing
DOIs
Publication statusPublished - 2024
Event2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Keywords

  • matrix product state
  • multi-task learning
  • speech quality assessment
  • tensor network

Fingerprint

Dive into the research topics of 'Non-Intrusive Speech Quality Assessment with Multi-Task Learning Based on Tensor Network'. Together they form a unique fingerprint.

Cite this