TY - JOUR
T1 - Neural Audio Coding with Deep Complex Networks
AU - Ru, Jiawei
AU - Wang, Lizhong
AU - Jia, Maoshen
AU - Wen, Liang
AU - Wang, Chunxi
AU - Zhao, Yuhao
AU - Wang, Jing
N1 - Publisher Copyright:
© Published under licence by IOP Publishing Ltd.
PY - 2024
Y1 - 2024
N2 - This paper proposes a transform domain audio coding method based on deep complex networks. In the proposed codec, the time-frequency spectrum of the audio signal is fed to the encoder which consists of complex convolutional blocks and a frequency-temporal modeling module to obtain the extracted features which are then quantized with a target bitrate by the vector quantizer. The structure of the decoder which reconstruct the time-frequency spectrum of the audio from quantized features is symmetrical to the encoder. In this paper, a structure combining the complex multi-head self-attention module and the complex long short-term memory is proposed to capture both frequency and temporal dependencies. Subjective and objective evaluation tests show the advantage of the proposed method.
AB - This paper proposes a transform domain audio coding method based on deep complex networks. In the proposed codec, the time-frequency spectrum of the audio signal is fed to the encoder which consists of complex convolutional blocks and a frequency-temporal modeling module to obtain the extracted features which are then quantized with a target bitrate by the vector quantizer. The structure of the decoder which reconstruct the time-frequency spectrum of the audio from quantized features is symmetrical to the encoder. In this paper, a structure combining the complex multi-head self-attention module and the complex long short-term memory is proposed to capture both frequency and temporal dependencies. Subjective and objective evaluation tests show the advantage of the proposed method.
UR - http://www.scopus.com/inward/record.url?scp=85194193652&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/2759/1/012005
DO - 10.1088/1742-6596/2759/1/012005
M3 - Conference article
AN - SCOPUS:85194193652
SN - 1742-6588
VL - 2759
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
IS - 1
M1 - 012005
T2 - 2024 8th International Conference on Machine Vision and Information Technology, CMVIT 2024
Y2 - 23 February 2024 through 25 February 2024
ER -