TY - GEN
T1 - LIGHTCODEC
T2 - 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
AU - Xu, Liang
AU - Wang, Jing
AU - Zhang, Jianqian
AU - Xie, Xiang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The audio codec is one of the core modules in audio communication for real-time transmission. With the development of neural networks, end-to-end audio codecs have emerged and demonstrated effects beyond conventional codecs. However, current neural network-based codecs have the weakness of high computational complexity, and the performance of these methods decreases rapidly after decreasing the complexity, which is not conducive to deployment under low computational resources. In this paper, a low-complexity audio codec is proposed. To realize the low complexity of the model with high quality, a structure based on frequency band division is designed, which is implemented using a within band-across band interaction (WBABI) module to learn the features across and within the subband. Further, we propose a new quantization-compensation module, which reduces the quantization error by 90%. The experimental results show that for audio with a sample rate of 24kHz, the model shows excellent performance at 3∼6kbps compared to other codecs, and the complexity is only 0.8 Giga Multiply-Add Operations per Second(GMACs).
AB - The audio codec is one of the core modules in audio communication for real-time transmission. With the development of neural networks, end-to-end audio codecs have emerged and demonstrated effects beyond conventional codecs. However, current neural network-based codecs have the weakness of high computational complexity, and the performance of these methods decreases rapidly after decreasing the complexity, which is not conducive to deployment under low computational resources. In this paper, a low-complexity audio codec is proposed. To realize the low complexity of the model with high quality, a structure based on frequency band division is designed, which is implemented using a within band-across band interaction (WBABI) module to learn the features across and within the subband. Further, we propose a new quantization-compensation module, which reduces the quantization error by 90%. The experimental results show that for audio with a sample rate of 24kHz, the model shows excellent performance at 3∼6kbps compared to other codecs, and the complexity is only 0.8 Giga Multiply-Add Operations per Second(GMACs).
KW - Audio codec
KW - Low complexity
KW - Vector quantization
UR - http://www.scopus.com/inward/record.url?scp=85195404820&partnerID=8YFLogxK
U2 - 10.1109/ICASSP48485.2024.10447532
DO - 10.1109/ICASSP48485.2024.10447532
M3 - Conference contribution
AN - SCOPUS:85195404820
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 586
EP - 590
BT - 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 14 April 2024 through 19 April 2024
ER -