TY - GEN
T1 - Noise-aware Speech Separation Based on Dual-Path RNN Model
AU - Xing, Tongkun
AU - Ding, Ding
AU - Liu, Nan
AU - Zhou, Xudong
AU - Liu, Fengming
AU - Wang, Yu
AU - Zhang, Jing
AU - Li, Guozheng
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Deep learning-based speech separation has made remarkable progress recently, especially in time-domain methods such as dual-path recurrent neural networks (DPRNN). Despite these advancements, existing methods encounter significant challenges when addressing highly complex acoustic environments or meeting increasingly stringent separation quality requirements. In this paper, we propose improvements to the DPRNN framework, including optimized convolutional parallelism, an attention module for feature enhancement, and an improved residual module. Evaluations on the self-synthesized dataset and the widely adopted LibriMix benchmark show that our improved DPRNN outperforms the original model in terms of signal-to-distortion ratio (SDRi) and scale-invariant signal-to-noise ratio (SI-SNRi) with limited increase in model size. Its practical applicability is further verified in challenging environments, which will help develop efficient and robust speech separation systems.
AB - Deep learning-based speech separation has made remarkable progress recently, especially in time-domain methods such as dual-path recurrent neural networks (DPRNN). Despite these advancements, existing methods encounter significant challenges when addressing highly complex acoustic environments or meeting increasingly stringent separation quality requirements. In this paper, we propose improvements to the DPRNN framework, including optimized convolutional parallelism, an attention module for feature enhancement, and an improved residual module. Evaluations on the self-synthesized dataset and the widely adopted LibriMix benchmark show that our improved DPRNN outperforms the original model in terms of signal-to-distortion ratio (SDRi) and scale-invariant signal-to-noise ratio (SI-SNRi) with limited increase in model size. Its practical applicability is further verified in challenging environments, which will help develop efficient and robust speech separation systems.
KW - attention
KW - convolution
KW - recurrent neural networks
KW - speech separation
UR - https://www.scopus.com/pages/publications/105030470142
U2 - 10.1109/ICSECE65727.2025.11257099
DO - 10.1109/ICSECE65727.2025.11257099
M3 - Conference contribution
AN - SCOPUS:105030470142
T3 - 2025 IEEE 3rd International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025
SP - 1071
EP - 1075
BT - 2025 IEEE 3rd International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd IEEE International Conference on Sensors, Electronics and Computer Engineering, ICSECE 2025
Y2 - 29 August 2025 through 31 August 2025
ER -