Multi-task based sound localization model

Song Tao, Qu Tianshu, Chen Jing*

*此作品的通讯作者

科研成果: 会议稿件论文同行评审

1 引用 (Scopus)

摘要

For machine hearing in complex scenes (i.e. reverberation, multi-sound sources), sound localization either serves as the front-end or is implicitly encoded in speech enhancing models. However, extracting binaural cues for sound localization is dependent on the clarity of the input speech signals, and speech enhancing (i.e. dereverberation or denoise) can benefit the processing of sound localization. Based on the idea above, a multi-task based sound localization model is proposed in this study. The proposed model takes waveform as input and simultaneously estimates the azimuth of the sound source and time-frequency (T-F) mask. Localization experiments were performed using binaural simulation in reverberant environments, and results show that compared to the single-task sound localization method, the presence of the speech enhancement task can improve the localization performance.

源语言英语
出版状态已出版 - 2020
已对外发布
活动148th Audio Engineering Society International Convention 2020 - Vienna, Virtual, Online, 奥地利
期限: 2 6月 20205 6月 2020

会议

会议148th Audio Engineering Society International Convention 2020
国家/地区奥地利
Vienna, Virtual, Online
时期2/06/205/06/20

指纹

探究 'Multi-task based sound localization model' 的科研主题。它们共同构成独一无二的指纹。

引用此