Deep learning for binaural sound source localization with low signal-to-noise ratio

Fengnian Zhao; Ruwei Li; Dongmei Pan

doi:10.1088/1742-6596/1828/1/012017

Deep learning for binaural sound source localization with low signal-to-noise ratio

Fengnian Zhao, Ruwei Li^*, Dongmei Pan

^*Corresponding author for this work

Beijing University of Technology

Research output: Contribution to journal › Conference article › peer-review

1 Citation (Scopus)

Abstract

A novel deep learning (DL) method is proposed for binaural sound source localization with low SNR. Firstly, the binaural sound signals are decomposed into several channels by using Gammatone filter. Secondly, the 4 feature parameters of Head-related Transfer Function, interaural time difference (ITD), interaural coherence (IC), interaural level difference (ILD), and interaural phase difference (IPD) are extracted. Thirdly, ITD and IC go through a Deep Belief Network (DBN) to determine the quadrant of the sound source and reduce the positioning range. Then, ITD, IC, ILD, and IPD go through a Deep Neural Network (DNN) to obtain the azimuthal angle within 90 degrees. Experimental results show that the proposed algorithm can solve the front-back confusion, and obtain a superior performance with lower complexity and higher precision under low SNR conditions.

Original language	English
Article number	012017
Journal	Journal of Physics: Conference Series
Volume	1828
Issue number	1
DOIs	https://doi.org/10.1088/1742-6596/1828/1/012017
Publication status	Published - 4 Mar 2021
Externally published	Yes
Event	2020 International Symposium on Automation, Information and Computing, ISAIC 2020 - Beijing, Virtual, China Duration: 2 Dec 2020 → 4 Dec 2020

Access to Document

10.1088/1742-6596/1828/1/012017

Cite this

Zhao, F., Li, R., & Pan, D. (2021). Deep learning for binaural sound source localization with low signal-to-noise ratio. Journal of Physics: Conference Series, 1828(1), Article 012017. https://doi.org/10.1088/1742-6596/1828/1/012017

@article{95099d1769bb446aab71553c7b49a3cd,

title = "Deep learning for binaural sound source localization with low signal-to-noise ratio",

abstract = "A novel deep learning (DL) method is proposed for binaural sound source localization with low SNR. Firstly, the binaural sound signals are decomposed into several channels by using Gammatone filter. Secondly, the 4 feature parameters of Head-related Transfer Function, interaural time difference (ITD), interaural coherence (IC), interaural level difference (ILD), and interaural phase difference (IPD) are extracted. Thirdly, ITD and IC go through a Deep Belief Network (DBN) to determine the quadrant of the sound source and reduce the positioning range. Then, ITD, IC, ILD, and IPD go through a Deep Neural Network (DNN) to obtain the azimuthal angle within 90 degrees. Experimental results show that the proposed algorithm can solve the front-back confusion, and obtain a superior performance with lower complexity and higher precision under low SNR conditions.",

author = "Fengnian Zhao and Ruwei Li and Dongmei Pan",

note = "Publisher Copyright: {\textcopyright} 2021 Institute of Physics Publishing. All rights reserved.; 2020 International Symposium on Automation, Information and Computing, ISAIC 2020 ; Conference date: 02-12-2020 Through 04-12-2020",

year = "2021",

month = mar,

day = "4",

doi = "10.1088/1742-6596/1828/1/012017",

language = "English",

volume = "1828",

journal = "Journal of Physics: Conference Series",

issn = "1742-6588",

publisher = "IOP Publishing Ltd.",

number = "1",

}

TY - JOUR

T1 - Deep learning for binaural sound source localization with low signal-to-noise ratio

AU - Zhao, Fengnian

AU - Li, Ruwei

AU - Pan, Dongmei

PY - 2021/3/4

Y1 - 2021/3/4

N2 - A novel deep learning (DL) method is proposed for binaural sound source localization with low SNR. Firstly, the binaural sound signals are decomposed into several channels by using Gammatone filter. Secondly, the 4 feature parameters of Head-related Transfer Function, interaural time difference (ITD), interaural coherence (IC), interaural level difference (ILD), and interaural phase difference (IPD) are extracted. Thirdly, ITD and IC go through a Deep Belief Network (DBN) to determine the quadrant of the sound source and reduce the positioning range. Then, ITD, IC, ILD, and IPD go through a Deep Neural Network (DNN) to obtain the azimuthal angle within 90 degrees. Experimental results show that the proposed algorithm can solve the front-back confusion, and obtain a superior performance with lower complexity and higher precision under low SNR conditions.

AB - A novel deep learning (DL) method is proposed for binaural sound source localization with low SNR. Firstly, the binaural sound signals are decomposed into several channels by using Gammatone filter. Secondly, the 4 feature parameters of Head-related Transfer Function, interaural time difference (ITD), interaural coherence (IC), interaural level difference (ILD), and interaural phase difference (IPD) are extracted. Thirdly, ITD and IC go through a Deep Belief Network (DBN) to determine the quadrant of the sound source and reduce the positioning range. Then, ITD, IC, ILD, and IPD go through a Deep Neural Network (DNN) to obtain the azimuthal angle within 90 degrees. Experimental results show that the proposed algorithm can solve the front-back confusion, and obtain a superior performance with lower complexity and higher precision under low SNR conditions.

UR - http://www.scopus.com/inward/record.url?scp=85103287198&partnerID=8YFLogxK

U2 - 10.1088/1742-6596/1828/1/012017

DO - 10.1088/1742-6596/1828/1/012017

M3 - Conference article

AN - SCOPUS:85103287198

SN - 1742-6588

VL - 1828

JO - Journal of Physics: Conference Series

JF - Journal of Physics: Conference Series

IS - 1

M1 - 012017

T2 - 2020 International Symposium on Automation, Information and Computing, ISAIC 2020

Y2 - 2 December 2020 through 4 December 2020

ER -

Deep learning for binaural sound source localization with low signal-to-noise ratio

Abstract

Access to Document

Other files and links

Fingerprint

Cite this