TY - JOUR
T1 - Multiple-Speech-Source DOA Estimation Based on Single-Source Cluster Detection
AU - Li, Lu
AU - Jia, Maoshen
AU - Wang, Jing
AU - Cao, Ruiyuan
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2023
Y1 - 2023
N2 - This study proposes multiple-speech-source direction -of-arrival (DOA) estimation based on the distribution characteristic of the time-frequency (TF) point dominated by a single-source component (i.e., single-source point, SSP). By exploring the TF distribution characteristics of SSPs, we found that most are distributed in clusters in the TF domain. Hence, the concept of a single-source cluster (SSC) is given, each composed of adjacent TF points from one dominant sound source. Considering that SSCs have different shapes and sizes, an SSC detection method is designed based on point-to-cluster expansion, which is the research focus of this article. A two-dimensional Gaussian function is introduced to model the theoretical distribution of the DOAs of SSPs, and a cluster expansion rule is proposed based on hypothesis testing of the DOA of a source. Two-dimensional kernel density estimation and peak search are adopted to estimate the DOAs and the number of sources using the detected SSCs. Experimental results in both simulated and real environments show that the proposed method can achieve better DOA estimation performance than some current techniques.
AB - This study proposes multiple-speech-source direction -of-arrival (DOA) estimation based on the distribution characteristic of the time-frequency (TF) point dominated by a single-source component (i.e., single-source point, SSP). By exploring the TF distribution characteristics of SSPs, we found that most are distributed in clusters in the TF domain. Hence, the concept of a single-source cluster (SSC) is given, each composed of adjacent TF points from one dominant sound source. Considering that SSCs have different shapes and sizes, an SSC detection method is designed based on point-to-cluster expansion, which is the research focus of this article. A two-dimensional Gaussian function is introduced to model the theoretical distribution of the DOAs of SSPs, and a cluster expansion rule is proposed based on hypothesis testing of the DOA of a source. Two-dimensional kernel density estimation and peak search are adopted to estimate the DOAs and the number of sources using the detected SSCs. Experimental results in both simulated and real environments show that the proposed method can achieve better DOA estimation performance than some current techniques.
KW - DOA estimation
KW - hypothesis testing
KW - single-source cluster detection
UR - http://www.scopus.com/inward/record.url?scp=85174852801&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2023.3321213
DO - 10.1109/TASLP.2023.3321213
M3 - Article
AN - SCOPUS:85174852801
SN - 2329-9290
VL - 31
SP - 3667
EP - 3680
JO - IEEE/ACM Transactions on Audio Speech and Language Processing
JF - IEEE/ACM Transactions on Audio Speech and Language Processing
ER -