Automatic Bird Sound Source Separation Based On Passive Acoustic Devices in Wild Environment

Jiangjian Xie; Yuwei Shi; Dongming Ni; Manuel Milling; Shuo Liu; Junguo Zhang; Kun Qian; Bjorn W. Schuller

doi:10.1109/JIOT.2024.3354036

Automatic Bird Sound Source Separation Based On Passive Acoustic Devices in Wild Environment

Jiangjian Xie, Yuwei Shi, Dongming Ni, Manuel Milling, Shuo Liu, Junguo Zhang, Kun Qian, Bjorn W. Schuller

医学技术学院

科研成果: 期刊稿件 › 文章 › 同行评审

5 引用（Scopus）

摘要

The Internet of Things (IoT)-based passive acoustic monitoring (PAM) has shown great potential in large-scale remote bird monitoring. However, field recordings often contain overlapping signals, making precise bird information extraction challenging. To solve this challenge, first, the inter-channel spatial feature is chosen as complementary information to the spectral feature to obtain additional spatial correlations between the sources. Then, an end-to-end model named BACPPNet is built based on Deeplabv3plus and enhanced with the polarized self-attention mechanism to estimate the spectral amplitude mask (SMM) for separating bird vocalizations. Finally, the separated bird vocalizations are recovered from SMMs and the spectrogram of mixed audio using the inverse short Fourier transform (ISTFT). We evaluate our proposed method utilizing the generated mixed dataset. Experiments have shown that our method can separate bird vocalizations from mixed audio with RMSE, SDR, SIR, SAR, and STOI values of 2.82, 10.00dB, 29.90 dB, 11.08 dB, and 0.66, respectively, which are better than existing methods. Furthermore, the average classification accuracy of the separated bird vocalizations drops the least. This indicates that our method outperforms other compared separation methods in bird sound separation and preserves the fidelity of the separated sound sources, which might help us better understand wild bird sound recordings.

源语言	英语
页（从-至）	1
页数	1
期刊	IEEE Internet of Things Journal
DOI	https://doi.org/10.1109/JIOT.2024.3354036
出版状态	已接受/待刊 - 2024

访问文件

10.1109/JIOT.2024.3354036

其它文件与链接

链接到 Scopus 的出版物

引用此

Xie, J., Shi, Y., Ni, D., Milling, M., Liu, S., Zhang, J., Qian, K., & Schuller, B. W. (已接受/印刷中). Automatic Bird Sound Source Separation Based On Passive Acoustic Devices in Wild Environment. IEEE Internet of Things Journal, 1. https://doi.org/10.1109/JIOT.2024.3354036

@article{9d312ccdc9cc4cec862bc8dbb92dc755,

title = "Automatic Bird Sound Source Separation Based On Passive Acoustic Devices in Wild Environment",

abstract = "The Internet of Things (IoT)-based passive acoustic monitoring (PAM) has shown great potential in large-scale remote bird monitoring. However, field recordings often contain overlapping signals, making precise bird information extraction challenging. To solve this challenge, first, the inter-channel spatial feature is chosen as complementary information to the spectral feature to obtain additional spatial correlations between the sources. Then, an end-to-end model named BACPPNet is built based on Deeplabv3plus and enhanced with the polarized self-attention mechanism to estimate the spectral amplitude mask (SMM) for separating bird vocalizations. Finally, the separated bird vocalizations are recovered from SMMs and the spectrogram of mixed audio using the inverse short Fourier transform (ISTFT). We evaluate our proposed method utilizing the generated mixed dataset. Experiments have shown that our method can separate bird vocalizations from mixed audio with RMSE, SDR, SIR, SAR, and STOI values of 2.82, 10.00dB, 29.90 dB, 11.08 dB, and 0.66, respectively, which are better than existing methods. Furthermore, the average classification accuracy of the separated bird vocalizations drops the least. This indicates that our method outperforms other compared separation methods in bird sound separation and preserves the fidelity of the separated sound sources, which might help us better understand wild bird sound recordings.",

keywords = "Acoustics, Bird sound separation, Birds, Forestry, Monitoring, Recording, Source separation, Task analysis, multi-channel audio processing, polarized self-attention mechanism",

author = "Jiangjian Xie and Yuwei Shi and Dongming Ni and Manuel Milling and Shuo Liu and Junguo Zhang and Kun Qian and Schuller, {Bjorn W.}",

note = "Publisher Copyright: IEEE",

year = "2024",

doi = "10.1109/JIOT.2024.3354036",

language = "English",

pages = "1",

journal = "IEEE Internet of Things Journal",

issn = "2327-4662",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Automatic Bird Sound Source Separation Based On Passive Acoustic Devices in Wild Environment

AU - Xie, Jiangjian

AU - Shi, Yuwei

AU - Ni, Dongming

AU - Milling, Manuel

AU - Liu, Shuo

AU - Zhang, Junguo

AU - Qian, Kun

AU - Schuller, Bjorn W.

N1 - Publisher Copyright: IEEE

PY - 2024

Y1 - 2024

N2 - The Internet of Things (IoT)-based passive acoustic monitoring (PAM) has shown great potential in large-scale remote bird monitoring. However, field recordings often contain overlapping signals, making precise bird information extraction challenging. To solve this challenge, first, the inter-channel spatial feature is chosen as complementary information to the spectral feature to obtain additional spatial correlations between the sources. Then, an end-to-end model named BACPPNet is built based on Deeplabv3plus and enhanced with the polarized self-attention mechanism to estimate the spectral amplitude mask (SMM) for separating bird vocalizations. Finally, the separated bird vocalizations are recovered from SMMs and the spectrogram of mixed audio using the inverse short Fourier transform (ISTFT). We evaluate our proposed method utilizing the generated mixed dataset. Experiments have shown that our method can separate bird vocalizations from mixed audio with RMSE, SDR, SIR, SAR, and STOI values of 2.82, 10.00dB, 29.90 dB, 11.08 dB, and 0.66, respectively, which are better than existing methods. Furthermore, the average classification accuracy of the separated bird vocalizations drops the least. This indicates that our method outperforms other compared separation methods in bird sound separation and preserves the fidelity of the separated sound sources, which might help us better understand wild bird sound recordings.

AB - The Internet of Things (IoT)-based passive acoustic monitoring (PAM) has shown great potential in large-scale remote bird monitoring. However, field recordings often contain overlapping signals, making precise bird information extraction challenging. To solve this challenge, first, the inter-channel spatial feature is chosen as complementary information to the spectral feature to obtain additional spatial correlations between the sources. Then, an end-to-end model named BACPPNet is built based on Deeplabv3plus and enhanced with the polarized self-attention mechanism to estimate the spectral amplitude mask (SMM) for separating bird vocalizations. Finally, the separated bird vocalizations are recovered from SMMs and the spectrogram of mixed audio using the inverse short Fourier transform (ISTFT). We evaluate our proposed method utilizing the generated mixed dataset. Experiments have shown that our method can separate bird vocalizations from mixed audio with RMSE, SDR, SIR, SAR, and STOI values of 2.82, 10.00dB, 29.90 dB, 11.08 dB, and 0.66, respectively, which are better than existing methods. Furthermore, the average classification accuracy of the separated bird vocalizations drops the least. This indicates that our method outperforms other compared separation methods in bird sound separation and preserves the fidelity of the separated sound sources, which might help us better understand wild bird sound recordings.

KW - Acoustics

KW - Bird sound separation

KW - Birds

KW - Forestry

KW - Monitoring

KW - Recording

KW - Source separation

KW - Task analysis

KW - multi-channel audio processing

KW - polarized self-attention mechanism

UR - http://www.scopus.com/inward/record.url?scp=85182946866&partnerID=8YFLogxK

U2 - 10.1109/JIOT.2024.3354036

DO - 10.1109/JIOT.2024.3354036

M3 - Article

AN - SCOPUS:85182946866

SN - 2327-4662

SP - 1

JO - IEEE Internet of Things Journal

JF - IEEE Internet of Things Journal

ER -

Automatic Bird Sound Source Separation Based On Passive Acoustic Devices in Wild Environment

摘要

访问文件

其它文件与链接

指纹

引用此