Learning Higher Representations from Bioacoustics: A Sequence-to-Sequence Deep Learning Approach for Bird Sound Classification

Yu Qiao, Kun Qian, Ziping Zhao*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

In the past two decades, a plethora of efforts have been given to the field of automatic classification of bird sounds, which can facilitate a long-term, non-human, and low-energy consumption ubiquitous computing system for monitoring the nature reserve. Nevertheless, human hand-crafted features need numerous domain knowledge, and inevitably make the designing progress time-consuming and expensive. To this line, we propose a sequence-to-sequence deep learning approach for extracting the higher representations automatically from bird sounds without any human expert knowledge. First, we transform the birds sound audio into spectrograms. Subsequently, higher representations were learnt by an autoencoder-based encoder-decoder paradigm combined with the deep recurrent neural networks. Finally, two typical machine learning models are selected to predict the classes, i.e., support vector machines and multi-layer perceptrons. Experimental results demonstrate the effectiveness of the method proposed, which can reach an unweighted average recall (UAR) at 66.8% in recognising 86 species of birds.

Original languageEnglish
Title of host publicationNeural Information Processing - 27th International Conference, ICONIP 2020, Proceedings
EditorsHaiqin Yang, Kitsuchart Pasupa, Andrew Chi-Sing Leung, James T. Kwok, Jonathan H. Chan, Irwin King
PublisherSpringer Science and Business Media Deutschland GmbH
Pages130-138
Number of pages9
ISBN (Print)9783030638221
DOIs
Publication statusPublished - 2020
Externally publishedYes
Event27th International Conference on Neural Information Processing, ICONIP 2020 - Bangkok, Thailand
Duration: 18 Nov 202022 Nov 2020

Publication series

NameCommunications in Computer and Information Science
Volume1333
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference27th International Conference on Neural Information Processing, ICONIP 2020
Country/TerritoryThailand
CityBangkok
Period18/11/2022/11/20

Keywords

  • Bioacoustics
  • Bird sound classification
  • Deep learning
  • Internet of Things
  • Sequence-to-sequence learning

Fingerprint

Dive into the research topics of 'Learning Higher Representations from Bioacoustics: A Sequence-to-Sequence Deep Learning Approach for Bird Sound Classification'. Together they form a unique fingerprint.

Cite this