Attention-Based Deep Neural Network Combined Local and Global Features for Indoor Scene Recognition

Luefeng Chen, Wenhao Duan, Jiazhuo Li, Min Wu*, Witold Pedrycz, Kaoru Hirota

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

An original attention-based indoor scene recognition model combining local and global features is proposed. Multi-strategy data augmentation using several different functions and intensities can improve the classification performance. Then, local features are extracted using a convolutional layer and a single self-attention, thus solving the problem of large intra-class variance. The multi-attention mechanism is used to fuse the local feature information extracted from different foci to obtain a more complete global feature representation. The multi-head attention mechanism allows the network to extract features in parallel in different directions of attention, which helps the network to better capture global information, improves the network's ability to understand and represent the input data, and solves the problem of high inter-class similarity. Finally, the extracted features are fed into the classifier to complete the classification of indoor scene images. Experiments are conducted on four data sets (IndoorCVPR09, SUN397, 15-Scenes and self-built small sample scientific indoor scene dataset), yield excellent results. The results show that the developed algorithm effectively solves the two problems of high intra-class diversity and high inter-class similarity. As a result, the model has achieved competitive results. Preliminary application experiments are developed in our HRI system, indicating that the proposed indoor scene recognition model can be applied to the complete environmental perception in HRI.

Original languageEnglish
Pages (from-to)12684-12693
Number of pages10
JournalIEEE Transactions on Industrial Informatics
Volume20
Issue number11
DOIs
Publication statusPublished - 2024
Externally publishedYes

Keywords

  • Human-robot interaction
  • indoor scene recognition
  • local and global features
  • multihead attention

Fingerprint

Dive into the research topics of 'Attention-Based Deep Neural Network Combined Local and Global Features for Indoor Scene Recognition'. Together they form a unique fingerprint.

Cite this