AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting

Zuodong Duan, Huimin Chen*, Jiahao Deng

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)

Abstract

CNN-based crowd counting methods have achieved great progress in recent years. However, most of these CNN-based crowd counting methods do not make full use of contextual information, which contains high-level semantic features and low-level detail features from different receptive fields of CNN. But rich contextual information is important to solve the scale variation problem of crowd counting. So the precision of previous CNN-based crowd counting methods is decreased. To solve this problem, we propose an adaptive attention fusion mechanism (AAFM). AAFM can use multi-scale features from different receptive fields of CNN effectively. It integrates the convolution network for feature learning and the attention mechanism for multi-scale features fusion. We apply the first 13 convolution layers of VGG-16 as the encoder module to extract the base features. Then, the base features are fed into the decoder module. The decoder module mainly contains Density Regression Branch (DRB) and Feature Fusion Branch (FFB). DRB uses multiple convolution layers for feature learning and multi-scale feature extraction. FFB uses attention modules for modeling multi-scale features and element-wise multiply for features fusion. Therefore, AAFM can obtain rich contextual information into the encoder-decoder framework for generating high-quality crowd density maps and accurate counting. We perform experiments on ShanghaiTech, UCF-CC-50, and UCF-QNRF datasets, and AAFM achieves promising results.

Original languageEnglish
Article number9151937
Pages (from-to)138297-138306
Number of pages10
JournalIEEE Access
Volume8
DOIs
Publication statusPublished - 2020

Keywords

  • Crowd counting
  • adaptive attention fusion mechanism
  • density estimation

Fingerprint

Dive into the research topics of 'AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting'. Together they form a unique fingerprint.

Cite this