AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting

Zuodong Duan; Huimin Chen; Jiahao Deng

doi:10.1109/ACCESS.2020.3012818

AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting

Zuodong Duan, Huimin Chen^*, Jiahao Deng

^*Corresponding author for this work

School of Mechatronical Engineering

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

5 Citations (Scopus)

Abstract

CNN-based crowd counting methods have achieved great progress in recent years. However, most of these CNN-based crowd counting methods do not make full use of contextual information, which contains high-level semantic features and low-level detail features from different receptive fields of CNN. But rich contextual information is important to solve the scale variation problem of crowd counting. So the precision of previous CNN-based crowd counting methods is decreased. To solve this problem, we propose an adaptive attention fusion mechanism (AAFM). AAFM can use multi-scale features from different receptive fields of CNN effectively. It integrates the convolution network for feature learning and the attention mechanism for multi-scale features fusion. We apply the first 13 convolution layers of VGG-16 as the encoder module to extract the base features. Then, the base features are fed into the decoder module. The decoder module mainly contains Density Regression Branch (DRB) and Feature Fusion Branch (FFB). DRB uses multiple convolution layers for feature learning and multi-scale feature extraction. FFB uses attention modules for modeling multi-scale features and element-wise multiply for features fusion. Therefore, AAFM can obtain rich contextual information into the encoder-decoder framework for generating high-quality crowd density maps and accurate counting. We perform experiments on ShanghaiTech, UCF-CC-50, and UCF-QNRF datasets, and AAFM achieves promising results.

Original language	English
Article number	9151937
Pages (from-to)	138297-138306
Number of pages	10
Journal	IEEE Access
Volume	8
DOIs	https://doi.org/10.1109/ACCESS.2020.3012818
Publication status	Published - 2020

Keywords

Crowd counting
adaptive attention fusion mechanism
density estimation

Access to Document

10.1109/ACCESS.2020.3012818

Cite this

Duan, Z., Chen, H., & Deng, J. (2020). AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting. IEEE Access, 8, 138297-138306. Article 9151937. https://doi.org/10.1109/ACCESS.2020.3012818

@article{9b7dfb9cbf3943289c815ed382320532,

title = "AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting",

abstract = "CNN-based crowd counting methods have achieved great progress in recent years. However, most of these CNN-based crowd counting methods do not make full use of contextual information, which contains high-level semantic features and low-level detail features from different receptive fields of CNN. But rich contextual information is important to solve the scale variation problem of crowd counting. So the precision of previous CNN-based crowd counting methods is decreased. To solve this problem, we propose an adaptive attention fusion mechanism (AAFM). AAFM can use multi-scale features from different receptive fields of CNN effectively. It integrates the convolution network for feature learning and the attention mechanism for multi-scale features fusion. We apply the first 13 convolution layers of VGG-16 as the encoder module to extract the base features. Then, the base features are fed into the decoder module. The decoder module mainly contains Density Regression Branch (DRB) and Feature Fusion Branch (FFB). DRB uses multiple convolution layers for feature learning and multi-scale feature extraction. FFB uses attention modules for modeling multi-scale features and element-wise multiply for features fusion. Therefore, AAFM can obtain rich contextual information into the encoder-decoder framework for generating high-quality crowd density maps and accurate counting. We perform experiments on ShanghaiTech, UCF-CC-50, and UCF-QNRF datasets, and AAFM achieves promising results.",

keywords = "Crowd counting, adaptive attention fusion mechanism, density estimation",

author = "Zuodong Duan and Huimin Chen and Jiahao Deng",

note = "Publisher Copyright: {\textcopyright} 2013 IEEE.",

year = "2020",

doi = "10.1109/ACCESS.2020.3012818",

language = "English",

volume = "8",

pages = "138297--138306",

journal = "IEEE Access",

issn = "2169-3536",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - AAFM

T2 - Adaptive Attention Fusion Mechanism for Crowd Counting

AU - Duan, Zuodong

AU - Chen, Huimin

AU - Deng, Jiahao

PY - 2020

Y1 - 2020

N2 - CNN-based crowd counting methods have achieved great progress in recent years. However, most of these CNN-based crowd counting methods do not make full use of contextual information, which contains high-level semantic features and low-level detail features from different receptive fields of CNN. But rich contextual information is important to solve the scale variation problem of crowd counting. So the precision of previous CNN-based crowd counting methods is decreased. To solve this problem, we propose an adaptive attention fusion mechanism (AAFM). AAFM can use multi-scale features from different receptive fields of CNN effectively. It integrates the convolution network for feature learning and the attention mechanism for multi-scale features fusion. We apply the first 13 convolution layers of VGG-16 as the encoder module to extract the base features. Then, the base features are fed into the decoder module. The decoder module mainly contains Density Regression Branch (DRB) and Feature Fusion Branch (FFB). DRB uses multiple convolution layers for feature learning and multi-scale feature extraction. FFB uses attention modules for modeling multi-scale features and element-wise multiply for features fusion. Therefore, AAFM can obtain rich contextual information into the encoder-decoder framework for generating high-quality crowd density maps and accurate counting. We perform experiments on ShanghaiTech, UCF-CC-50, and UCF-QNRF datasets, and AAFM achieves promising results.

AB - CNN-based crowd counting methods have achieved great progress in recent years. However, most of these CNN-based crowd counting methods do not make full use of contextual information, which contains high-level semantic features and low-level detail features from different receptive fields of CNN. But rich contextual information is important to solve the scale variation problem of crowd counting. So the precision of previous CNN-based crowd counting methods is decreased. To solve this problem, we propose an adaptive attention fusion mechanism (AAFM). AAFM can use multi-scale features from different receptive fields of CNN effectively. It integrates the convolution network for feature learning and the attention mechanism for multi-scale features fusion. We apply the first 13 convolution layers of VGG-16 as the encoder module to extract the base features. Then, the base features are fed into the decoder module. The decoder module mainly contains Density Regression Branch (DRB) and Feature Fusion Branch (FFB). DRB uses multiple convolution layers for feature learning and multi-scale feature extraction. FFB uses attention modules for modeling multi-scale features and element-wise multiply for features fusion. Therefore, AAFM can obtain rich contextual information into the encoder-decoder framework for generating high-quality crowd density maps and accurate counting. We perform experiments on ShanghaiTech, UCF-CC-50, and UCF-QNRF datasets, and AAFM achieves promising results.

KW - Crowd counting

KW - adaptive attention fusion mechanism

KW - density estimation

UR - http://www.scopus.com/inward/record.url?scp=85089583068&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2020.3012818

DO - 10.1109/ACCESS.2020.3012818

M3 - Article

AN - SCOPUS:85089583068

SN - 2169-3536

VL - 8

SP - 138297

EP - 138306

JO - IEEE Access

JF - IEEE Access

M1 - 9151937

ER -

AAFM: Adaptive Attention Fusion Mechanism for Crowd Counting

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this