Abstract
Crowd counting is a challenging task due to many challenges such as scale variations and noisy background. To handle these challenges, we propose a novel framework named Multi-Pathway Zooming Network (MZNet) in this paper. The proposed framework recursively optimizes multi-scale features using multiple zooming pathways and progressively enhances the foreground information to improve crowd counting performance. Each zooming pathway comprises two zooming directions, zooming in and zooming out. Convolutional features at different resolutions are propagated to optimize the context information at each specific level. By sequentially integrating and interacting multi-observation information, the optimized features are powerful in handling the scale variation issue, and thus the crowd counting performance can be enhanced. To address the noisy background in many scenarios, we also introduce a new scheme to enhance the foreground information by incorporating a masked input image into the network, which is formed by a mask that element-wise multiplies with the original image. Finally, the context information, incorporated with an output density map, is recursively finetuned in our network to boost the counting performance. Extensive experiments evaluated on challenging benchmark datasets show competitive performances for both crowded and sparse scenarios.
Original language | English |
---|---|
Article number | 109585 |
Journal | Pattern Recognition |
Volume | 141 |
DOIs | |
Publication status | Published - Sept 2023 |
Keywords
- Crowd counting
- Density estimation
- Foreground enhancement
- Multi-Pathway zooming