Multi-scale object detection by top-down and bottom-up feature pyramid network

Zhao Baojun*, Zhao Boya, Tang Linbo, Wang Wenzheng, Wu Chen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

25 Citations (Scopus)

Abstract

While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network (TDBU-FPN), which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement sub-original information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes (PASCAL VOC) dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalJournal of Systems Engineering and Electronics
Volume30
Issue number1
DOIs
Publication statusPublished - Feb 2019

Keywords

  • convolutional neural network (CNN)
  • deconvolution
  • feature pyramid network (FPN)
  • object detection

Fingerprint

Dive into the research topics of 'Multi-scale object detection by top-down and bottom-up feature pyramid network'. Together they form a unique fingerprint.

Cite this