Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation

Zeyang Dou; Kun Gao; Xiaodian Zhang; Hong Wang; Junwei Wang

doi:10.1109/TIP.2020.3038349

Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation

Zeyang Dou, Kun Gao^*, Xiaodian Zhang, Hong Wang, Junwei Wang

^*Corresponding author for this work

Ministry of Education

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

Abstract

Most anchor-based object detection methods have adopted predefined anchor boxes as regression references. However, the proper setting of anchor boxes may vary significantly across different datasets, improperly designed anchors severely limit the performances and adaptabilities of detectors. Recently, some works have tackled this problem by learning anchor shapes from datasets. However, all of these works explicitly or implicitly rely on predefined anchors, limiting universalities of detectors. In this paper, we propose a simple learning anchoring scheme with an effective target generation method to cast off predefined anchor dependencies. The proposed anchoring scheme, named as differentiable anchoring, simplifies learning anchor shape process by adding only one branch in parallel with the existing classification and bounding box regression branches. The proposed target generation method, including the $L_{p}$ norm ball approximation and the optimization difficulty-based pyramid level assignment approach, generates positive samples for the new branch. Compared with existing learning anchoring-based approaches, the proposed method doesn't require any predefined anchors, while tremendously improving performances and adaptiveness of detectors. The proposed method can be seamlessly integrated to Faster RCNN, RetinaNet, and SSD, improving the detection mAP by 2.8%, 2.1% and 2.3% respectively on MS COCO 2017 test-dev set. Moreover, the differentiable anchoring-based detectors can be directly applied to specific scenarios without any modification of the hyperparameters or using a specialized optimization. Specifically, the differentiable anchoring-based RetinaNet achieves very competitive performances on tiny face detection and text detection tasks, which are not well handled by the conventional and guided anchoring based RetinaNets for the MS COCO dataset.

Original language	English
Article number	9266069
Pages (from-to)	712-724
Number of pages	13
Journal	IEEE Transactions on Image Processing
Volume	30
DOIs	https://doi.org/10.1109/TIP.2020.3038349
Publication status	Published - 2021
Externally published	Yes

Keywords

L p norm ball
Object detection
anchor
feature pyramid

Access to Document

10.1109/TIP.2020.3038349

Cite this

Dou, Z., Gao, K., Zhang, X., Wang, H., & Wang, J. (2021). Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation. IEEE Transactions on Image Processing, 30, 712-724. Article 9266069. https://doi.org/10.1109/TIP.2020.3038349

@article{b19e2eeb13c4479f8be035af0a8bd721,

title = "Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation",

abstract = "Most anchor-based object detection methods have adopted predefined anchor boxes as regression references. However, the proper setting of anchor boxes may vary significantly across different datasets, improperly designed anchors severely limit the performances and adaptabilities of detectors. Recently, some works have tackled this problem by learning anchor shapes from datasets. However, all of these works explicitly or implicitly rely on predefined anchors, limiting universalities of detectors. In this paper, we propose a simple learning anchoring scheme with an effective target generation method to cast off predefined anchor dependencies. The proposed anchoring scheme, named as differentiable anchoring, simplifies learning anchor shape process by adding only one branch in parallel with the existing classification and bounding box regression branches. The proposed target generation method, including the $L_{p}$ norm ball approximation and the optimization difficulty-based pyramid level assignment approach, generates positive samples for the new branch. Compared with existing learning anchoring-based approaches, the proposed method doesn't require any predefined anchors, while tremendously improving performances and adaptiveness of detectors. The proposed method can be seamlessly integrated to Faster RCNN, RetinaNet, and SSD, improving the detection mAP by 2.8%, 2.1% and 2.3% respectively on MS COCO 2017 test-dev set. Moreover, the differentiable anchoring-based detectors can be directly applied to specific scenarios without any modification of the hyperparameters or using a specialized optimization. Specifically, the differentiable anchoring-based RetinaNet achieves very competitive performances on tiny face detection and text detection tasks, which are not well handled by the conventional and guided anchoring based RetinaNets for the MS COCO dataset.",

keywords = "L p norm ball, Object detection, anchor, feature pyramid",

author = "Zeyang Dou and Kun Gao and Xiaodian Zhang and Hong Wang and Junwei Wang",

note = "Publisher Copyright: {\textcopyright} 1992-2012 IEEE.",

year = "2021",

doi = "10.1109/TIP.2020.3038349",

language = "English",

volume = "30",

pages = "712--724",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation

AU - Dou, Zeyang

AU - Gao, Kun

AU - Zhang, Xiaodian

AU - Wang, Hong

AU - Wang, Junwei

PY - 2021

Y1 - 2021

N2 - Most anchor-based object detection methods have adopted predefined anchor boxes as regression references. However, the proper setting of anchor boxes may vary significantly across different datasets, improperly designed anchors severely limit the performances and adaptabilities of detectors. Recently, some works have tackled this problem by learning anchor shapes from datasets. However, all of these works explicitly or implicitly rely on predefined anchors, limiting universalities of detectors. In this paper, we propose a simple learning anchoring scheme with an effective target generation method to cast off predefined anchor dependencies. The proposed anchoring scheme, named as differentiable anchoring, simplifies learning anchor shape process by adding only one branch in parallel with the existing classification and bounding box regression branches. The proposed target generation method, including the $L_{p}$ norm ball approximation and the optimization difficulty-based pyramid level assignment approach, generates positive samples for the new branch. Compared with existing learning anchoring-based approaches, the proposed method doesn't require any predefined anchors, while tremendously improving performances and adaptiveness of detectors. The proposed method can be seamlessly integrated to Faster RCNN, RetinaNet, and SSD, improving the detection mAP by 2.8%, 2.1% and 2.3% respectively on MS COCO 2017 test-dev set. Moreover, the differentiable anchoring-based detectors can be directly applied to specific scenarios without any modification of the hyperparameters or using a specialized optimization. Specifically, the differentiable anchoring-based RetinaNet achieves very competitive performances on tiny face detection and text detection tasks, which are not well handled by the conventional and guided anchoring based RetinaNets for the MS COCO dataset.

AB - Most anchor-based object detection methods have adopted predefined anchor boxes as regression references. However, the proper setting of anchor boxes may vary significantly across different datasets, improperly designed anchors severely limit the performances and adaptabilities of detectors. Recently, some works have tackled this problem by learning anchor shapes from datasets. However, all of these works explicitly or implicitly rely on predefined anchors, limiting universalities of detectors. In this paper, we propose a simple learning anchoring scheme with an effective target generation method to cast off predefined anchor dependencies. The proposed anchoring scheme, named as differentiable anchoring, simplifies learning anchor shape process by adding only one branch in parallel with the existing classification and bounding box regression branches. The proposed target generation method, including the $L_{p}$ norm ball approximation and the optimization difficulty-based pyramid level assignment approach, generates positive samples for the new branch. Compared with existing learning anchoring-based approaches, the proposed method doesn't require any predefined anchors, while tremendously improving performances and adaptiveness of detectors. The proposed method can be seamlessly integrated to Faster RCNN, RetinaNet, and SSD, improving the detection mAP by 2.8%, 2.1% and 2.3% respectively on MS COCO 2017 test-dev set. Moreover, the differentiable anchoring-based detectors can be directly applied to specific scenarios without any modification of the hyperparameters or using a specialized optimization. Specifically, the differentiable anchoring-based RetinaNet achieves very competitive performances on tiny face detection and text detection tasks, which are not well handled by the conventional and guided anchoring based RetinaNets for the MS COCO dataset.

KW - L p norm ball

KW - Object detection

KW - anchor

KW - feature pyramid

UR - http://www.scopus.com/inward/record.url?scp=85097182029&partnerID=8YFLogxK

U2 - 10.1109/TIP.2020.3038349

DO - 10.1109/TIP.2020.3038349

M3 - Article

C2 - 33226941

AN - SCOPUS:85097182029

SN - 1057-7149

VL - 30

SP - 712

EP - 724

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

M1 - 9266069

ER -

Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this