TY - JOUR
T1 - Improving performance and adaptivity of anchor-based detector using differentiable anchoring with efficient target generation
AU - Dou, Zeyang
AU - Gao, Kun
AU - Zhang, Xiaodian
AU - Wang, Hong
AU - Wang, Junwei
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Most anchor-based object detection methods have adopted predefined anchor boxes as regression references. However, the proper setting of anchor boxes may vary significantly across different datasets, improperly designed anchors severely limit the performances and adaptabilities of detectors. Recently, some works have tackled this problem by learning anchor shapes from datasets. However, all of these works explicitly or implicitly rely on predefined anchors, limiting universalities of detectors. In this paper, we propose a simple learning anchoring scheme with an effective target generation method to cast off predefined anchor dependencies. The proposed anchoring scheme, named as differentiable anchoring, simplifies learning anchor shape process by adding only one branch in parallel with the existing classification and bounding box regression branches. The proposed target generation method, including the $L_{p}$ norm ball approximation and the optimization difficulty-based pyramid level assignment approach, generates positive samples for the new branch. Compared with existing learning anchoring-based approaches, the proposed method doesn't require any predefined anchors, while tremendously improving performances and adaptiveness of detectors. The proposed method can be seamlessly integrated to Faster RCNN, RetinaNet, and SSD, improving the detection mAP by 2.8%, 2.1% and 2.3% respectively on MS COCO 2017 test-dev set. Moreover, the differentiable anchoring-based detectors can be directly applied to specific scenarios without any modification of the hyperparameters or using a specialized optimization. Specifically, the differentiable anchoring-based RetinaNet achieves very competitive performances on tiny face detection and text detection tasks, which are not well handled by the conventional and guided anchoring based RetinaNets for the MS COCO dataset.
AB - Most anchor-based object detection methods have adopted predefined anchor boxes as regression references. However, the proper setting of anchor boxes may vary significantly across different datasets, improperly designed anchors severely limit the performances and adaptabilities of detectors. Recently, some works have tackled this problem by learning anchor shapes from datasets. However, all of these works explicitly or implicitly rely on predefined anchors, limiting universalities of detectors. In this paper, we propose a simple learning anchoring scheme with an effective target generation method to cast off predefined anchor dependencies. The proposed anchoring scheme, named as differentiable anchoring, simplifies learning anchor shape process by adding only one branch in parallel with the existing classification and bounding box regression branches. The proposed target generation method, including the $L_{p}$ norm ball approximation and the optimization difficulty-based pyramid level assignment approach, generates positive samples for the new branch. Compared with existing learning anchoring-based approaches, the proposed method doesn't require any predefined anchors, while tremendously improving performances and adaptiveness of detectors. The proposed method can be seamlessly integrated to Faster RCNN, RetinaNet, and SSD, improving the detection mAP by 2.8%, 2.1% and 2.3% respectively on MS COCO 2017 test-dev set. Moreover, the differentiable anchoring-based detectors can be directly applied to specific scenarios without any modification of the hyperparameters or using a specialized optimization. Specifically, the differentiable anchoring-based RetinaNet achieves very competitive performances on tiny face detection and text detection tasks, which are not well handled by the conventional and guided anchoring based RetinaNets for the MS COCO dataset.
KW - L p norm ball
KW - Object detection
KW - anchor
KW - feature pyramid
UR - http://www.scopus.com/inward/record.url?scp=85097182029&partnerID=8YFLogxK
U2 - 10.1109/TIP.2020.3038349
DO - 10.1109/TIP.2020.3038349
M3 - Article
C2 - 33226941
AN - SCOPUS:85097182029
SN - 1057-7149
VL - 30
SP - 712
EP - 724
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9266069
ER -