TY - JOUR
T1 - Feature Alignment in Anchor-Free Object Detection
AU - Gao, Feng
AU - Cai, Yeyun
AU - Deng, Fang
AU - Yu, Chengpu
AU - Chen, Jie
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2023/8/1
Y1 - 2023/8/1
N2 - Most anchor-free methods perform object detection using dense recommendation, which assumes that one point can simultaneously conduct accurate category prediction and regression estimation. However, due to different task drivers, valid features for classification and regression may locate at distinct areas in the training phase. This problem is called feature misalignment. To solve it, we propose a new feature alignment method based on anchor-free object detector. Firstly, a global receptive field adaptor (G-RFA) is designed by incorporating the feature pyramid networks (FPN) with the global attention mechanism, and forward features are further fine-tuned with a deformable-subnet (De-Subnet) to remove the influence of redundant contextual information. Then, a new feature filter strategy with a misalignment score is proposed to guide the network to focus on sampling points with aligned features. In addition, we establish mutually independent multi-layer quality distributions to model the priori information of an object on different FPN levels. Equipped with our method, the classification and regression features are aligned, and the generated foreground weight map converges to the centers of classification and regression heatmaps. Experimental results show that without bells and whistles, our method achieves 49.3% AP on MS COCO test-dev under the default 2× training schedule, outperforming related methods. Besides, experiments on PASCAL VOC demonstrate the generalization ability of our method. Code is available at https://github.com/GFENGG/featurealign.
AB - Most anchor-free methods perform object detection using dense recommendation, which assumes that one point can simultaneously conduct accurate category prediction and regression estimation. However, due to different task drivers, valid features for classification and regression may locate at distinct areas in the training phase. This problem is called feature misalignment. To solve it, we propose a new feature alignment method based on anchor-free object detector. Firstly, a global receptive field adaptor (G-RFA) is designed by incorporating the feature pyramid networks (FPN) with the global attention mechanism, and forward features are further fine-tuned with a deformable-subnet (De-Subnet) to remove the influence of redundant contextual information. Then, a new feature filter strategy with a misalignment score is proposed to guide the network to focus on sampling points with aligned features. In addition, we establish mutually independent multi-layer quality distributions to model the priori information of an object on different FPN levels. Equipped with our method, the classification and regression features are aligned, and the generated foreground weight map converges to the centers of classification and regression heatmaps. Experimental results show that without bells and whistles, our method achieves 49.3% AP on MS COCO test-dev under the default 2× training schedule, outperforming related methods. Besides, experiments on PASCAL VOC demonstrate the generalization ability of our method. Code is available at https://github.com/GFENGG/featurealign.
KW - Object detection
KW - anchor-free models
KW - feature alignment
UR - http://www.scopus.com/inward/record.url?scp=85148440959&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2023.3241993
DO - 10.1109/TCSVT.2023.3241993
M3 - Article
AN - SCOPUS:85148440959
SN - 1051-8215
VL - 33
SP - 3799
EP - 3810
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 8
ER -