TY - JOUR
T1 - Multistage Object Detection with Group Recursive Learning
AU - Li, Jianan
AU - Liang, Xiaodan
AU - Li, Jianshu
AU - Wei, Yunchao
AU - Xu, Tingfa
AU - Feng, Jiashi
AU - Yan, Shuicheng
N1 - Publisher Copyright:
© 1999-2012 IEEE.
PY - 2018/7
Y1 - 2018/7
N2 - Most existing detection pipelines treat object proposals independently and predict bounding box locations and classification scores over them separately. However, the important semantic and spatial layout correlations among proposals are often ignored, which are actually useful for more accurate object detection. In this paper, we propose a new EM-like group recursive learning approach to iteratively refine object proposals by incorporating such context of surrounding proposals and provide an optimal spatial configuration of object detections. In addition, we propose to incorporate the weakly supervised object segmentation cues and region-based object detection into a multistage architecture in order to fully exploit the learned segmentation features for better object detection in an end-to-end way. The proposed architecture consists of three cascaded networks that, respectively, learn to perform weakly supervised object segmentation, object proposal generation, and recursive detection refinement. Combining the group recursive learning and the multistage architecture provides competitive mAPs of 78.7% and 74.9% on the PASCAL VOC2007 and VOC2012 datasets, respectively, which outperform many well-established baselines significantly.
AB - Most existing detection pipelines treat object proposals independently and predict bounding box locations and classification scores over them separately. However, the important semantic and spatial layout correlations among proposals are often ignored, which are actually useful for more accurate object detection. In this paper, we propose a new EM-like group recursive learning approach to iteratively refine object proposals by incorporating such context of surrounding proposals and provide an optimal spatial configuration of object detections. In addition, we propose to incorporate the weakly supervised object segmentation cues and region-based object detection into a multistage architecture in order to fully exploit the learned segmentation features for better object detection in an end-to-end way. The proposed architecture consists of three cascaded networks that, respectively, learn to perform weakly supervised object segmentation, object proposal generation, and recursive detection refinement. Combining the group recursive learning and the multistage architecture provides competitive mAPs of 78.7% and 74.9% on the PASCAL VOC2007 and VOC2012 datasets, respectively, which outperform many well-established baselines significantly.
KW - Image segmentation
KW - neural networks
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85034591384&partnerID=8YFLogxK
U2 - 10.1109/TMM.2017.2772796
DO - 10.1109/TMM.2017.2772796
M3 - Article
AN - SCOPUS:85034591384
SN - 1520-9210
VL - 20
SP - 1645
EP - 1655
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
IS - 7
ER -