TY - JOUR
T1 - Coupling-and-decoupling
T2 - A hierarchical model for occlusion-free object detection
AU - Li, Bo
AU - Song, Xi
AU - Wu, Tianfu
AU - Hu, Wenze
AU - Pei, Mingtao
PY - 2014/10
Y1 - 2014/10
N2 - Handling occlusion is a very challenging problem in object detection. This paper presents a method of learning a hierarchical model for X-to-X occlusion-free object detection (e.g., car-to-car and person-to-person occlusions in our experiments). The proposed method is motivated by an intuitive coupling-and-decoupling strategy. In the learning stage, the pair of occluding X's (e.g., car pairs or person pairs) is represented directly and jointly by a hierarchical And-Or directed acyclic graph (AOG) which accounts for the statistically significant co-occurrence (i.e., coupling). The structure and the parameters of the AOG are learned using the latent structural SVM (LSSVM) framework. In detection, a dynamic programming (DP) algorithm is utilized to find the best parse trees for all sliding windows with detection scores being greater than the learned threshold. Then, the two single X's are decoupled from the declared detections of X-to-X occluding pairs together with some non-maximum suppression (NMS) post-processing. In experiments, our method is tested on both a roadside-car dataset collected by ourselves (which will be released with this paper) and two public person datasets, the MPII-2Person dataset and the TUD-Crossing dataset. Our method is compared with state-of-the-art deformable part-based methods, and obtains comparable or better detection performance.
AB - Handling occlusion is a very challenging problem in object detection. This paper presents a method of learning a hierarchical model for X-to-X occlusion-free object detection (e.g., car-to-car and person-to-person occlusions in our experiments). The proposed method is motivated by an intuitive coupling-and-decoupling strategy. In the learning stage, the pair of occluding X's (e.g., car pairs or person pairs) is represented directly and jointly by a hierarchical And-Or directed acyclic graph (AOG) which accounts for the statistically significant co-occurrence (i.e., coupling). The structure and the parameters of the AOG are learned using the latent structural SVM (LSSVM) framework. In detection, a dynamic programming (DP) algorithm is utilized to find the best parse trees for all sliding windows with detection scores being greater than the learned threshold. Then, the two single X's are decoupled from the declared detections of X-to-X occluding pairs together with some non-maximum suppression (NMS) post-processing. In experiments, our method is tested on both a roadside-car dataset collected by ourselves (which will be released with this paper) and two public person datasets, the MPII-2Person dataset and the TUD-Crossing dataset. Our method is compared with state-of-the-art deformable part-based methods, and obtains comparable or better detection performance.
KW - And-Or graph
KW - Deformable part-based model
KW - Latent structural SVM
KW - Object detection
KW - Occlusion modeling
UR - http://www.scopus.com/inward/record.url?scp=84902331104&partnerID=8YFLogxK
U2 - 10.1016/j.patcog.2014.04.016
DO - 10.1016/j.patcog.2014.04.016
M3 - Article
AN - SCOPUS:84902331104
SN - 0031-3203
VL - 47
SP - 3254
EP - 3264
JO - Pattern Recognition
JF - Pattern Recognition
IS - 10
ER -