TY - GEN
T1 - Coupling-and-decoupling
T2 - 11th Asian Conference on Computer Vision, ACCV 2012
AU - Li, Bo
AU - Wu, Tianfu
AU - Hu, Wenze
AU - Pei, Mingtao
PY - 2013
Y1 - 2013
N2 - Handling occlusions in object detection is a long-standing problem. This paper addresses the problem of X-to-X-occlusion-free object detection (e.g. car-to-car occlusions in our experiment) by utilizing an intuitive coupling-and-decoupling strategy. In the "coupling" stage, we model the pair of occluding X's (e.g. car pairs) directly to account for the statistically strong co-occurrence (i.e. coupling). Then, we learn a hierarchical And-Or directed acyclic graph (AOG) model under the latent structural SVM (LSSVM) framework. The learned AOG consists of, from the top to bottom, (i) a root Or-node representing different compositions of occluding X pairs, (ii) a set of And-nodes each of which represents a specific composition of occluding X pairs, (iii) another set of And-nodes representing single X's decomposed from occluding X pairs, and (iv) a set of terminal-nodes which represent the appearance templates for the X pairs, single X's and latent parts of the single X's, respectively. The part appearance templates can also be shared among different single X's. In detection, a dynamic programming (DP) algorithm is used and as a natural consequence we decouple the two single X's from the X-to-X occluding pairs. In experiments, we test our method on roadside cars which are collected from real traffic video surveillance environment by ourselves. We compare our model with the state-of-the-art deformable part-based model (DPM) and obtain better detection performance.
AB - Handling occlusions in object detection is a long-standing problem. This paper addresses the problem of X-to-X-occlusion-free object detection (e.g. car-to-car occlusions in our experiment) by utilizing an intuitive coupling-and-decoupling strategy. In the "coupling" stage, we model the pair of occluding X's (e.g. car pairs) directly to account for the statistically strong co-occurrence (i.e. coupling). Then, we learn a hierarchical And-Or directed acyclic graph (AOG) model under the latent structural SVM (LSSVM) framework. The learned AOG consists of, from the top to bottom, (i) a root Or-node representing different compositions of occluding X pairs, (ii) a set of And-nodes each of which represents a specific composition of occluding X pairs, (iii) another set of And-nodes representing single X's decomposed from occluding X pairs, and (iv) a set of terminal-nodes which represent the appearance templates for the X pairs, single X's and latent parts of the single X's, respectively. The part appearance templates can also be shared among different single X's. In detection, a dynamic programming (DP) algorithm is used and as a natural consequence we decouple the two single X's from the X-to-X occluding pairs. In experiments, we test our method on roadside cars which are collected from real traffic video surveillance environment by ourselves. We compare our model with the state-of-the-art deformable part-based model (DPM) and obtain better detection performance.
UR - http://www.scopus.com/inward/record.url?scp=84875887737&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-37331-2_13
DO - 10.1007/978-3-642-37331-2_13
M3 - Conference contribution
AN - SCOPUS:84875887737
SN - 9783642373305
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 164
EP - 175
BT - Computer Vision, ACCV 2012 - 11th Asian Conference on Computer Vision, Revised Selected Papers
Y2 - 5 November 2012 through 9 November 2012
ER -