TY - GEN
T1 - Reconfigurable templates for robust vehicle detection and classification
AU - Lv, Yang
AU - Yao, Benjamin
AU - Wang, Yongtian
AU - Zhu, Song Chun
PY - 2012
Y1 - 2012
N2 - In this paper, we learn a reconfigurable template for detecting vehicles and classifying their types. We adopt a popular design for the part based model that has one coarse template covering entire object window and several small high-resolution templates representing parts. The reconfigurable template can learn part configurations that capture the spatial correlation of features for a deformable part based model. The features of templates are Histograms of Gradients (HoG). In order to better describe the actual dimensions and locations of "parts" (i.e. features with strong spatial correlations), we design a dictionary of rectangular primitives of various sizes, aspect-ratios and positions. A configuration is defined as a subset of non-overlapping primitives from this dictionary. To learn the optimal configuration using SVM amounts, we need to find the subset of parts that minimize the regularized hinge loss, which leads to a non-convex optimization problem. We solve this problem by replacing the hinge loss with a negative sigmoid loss that can be approximately decomposed into losses (or negative sigmoid scores) of individual parts. In the experiment, we compare our method empirically with group lasso and a state of the art method [7] and demonstrate that models learned with our method outperform others on two computer vision applications: vehicle localization and vehicle model recognition.
AB - In this paper, we learn a reconfigurable template for detecting vehicles and classifying their types. We adopt a popular design for the part based model that has one coarse template covering entire object window and several small high-resolution templates representing parts. The reconfigurable template can learn part configurations that capture the spatial correlation of features for a deformable part based model. The features of templates are Histograms of Gradients (HoG). In order to better describe the actual dimensions and locations of "parts" (i.e. features with strong spatial correlations), we design a dictionary of rectangular primitives of various sizes, aspect-ratios and positions. A configuration is defined as a subset of non-overlapping primitives from this dictionary. To learn the optimal configuration using SVM amounts, we need to find the subset of parts that minimize the regularized hinge loss, which leads to a non-convex optimization problem. We solve this problem by replacing the hinge loss with a negative sigmoid loss that can be approximately decomposed into losses (or negative sigmoid scores) of individual parts. In the experiment, we compare our method empirically with group lasso and a state of the art method [7] and demonstrate that models learned with our method outperform others on two computer vision applications: vehicle localization and vehicle model recognition.
UR - https://www.scopus.com/pages/publications/84860675785
U2 - 10.1109/WACV.2012.6163016
DO - 10.1109/WACV.2012.6163016
M3 - Conference contribution
AN - SCOPUS:84860675785
SN - 9781467302333
T3 - Proceedings of IEEE Workshop on Applications of Computer Vision
SP - 321
EP - 328
BT - 2012 IEEE Workshop on the Applications of Computer Vision, WACV 2012
PB - IEEE Computer Society
T2 - 2012 IEEE Workshop on the Applications of Computer Vision, WACV 2012
Y2 - 9 January 2012 through 11 January 2012
ER -