TY - JOUR
T1 - Enhanced and hierarchical structure algorithm for data imbalance problem in semantic extraction under massive video dataset
AU - Gao, Zan
AU - Zhang, Long Fei
AU - Chen, Ming Yu
AU - Hauptmann, Alexander
AU - Zhang, Hua
AU - Cai, An Ni
PY - 2014/2
Y1 - 2014/2
N2 - Data imbalance problem often exists in our real life dataset, especial for massive video dataset, however, the balanced data distribution and the same misclassification cost are assumed in traditional machine learning algorithms, thus, it will be difficult for them to accurately describe the true data distribution, and resulting in misclassification. In: this paper, the data imbalance problem in semantic extraction under massive video dataset is exploited, and enhanced and hierarchical structure (called EHS) algorithm is proposed. In: proposed algorithm, data sampling, filtering and model training are considered and integrated together compactly via hierarchical structure algorithm, thus, the performance of model can be improved step by step, and is robust and stability with the change of features and datasets. Experiments on TRECVID2010 Semantic Indexing demonstrate that our proposed algorithm has much more powerful performance than that of traditional machine learning algorithms, and keeps stable and robust when different kinds of features are employed. Extended experiments on TRECVID2010 Surveillance Event Detection also prove that our EHS algorithm is efficient and effective, and reaches top performance in four of seven events.
AB - Data imbalance problem often exists in our real life dataset, especial for massive video dataset, however, the balanced data distribution and the same misclassification cost are assumed in traditional machine learning algorithms, thus, it will be difficult for them to accurately describe the true data distribution, and resulting in misclassification. In: this paper, the data imbalance problem in semantic extraction under massive video dataset is exploited, and enhanced and hierarchical structure (called EHS) algorithm is proposed. In: proposed algorithm, data sampling, filtering and model training are considered and integrated together compactly via hierarchical structure algorithm, thus, the performance of model can be improved step by step, and is robust and stability with the change of features and datasets. Experiments on TRECVID2010 Semantic Indexing demonstrate that our proposed algorithm has much more powerful performance than that of traditional machine learning algorithms, and keeps stable and robust when different kinds of features are employed. Extended experiments on TRECVID2010 Surveillance Event Detection also prove that our EHS algorithm is efficient and effective, and reaches top performance in four of seven events.
KW - Enhanced and hierarchical structure (EHS)
KW - Keyword: Data imbalance
KW - Massive video dataset
KW - Semantic indexing
KW - Surveillance event detection
KW - TRECVID
UR - http://www.scopus.com/inward/record.url?scp=84895064446&partnerID=8YFLogxK
U2 - 10.1007/s11042-012-1071-7
DO - 10.1007/s11042-012-1071-7
M3 - Article
AN - SCOPUS:84895064446
SN - 1380-7501
VL - 68
SP - 641
EP - 657
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 3
ER -