Abstract
In this paper, we present a model for learning atomic actions for complex activities classification. A video sequence is first represented by a collection of visual interest points. The model automatically clusters visual words into atomic actions based on their co-occurrence and temporal proximity using an extension of Hierarchical Dirichlet Process (HDP) mixture model. Our approach is robust to noisy interest points caused by various conditions because HDP is a generative model. Based on the atomic actions learned from our model, we use both a Naive Bayesian and a linear SVM classifier for activity classification. We first use a synthetic example to demonstrate the intermediate result, then we apply on the complex Olympic Sport 16-class dataset and show that our model outperforms other state-of-art methods.
Original language | English |
---|---|
Article number | 6298410 |
Pages (from-to) | 278-283 |
Number of pages | 6 |
Journal | Proceedings - IEEE International Conference on Multimedia and Expo |
DOIs | |
Publication status | Published - 2012 |
Event | 2012 13th IEEE International Conference on Multimedia and Expo, ICME 2012 - Melbourne, VIC, Australia Duration: 9 Jul 2012 → 13 Jul 2012 |
Keywords
- Activity classification
- atomic action
- temporal relation