Abstract
We designed a simple system for the 2014 TRECVID Multimedia Event Detection [1]. Except the videos provided by NIST, we only used the BVLC Reference CaffeNet model file distributed besides Caffe [2]. Our system follows the standard pipeline and consists two parts: feature extraction and classification. The feature extraction part is implemented by Caffe and the classification is implemented by LIBSVM [3]. Based on the results, we think that the contribution mainly comes from the feature extraction part. We learned that Convolutional Neural Networks (CNN) is a powerfully model and hope that a easy accessible spatio-temporal CNN model for videos will be available soon.
Original language | English |
---|---|
Publication status | Published - 2020 |
Event | 2014 TREC Video Retrieval Evaluation, TRECVID 2014 - Orlando, United States Duration: 10 Nov 2014 → 12 Nov 2014 |
Conference
Conference | 2014 TREC Video Retrieval Evaluation, TRECVID 2014 |
---|---|
Country/Territory | United States |
City | Orlando |
Period | 10/11/14 → 12/11/14 |