TY - GEN
T1 - Video retrieval based on deep convolutional neural network
AU - Dong, Yajiao
AU - Li, Jianguo
N1 - Publisher Copyright:
Copyright © 2018 ACM
PY - 2018/4/28
Y1 - 2018/4/28
N2 - Recently, with the enormous growth of online videos, fast video retrieval research has received increasing attention. As an extension of image hashing techniques, traditional video hashing methods mainly depend on hand-crafted features and transform the real-valued features into binary hash codes. As videos provide far more diverse and complex visual information than images, extracting features from videos is much more challenging than that from images. Therefore, high-level semantic features to represent videos are needed rather than low-level hand-crafted methods. In this paper, a deep convolutional neural network is proposed to extract high-level semantic features and a binary hash function is then integrated into this framework to achieve an end-to-end optimization. Particularly, our approach also combines triplet loss function which preserves the relative similarity and difference of videos and classification loss function as the optimization objective. Experiments have been performed on two public datasets and the results demonstrate the superiority of our proposed method compared with other state-of-the-art video retrieval methods.
AB - Recently, with the enormous growth of online videos, fast video retrieval research has received increasing attention. As an extension of image hashing techniques, traditional video hashing methods mainly depend on hand-crafted features and transform the real-valued features into binary hash codes. As videos provide far more diverse and complex visual information than images, extracting features from videos is much more challenging than that from images. Therefore, high-level semantic features to represent videos are needed rather than low-level hand-crafted methods. In this paper, a deep convolutional neural network is proposed to extract high-level semantic features and a binary hash function is then integrated into this framework to achieve an end-to-end optimization. Particularly, our approach also combines triplet loss function which preserves the relative similarity and difference of videos and classification loss function as the optimization objective. Experiments have been performed on two public datasets and the results demonstrate the superiority of our proposed method compared with other state-of-the-art video retrieval methods.
KW - Deep convolutional neural network
KW - Hash mapping function
KW - Video retrieval
UR - http://www.scopus.com/inward/record.url?scp=85051237082&partnerID=8YFLogxK
U2 - 10.1145/3220162.3220168
DO - 10.1145/3220162.3220168
M3 - Conference contribution
AN - SCOPUS:85051237082
T3 - ACM International Conference Proceeding Series
SP - 12
EP - 16
BT - ICMSSP 2018 - 2018 3rd International Conference on Multimedia Systems and Signal Processing
PB - Association for Computing Machinery
T2 - 3rd International Conference on Multimedia Systems and Signal Processing, ICMSSP 2018
Y2 - 28 April 2018 through 30 April 2018
ER -