Video retrieval based on deep convolutional neural network

Yajiao Dong; Jianguo Li

doi:10.1145/3220162.3220168

Video retrieval based on deep convolutional neural network

Yajiao Dong, Jianguo Li

网络空间安全学院

Beijing Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

14 引用（Scopus）

摘要

Recently, with the enormous growth of online videos, fast video retrieval research has received increasing attention. As an extension of image hashing techniques, traditional video hashing methods mainly depend on hand-crafted features and transform the real-valued features into binary hash codes. As videos provide far more diverse and complex visual information than images, extracting features from videos is much more challenging than that from images. Therefore, high-level semantic features to represent videos are needed rather than low-level hand-crafted methods. In this paper, a deep convolutional neural network is proposed to extract high-level semantic features and a binary hash function is then integrated into this framework to achieve an end-to-end optimization. Particularly, our approach also combines triplet loss function which preserves the relative similarity and difference of videos and classification loss function as the optimization objective. Experiments have been performed on two public datasets and the results demonstrate the superiority of our proposed method compared with other state-of-the-art video retrieval methods.

源语言	英语
主期刊名	ICMSSP 2018 - 2018 3rd International Conference on Multimedia Systems and Signal Processing
出版商	Association for Computing Machinery
页	12-16
页数	5
ISBN（电子版）	9781450364577
DOI	https://doi.org/10.1145/3220162.3220168
出版状态	已出版 - 28 4月 2018
活动	3rd International Conference on Multimedia Systems and Signal Processing, ICMSSP 2018 - Shenzhen, 中国期限: 28 4月 2018 → 30 4月 2018

出版系列

姓名	ACM International Conference Proceeding Series

会议

会议	3rd International Conference on Multimedia Systems and Signal Processing, ICMSSP 2018
国家/地区	中国
市	Shenzhen
时期	28/04/18 → 30/04/18

访问文件

10.1145/3220162.3220168

其它文件与链接

链接到 Scopus 的出版物

引用此

Dong, Y., & Li, J. (2018). Video retrieval based on deep convolutional neural network. 在 ICMSSP 2018 - 2018 3rd International Conference on Multimedia Systems and Signal Processing (页码 12-16). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3220162.3220168

@inproceedings{a7aef0718e12440db9ee4b75a8361117,

title = "Video retrieval based on deep convolutional neural network",

abstract = "Recently, with the enormous growth of online videos, fast video retrieval research has received increasing attention. As an extension of image hashing techniques, traditional video hashing methods mainly depend on hand-crafted features and transform the real-valued features into binary hash codes. As videos provide far more diverse and complex visual information than images, extracting features from videos is much more challenging than that from images. Therefore, high-level semantic features to represent videos are needed rather than low-level hand-crafted methods. In this paper, a deep convolutional neural network is proposed to extract high-level semantic features and a binary hash function is then integrated into this framework to achieve an end-to-end optimization. Particularly, our approach also combines triplet loss function which preserves the relative similarity and difference of videos and classification loss function as the optimization objective. Experiments have been performed on two public datasets and the results demonstrate the superiority of our proposed method compared with other state-of-the-art video retrieval methods.",

keywords = "Deep convolutional neural network, Hash mapping function, Video retrieval",

author = "Yajiao Dong and Jianguo Li",

note = "Publisher Copyright: Copyright {\textcopyright} 2018 ACM; 3rd International Conference on Multimedia Systems and Signal Processing, ICMSSP 2018 ; Conference date: 28-04-2018 Through 30-04-2018",

year = "2018",

month = apr,

day = "28",

doi = "10.1145/3220162.3220168",

language = "English",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

pages = "12--16",

booktitle = "ICMSSP 2018 - 2018 3rd International Conference on Multimedia Systems and Signal Processing",

}

Dong, Y & Li, J 2018, Video retrieval based on deep convolutional neural network. 在 ICMSSP 2018 - 2018 3rd International Conference on Multimedia Systems and Signal Processing. ACM International Conference Proceeding Series, Association for Computing Machinery, 页码 12-16, 3rd International Conference on Multimedia Systems and Signal Processing, ICMSSP 2018, Shenzhen, 中国, 28/04/18. https://doi.org/10.1145/3220162.3220168

TY - GEN

T1 - Video retrieval based on deep convolutional neural network

AU - Dong, Yajiao

AU - Li, Jianguo

PY - 2018/4/28

Y1 - 2018/4/28

N2 - Recently, with the enormous growth of online videos, fast video retrieval research has received increasing attention. As an extension of image hashing techniques, traditional video hashing methods mainly depend on hand-crafted features and transform the real-valued features into binary hash codes. As videos provide far more diverse and complex visual information than images, extracting features from videos is much more challenging than that from images. Therefore, high-level semantic features to represent videos are needed rather than low-level hand-crafted methods. In this paper, a deep convolutional neural network is proposed to extract high-level semantic features and a binary hash function is then integrated into this framework to achieve an end-to-end optimization. Particularly, our approach also combines triplet loss function which preserves the relative similarity and difference of videos and classification loss function as the optimization objective. Experiments have been performed on two public datasets and the results demonstrate the superiority of our proposed method compared with other state-of-the-art video retrieval methods.

AB - Recently, with the enormous growth of online videos, fast video retrieval research has received increasing attention. As an extension of image hashing techniques, traditional video hashing methods mainly depend on hand-crafted features and transform the real-valued features into binary hash codes. As videos provide far more diverse and complex visual information than images, extracting features from videos is much more challenging than that from images. Therefore, high-level semantic features to represent videos are needed rather than low-level hand-crafted methods. In this paper, a deep convolutional neural network is proposed to extract high-level semantic features and a binary hash function is then integrated into this framework to achieve an end-to-end optimization. Particularly, our approach also combines triplet loss function which preserves the relative similarity and difference of videos and classification loss function as the optimization objective. Experiments have been performed on two public datasets and the results demonstrate the superiority of our proposed method compared with other state-of-the-art video retrieval methods.

KW - Deep convolutional neural network

KW - Hash mapping function

KW - Video retrieval

UR - http://www.scopus.com/inward/record.url?scp=85051237082&partnerID=8YFLogxK

U2 - 10.1145/3220162.3220168

DO - 10.1145/3220162.3220168

M3 - Conference contribution

AN - SCOPUS:85051237082

T3 - ACM International Conference Proceeding Series

SP - 12

EP - 16

BT - ICMSSP 2018 - 2018 3rd International Conference on Multimedia Systems and Signal Processing

PB - Association for Computing Machinery

T2 - 3rd International Conference on Multimedia Systems and Signal Processing, ICMSSP 2018

Y2 - 28 April 2018 through 30 April 2018

ER -

Video retrieval based on deep convolutional neural network

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此