Compact Deep Invariant Descriptors for Video Retrieval

Yihang Lou, Yan Bai, Jie Lin, Shiqi Wang, Jie Chen, Vijay Chandrasekhar, Ling Yu Duan, Tiejun Huang, Alex Chichung Kot, Wen Gao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

37 Citations (Scopus)

Abstract

With emerging demand for large-scale video analysis, the Motion Picture Experts Group (MPEG) initiated the Compact Descriptor for Video Analysis (CDVA) standardization in 2014. In this work, we develop novel deep-learning features and incorporate them into the well-established CDVA evaluation framework to study its effectiveness in video analysis. In particular, we propose a Nested Invariance Pooling (NIP) method to obtain compact and robust Convolutional Neural Network (CNNs) descriptors. The CNNs descriptors are generated by applying three different pooling operations to the feature maps of CNNs in a nested way towards rotation and scale invariant feature representation. In particular, the rational, advantages and performance on the combination of CNNs and handcrafted descriptors are provided to better investigate the complementary effects of deep learnt and handcrafted features. Extensive experimental results show that the proposed CNNs descriptors outperform both state-of-The-Art CNNs descriptors and canonical handcrafted descriptors adopted in CDVA Experimental Model (CXM) with significant mAP gains of 11.3% and 4.7%, respectively. Moreover, the combination of NIP derived deep invariant descriptors and handcrafted descriptors not only fulfills the lowest bitrate budget of CDVA, but also significantly advances the performance of CDVA core techniques.

Original languageEnglish
Title of host publicationProceedings - DCC 2017, 2017 Data Compression Conference
EditorsAli Bilgin, Joan Serra-Sagrista, Michael W. Marcellin, James A. Storer
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages420-429
Number of pages10
ISBN (Electronic)9781509067213
DOIs
Publication statusPublished - 8 May 2017
Externally publishedYes
Event2017 Data Compression Conference, DCC 2017 - Snowbird, United States
Duration: 4 Apr 20177 Apr 2017

Publication series

NameData Compression Conference Proceedings
VolumePart F127767
ISSN (Print)1068-0314

Conference

Conference2017 Data Compression Conference, DCC 2017
Country/TerritoryUnited States
CitySnowbird
Period4/04/177/04/17

Keywords

  • Deep Neural Network
  • Invariant Descriptor
  • Nested Pooling
  • Video Retrieval

Fingerprint

Dive into the research topics of 'Compact Deep Invariant Descriptors for Video Retrieval'. Together they form a unique fingerprint.

Cite this