Deep Foreground-Background Weighted Cross-modal Hashing

Guanqi Zhao, Xian Ling Mao*, Rong Cheng Tu, Wenjin Ji, Heyan Huang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Citation (Scopus)

Abstract

With the rapid growth of multi-modal data, deep cross-modal hashing algorithms provide a perfect solution for cross-modal retrieval tasks for their advantages of efficient retrieval speed and low storage consumption. Currently, the existing supervised cross-modal hashing methods, in order to efficiently extract structured information from raw data, generally gather on feature extraction of global information, however, all those methods ignore the weight differentiation between foreground and background information in a image. To address the issue, we propose a novel Deep Foreground-Background Weighted Cross-Modal Hashing(DFBWH) for supervised cross-modal retrieval. Specifically, the proposed method firstly performs target detection on the original image and select out candidate regions as target foreground entities. Then, the proposed method utilize the semantic interactions in the textual descriptions and tagging information as evaluation criteria, and use CLIP to detect the matching degree of the candidate regions. Eventually, under the supervision of the category labeling information, the hash loss function is utilized to obtain a high-quality hash code. Extensive experiments were carried out on two benchmark datasets, which demonstrate that DFBWH achieves better performance than the state-of-the-art baselines.

Original languageEnglish
Title of host publicationNatural Language Processing and Chinese Computing - 13th National CCF Conference, NLPCC 2024, Proceedings
EditorsDerek F. Wong, Zhongyu Wei, Muyun Yang
PublisherSpringer Science and Business Media Deutschland GmbH
Pages433-445
Number of pages13
ISBN (Print)9789819794362
DOIs
Publication statusPublished - 2025
Event13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024 - Hangzhou, China
Duration: 1 Nov 20243 Nov 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15361 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024
Country/TerritoryChina
CityHangzhou
Period1/11/243/11/24

Keywords

  • Cross-modal retrieval
  • Deep hashing
  • Foreground-Background Weighted

Cite this