Hashing based Efficient Inference for Image-Text Matching

Rong Cheng Tu, Lei Ji*, Huaishao Luo, Botian Shi, Heyan Huang, Nan Duan, Xian Ling Mao

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

Image-text matching has been a popular research topic which bridges vision and language through semantic understanding. Recent works mainly focus on exploring the interactions between images and sentences to improve the performance without considering inference efficiency. Specifically, for the large scale databases, it is unacceptable to perform such time-consuming mechanisms between a query (text/image) and each candidate datapoint (image/text) in the whole retrieval set during inference. To tackle this problem, we propose a novel hashing based efficient inference module called HEI, which can be plugged into the existing framework to speed up the inference step without reducing the retrieval performance. In details, HEI learns to map the original datapoints into short binary hash codes and coarsely preserve the heterologous matching relationship. Thus, in the inference phase, the proposed HEI module uses the hash codes to quickly select a few candidate datapoints from the retrieval set for a given query. Then, the image-text matching model fine ranks the candidate set to find the matching datapoint. Extensive experiments on two widely used benchmark MS-COCO and Flickr30k with four baseline methods demonstrate the efficiency and effectiveness of our proposed HEI module.

Original languageEnglish
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationACL-IJCNLP 2021
EditorsChengqing Zong, Fei Xia, Wenjie Li, Roberto Navigli
PublisherAssociation for Computational Linguistics (ACL)
Pages743-752
Number of pages10
ISBN (Electronic)9781954085541
Publication statusPublished - 2021
EventFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021 - Virtual, Online
Duration: 1 Aug 20216 Aug 2021

Publication series

NameFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021

Conference

ConferenceFindings of the Association for Computational Linguistics: ACL-IJCNLP 2021
CityVirtual, Online
Period1/08/216/08/21

Fingerprint

Dive into the research topics of 'Hashing based Efficient Inference for Image-Text Matching'. Together they form a unique fingerprint.

Cite this