TY - JOUR
T1 - Source-Free Image-Text Matching via Uncertainty-Aware Learning
AU - Tian, Mengxiao
AU - Yang, Shuo
AU - Wu, Xinxiao
AU - Jia, Yunde
N1 - Publisher Copyright:
© 1994-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - When applying a trained image-text matching model to a new scenario, the performance may largely degrade due to domain shift, which makes it impractical in real-world applications. In this paper, we make the first attempt on adapting the image-text matching model well-trained on a labeled source domain to an unlabeled target domain in the absence of source data, namely, source-free image-text matching. This task is challenging since it has no direct access to the source data when learning to reduce the doma in shift. To address this challenge, we propose a simple yet effective method that introduces uncertainty-aware learning to generate high-quality pseudo-pairs of image and text for target adaptation. Specifically, starting with using the pre-trained source model to retrieve several top-ranked image-text pairs from the target domain as pseudo-pairs, we then model uncertainty of each pseudo-pair by calculating the variance of retrieved texts (resp. images) given the paired image (resp. text) as query, and finally incorporate the uncertainty into an objective function to down-weight noisy pseudo-pairs for better training, thereby enhancing adaptation. This uncertainty-aware training approach can be generally applied on all existing models. Extensive experiments on the COCO and Flickr30K datasets demonstrate the effectiveness of the proposed method.
AB - When applying a trained image-text matching model to a new scenario, the performance may largely degrade due to domain shift, which makes it impractical in real-world applications. In this paper, we make the first attempt on adapting the image-text matching model well-trained on a labeled source domain to an unlabeled target domain in the absence of source data, namely, source-free image-text matching. This task is challenging since it has no direct access to the source data when learning to reduce the doma in shift. To address this challenge, we propose a simple yet effective method that introduces uncertainty-aware learning to generate high-quality pseudo-pairs of image and text for target adaptation. Specifically, starting with using the pre-trained source model to retrieve several top-ranked image-text pairs from the target domain as pseudo-pairs, we then model uncertainty of each pseudo-pair by calculating the variance of retrieved texts (resp. images) given the paired image (resp. text) as query, and finally incorporate the uncertainty into an objective function to down-weight noisy pseudo-pairs for better training, thereby enhancing adaptation. This uncertainty-aware training approach can be generally applied on all existing models. Extensive experiments on the COCO and Flickr30K datasets demonstrate the effectiveness of the proposed method.
KW - Image-text matching
KW - source-free adaptation
KW - uncertainty-aware learning
UR - http://www.scopus.com/inward/record.url?scp=85208550404&partnerID=8YFLogxK
U2 - 10.1109/LSP.2024.3488521
DO - 10.1109/LSP.2024.3488521
M3 - Article
AN - SCOPUS:85208550404
SN - 1070-9908
VL - 31
SP - 3059
EP - 3063
JO - IEEE Signal Processing Letters
JF - IEEE Signal Processing Letters
ER -