TY - GEN
T1 - Cross-domain image retrieval with attention modeling
AU - Ji, Xin
AU - Wang, Wei
AU - Zhang, Meihui
AU - Yang, Yang
N1 - Publisher Copyright:
© 2017 Association for Computing Machinery.
PY - 2017/10/23
Y1 - 2017/10/23
N2 - With the proliferation of e-commerce websites and the ubiquitousness of smart phones, cross-domain image retrieval using images taken by smart phones as queries to search products on e-commerce websites is emerging as a popular application. One challenge of this task is to locate the attention of both the query and database images. In particular, database images, e.g. of fashion products, on e-commerce websites are typically displayed with other accessories, and the images taken by users contain noisy background and large variations in orientation and lighting. Consequently, their attention is difficult to locate. In this paper, we exploit the rich tag information available on the e-commerce websites to locate the attention of database images. For query images, we use each candidate image in the database as the context to locate the query attention. Novel deep convolutional neural network architectures, namely TagYNet and CtxYNet, are proposed to learn the attention weights and then extract effective representations of the images. Experimental results on public datasets confirm that our approaches have significant improvement over the existing methods in terms of the retrieval accuracy and efficiency.
AB - With the proliferation of e-commerce websites and the ubiquitousness of smart phones, cross-domain image retrieval using images taken by smart phones as queries to search products on e-commerce websites is emerging as a popular application. One challenge of this task is to locate the attention of both the query and database images. In particular, database images, e.g. of fashion products, on e-commerce websites are typically displayed with other accessories, and the images taken by users contain noisy background and large variations in orientation and lighting. Consequently, their attention is difficult to locate. In this paper, we exploit the rich tag information available on the e-commerce websites to locate the attention of database images. For query images, we use each candidate image in the database as the context to locate the query attention. Novel deep convolutional neural network architectures, namely TagYNet and CtxYNet, are proposed to learn the attention weights and then extract effective representations of the images. Experimental results on public datasets confirm that our approaches have significant improvement over the existing methods in terms of the retrieval accuracy and efficiency.
KW - Attention modeling
KW - Cross-domain image retrieval
KW - Deep learning
KW - Fashion product
UR - https://www.scopus.com/pages/publications/85035242030
U2 - 10.1145/3123266.3123429
DO - 10.1145/3123266.3123429
M3 - Conference contribution
AN - SCOPUS:85035242030
T3 - MM 2017 - Proceedings of the 2017 ACM Multimedia Conference
SP - 1654
EP - 1662
BT - MM 2017 - Proceedings of the 2017 ACM Multimedia Conference
PB - Association for Computing Machinery, Inc
T2 - 25th ACM International Conference on Multimedia, MM 2017
Y2 - 23 October 2017 through 27 October 2017
ER -