TY - JOUR
T1 - Boosting VLAD with weighted fusion of local descriptors for image retrieval
AU - Liu, Hao
AU - Zhao, Qingjie
AU - Zhang, Cong
AU - Mbelwa, Jimmy T.
AU - Tang, Song
AU - Zhang, Jianwei
N1 - Publisher Copyright:
© 2018, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2019/5/1
Y1 - 2019/5/1
N2 - In the last decade, many efforts have been developed for discriminative image representations. Among these works, vector of locally aggregated descriptors (VLAD) has been demonstrated to be an effective one. However, most VLAD-based methods generally employ detected SIFT descriptors and contain limited content information, in which the representation ability is deteriorated. In this work, we propose a novel framework to boost VLAD with weighted fusion of local descriptors (WF-VLAD), which encodes more discriminative clues and maintains higher performance. Toward a preferable image representation that contains sufficient details, our approach fuses SIFT sampled densely (dense SIFT) and detected from the interest points (detected SIFT) in the aggregation. Furthermore, we assign each detected SIFT corresponding weight that measured by saliency analysis to make the salient descriptors with relatively high importance. The proposed method can include sufficient image content information and highlight the important image regions. Finally, experiments on publicly available datasets demonstrate that our approach shows competitive performance in retrieval tasks.
AB - In the last decade, many efforts have been developed for discriminative image representations. Among these works, vector of locally aggregated descriptors (VLAD) has been demonstrated to be an effective one. However, most VLAD-based methods generally employ detected SIFT descriptors and contain limited content information, in which the representation ability is deteriorated. In this work, we propose a novel framework to boost VLAD with weighted fusion of local descriptors (WF-VLAD), which encodes more discriminative clues and maintains higher performance. Toward a preferable image representation that contains sufficient details, our approach fuses SIFT sampled densely (dense SIFT) and detected from the interest points (detected SIFT) in the aggregation. Furthermore, we assign each detected SIFT corresponding weight that measured by saliency analysis to make the salient descriptors with relatively high importance. The proposed method can include sufficient image content information and highlight the important image regions. Finally, experiments on publicly available datasets demonstrate that our approach shows competitive performance in retrieval tasks.
KW - Image representation
KW - Image retrieval
KW - Saliency weighting
KW - VLAD
UR - http://www.scopus.com/inward/record.url?scp=85065701671&partnerID=8YFLogxK
U2 - 10.1007/s11042-018-6712-z
DO - 10.1007/s11042-018-6712-z
M3 - Article
AN - SCOPUS:85065701671
SN - 1380-7501
VL - 78
SP - 11835
EP - 11855
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 9
ER -