Abstract
In the last decade, many efforts have been developed for discriminative image representations. Among these works, vector of locally aggregated descriptors (VLAD) has been demonstrated to be an effective one. However, most VLAD-based methods generally employ detected SIFT descriptors and contain limited content information, in which the representation ability is deteriorated. In this work, we propose a novel framework to boost VLAD with weighted fusion of local descriptors (WF-VLAD), which encodes more discriminative clues and maintains higher performance. Toward a preferable image representation that contains sufficient details, our approach fuses SIFT sampled densely (dense SIFT) and detected from the interest points (detected SIFT) in the aggregation. Furthermore, we assign each detected SIFT corresponding weight that measured by saliency analysis to make the salient descriptors with relatively high importance. The proposed method can include sufficient image content information and highlight the important image regions. Finally, experiments on publicly available datasets demonstrate that our approach shows competitive performance in retrieval tasks.
Original language | English |
---|---|
Pages (from-to) | 11835-11855 |
Number of pages | 21 |
Journal | Multimedia Tools and Applications |
Volume | 78 |
Issue number | 9 |
DOIs | |
Publication status | Published - 1 May 2019 |
Keywords
- Image representation
- Image retrieval
- Saliency weighting
- VLAD