Learning compact visual descriptors for low bit rate mobile landmark search

Ling Yu Duan; Jie Chen; Rongrong Ji; Tiejun Huang; Wen Gao

doi:10.1609/aimag.v34i2.2469

Learning compact visual descriptors for low bit rate mobile landmark search

Ling Yu Duan, Jie Chen, Rongrong Ji, Tiejun Huang, Wen Gao

Research output: Contribution to journal › Article › peer-review

5 Citations (Scopus)

Abstract

Along with the ever-growing computational power of mobile devices, mobile visual search has undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using a location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with offline learning from geotagged photos from online photosharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using a Gaussian mixture model and then boosting a ranking-sensitive vocabulary within each region, with "entropy"-based feedback on the compactness of the descriptor to refine both phases iteratively. In online search, when entering a geographical region, the code book in a mobile device is downstream adapted to generate extremely compact descriptors with promising discriminative ability. We have deployed landmark search apps to both HTC and iPhone mobile phones, accessing a database of a million scale images in typical areas like Beijing, New York, and Barcelona, and others. Our descriptor outperforms alternative compact descriptors (Chen et al. 2009; Chen et al., 2010; Chandrasekhar et al. 2009a; Chandrasekhar et al. 2009b) by significant margins. Beyond landmark search, this article will summarize the MPEG standarization progress of compact descriptor for visual search (CDVS) (Yuri et al. 2010; Yuri et al. 2011) toward application interoperability.

Original language	English
Pages (from-to)	67-85
Number of pages	19
Journal	AI Magazine
Volume	34
Issue number	2
DOIs	https://doi.org/10.1609/aimag.v34i2.2469
Publication status	Published - 2013
Externally published	Yes

Access to Document

10.1609/aimag.v34i2.2469

Cite this

@article{f4052d7e57d4465c88ebb91cfb426eaf,

title = "Learning compact visual descriptors for low bit rate mobile landmark search",

abstract = "Along with the ever-growing computational power of mobile devices, mobile visual search has undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using a location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with offline learning from geotagged photos from online photosharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using a Gaussian mixture model and then boosting a ranking-sensitive vocabulary within each region, with {"}entropy{"}-based feedback on the compactness of the descriptor to refine both phases iteratively. In online search, when entering a geographical region, the code book in a mobile device is downstream adapted to generate extremely compact descriptors with promising discriminative ability. We have deployed landmark search apps to both HTC and iPhone mobile phones, accessing a database of a million scale images in typical areas like Beijing, New York, and Barcelona, and others. Our descriptor outperforms alternative compact descriptors (Chen et al. 2009; Chen et al., 2010; Chandrasekhar et al. 2009a; Chandrasekhar et al. 2009b) by significant margins. Beyond landmark search, this article will summarize the MPEG standarization progress of compact descriptor for visual search (CDVS) (Yuri et al. 2010; Yuri et al. 2011) toward application interoperability.",

author = "Duan, {Ling Yu} and Jie Chen and Rongrong Ji and Tiejun Huang and Wen Gao",

year = "2013",

doi = "10.1609/aimag.v34i2.2469",

language = "English",

volume = "34",

pages = "67--85",

journal = "AI Magazine",

issn = "0738-4602",

publisher = "John Wiley and Sons Inc.",

number = "2",

}

TY - JOUR

T1 - Learning compact visual descriptors for low bit rate mobile landmark search

AU - Duan, Ling Yu

AU - Chen, Jie

AU - Ji, Rongrong

AU - Huang, Tiejun

AU - Gao, Wen

PY - 2013

Y1 - 2013

N2 - Along with the ever-growing computational power of mobile devices, mobile visual search has undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using a location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with offline learning from geotagged photos from online photosharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using a Gaussian mixture model and then boosting a ranking-sensitive vocabulary within each region, with "entropy"-based feedback on the compactness of the descriptor to refine both phases iteratively. In online search, when entering a geographical region, the code book in a mobile device is downstream adapted to generate extremely compact descriptors with promising discriminative ability. We have deployed landmark search apps to both HTC and iPhone mobile phones, accessing a database of a million scale images in typical areas like Beijing, New York, and Barcelona, and others. Our descriptor outperforms alternative compact descriptors (Chen et al. 2009; Chen et al., 2010; Chandrasekhar et al. 2009a; Chandrasekhar et al. 2009b) by significant margins. Beyond landmark search, this article will summarize the MPEG standarization progress of compact descriptor for visual search (CDVS) (Yuri et al. 2010; Yuri et al. 2011) toward application interoperability.

AB - Along with the ever-growing computational power of mobile devices, mobile visual search has undergone an evolution in techniques and applications. A significant trend is low bit rate visual search, where compact visual descriptors are extracted directly over a mobile and delivered as queries rather than raw images to reduce the query transmission latency. In this article, we introduce our work on low bit rate mobile landmark search, in which a compact yet discriminative landmark image descriptor is extracted by using a location context such as GPS, crowd-sourced hotspot WLAN, and cell tower locations. The compactness originates from the bag-of-words image representation, with offline learning from geotagged photos from online photosharing websites including Flickr and Panoramio. The learning process involves segmenting the landmark photo collection by discrete geographical regions using a Gaussian mixture model and then boosting a ranking-sensitive vocabulary within each region, with "entropy"-based feedback on the compactness of the descriptor to refine both phases iteratively. In online search, when entering a geographical region, the code book in a mobile device is downstream adapted to generate extremely compact descriptors with promising discriminative ability. We have deployed landmark search apps to both HTC and iPhone mobile phones, accessing a database of a million scale images in typical areas like Beijing, New York, and Barcelona, and others. Our descriptor outperforms alternative compact descriptors (Chen et al. 2009; Chen et al., 2010; Chandrasekhar et al. 2009a; Chandrasekhar et al. 2009b) by significant margins. Beyond landmark search, this article will summarize the MPEG standarization progress of compact descriptor for visual search (CDVS) (Yuri et al. 2010; Yuri et al. 2011) toward application interoperability.

UR - http://www.scopus.com/inward/record.url?scp=84883096582&partnerID=8YFLogxK

U2 - 10.1609/aimag.v34i2.2469

DO - 10.1609/aimag.v34i2.2469

M3 - Article

AN - SCOPUS:84883096582

SN - 0738-4602

VL - 34

SP - 67

EP - 85

JO - AI Magazine

JF - AI Magazine

IS - 2

ER -

Learning compact visual descriptors for low bit rate mobile landmark search

Abstract

Access to Document

Other files and links

Fingerprint

Cite this