Towards low bit rate mobile visual search with multiple-channel coding

Rongrong Ji; Ling Yu Duan; Jie Chen; Hongxun Yao; Yong Rui; Shih Fu Chang; Wen Gao

doi:10.1145/2072298.2072372

Towards low bit rate mobile visual search with multiple-channel coding

Rongrong Ji^*, Ling Yu Duan, Jie Chen, Hongxun Yao, Yong Rui, Shih Fu Chang, Wen Gao

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

46 Citations (Scopus)

Abstract

In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.

Original language	English
Title of host publication	MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops
Pages	573-582
Number of pages	10
DOIs	https://doi.org/10.1145/2072298.2072372
Publication status	Published - 2011
Externally published	Yes
Event	19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11 - Scottsdale, AZ, United States Duration: 28 Nov 2011 → 1 Dec 2011

Publication series

Name	MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops

Conference

Conference	19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11
Country/Territory	United States
City	Scottsdale, AZ
Period	28/11/11 → 1/12/11

Keywords

Compact descriptor
Contextual learning
Data compression
Mobile visual search
Wireless communication

Access to Document

10.1145/2072298.2072372

Cite this

Ji, R., Duan, L. Y., Chen, J., Yao, H., Rui, Y., Chang, S. F., & Gao, W. (2011). Towards low bit rate mobile visual search with multiple-channel coding. In MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops (pp. 573-582). (MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops). https://doi.org/10.1145/2072298.2072372

@inproceedings{6595f57582f24fbaba9605f8ccba98c3,

title = "Towards low bit rate mobile visual search with multiple-channel coding",

abstract = "In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.",

keywords = "Compact descriptor, Contextual learning, Data compression, Mobile visual search, Wireless communication",

author = "Rongrong Ji and Duan, {Ling Yu} and Jie Chen and Hongxun Yao and Yong Rui and Chang, {Shih Fu} and Wen Gao",

year = "2011",

doi = "10.1145/2072298.2072372",

language = "English",

isbn = "9781450306164",

series = "MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops",

pages = "573--582",

booktitle = "MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops",

note = "19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11 ; Conference date: 28-11-2011 Through 01-12-2011",

}

Ji, R, Duan, LY, Chen, J, Yao, H, Rui, Y, Chang, SF & Gao, W 2011, Towards low bit rate mobile visual search with multiple-channel coding. in MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops. MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops, pp. 573-582, 19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11, Scottsdale, AZ, United States, 28/11/11. https://doi.org/10.1145/2072298.2072372

Towards low bit rate mobile visual search with multiple-channel coding. / Ji, Rongrong; Duan, Ling Yu; Chen, Jie et al.
MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops. 2011. p. 573-582 (MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Towards low bit rate mobile visual search with multiple-channel coding

AU - Ji, Rongrong

AU - Duan, Ling Yu

AU - Chen, Jie

AU - Yao, Hongxun

AU - Rui, Yong

AU - Chang, Shih Fu

AU - Gao, Wen

PY - 2011

Y1 - 2011

N2 - In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.

AB - In this paper, we propose a multiple-channel coding scheme to extract compact visual descriptors for low bit rate mobile visual search. Different from previous visual search scenarios that send the query image, we make use of the ever growing mobile computational capability to directly extract compact visual descriptors at the mobile end. Meanwhile, stepping forward from the state-of-the-art compact descriptor extractions, we exploit the rich contextual cues at the mobile end (such as GPS tags for mobile visual search and 2D barcodes or RFID tags for mobile product search), together with the visual statistics at the reference database, to learn multiple coding channels. Therefore, we describe the query with one of many forms of high-dimensional visual signature, which is subsequently mapped to one or more channels and compressed. The compression function within each channel is learnt based on a novel robust PCA scheme, with specific consideration to preserve the retrieval ranking capability of the original signature. We have deployed our scheme on both iPhone4 and HTC DESIRE 7 to search ten million landmark images in a low bit rate setting. Quantitative comparisons to the state-of-the-arts demonstrate our significant advantages in descriptor compactness (with orders of magnitudes improvement) and retrieval mAP in mobile landmark, product, and CD/book cover search.

KW - Compact descriptor

KW - Contextual learning

KW - Data compression

KW - Mobile visual search

KW - Wireless communication

UR - http://www.scopus.com/inward/record.url?scp=84455212226&partnerID=8YFLogxK

U2 - 10.1145/2072298.2072372

DO - 10.1145/2072298.2072372

M3 - Conference contribution

AN - SCOPUS:84455212226

SN - 9781450306164

T3 - MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops

SP - 573

EP - 582

BT - MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops

T2 - 19th ACM International Conference on Multimedia ACM Multimedia 2011, MM'11

Y2 - 28 November 2011 through 1 December 2011

ER -

Towards low bit rate mobile visual search with multiple-channel coding

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this