Viewpoint estimation for objects with convolutional neural network trained on synthetic images

Yumeng Wang, Shuyang Li, Mengyao Jia, Wei Liang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

In this paper, we propose a method to estimate object viewpoint from a single RGB image and address two problems in estimation: generating training data with viewpoint annotations and extracting powerful features for the estimation. We first collect 1780 high quality 3D CAD object models of 3 categories. Then we generate a synthetic RGB image dataset with viewpoint annotations, in which each image is generated by placing one model in a realistic panorama scene and rendering the model with a random camera parameters. We train a CNN model on our synthetic dataset to predict the object viewpoint. The proposed method is evaluated on PASCAL 3D+ dataset and our synthetic dataset. The experiment results show good performance.

Original languageEnglish
Title of host publicationAdvances in Multimedia Information Processing – 17th Pacific-Rim Conference on Multimedia, PCM 2016, Proceedings
EditorsEnqing Chen, Yun Tie, Yihong Gong
PublisherSpringer Verlag
Pages169-179
Number of pages11
ISBN (Print)9783319488950
DOIs
Publication statusPublished - 2016
Event17th Pacific-Rim Conference on Multimedia, PCM 2016 - Xi’an, China
Duration: 15 Sept 201616 Sept 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9917 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th Pacific-Rim Conference on Multimedia, PCM 2016
Country/TerritoryChina
CityXi’an
Period15/09/1616/09/16

Keywords

  • Convolutional neural network
  • Panorama scene rendering
  • Synthetic image
  • Viewpoint estimation

Fingerprint

Dive into the research topics of 'Viewpoint estimation for objects with convolutional neural network trained on synthetic images'. Together they form a unique fingerprint.

Cite this