CapsNet based on Encoder and Decoder for Object Detection

Man Luo, Xin Wang, Hongbin Ma

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1112-1117
Number of pages6
ISBN (Electronic)9781728164151
DOIs
Publication statusPublished - 13 Oct 2020
Event17th IEEE International Conference on Mechatronics and Automation, ICMA 2020 - Beijing, China
Duration: 13 Oct 202016 Oct 2020

Publication series

Name2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020

Conference

Conference17th IEEE International Conference on Mechatronics and Automation, ICMA 2020
Country/TerritoryChina
CityBeijing
Period13/10/2016/10/20

Keywords

  • capsule networks
  • classification encoder
  • dynamic routing algorithm
  • expanded MNIST dataset
  • reconstruction decoder

Fingerprint

Dive into the research topics of 'CapsNet based on Encoder and Decoder for Object Detection'. Together they form a unique fingerprint.

Cite this