CapsNet based on Encoder and Decoder for Object Detection

Man Luo; Xin Wang; Hongbin Ma

doi:10.1109/ICMA49215.2020.9233658

CapsNet based on Encoder and Decoder for Object Detection

Man Luo, Xin Wang, Hongbin Ma

School of Automation

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.

Original language	English
Title of host publication	2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1112-1117
Number of pages	6
ISBN (Electronic)	9781728164151
DOIs	https://doi.org/10.1109/ICMA49215.2020.9233658
Publication status	Published - 13 Oct 2020
Event	17th IEEE International Conference on Mechatronics and Automation, ICMA 2020 - Beijing, China Duration: 13 Oct 2020 → 16 Oct 2020

Publication series

Name	2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020

Conference

Conference	17th IEEE International Conference on Mechatronics and Automation, ICMA 2020
Country/Territory	China
City	Beijing
Period	13/10/20 → 16/10/20

Keywords

capsule networks
classification encoder
dynamic routing algorithm
expanded MNIST dataset
reconstruction decoder

Access to Document

10.1109/ICMA49215.2020.9233658

Cite this

Luo, M., Wang, X., & Ma, H. (2020). CapsNet based on Encoder and Decoder for Object Detection. In 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020 (pp. 1112-1117). Article 9233658 (2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICMA49215.2020.9233658

@inproceedings{74627d700c534cbfa73eb7eb106de913,

title = "CapsNet based on Encoder and Decoder for Object Detection",

abstract = "The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.",

keywords = "capsule networks, classification encoder, dynamic routing algorithm, expanded MNIST dataset, reconstruction decoder",

author = "Man Luo and Xin Wang and Hongbin Ma",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 17th IEEE International Conference on Mechatronics and Automation, ICMA 2020 ; Conference date: 13-10-2020 Through 16-10-2020",

year = "2020",

month = oct,

day = "13",

doi = "10.1109/ICMA49215.2020.9233658",

language = "English",

series = "2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1112--1117",

booktitle = "2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020",

address = "United States",

}

Luo, M, Wang, X & Ma, H 2020, CapsNet based on Encoder and Decoder for Object Detection. in 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020., 9233658, 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020, Institute of Electrical and Electronics Engineers Inc., pp. 1112-1117, 17th IEEE International Conference on Mechatronics and Automation, ICMA 2020, Beijing, China, 13/10/20. https://doi.org/10.1109/ICMA49215.2020.9233658

CapsNet based on Encoder and Decoder for Object Detection. / Luo, Man; Wang, Xin; Ma, Hongbin.
2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020. Institute of Electrical and Electronics Engineers Inc., 2020. p. 1112-1117 9233658 (2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - CapsNet based on Encoder and Decoder for Object Detection

AU - Luo, Man

AU - Wang, Xin

AU - Ma, Hongbin

PY - 2020/10/13

Y1 - 2020/10/13

N2 - The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.

AB - The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.

KW - capsule networks

KW - classification encoder

KW - dynamic routing algorithm

KW - expanded MNIST dataset

KW - reconstruction decoder

UR - http://www.scopus.com/inward/record.url?scp=85096584767&partnerID=8YFLogxK

U2 - 10.1109/ICMA49215.2020.9233658

DO - 10.1109/ICMA49215.2020.9233658

M3 - Conference contribution

AN - SCOPUS:85096584767

T3 - 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020

SP - 1112

EP - 1117

BT - 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 17th IEEE International Conference on Mechatronics and Automation, ICMA 2020

Y2 - 13 October 2020 through 16 October 2020

ER -

CapsNet based on Encoder and Decoder for Object Detection

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this