TY - GEN
T1 - CapsNet based on Encoder and Decoder for Object Detection
AU - Luo, Man
AU - Wang, Xin
AU - Ma, Hongbin
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/13
Y1 - 2020/10/13
N2 - The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.
AB - The recently proposed capsule network (CapsNet) can learn the hierarchy relationships of entity features and realize the equivariance to affine transformations, which makes the capsule architecture more promising for object detection. In this paper, based on capsule architecture, we create the CapsNet-V1 models for object detection. The proposed CapsNetV1 mainly consists of the classification net as encoder to extract multi-class information and the reconstruction net as decoder to obtain masks with multi-object position information. In the experiments, based on the randomly expanded MNIST dataset, we simultaneously evaluate the multi-object classification and reconstruction abilities of the proposed CapsNet. The results indicate that our capsule models can reconstruct the object masks with accurate location information at correct labels, which exactly demonstrates the feasibility of using capsule networks for object detection. Further, our CapsNet can be widely applied to the multi-object detection with simple backgrounds in the industrial production lines.
KW - capsule networks
KW - classification encoder
KW - dynamic routing algorithm
KW - expanded MNIST dataset
KW - reconstruction decoder
UR - http://www.scopus.com/inward/record.url?scp=85096584767&partnerID=8YFLogxK
U2 - 10.1109/ICMA49215.2020.9233658
DO - 10.1109/ICMA49215.2020.9233658
M3 - Conference contribution
AN - SCOPUS:85096584767
T3 - 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020
SP - 1112
EP - 1117
BT - 2020 IEEE International Conference on Mechatronics and Automation, ICMA 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE International Conference on Mechatronics and Automation, ICMA 2020
Y2 - 13 October 2020 through 16 October 2020
ER -