TY - JOUR
T1 - Image-similarity-based convolutional neural network for robot visual relocalization
AU - Wang, Li
AU - Li, Ruifeng
AU - Sun, Jingwen
AU - Seah, Hock Soon
AU - Quah, Chee Kwang
AU - Zhao, Lijun
AU - Tandianus, Budianto
N1 - Publisher Copyright:
© MYU K.K.
PY - 2020/4/10
Y1 - 2020/4/10
N2 - Convolutional neural network (CNN)-based methods, which train an end-to-end model to regress a six degree of freedom (DoF) pose of a robot from a single red–green–blue (RGB) image, have been developed to overcome the poor robustness of robot visual relocalization recently. However, the pose precision becomes low when the test image is dissimilar to training images. In this paper, we propose a novel method, named image-similarity-based CNN, which considers the image similarity of an input image during the CNN training. The higher the similarity of the input image, the higher precision we can achieve. Therefore, we crop the input image into several small image blocks, and the similarity between each cropped image block and training dataset images is measured by employing a feature vector in a fully connected CNN layer. Finally, the most similar image is selected to regress the pose. A genetic algorithm is utilized to determine the cropped position. Experiments on both open-source dataset 7-Scenes and two actual indoor environments are conducted. The results show that the proposed algorithm leads to better results and reduces large regression errors effectively compared with existing solutions.
AB - Convolutional neural network (CNN)-based methods, which train an end-to-end model to regress a six degree of freedom (DoF) pose of a robot from a single red–green–blue (RGB) image, have been developed to overcome the poor robustness of robot visual relocalization recently. However, the pose precision becomes low when the test image is dissimilar to training images. In this paper, we propose a novel method, named image-similarity-based CNN, which considers the image similarity of an input image during the CNN training. The higher the similarity of the input image, the higher precision we can achieve. Therefore, we crop the input image into several small image blocks, and the similarity between each cropped image block and training dataset images is measured by employing a feature vector in a fully connected CNN layer. Finally, the most similar image is selected to regress the pose. A genetic algorithm is utilized to determine the cropped position. Experiments on both open-source dataset 7-Scenes and two actual indoor environments are conducted. The results show that the proposed algorithm leads to better results and reduces large regression errors effectively compared with existing solutions.
KW - CNN
KW - Image similarity
KW - Visual relocalization
UR - http://www.scopus.com/inward/record.url?scp=85084051492&partnerID=8YFLogxK
U2 - 10.18494/SAM.2020.2549
DO - 10.18494/SAM.2020.2549
M3 - Article
AN - SCOPUS:85084051492
SN - 0914-4935
VL - 32
SP - 1245
EP - 1259
JO - Sensors and Materials
JF - Sensors and Materials
IS - 4
ER -