Attention-based Deep Learning for Visual Servoing

Bo Wang, Yuan Li

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

The traditional image based visual servo(IBVS) system relies on manually extracted features, the process of estimating the feature Jacobian is very complicated and difficult. This paper proposes an approach of visual servo control which has an end-to-end structure. We find the excellent performance of convolutional neural network in classification and regression tasks and make use of it, attention mechanism is introduced and region of interest(ROI) is extracted through the feature extraction to strengthen the expression of feature information. The corresponding relationship between image space and pose space is successfully obtained. The dataset is collected by performing perspective transformation on the image. The proposed dual-stream network processes the input images which representing the current task pose and the desired task pose at the same time, and The command obtained from the network output can control the robot to complete the corresponding visual servo task. The proposed approach has achieved good results in the corresponding experimental scenes.

Original languageEnglish
Title of host publicationProceedings - 2020 Chinese Automation Congress, CAC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4388-4393
Number of pages6
ISBN (Electronic)9781728176871
DOIs
Publication statusPublished - 6 Nov 2020
Event2020 Chinese Automation Congress, CAC 2020 - Shanghai, China
Duration: 6 Nov 20208 Nov 2020

Publication series

NameProceedings - 2020 Chinese Automation Congress, CAC 2020

Conference

Conference2020 Chinese Automation Congress, CAC 2020
Country/TerritoryChina
CityShanghai
Period6/11/208/11/20

Keywords

  • CNN
  • ROI
  • attention mechanism
  • image based visual servo
  • robot

Fingerprint

Dive into the research topics of 'Attention-based Deep Learning for Visual Servoing'. Together they form a unique fingerprint.

Cite this