Visuomotor Policy Learning for Task Automation of Surgical Robot

Junhui Huang; Qingxin Shi; Dongsheng Xie; Yiming Ma; Xiaoming Liu; Changsheng Li; Xingguang Duan

doi:10.1109/TMRB.2024.3464090

Visuomotor Policy Learning for Task Automation of Surgical Robot

Junhui Huang, Qingxin Shi, Dongsheng Xie, Yiming Ma, Xiaoming Liu, Changsheng Li^*, Xingguang Duan^*

^*Corresponding author for this work

School of Mechatronical Engineering

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

Abstract

With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.

Original language	English
Pages (from-to)	1448-1457
Number of pages	10
Journal	IEEE Transactions on Medical Robotics and Bionics
Volume	6
Issue number	4
DOIs	https://doi.org/10.1109/TMRB.2024.3464090
Publication status	Published - 2024

Keywords

Surgical robots
imitation learning
surgical task automation

Access to Document

10.1109/TMRB.2024.3464090

Cite this

Huang, J., Shi, Q., Xie, D., Ma, Y., Liu, X., Li, C., & Duan, X. (2024). Visuomotor Policy Learning for Task Automation of Surgical Robot. IEEE Transactions on Medical Robotics and Bionics, 6(4), 1448-1457. https://doi.org/10.1109/TMRB.2024.3464090

@article{9149dbc4676e4b7994ba5486a3a76c1a,

title = "Visuomotor Policy Learning for Task Automation of Surgical Robot",

abstract = "With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.",

keywords = "Surgical robots, imitation learning, surgical task automation",

author = "Junhui Huang and Qingxin Shi and Dongsheng Xie and Yiming Ma and Xiaoming Liu and Changsheng Li and Xingguang Duan",

note = "Publisher Copyright: {\textcopyright} 2018 IEEE.",

year = "2024",

doi = "10.1109/TMRB.2024.3464090",

language = "English",

volume = "6",

pages = "1448--1457",

journal = "IEEE Transactions on Medical Robotics and Bionics",

issn = "2576-3202",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "4",

}

TY - JOUR

T1 - Visuomotor Policy Learning for Task Automation of Surgical Robot

AU - Huang, Junhui

AU - Shi, Qingxin

AU - Xie, Dongsheng

AU - Ma, Yiming

AU - Liu, Xiaoming

AU - Li, Changsheng

AU - Duan, Xingguang

PY - 2024

Y1 - 2024

N2 - With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.

AB - With the increasing adoption of robotic surgery systems, the need for automated surgical tasks has become more pressing. Recent learning-based approaches provide solutions to surgical automation but typically rely on low-dimensional observations. To further imitate the actions of surgeons in an end-to-end paradigm, this paper introduces a novel visual-based approach to automating surgical tasks using generative imitation learning for robotic systems. We develop a hybrid model integrating state space models transformer, and conditional variational autoencoders (CVAE) to enhance performance and generalization called ACMT. The proposed model, leveraging the Mamba block and multi-head cross-attention mechanisms for sequential modeling, achieves a 75-100% success rate with just 100 demonstrations for most of the tasks. This work significantly advances data-driven automation in surgical robotics, aiming to alleviate the burden on surgeons and improve surgical outcomes.

KW - Surgical robots

KW - imitation learning

KW - surgical task automation

UR - http://www.scopus.com/inward/record.url?scp=85204697236&partnerID=8YFLogxK

U2 - 10.1109/TMRB.2024.3464090

DO - 10.1109/TMRB.2024.3464090

M3 - Article

AN - SCOPUS:85204697236

SN - 2576-3202

VL - 6

SP - 1448

EP - 1457

JO - IEEE Transactions on Medical Robotics and Bionics

JF - IEEE Transactions on Medical Robotics and Bionics

IS - 4

ER -

Visuomotor Policy Learning for Task Automation of Surgical Robot

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this