ACT-Net: Anchor-Context Action Detection in Surgery Videos

Luoying Hao, Yan Hu, Wenjun Lin, Qun Wang, Heng Li, Huazhu Fu, Jinming Duan*, Jiang Liu

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

2 引用 (Scopus)

摘要

Recognition and localization of surgical detailed actions is an essential component of developing a context-aware decision support system. However, most existing detection algorithms fail to provide high-accuracy action classes even having their locations, as they do not consider the surgery procedure’s regularity in the whole video. This limitation hinders their application. Moreover, implementing the predictions in clinical applications seriously needs to convey model confidence to earn entrustment, which is unexplored in surgical action prediction. In this paper, to accurately detect fine-grained actions that happen at every moment, we propose an anchor-context action detection network (ACTNet), including an anchor-context detection (ACD) module and a class conditional diffusion (CCD) module, to answer the following questions: 1) where the actions happen; 2) what actions are; 3) how confidence predictions are. Specifically, the proposed ACD module spatially and temporally highlights the regions interacting with the extracted anchor in surgery video, which outputs action location and its class distribution based on anchor-context interactions. Considering the full distribution of action classes in videos, the CCD module adopts a denoising diffusion-based generative model conditioned on our ACD estimator to further reconstruct accurately the action predictions. Moreover, we utilize the stochastic nature of the diffusion model outputs to access model confidence for each prediction. Our method reports the state-of-the-art performance, with improvements of 4.0% mAP against baseline on the surgical video dataset.

源语言英语
主期刊名Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 - 26th International Conference, Proceedings
编辑Hayit Greenspan, Hayit Greenspan, Anant Madabhushi, Parvin Mousavi, Septimiu Salcudean, James Duncan, Tanveer Syeda-Mahmood, Russell Taylor
出版商Springer Science and Business Media Deutschland GmbH
196-206
页数11
ISBN(印刷版)9783031439957
DOI
出版状态已出版 - 2023
已对外发布
活动26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023 - Vancouver, 加拿大
期限: 8 10月 202312 10月 2023

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
14228 LNCS
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议26th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2023
国家/地区加拿大
Vancouver
时期8/10/2312/10/23

指纹

探究 'ACT-Net: Anchor-Context Action Detection in Surgery Videos' 的科研主题。它们共同构成独一无二的指纹。

引用此