Abstract
Recently the strategy of integrating instance mask prediction header into one-stage or two-stage object detector has been immensely popular for instance segmentation (e.g., RetinaMask or Mask R-CNN). This strategy notably improve the object detector at the meantime of learning to predict instance mask. In this paper, we introduce a Mask-aided R-CNN model with a flexible and multi-stage training protocol to address the problems of EAD2019 Challenge (a multi-class artefact detection in video endoscopy). The proposed training protocol aims to facilitate the implementation of this strategy for the detection task and segmentation task and to improve the detection and segmentation performance using pixel-level labeled samples with incomplete categories. This training protocol consists of three principal steps, of which the core part is augmenting the training set with soft pixel-level labels. The Mask-aided R-CNN is modified from Mask R-CNN by pruning its mask header to support training on pixel-level labeled samples with incomplete categories. We propose a simple yet effective ensemble method based on graph clique for object detectors to furtherly improve the detection performance. The ensemble method votes on graph cliques to fuse the detection results from different detectors. It produces robust detection results from different detectors. It produces robust detection results, which is quite important for clinical application. Extensive experiments on EAD2019 challenging dataset have demonstrated the effectiveness of our proposed ensemble Mask-aided R-CNN.As a result, we won the 1ST place in detection task of EAD2019 Challenge.
Original language | English |
---|---|
Journal | CEUR Workshop Proceedings |
Volume | 2366 |
Publication status | Published - 2019 |
Event | 2019 Challenge on Endoscopy Artefacts Detection: Multi-Class Artefact Detection in Video Endoscopy, EAD 2019 - Venice, Italy Duration: 8 Apr 2019 → … |
Keywords
- Ensemble
- Graph clique
- Mask-aided R-CNN
- Terms— soft label