Self-Teaching Video Object Segmentation

Chuanwei Zhou, Chunyan Xu*, Zhen Cui, Tong Zhang, Jian Yang*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)

Abstract

Video object segmentation (VOS) is one of the most fundamental tasks for numerous sequent video applications. The crucial issue of online VOS is the drifting of segmenter when incrementally updated on continuous video frames under unconfident supervision constraints. In this work, we propose a self-teaching VOS (ST-VOS) method to make segmenter to learn online adaptation confidently as much as possible. In the segmenter learning at each time slice, the segment hypothesis and segmenter update are enclosed into a self-looping optimization circle such that they can be mutually improved for each other. To reduce error accumulation of the self-looping process, we specifically introduce a metalearning strategy to learn how to do this optimization within only a few iteration steps. To this end, the learning rates of segmenter are adaptively derived through metaoptimization in the channel space of convolutional kernels. Furthermore, to better launch the self-looping process, we calculate an initial mask map through part detectors and motion flow to well-establish a foundation for subsequent refinement, which could result in the robustness of the segmenter update. Extensive experiments demonstrate that this ST idea can boost the performance of baselines, and in the meantime, our ST-VOS achieves encouraging performance on the DAVIS16, Youtube-objects, DAVIS17, and SegTrackV2 data sets, where, in particular, the accuracy of 75.7% in J-mean metric is obtained on the multi-instance DAVIS17 data set.

Original languageEnglish
Pages (from-to)1623-1637
Number of pages15
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume33
Issue number4
DOIs
Publication statusPublished - 1 Apr 2022
Externally publishedYes

Keywords

  • Metaoptimization
  • Self-teaching (ST)
  • Video object segmentation (VOS)

Fingerprint

Dive into the research topics of 'Self-Teaching Video Object Segmentation'. Together they form a unique fingerprint.

Cite this