Video Object Segmentation with Dynamic Query Modulation

Hantao Zhou, Runze Hu*, Xiu Li*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Storing intermediate frame segmentations as memory for long-range context modeling, spatial-temporal memory-based methods have recently showcased impressive results in semi-supervised video object segmentation (SVOS). However, these methods face two key limitations: 1) relying on non-local pixel-level matching to read memory, resulting in noisy retrieved features for segmentation; 2) segmenting each object independently without interaction. These shortcomings make the memory-based methods struggle in similar object and multi-object segmentation. To address these issues, we propose a query modulation method, termed QMVOS. This method summarizes object features into dynamic queries and then treats them as dynamic filters for mask prediction, thereby providing high-level descriptions and object-level perception for the model. Efficient and effective multi-object interactions are realized through inter-query attention. Extensive experiments demonstrate that our method can bring significant improvements to the memory-based SVOS method and achieve competitive performance on standard SVOS benchmarks. The code is available at https://github.com/zht8506/QMVOS.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Multimedia and Expo, ICME 2024
PublisherIEEE Computer Society
ISBN (Electronic)9798350390155
DOIs
Publication statusPublished - 2024
Event2024 IEEE International Conference on Multimedia and Expo, ICME 2024 - Niagra Falls, Canada
Duration: 15 Jul 202419 Jul 2024

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2024 IEEE International Conference on Multimedia and Expo, ICME 2024
Country/TerritoryCanada
CityNiagra Falls
Period15/07/2419/07/24

Keywords

  • Memory bank
  • Object query
  • SVOS

Fingerprint

Dive into the research topics of 'Video Object Segmentation with Dynamic Query Modulation'. Together they form a unique fingerprint.

Cite this