It Takes Two: Multi-frequency Perception with Complementary Fusion Network for Complex Scene Segmentation

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Complex scene segmentation aims to segment objects with intricate details or those concealed within the background. Despite significant advancements, a persistent challenge remains: accurately identifying object edges in backgrounds with high inherent similarity and complex structures. To address this, we identify the prevalent spectral bias in image segmentation, where networks preferentially learn low-frequency information, as a key impediment to recognizing and learning object edges, which are rich in high-frequency details. To mitigate this bias, we propose MCNet, a segmentation framework designed to promote balanced frequency learning. MCNet comprises two primary components: multi-frequency perception (MP), which independently captures high-frequency details and low-frequency structural components of objects, and complementary fusion (CF), which intelligently fuses these distinct frequency features through learnable, adaptive mechanisms. Crucially, MCNet employs a novel frequency-aware consistency adversarial loss to explicitly guide the learning across different frequency bands. MCNet effectively integrates MP and CF, enhancing the detection of high-frequency details and low-frequency structures, thereby alleviating challenges posed by spectral bias. We evaluate the proposed method on complex scene segmentation tasks, including camouflaged object detection and dichotomous image segmentation. Through extensive comparisons with 31 existing methods across 8 benchmark datasets, we demonstrate the superiority of the proposed method.

Original languageEnglish
JournalIEEE Transactions on Circuits and Systems for Video Technology
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Keywords

  • Complementary fusion
  • Complex scene segmentation
  • Multi-frequency perception
  • Spectral bias

Fingerprint

Dive into the research topics of 'It Takes Two: Multi-frequency Perception with Complementary Fusion Network for Complex Scene Segmentation'. Together they form a unique fingerprint.

Cite this