Superpixel-based Visual Feature Enhancement for Compositional Zero-Shot Learning

  • Wenlong Du
  • , Xianglin Bao
  • , Xiaofeng Xu*
  • , Xingyu Lu
  • , Ruiheng Zhang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Compositional Zero-Shot Learning (CZSL) is a challenging machine learning task that recognizes new compositional concepts by leveraging learned concepts such as attribute-object combinations. Previous research depended on visual attributes derived from networks pre-trained in object categorization. These approaches are limited in capturing the subtleties of attribute distinctions and fail to account for the critical contextual interactions between attributes and visual objects. To address this problem, in this work, we draw inspiration from superpixels and introduce the Superpixel-based Visual Feature Enhancement (SVFE) model for the compositional zero-shot learning task. In the proposed approach, an innovative superpixel integration strategy is designed to meticulously disentangle and represent the visual concepts of states and objects with finer granularity. Then, we introduce a novel Fourier spectral layer that harnesses the frequency domain to capture global image features and dynamically adjusts component contributions to enhance the local detail representation. Furthermore, we propose a long-range fusion module to optimize the synergy between the local and global features, thereby fortifying the model’s acuity in discerning intricate compositional relationships. Through rigorous experiments on standard CZSL benchmark datasets, the proposed SVFE model demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios.

Original languageEnglish
Article number104414
JournalInformation Processing and Management
Volume63
Issue number2
DOIs
Publication statusPublished - Mar 2026
Externally publishedYes

Keywords

  • Attention fusion
  • Attribute-object combinations
  • Compositional zero-shot learning
  • Fourier spectral layer
  • Superpixel segmentation

Fingerprint

Dive into the research topics of 'Superpixel-based Visual Feature Enhancement for Compositional Zero-Shot Learning'. Together they form a unique fingerprint.

Cite this