A novel compositional zero-shot learning approach based on hierarchical multi-scale feature fusion

  • Wenlong Du
  • , Xianglin Bao
  • , Wei Zhao
  • , Xiaofeng Xu*
  • , Xingyu Lu
  • , Ruiheng Zhang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Compositional Zero-Shot Learning (CZSL) aims to recognize novel combinations of attributes and objects with pre-existing concepts. Most of existing CZSL methods encounter substantial challenges in managing the complex interplay between attributes and objects, particularly when compositions vary in subtle visual details or scale. Inspired by the point-wise convolution, in this work, we propose a novel Hierarchical Multi-Scale Feature Fusion approach for the compositional zero-shot learning task. The proposed CZSL approach incorporates a patch-aware feature selection mechanism to select informative patches from images, enhancing the model's ability to capture fine-grained details. Subsequently, we design a hierarchical multi-scale feature fusion strategy that combines visual features from multiple scales, allowing the model to integrate local and global information effectively. The proposed feature fusion strategy works by enhancing the model's ability to disentangle attributes and objects, thereby facilitating improved recognition of novel compositions. The proposed feature fusion strategy works by enhancing the model's ability to disentangle attributes and objects for improved recognition of novel compositions. Through extensive experiments on standard CZSL benchmark datasets, the proposed approach demonstrates significant improvement over other state-of-the-art methods in both open-world and closed-world CZSL scenarios. This study not only improves accuracy and robustness in the compositional zero-shot learning task but also provides solutions for complex visual tasks in image understanding and robotics, promoting Artificial Intelligence development in visual data and semantic understanding.

Original languageEnglish
Article number111633
JournalEngineering Applications of Artificial Intelligence
Volume159
DOIs
Publication statusPublished - 8 Nov 2025
Externally publishedYes

Keywords

  • Attribute-object combinations
  • Compositional zero-shot learning
  • Multi-scale feature fusion
  • Patch-aware feature selection

Fingerprint

Dive into the research topics of 'A novel compositional zero-shot learning approach based on hierarchical multi-scale feature fusion'. Together they form a unique fingerprint.

Cite this