跳到主要导航 跳到搜索 跳到主要内容

SynerNet: Broad-to-precise CAM synergy for weakly supervised semantic segmentation

  • Zhonggai Wang
  • , Guangyu Gao*
  • , Zhuoshu Li
  • , A. K. Qin
  • *此作品的通讯作者
  • Beijing Institute of Technology
  • Swinburne University of Technology

科研成果: 期刊稿件文章同行评审

摘要

Weakly Supervised Semantic Segmentation (WSSS) remains highly challenging because image-level supervision typically produces class activation maps (CAMs) that are incomplete or noisy when used as pixel-level pseudo-labels. Despite the architectural efficiency of one-stage approaches, they are often hindered by tight encoder-label coupling: CAMs and segmentation predictions are derived from the same encoder and optimized jointly, leading to the propagation and reinforcement of initial CAM inaccuracies by the segmentation outputs. To circumvent this limitation, we propose SynerNet, a one-stage dual-branch framework that explicitly mandates complementary yet synergistic objectives: one branch generates broad pseudo-labels to enhance coverage, while the other produces precise pseudo-labels to sharpen localization. With such pseudo-labels, the segmentation network yields simultaneously comprehensive and accurate predictions. The broad branch (B-CAM) leverages global attention to expand foreground coverage by guiding ambiguous regions toward likely foreground, whereas the precise branch (P-CAM) emphasizes fine localization by encouraging unreliable pixels toward the background. Through cross-supervision, the two branches effectively decouple the optimization process, alleviating the risk of error reinforcement inherent in direct coupling. To further integrate their strengths, we introduce a confidence matrix derived from multi-scale ViT features, in which pixels consistently classified across layers are treated as high-confidence, while inconsistent ones are marked as uncertain. This enables a confidence-guided fusion strategy that directly adopts reliable predictions and adaptively blends uncertain regions using contributions from both branches. Such a complementary design mitigates error reinforcement and promotes mutually beneficial learning, enabling the network to generate high-fidelity pseudo-labels in a fully end-to-end manner. By combining branch-specific objectives with confidence-guided fusion, SynerNet produces pseudo-labels that are both complete and precise, achieves state-of-the-art performance on PASCAL VOC 2012 and COCO 2014, and demonstrates the effectiveness of one-stage co-training for high-quality weakly supervised segmentation. The code is publicly available at: https://github.com/ZhonggaiWang/DEFormer.

源语言英语
文章编号109024
期刊Neural Networks
202
DOI
出版状态已出版 - 10月 2026
已对外发布

指纹

探究 'SynerNet: Broad-to-precise CAM synergy for weakly supervised semantic segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此