Embedded discriminative attention mechanism for weakly supervised semantic segmentation

Tong Wu; Junshi Huang; Guangyu Gao; Xiaoming Wei; Xiaolin Wei; Xuan Luo; Chi Harold Liu

doi:10.1109/CVPR46437.2021.01649

Embedded discriminative attention mechanism for weakly supervised semantic segmentation

Tong Wu, Junshi Huang, Guangyu Gao^*, Xiaoming Wei, Xiaolin Wei, Xuan Luo, Chi Harold Liu

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

125 引用（Scopus）

摘要

Weakly Supervised Semantic Segmentation (WSSS) with image-level annotation uses class activation maps from the classifier as pseudo-labels for semantic segmentation. However, such activation maps usually highlight the local discriminative regions rather than the whole object, which deviates from the requirement of semantic segmentation. To explore more comprehensive class-specific activation maps, we propose an Embedded Discriminative Attention Mechanism (EDAM) by integrating the activation map generation into the classification network directly for WSSS. Specifically, a Discriminative Activation (DA) layer is designed to explicitly produce a series of normalized class-specific masks, which are then used to generate class-specific pixel-level pseudo-labels demanded in segmentation. For learning the pseudo-labels, the masks are multiplied with the feature maps after the backbone to generate the discriminative activation maps, each of which encodes the specific information of the corresponding category in the input images. Given such class-specific activation maps, a Collaborative Multi-Attention (CMA) module is proposed to extract the collaborative information of each given category from images in a batch. In inference, we directly use the activation masks from the DA layer as pseudo-labels for segmentation. Based on the generated pseudo-labels, we achieve the mIoU of 70.60% on PASCAL VOC 2012 segmentation test-set, which is the new state-of-the-art, to our best knowledge. Code and pre-trained models are available online soon.

源语言	英语
主期刊名	Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
出版商	IEEE Computer Society
页	16760-16769
页数	10
ISBN（电子版）	9781665445092
DOI	https://doi.org/10.1109/CVPR46437.2021.01649
出版状态	已出版 - 2021
活动	2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 - Virtual, Online, 美国期限: 19 6月 2021 → 25 6月 2021

出版系列

姓名	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN（印刷版）	1063-6919

会议

会议	2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
国家/地区	美国
市	Virtual, Online
时期	19/06/21 → 25/06/21

访问文件

10.1109/CVPR46437.2021.01649

其它文件与链接

链接到 Scopus 的出版物

引用此

Wu, T., Huang, J., Gao, G., Wei, X., Wei, X., Luo, X., & Liu, C. H. (2021). Embedded discriminative attention mechanism for weakly supervised semantic segmentation. 在 Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 (页码 16760-16769). (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). IEEE Computer Society. https://doi.org/10.1109/CVPR46437.2021.01649

Wu, Tong ; Huang, Junshi ; Gao, Guangyu 等. / Embedded discriminative attention mechanism for weakly supervised semantic segmentation. Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society, 2021. 页码 16760-16769 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

@inproceedings{eeeb8b1e449f4b46901ba4d4a646a879,

title = "Embedded discriminative attention mechanism for weakly supervised semantic segmentation",

abstract = "Weakly Supervised Semantic Segmentation (WSSS) with image-level annotation uses class activation maps from the classifier as pseudo-labels for semantic segmentation. However, such activation maps usually highlight the local discriminative regions rather than the whole object, which deviates from the requirement of semantic segmentation. To explore more comprehensive class-specific activation maps, we propose an Embedded Discriminative Attention Mechanism (EDAM) by integrating the activation map generation into the classification network directly for WSSS. Specifically, a Discriminative Activation (DA) layer is designed to explicitly produce a series of normalized class-specific masks, which are then used to generate class-specific pixel-level pseudo-labels demanded in segmentation. For learning the pseudo-labels, the masks are multiplied with the feature maps after the backbone to generate the discriminative activation maps, each of which encodes the specific information of the corresponding category in the input images. Given such class-specific activation maps, a Collaborative Multi-Attention (CMA) module is proposed to extract the collaborative information of each given category from images in a batch. In inference, we directly use the activation masks from the DA layer as pseudo-labels for segmentation. Based on the generated pseudo-labels, we achieve the mIoU of 70.60% on PASCAL VOC 2012 segmentation test-set, which is the new state-of-the-art, to our best knowledge. Code and pre-trained models are available online soon.",

author = "Tong Wu and Junshi Huang and Guangyu Gao and Xiaoming Wei and Xiaolin Wei and Xuan Luo and Liu, {Chi Harold}",

note = "Publisher Copyright: {\textcopyright} 2021 IEEE; 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021 ; Conference date: 19-06-2021 Through 25-06-2021",

year = "2021",

doi = "10.1109/CVPR46437.2021.01649",

language = "English",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "16760--16769",

booktitle = "Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021",

address = "United States",

}

Wu, T, Huang, J, Gao, G, Wei, X, Wei, X, Luo, X & Liu, CH 2021, Embedded discriminative attention mechanism for weakly supervised semantic segmentation. 在 Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, 页码 16760-16769, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021, Virtual, Online, 美国, 19/06/21. https://doi.org/10.1109/CVPR46437.2021.01649

Embedded discriminative attention mechanism for weakly supervised semantic segmentation. / Wu, Tong; Huang, Junshi; Gao, Guangyu 等.
Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society, 2021. 页码 16760-16769 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Embedded discriminative attention mechanism for weakly supervised semantic segmentation

AU - Wu, Tong

AU - Huang, Junshi

AU - Gao, Guangyu

AU - Wei, Xiaoming

AU - Wei, Xiaolin

AU - Luo, Xuan

AU - Liu, Chi Harold

PY - 2021

Y1 - 2021

N2 - Weakly Supervised Semantic Segmentation (WSSS) with image-level annotation uses class activation maps from the classifier as pseudo-labels for semantic segmentation. However, such activation maps usually highlight the local discriminative regions rather than the whole object, which deviates from the requirement of semantic segmentation. To explore more comprehensive class-specific activation maps, we propose an Embedded Discriminative Attention Mechanism (EDAM) by integrating the activation map generation into the classification network directly for WSSS. Specifically, a Discriminative Activation (DA) layer is designed to explicitly produce a series of normalized class-specific masks, which are then used to generate class-specific pixel-level pseudo-labels demanded in segmentation. For learning the pseudo-labels, the masks are multiplied with the feature maps after the backbone to generate the discriminative activation maps, each of which encodes the specific information of the corresponding category in the input images. Given such class-specific activation maps, a Collaborative Multi-Attention (CMA) module is proposed to extract the collaborative information of each given category from images in a batch. In inference, we directly use the activation masks from the DA layer as pseudo-labels for segmentation. Based on the generated pseudo-labels, we achieve the mIoU of 70.60% on PASCAL VOC 2012 segmentation test-set, which is the new state-of-the-art, to our best knowledge. Code and pre-trained models are available online soon.

AB - Weakly Supervised Semantic Segmentation (WSSS) with image-level annotation uses class activation maps from the classifier as pseudo-labels for semantic segmentation. However, such activation maps usually highlight the local discriminative regions rather than the whole object, which deviates from the requirement of semantic segmentation. To explore more comprehensive class-specific activation maps, we propose an Embedded Discriminative Attention Mechanism (EDAM) by integrating the activation map generation into the classification network directly for WSSS. Specifically, a Discriminative Activation (DA) layer is designed to explicitly produce a series of normalized class-specific masks, which are then used to generate class-specific pixel-level pseudo-labels demanded in segmentation. For learning the pseudo-labels, the masks are multiplied with the feature maps after the backbone to generate the discriminative activation maps, each of which encodes the specific information of the corresponding category in the input images. Given such class-specific activation maps, a Collaborative Multi-Attention (CMA) module is proposed to extract the collaborative information of each given category from images in a batch. In inference, we directly use the activation masks from the DA layer as pseudo-labels for segmentation. Based on the generated pseudo-labels, we achieve the mIoU of 70.60% on PASCAL VOC 2012 segmentation test-set, which is the new state-of-the-art, to our best knowledge. Code and pre-trained models are available online soon.

UR - http://www.scopus.com/inward/record.url?scp=85113662596&partnerID=8YFLogxK

U2 - 10.1109/CVPR46437.2021.01649

DO - 10.1109/CVPR46437.2021.01649

M3 - Conference contribution

AN - SCOPUS:85113662596

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 16760

EP - 16769

BT - Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021

PB - IEEE Computer Society

T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021

Y2 - 19 June 2021 through 25 June 2021

ER -

Wu T, Huang J, Gao G, Wei X, Wei X, Luo X 等. Embedded discriminative attention mechanism for weakly supervised semantic segmentation. 在 Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021. IEEE Computer Society. 2021. 页码 16760-16769. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR46437.2021.01649

Embedded discriminative attention mechanism for weakly supervised semantic segmentation

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此