Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation

Meijie Zhang; Jianwu Li; Tianfei Zhou

doi:10.1145/3503161.3547919

Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation

Meijie Zhang, Jianwu Li, Tianfei Zhou^*

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

4 引用（Scopus）

摘要

This paper solves the problem of learning image semantic segmentation using image-level supervision. The task is promising in terms of reducing annotation efforts, yet extremely challenging due to the difficulty to directly associate high-level concepts with low-level appearance. While current efforts handle each concept independently, we take a broader perspective to harvest implicit, holistic structures of semantic concepts, which express valuable prior knowledge for accurate concept grounding. This raises multi-granular semantic mining, a new formalism allowing flexible specification of complex relations in the label space. In particular, we propose a heterogeneous graph neural network (Hgnn) to model the heterogeneity of multi-granular semantics within a set of input images. The Hgnn consists of two types of sub-graphs: 1) an external graph characterizes the relations across different images to mine inter-image contexts; and for each image, 2) an internal graph is constructed to mine inter-class semantic dependencies within each individual image. Through heterogeneous graph learning, our Hgnn is able to land a comprehensive understanding of object patterns, leading to more accurate semantic concept grounding. Extensive experimental results show that Hgnn outperforms the current state-of-the-art approaches on the popular PASCAL VOC 2012 and COCO 2014 benchmarks. Our code is available at: https://github.com/maeve07/HGNN.git.

源语言	英语
主期刊名	MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia
出版商	Association for Computing Machinery, Inc
页	6019-6028
页数	10
ISBN（电子版）	9781450392037
DOI	https://doi.org/10.1145/3503161.3547919
出版状态	已出版 - 10 10月 2022
活动	30th ACM International Conference on Multimedia, MM 2022 - Lisboa, 葡萄牙期限: 10 10月 2022 → 14 10月 2022

出版系列

姓名	MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia

会议

会议	30th ACM International Conference on Multimedia, MM 2022
国家/地区	葡萄牙
市	Lisboa
时期	10/10/22 → 14/10/22

访问文件

10.1145/3503161.3547919

其它文件与链接

链接到 Scopus 的出版物

引用此

Zhang, M., Li, J., & Zhou, T. (2022). Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation. 在 MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia (页码 6019-6028). (MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3503161.3547919

@inproceedings{ae57b7bd5c784452acebc4fc0c067200,

title = "Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation",

abstract = "This paper solves the problem of learning image semantic segmentation using image-level supervision. The task is promising in terms of reducing annotation efforts, yet extremely challenging due to the difficulty to directly associate high-level concepts with low-level appearance. While current efforts handle each concept independently, we take a broader perspective to harvest implicit, holistic structures of semantic concepts, which express valuable prior knowledge for accurate concept grounding. This raises multi-granular semantic mining, a new formalism allowing flexible specification of complex relations in the label space. In particular, we propose a heterogeneous graph neural network (Hgnn) to model the heterogeneity of multi-granular semantics within a set of input images. The Hgnn consists of two types of sub-graphs: 1) an external graph characterizes the relations across different images to mine inter-image contexts; and for each image, 2) an internal graph is constructed to mine inter-class semantic dependencies within each individual image. Through heterogeneous graph learning, our Hgnn is able to land a comprehensive understanding of object patterns, leading to more accurate semantic concept grounding. Extensive experimental results show that Hgnn outperforms the current state-of-the-art approaches on the popular PASCAL VOC 2012 and COCO 2014 benchmarks. Our code is available at: https://github.com/maeve07/HGNN.git.",

keywords = "graph neural networks, weakly supervised semantic segmentation",

author = "Meijie Zhang and Jianwu Li and Tianfei Zhou",

note = "Publisher Copyright: {\textcopyright} 2022 ACM.; 30th ACM International Conference on Multimedia, MM 2022 ; Conference date: 10-10-2022 Through 14-10-2022",

year = "2022",

month = oct,

day = "10",

doi = "10.1145/3503161.3547919",

language = "English",

series = "MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia",

publisher = "Association for Computing Machinery, Inc",

pages = "6019--6028",

booktitle = "MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia",

}

Zhang, M, Li, J & Zhou, T 2022, Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation. 在 MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia. MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, 页码 6019-6028, 30th ACM International Conference on Multimedia, MM 2022, Lisboa, 葡萄牙, 10/10/22. https://doi.org/10.1145/3503161.3547919

Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation. / Zhang, Meijie; Li, Jianwu ; Zhou, Tianfei.
MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2022. 页码 6019-6028 (MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation

AU - Zhang, Meijie

AU - Li, Jianwu

AU - Zhou, Tianfei

PY - 2022/10/10

Y1 - 2022/10/10

N2 - This paper solves the problem of learning image semantic segmentation using image-level supervision. The task is promising in terms of reducing annotation efforts, yet extremely challenging due to the difficulty to directly associate high-level concepts with low-level appearance. While current efforts handle each concept independently, we take a broader perspective to harvest implicit, holistic structures of semantic concepts, which express valuable prior knowledge for accurate concept grounding. This raises multi-granular semantic mining, a new formalism allowing flexible specification of complex relations in the label space. In particular, we propose a heterogeneous graph neural network (Hgnn) to model the heterogeneity of multi-granular semantics within a set of input images. The Hgnn consists of two types of sub-graphs: 1) an external graph characterizes the relations across different images to mine inter-image contexts; and for each image, 2) an internal graph is constructed to mine inter-class semantic dependencies within each individual image. Through heterogeneous graph learning, our Hgnn is able to land a comprehensive understanding of object patterns, leading to more accurate semantic concept grounding. Extensive experimental results show that Hgnn outperforms the current state-of-the-art approaches on the popular PASCAL VOC 2012 and COCO 2014 benchmarks. Our code is available at: https://github.com/maeve07/HGNN.git.

AB - This paper solves the problem of learning image semantic segmentation using image-level supervision. The task is promising in terms of reducing annotation efforts, yet extremely challenging due to the difficulty to directly associate high-level concepts with low-level appearance. While current efforts handle each concept independently, we take a broader perspective to harvest implicit, holistic structures of semantic concepts, which express valuable prior knowledge for accurate concept grounding. This raises multi-granular semantic mining, a new formalism allowing flexible specification of complex relations in the label space. In particular, we propose a heterogeneous graph neural network (Hgnn) to model the heterogeneity of multi-granular semantics within a set of input images. The Hgnn consists of two types of sub-graphs: 1) an external graph characterizes the relations across different images to mine inter-image contexts; and for each image, 2) an internal graph is constructed to mine inter-class semantic dependencies within each individual image. Through heterogeneous graph learning, our Hgnn is able to land a comprehensive understanding of object patterns, leading to more accurate semantic concept grounding. Extensive experimental results show that Hgnn outperforms the current state-of-the-art approaches on the popular PASCAL VOC 2012 and COCO 2014 benchmarks. Our code is available at: https://github.com/maeve07/HGNN.git.

KW - graph neural networks

KW - weakly supervised semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85151091560&partnerID=8YFLogxK

U2 - 10.1145/3503161.3547919

DO - 10.1145/3503161.3547919

M3 - Conference contribution

AN - SCOPUS:85151091560

T3 - MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia

SP - 6019

EP - 6028

BT - MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia

PB - Association for Computing Machinery, Inc

T2 - 30th ACM International Conference on Multimedia, MM 2022

Y2 - 10 October 2022 through 14 October 2022

ER -

Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此