Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training

Jianwu Li; Kaiyue Shi; Guo Sen Xie; Xiaofeng Liu; Jian Zhang; Tianfei Zhou

doi:10.1609/aaai.v38i4.28094

Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training

Jianwu Li, Kaiyue Shi, Guo Sen Xie, Xiaofeng Liu, Jian Zhang, Tianfei Zhou^*

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 会议文章 › 同行评审

4 引用（Scopus）

摘要

The goal of this paper is to alleviate the training cost for few-shot semantic segmentation (FSS) models. Despite that FSS in nature improves model generalization to new concepts using only a handful of test exemplars, it relies on strong supervision from a considerable amount of labeled training data for base classes. However, collecting pixel-level annotations is notoriously expensive and time-consuming, and small-scale training datasets convey low information density that limits test-time generalization. To resolve the issue, we take a pioneering step towards label-efficient training of FSS models from fully unlabeled training data, or additionally a few labeled samples to enhance the performance. This motivates an approach based on a novel unsupervised meta-training paradigm. In particular, the approach first distills pre-trained unsupervised pixel embedding into compact semantic clusters from which a massive number of pseudo meta-tasks is constructed. To mitigate the noise in the pseudo meta-tasks, we further advocate a robust Transformer-based FSS model with a novel prototype-based cross-attention design. Extensive experiments have been conducted on two standard benchmarks, i.e., PASCAL-5ⁱ and COCO-20ⁱ, and the results show that our method produces impressive performance without any annotations, and is comparable to fully supervised competitors even using only 20% of the annotations. Our code is available at: https://github.com/SSSKYue/UMTFSS.

源语言	英语
页（从-至）	3109-3117
页数	9
期刊	Proceedings of the AAAI Conference on Artificial Intelligence
卷	38
期	4
DOI	https://doi.org/10.1609/aaai.v38i4.28094
出版状态	已出版 - 25 3月 2024
活动	38th AAAI Conference on Artificial Intelligence, AAAI 2024 - Vancouver, 加拿大期限: 20 2月 2024 → 27 2月 2024

访问文件

10.1609/aaai.v38i4.28094

其它文件与链接

链接到 Scopus 的出版物

引用此

Li, J., Shi, K., Xie, G. S., Liu, X., Zhang, J., & Zhou, T. (2024). Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3109-3117. https://doi.org/10.1609/aaai.v38i4.28094

@article{eb4563e420fd43ea96fe2cd6b28ac22c,

title = "Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training",

abstract = "The goal of this paper is to alleviate the training cost for few-shot semantic segmentation (FSS) models. Despite that FSS in nature improves model generalization to new concepts using only a handful of test exemplars, it relies on strong supervision from a considerable amount of labeled training data for base classes. However, collecting pixel-level annotations is notoriously expensive and time-consuming, and small-scale training datasets convey low information density that limits test-time generalization. To resolve the issue, we take a pioneering step towards label-efficient training of FSS models from fully unlabeled training data, or additionally a few labeled samples to enhance the performance. This motivates an approach based on a novel unsupervised meta-training paradigm. In particular, the approach first distills pre-trained unsupervised pixel embedding into compact semantic clusters from which a massive number of pseudo meta-tasks is constructed. To mitigate the noise in the pseudo meta-tasks, we further advocate a robust Transformer-based FSS model with a novel prototype-based cross-attention design. Extensive experiments have been conducted on two standard benchmarks, i.e., PASCAL-5i and COCO-20i, and the results show that our method produces impressive performance without any annotations, and is comparable to fully supervised competitors even using only 20% of the annotations. Our code is available at: https://github.com/SSSKYue/UMTFSS.",

author = "Jianwu Li and Kaiyue Shi and Xie, {Guo Sen} and Xiaofeng Liu and Jian Zhang and Tianfei Zhou",

note = "Publisher Copyright: Copyright {\textcopyright} 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 38th AAAI Conference on Artificial Intelligence, AAAI 2024 ; Conference date: 20-02-2024 Through 27-02-2024",

year = "2024",

month = mar,

day = "25",

doi = "10.1609/aaai.v38i4.28094",

language = "English",

volume = "38",

pages = "3109--3117",

journal = "Proceedings of the AAAI Conference on Artificial Intelligence",

issn = "2159-5399",

publisher = "Association for the Advancement of Artificial Intelligence",

number = "4",

}

TY - JOUR

T1 - Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training

AU - Li, Jianwu

AU - Shi, Kaiyue

AU - Xie, Guo Sen

AU - Liu, Xiaofeng

AU - Zhang, Jian

AU - Zhou, Tianfei

PY - 2024/3/25

Y1 - 2024/3/25

N2 - The goal of this paper is to alleviate the training cost for few-shot semantic segmentation (FSS) models. Despite that FSS in nature improves model generalization to new concepts using only a handful of test exemplars, it relies on strong supervision from a considerable amount of labeled training data for base classes. However, collecting pixel-level annotations is notoriously expensive and time-consuming, and small-scale training datasets convey low information density that limits test-time generalization. To resolve the issue, we take a pioneering step towards label-efficient training of FSS models from fully unlabeled training data, or additionally a few labeled samples to enhance the performance. This motivates an approach based on a novel unsupervised meta-training paradigm. In particular, the approach first distills pre-trained unsupervised pixel embedding into compact semantic clusters from which a massive number of pseudo meta-tasks is constructed. To mitigate the noise in the pseudo meta-tasks, we further advocate a robust Transformer-based FSS model with a novel prototype-based cross-attention design. Extensive experiments have been conducted on two standard benchmarks, i.e., PASCAL-5i and COCO-20i, and the results show that our method produces impressive performance without any annotations, and is comparable to fully supervised competitors even using only 20% of the annotations. Our code is available at: https://github.com/SSSKYue/UMTFSS.

AB - The goal of this paper is to alleviate the training cost for few-shot semantic segmentation (FSS) models. Despite that FSS in nature improves model generalization to new concepts using only a handful of test exemplars, it relies on strong supervision from a considerable amount of labeled training data for base classes. However, collecting pixel-level annotations is notoriously expensive and time-consuming, and small-scale training datasets convey low information density that limits test-time generalization. To resolve the issue, we take a pioneering step towards label-efficient training of FSS models from fully unlabeled training data, or additionally a few labeled samples to enhance the performance. This motivates an approach based on a novel unsupervised meta-training paradigm. In particular, the approach first distills pre-trained unsupervised pixel embedding into compact semantic clusters from which a massive number of pseudo meta-tasks is constructed. To mitigate the noise in the pseudo meta-tasks, we further advocate a robust Transformer-based FSS model with a novel prototype-based cross-attention design. Extensive experiments have been conducted on two standard benchmarks, i.e., PASCAL-5i and COCO-20i, and the results show that our method produces impressive performance without any annotations, and is comparable to fully supervised competitors even using only 20% of the annotations. Our code is available at: https://github.com/SSSKYue/UMTFSS.

UR - http://www.scopus.com/inward/record.url?scp=85189545664&partnerID=8YFLogxK

U2 - 10.1609/aaai.v38i4.28094

DO - 10.1609/aaai.v38i4.28094

M3 - Conference article

AN - SCOPUS:85189545664

SN - 2159-5399

VL - 38

SP - 3109

EP - 3117

JO - Proceedings of the AAAI Conference on Artificial Intelligence

JF - Proceedings of the AAAI Conference on Artificial Intelligence

IS - 4

T2 - 38th AAAI Conference on Artificial Intelligence, AAAI 2024

Y2 - 20 February 2024 through 27 February 2024

ER -

Label-Efficient Few-Shot Semantic Segmentation with Unsupervised Meta-Training

摘要

访问文件

其它文件与链接

指纹

引用此