Prompt Tuning In a Compact Attribute Space

Shiyu Hou, Tianfei Zhou, Shuai Zhang, Ye Yuan*, Guoren Wang

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

Prompt tuning (PT) has emerged as a key to unlocking the power of visual-language models like CLIP for various downstream tasks. Predominant approaches learn a small set of task-relevant soft prompts by solving an image-class matching problem. Nevertheless, by optimizing merely with respect to class names, they face challenges in learning high performant prompts capable of capturing fine-grained, diverse characteristics of each class, and tends to overfit potentially biased distribution of base classes. In this work, we propose PTinCAS to tackle prompt tuning in a compact attribute space, driven by the premise that attributes offer detailed class interpretations and can facilitate transfer across related categories. Particularly, PTinCAS is grounded in two innovative designs. First, we create a compact attribute space by properly prompting large language models to generate factual descriptions about categories, which are subsequently clustered to form a concise attribute vocabulary. Second, we leverage attributes as a source of supervision in PT to transfer the inherent common sense knowledge in attributes to soft prompts. An object-aware visual prompting mechanism is developed to effortlessly highlight intended regions in the original image, which guides the model towards learning visual attributes associated with object regions rather than the background. We show that PTinCAS not only improves few-shot generalizability compared to existing PT methods, but also provides some level of inherent explainability that helps us understand why a class name is determined based on the attributes activated in an image.

Original languageEnglish
Pages (from-to)3518-3526
Number of pages9
JournalProceedings of the AAAI Conference on Artificial Intelligence
Volume39
Issue number4
DOIs
Publication statusPublished - 11 Apr 2025
Event39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States
Duration: 25 Feb 20254 Mar 2025

Fingerprint

Dive into the research topics of 'Prompt Tuning In a Compact Attribute Space'. Together they form a unique fingerprint.

Cite this