TY - GEN
T1 - Attention-enhanced Relation Network for Few-shot Image Classification
AU - Li, Jinyang
AU - Tong, Jiahui
AU - Gao, Guangyu
AU - Xu, Wenbin
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/1/6
Y1 - 2023/1/6
N2 - Traditional deep learning models firmly rely on a large amount of labeled data during pre-training. Whereas it lacks generalization in the face of unfamiliar categories. Recently, few-shot learning is a hot topic in computer vision to classify unseen classes with limited labels. A representative approach is to extract features from the support and query sets, respectively, and compare similarities via metric learning. However, convolutional neural networks often focus only on a local region and ignore the global region, which severely reduces the accuracy of the matching. Specifically, in this paper, we pile lightweight attention-based blocks in the embedding module, which combines an adaptive kernel size 2D convolutional network with a cross-channel attention mechanism to encode multi-scale features and implicitly increase the receptive field. The SE-relation module chooses to construct learnable non-linear comparators to compare the relationship utilizing channel information. Finally, we show experimental results on standard few-shot testing benchmarks such as mini-ImageNet and tiered-ImageNet to demonstrate effectiveness.
AB - Traditional deep learning models firmly rely on a large amount of labeled data during pre-training. Whereas it lacks generalization in the face of unfamiliar categories. Recently, few-shot learning is a hot topic in computer vision to classify unseen classes with limited labels. A representative approach is to extract features from the support and query sets, respectively, and compare similarities via metric learning. However, convolutional neural networks often focus only on a local region and ignore the global region, which severely reduces the accuracy of the matching. Specifically, in this paper, we pile lightweight attention-based blocks in the embedding module, which combines an adaptive kernel size 2D convolutional network with a cross-channel attention mechanism to encode multi-scale features and implicitly increase the receptive field. The SE-relation module chooses to construct learnable non-linear comparators to compare the relationship utilizing channel information. Finally, we show experimental results on standard few-shot testing benchmarks such as mini-ImageNet and tiered-ImageNet to demonstrate effectiveness.
KW - attention modules
KW - few-shot learning
KW - image classification
UR - http://www.scopus.com/inward/record.url?scp=85153329821&partnerID=8YFLogxK
U2 - 10.1145/3582649.3582661
DO - 10.1145/3582649.3582661
M3 - Conference contribution
AN - SCOPUS:85153329821
T3 - ACM International Conference Proceeding Series
SP - 197
EP - 203
BT - ICIGP 2023 - Proceedings of the 6th International Conference on Image and Graphics Processing
PB - Association for Computing Machinery
T2 - 6th International Conference on Image and Graphics Processing, ICIGP 2023
Y2 - 6 January 2023 through 8 January 2023
ER -