Learning the Relation Between Similarity Loss and Clustering Loss in Self-Supervised Learning

Jidong Ge, Yuxiang Liu, Jie Gui*, Lanting Fang, Ming Lin, James Tin Yau Kwok, Liguo Huang, Bin Luo

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Self-supervised learning enables networks to learn discriminative features from massive data itself. Most state-of-the-art methods maximize the similarity between two augmentations of one image based on contrastive learning. By utilizing the consistency of two augmentations, the burden of manual annotations can be freed. Contrastive learning exploits instance-level information to learn robust features. However, the learned information is probably confined to different views of the same instance. In this paper, we attempt to leverage the similarity between two distinct images to boost representation in self-supervised learning. In contrast to instance-level information, the similarity between two distinct images may provide more useful information. Besides, we analyze the relation between similarity loss and feature-level cross-entropy loss. These two losses are essential for most deep learning methods. However, the relation between these two losses is not clear. Similarity loss helps obtain instance-level representation, while feature-level cross-entropy loss helps mine the similarity between two distinct images. We provide theoretical analyses and experiments to show that a suitable combination of these two losses can get state-of-the-art results. Code is available at https://github.com/guijiejie/ICCL.

Original languageEnglish
Pages (from-to)3442-3454
Number of pages13
JournalIEEE Transactions on Image Processing
Volume32
DOIs
Publication statusPublished - 2023
Externally publishedYes

Keywords

  • Self-supervised learning
  • image classification
  • image representation

Fingerprint

Dive into the research topics of 'Learning the Relation Between Similarity Loss and Clustering Loss in Self-Supervised Learning'. Together they form a unique fingerprint.

Cite this