Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning

Liulei Li, Tianfei Zhou, Wenguan Wang*, Lu Yang, Jianwu Li, Yi Yang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Citations (Scopus)

Abstract

Our target is to learn visual correspondence from unlabeled videos. We develop Liir, a locality-aware inter-and intra-video reconstruction method that fills in three missing pieces, i.e., instance discrimination, location awareness, and spatial compactness, of self-supervised correspondence learning puzzle. First, instead of most existing efforts focusing on intra-video self-supervision only, we exploit cross-video affinities as extra negative samples within a unified, inter-and intra-video reconstruction scheme. This enables instance discriminative representation learning by contrasting desired intra-video pixel association against negative inter-video correspondence. Second, we merge position information into correspondence matching, and design a position shifting strategy to remove the side-effect of position encoding during inter-video affinity computation, making our Liir location-sensitive. Third, to make full use of the spatial continuity nature of video data, we impose a compactness-based constraint on correspondence matching, yielding more sparse and reliable solutions. The learned representation surpasses self-supervised state-of-the-arts on label propagation tasks including objects, semantic parts, and keypoints.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
PublisherIEEE Computer Society
Pages8709-8720
Number of pages12
ISBN (Electronic)9781665469463
DOIs
Publication statusPublished - 2022
Event2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 - New Orleans, United States
Duration: 19 Jun 202224 Jun 2022

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2022-June
ISSN (Print)1063-6919

Conference

Conference2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
Country/TerritoryUnited States
CityNew Orleans
Period19/06/2224/06/22

Keywords

  • Motion and tracking
  • Segmentation
  • grouping and shape analysis

Fingerprint

Dive into the research topics of 'Locality-Aware Inter-and Intra-Video Reconstruction for Self-Supervised Correspondence Learning'. Together they form a unique fingerprint.

Cite this