Prototypical Contrast and Reverse Prediction: Unsupervised Skeleton Based Action Recognition

Shihao Xu, Haocong Rao, Xiping Hu*, Jun Cheng*, Bin Hu*

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

25 引用 (Scopus)

摘要

We focus on unsupervised representation learning for skeleton based action recognition. Existing unsupervised approaches usually learn action representations by motion prediction but they lack the ability to fully learn inherent semantic similarity. In this paper, we propose a novel framework named Prototypical Contrast and Reverse Prediction (PCRP) to address this challenge. Different from plain motion prediction, PCRP performs reverse motion prediction based on encoder-decoder structure to extract more discriminative temporal pattern, and derives action prototypes by clustering to explore the inherent action similarity within the action encoding. Specifically, we regard action prototypes as latent variables and formulate PCRP as an expectation-maximization (EM) task. PCRP iteratively runs (1) E-step as to determine the distribution of action prototypes by clustering action encoding from the encoder while estimating concentration around prototypes, and (2) M-step as optimizing the model by minimizing the proposed ProtoMAE loss, which helps simultaneously pull the action encoding closer to its assigned prototype by contrastive learning and perform reverse motion prediction task. Besides, the sorting can also serve as a temporal task similar as reverse prediction in the proposed framework. Extensive experiments on N-UCLA, NTU 60, and NTU 120 dataset present that PCRP outperforms main stream unsupervised methods and even achieves superior performance over many supervised methods. The codes are available at: https://github.com/LZUSIAT/PCRP.

源语言英语
页(从-至)624-634
页数11
期刊IEEE Transactions on Multimedia
25
DOI
出版状态已出版 - 2023

指纹

探究 'Prototypical Contrast and Reverse Prediction: Unsupervised Skeleton Based Action Recognition' 的科研主题。它们共同构成独一无二的指纹。

引用此