SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification

Haocong Rao; Xiping Hu; Jun Cheng; Bin Hu

doi:10.1145/3474085.3475330

SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification

Haocong Rao, Xiping Hu^*, Jun Cheng, Bin Hu

^*Corresponding author for this work

School of Medical and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

12 Citations (Scopus)

Abstract

Person re-identification via 3D skeletons is an emerging topic with great potential in security-critical applications. Existing methods typically learn body and motion features from the body-joint trajectory, whereas they lack a systematic way to model body structure and underlying relations of body components beyond the scale of body joints. In this paper, we for the first time propose a Self-supervised Multi-scale Skeleton Graph Encoding (SM-SGE) framework that comprehensively models human body, component relations, and skeleton dynamics from unlabeled skeleton graphs of various scales to learn an effective skeleton representation for person Re-ID. Specifically, we first devise multi-scale skeleton graphs with coarse-to-fine human body partitions, which enables us to model body structure and skeleton dynamics at multiple levels. Second, to mine inherent correlations between body components in skeletal motion, we propose a multi-scale graph relation network to learn structural relations between adjacent body-component nodes and collaborative relations among nodes of different scales, so as to capture more discriminative skeleton graph features. Last, we propose a novel multi-scale skeleton reconstruction mechanism to enable our framework to encode skeleton dynamics and high-level semantics from unlabeled skeleton graphs, which encourages learning a discriminative skeleton representation for person Re-ID. Extensive experiments show that SM-SGE outperforms most state-of-the-art skeleton-based methods. We further demonstrate its effectiveness on 3D skeleton data estimated from large-scale RGB videos. Our codes are open at https://github.com/Kali-Hac/SM-SGE.

Original language	English
Title of host publication	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia
Publisher	Association for Computing Machinery, Inc
Pages	1812-1820
Number of pages	9
ISBN (Electronic)	9781450386517
DOIs	https://doi.org/10.1145/3474085.3475330
Publication status	Published - 17 Oct 2021
Event	29th ACM International Conference on Multimedia, MM 2021 - Virtual, Online, China Duration: 20 Oct 2021 → 24 Oct 2021

Publication series

Name	MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

Conference

Conference	29th ACM International Conference on Multimedia, MM 2021
Country/Territory	China
City	Virtual, Online
Period	20/10/21 → 24/10/21

Keywords

multi-scale skeleton graph encoding
self-supervised representation learning
skeleton-based person re-identification

Access to Document

10.1145/3474085.3475330

Cite this

Rao, H., Hu, X., Cheng, J., & Hu, B. (2021). SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification. In MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia (pp. 1812-1820). (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3474085.3475330

Rao, Haocong ; Hu, Xiping ; Cheng, Jun et al. / SM-SGE : A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification. MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2021. pp. 1812-1820 (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia).

@inproceedings{20dd965ea18b40a3a0b68bd731071135,

title = "SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification",

abstract = "Person re-identification via 3D skeletons is an emerging topic with great potential in security-critical applications. Existing methods typically learn body and motion features from the body-joint trajectory, whereas they lack a systematic way to model body structure and underlying relations of body components beyond the scale of body joints. In this paper, we for the first time propose a Self-supervised Multi-scale Skeleton Graph Encoding (SM-SGE) framework that comprehensively models human body, component relations, and skeleton dynamics from unlabeled skeleton graphs of various scales to learn an effective skeleton representation for person Re-ID. Specifically, we first devise multi-scale skeleton graphs with coarse-to-fine human body partitions, which enables us to model body structure and skeleton dynamics at multiple levels. Second, to mine inherent correlations between body components in skeletal motion, we propose a multi-scale graph relation network to learn structural relations between adjacent body-component nodes and collaborative relations among nodes of different scales, so as to capture more discriminative skeleton graph features. Last, we propose a novel multi-scale skeleton reconstruction mechanism to enable our framework to encode skeleton dynamics and high-level semantics from unlabeled skeleton graphs, which encourages learning a discriminative skeleton representation for person Re-ID. Extensive experiments show that SM-SGE outperforms most state-of-the-art skeleton-based methods. We further demonstrate its effectiveness on 3D skeleton data estimated from large-scale RGB videos. Our codes are open at https://github.com/Kali-Hac/SM-SGE.",

keywords = "multi-scale skeleton graph encoding, self-supervised representation learning, skeleton-based person re-identification",

author = "Haocong Rao and Xiping Hu and Jun Cheng and Bin Hu",

note = "Publisher Copyright: {\textcopyright} 2021 ACM.; 29th ACM International Conference on Multimedia, MM 2021 ; Conference date: 20-10-2021 Through 24-10-2021",

year = "2021",

month = oct,

day = "17",

doi = "10.1145/3474085.3475330",

language = "English",

series = "MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia",

publisher = "Association for Computing Machinery, Inc",

pages = "1812--1820",

booktitle = "MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia",

}

Rao, H, Hu, X, Cheng, J & Hu, B 2021, SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification. in MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, pp. 1812-1820, 29th ACM International Conference on Multimedia, MM 2021, Virtual, Online, China, 20/10/21. https://doi.org/10.1145/3474085.3475330

SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification. / Rao, Haocong; Hu, Xiping; Cheng, Jun et al.
MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2021. p. 1812-1820 (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - SM-SGE

T2 - 29th ACM International Conference on Multimedia, MM 2021

AU - Rao, Haocong

AU - Hu, Xiping

AU - Cheng, Jun

AU - Hu, Bin

PY - 2021/10/17

Y1 - 2021/10/17

N2 - Person re-identification via 3D skeletons is an emerging topic with great potential in security-critical applications. Existing methods typically learn body and motion features from the body-joint trajectory, whereas they lack a systematic way to model body structure and underlying relations of body components beyond the scale of body joints. In this paper, we for the first time propose a Self-supervised Multi-scale Skeleton Graph Encoding (SM-SGE) framework that comprehensively models human body, component relations, and skeleton dynamics from unlabeled skeleton graphs of various scales to learn an effective skeleton representation for person Re-ID. Specifically, we first devise multi-scale skeleton graphs with coarse-to-fine human body partitions, which enables us to model body structure and skeleton dynamics at multiple levels. Second, to mine inherent correlations between body components in skeletal motion, we propose a multi-scale graph relation network to learn structural relations between adjacent body-component nodes and collaborative relations among nodes of different scales, so as to capture more discriminative skeleton graph features. Last, we propose a novel multi-scale skeleton reconstruction mechanism to enable our framework to encode skeleton dynamics and high-level semantics from unlabeled skeleton graphs, which encourages learning a discriminative skeleton representation for person Re-ID. Extensive experiments show that SM-SGE outperforms most state-of-the-art skeleton-based methods. We further demonstrate its effectiveness on 3D skeleton data estimated from large-scale RGB videos. Our codes are open at https://github.com/Kali-Hac/SM-SGE.

AB - Person re-identification via 3D skeletons is an emerging topic with great potential in security-critical applications. Existing methods typically learn body and motion features from the body-joint trajectory, whereas they lack a systematic way to model body structure and underlying relations of body components beyond the scale of body joints. In this paper, we for the first time propose a Self-supervised Multi-scale Skeleton Graph Encoding (SM-SGE) framework that comprehensively models human body, component relations, and skeleton dynamics from unlabeled skeleton graphs of various scales to learn an effective skeleton representation for person Re-ID. Specifically, we first devise multi-scale skeleton graphs with coarse-to-fine human body partitions, which enables us to model body structure and skeleton dynamics at multiple levels. Second, to mine inherent correlations between body components in skeletal motion, we propose a multi-scale graph relation network to learn structural relations between adjacent body-component nodes and collaborative relations among nodes of different scales, so as to capture more discriminative skeleton graph features. Last, we propose a novel multi-scale skeleton reconstruction mechanism to enable our framework to encode skeleton dynamics and high-level semantics from unlabeled skeleton graphs, which encourages learning a discriminative skeleton representation for person Re-ID. Extensive experiments show that SM-SGE outperforms most state-of-the-art skeleton-based methods. We further demonstrate its effectiveness on 3D skeleton data estimated from large-scale RGB videos. Our codes are open at https://github.com/Kali-Hac/SM-SGE.

KW - multi-scale skeleton graph encoding

KW - self-supervised representation learning

KW - skeleton-based person re-identification

UR - http://www.scopus.com/inward/record.url?scp=85119381442&partnerID=8YFLogxK

U2 - 10.1145/3474085.3475330

DO - 10.1145/3474085.3475330

M3 - Conference contribution

AN - SCOPUS:85119381442

T3 - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

SP - 1812

EP - 1820

BT - MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia

PB - Association for Computing Machinery, Inc

Y2 - 20 October 2021 through 24 October 2021

ER -

Rao H, Hu X, Cheng J, Hu B. SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification. In MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia. Association for Computing Machinery, Inc. 2021. p. 1812-1820. (MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia). doi: 10.1145/3474085.3475330

SM-SGE: A Self-Supervised Multi-Scale Skeleton Graph Encoding Framework for Person Re-Identification

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this