Geometry-driven self-supervised method for 3D human pose estimation

Yang Li; Kan Li; Shuai Jiang; Ziyue Zhang; Congzhentao Huang; Richard Yi Da Xu

Geometry-driven self-supervised method for 3D human pose estimation

Yang Li, Kan Li^*, Shuai Jiang, Ziyue Zhang, Congzhentao Huang, Richard Yi Da Xu

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

29 Citations (Scopus)

Abstract

The neural network based approach for 3D human pose estimation from monocular images has attracted growing interest. However, annotating 3D poses is a labor-intensive and expensive process. In this paper, we propose a novel self-supervised approach to avoid the need of manual annotations. Different from existing weakly/self-supervised methods that require extra unpaired 3D ground-truth data to alleviate the depth ambiguity problem, our method trains the network only relying on geometric knowledge without any additional 3D pose annotations. The proposed method follows the two-stage pipeline: 2D pose estimation and 2D-to-3D pose lifting. We design the transform re-projection loss that is an effective way to explore multi-view consistency for training the 2D-to-3D lifting network. Besides, we adopt the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Finally, we design a two-branch training architecture, which helps to preserve the scale information of re-projected 2D poses during training, resulting in accurate 3D pose predictions. We demonstrate the effectiveness of our method on two popular 3D human pose datasets, Human3.6M and MPI-INF-3DHP. The results show that our method significantly outperforms recent weakly/self-supervised approaches.

Original language	English
Title of host publication	AAAI 2020 - 34th AAAI Conference on Artificial Intelligence
Publisher	AAAI press
Pages	11442-11449
Number of pages	8
ISBN (Electronic)	9781577358350
Publication status	Published - 2020
Event	34th AAAI Conference on Artificial Intelligence, AAAI 2020 - New York, United States Duration: 7 Feb 2020 → 12 Feb 2020

Publication series

Name	AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

Conference

Conference	34th AAAI Conference on Artificial Intelligence, AAAI 2020
Country/Territory	United States
City	New York
Period	7/02/20 → 12/02/20

Cite this

@inproceedings{6a531d9492b54693a409a1fd22c90257,

title = "Geometry-driven self-supervised method for 3D human pose estimation",

abstract = "The neural network based approach for 3D human pose estimation from monocular images has attracted growing interest. However, annotating 3D poses is a labor-intensive and expensive process. In this paper, we propose a novel self-supervised approach to avoid the need of manual annotations. Different from existing weakly/self-supervised methods that require extra unpaired 3D ground-truth data to alleviate the depth ambiguity problem, our method trains the network only relying on geometric knowledge without any additional 3D pose annotations. The proposed method follows the two-stage pipeline: 2D pose estimation and 2D-to-3D pose lifting. We design the transform re-projection loss that is an effective way to explore multi-view consistency for training the 2D-to-3D lifting network. Besides, we adopt the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Finally, we design a two-branch training architecture, which helps to preserve the scale information of re-projected 2D poses during training, resulting in accurate 3D pose predictions. We demonstrate the effectiveness of our method on two popular 3D human pose datasets, Human3.6M and MPI-INF-3DHP. The results show that our method significantly outperforms recent weakly/self-supervised approaches.",

author = "Yang Li and Kan Li and Shuai Jiang and Ziyue Zhang and Congzhentao Huang and {Da Xu}, {Richard Yi}",

note = "Publisher Copyright: Copyright {\textcopyright} 2020, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.; 34th AAAI Conference on Artificial Intelligence, AAAI 2020 ; Conference date: 07-02-2020 Through 12-02-2020",

year = "2020",

language = "English",

series = "AAAI 2020 - 34th AAAI Conference on Artificial Intelligence",

publisher = "AAAI press",

pages = "11442--11449",

booktitle = "AAAI 2020 - 34th AAAI Conference on Artificial Intelligence",

}

Li, Y, Li, K, Jiang, S, Zhang, Z, Huang, C & Da Xu, RY 2020, Geometry-driven self-supervised method for 3D human pose estimation. in AAAI 2020 - 34th AAAI Conference on Artificial Intelligence. AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, AAAI press, pp. 11442-11449, 34th AAAI Conference on Artificial Intelligence, AAAI 2020, New York, United States, 7/02/20.

Geometry-driven self-supervised method for 3D human pose estimation. / Li, Yang; Li, Kan; Jiang, Shuai et al.
AAAI 2020 - 34th AAAI Conference on Artificial Intelligence. AAAI press, 2020. p. 11442-11449 (AAAI 2020 - 34th AAAI Conference on Artificial Intelligence).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Geometry-driven self-supervised method for 3D human pose estimation

AU - Li, Yang

AU - Li, Kan

AU - Jiang, Shuai

AU - Zhang, Ziyue

AU - Huang, Congzhentao

AU - Da Xu, Richard Yi

PY - 2020

Y1 - 2020

N2 - The neural network based approach for 3D human pose estimation from monocular images has attracted growing interest. However, annotating 3D poses is a labor-intensive and expensive process. In this paper, we propose a novel self-supervised approach to avoid the need of manual annotations. Different from existing weakly/self-supervised methods that require extra unpaired 3D ground-truth data to alleviate the depth ambiguity problem, our method trains the network only relying on geometric knowledge without any additional 3D pose annotations. The proposed method follows the two-stage pipeline: 2D pose estimation and 2D-to-3D pose lifting. We design the transform re-projection loss that is an effective way to explore multi-view consistency for training the 2D-to-3D lifting network. Besides, we adopt the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Finally, we design a two-branch training architecture, which helps to preserve the scale information of re-projected 2D poses during training, resulting in accurate 3D pose predictions. We demonstrate the effectiveness of our method on two popular 3D human pose datasets, Human3.6M and MPI-INF-3DHP. The results show that our method significantly outperforms recent weakly/self-supervised approaches.

AB - The neural network based approach for 3D human pose estimation from monocular images has attracted growing interest. However, annotating 3D poses is a labor-intensive and expensive process. In this paper, we propose a novel self-supervised approach to avoid the need of manual annotations. Different from existing weakly/self-supervised methods that require extra unpaired 3D ground-truth data to alleviate the depth ambiguity problem, our method trains the network only relying on geometric knowledge without any additional 3D pose annotations. The proposed method follows the two-stage pipeline: 2D pose estimation and 2D-to-3D pose lifting. We design the transform re-projection loss that is an effective way to explore multi-view consistency for training the 2D-to-3D lifting network. Besides, we adopt the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Finally, we design a two-branch training architecture, which helps to preserve the scale information of re-projected 2D poses during training, resulting in accurate 3D pose predictions. We demonstrate the effectiveness of our method on two popular 3D human pose datasets, Human3.6M and MPI-INF-3DHP. The results show that our method significantly outperforms recent weakly/self-supervised approaches.

UR - http://www.scopus.com/inward/record.url?scp=85095267609&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85095267609

T3 - AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

SP - 11442

EP - 11449

BT - AAAI 2020 - 34th AAAI Conference on Artificial Intelligence

PB - AAAI press

T2 - 34th AAAI Conference on Artificial Intelligence, AAAI 2020

Y2 - 7 February 2020 through 12 February 2020

ER -

Geometry-driven self-supervised method for 3D human pose estimation

Abstract

Publication series

Conference

Other files and links

Fingerprint

Cite this