Two-Stage Representation Refinement Based on Convex Combination for 3-D Human Poses Estimation

Luefeng Chen, Wei Cao, Biao Zheng, Min Wu*, Witold Pedrycz, Kaoru Hirota

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In the human pose estimation task, on the one hand, 3-D pose always has difficulty in dividing different 2-D poses if the view is limited; on the other hand, it is hard to reduce the lifting ambiguity because of the lack of depth information, it is an important and challenging problem. Therefore, two-stage representation refinement based on the convex combination for 3-D human pose estimation is proposed, in which the two-stage method includes a dense-spatial-temporal convolutional network and a local-to-refine network. The former is applied to determine the features between each video frame; the latter is used to get the different scales of pose details. It aims to address the difficulty of estimating 3-D human pose from 2-D image sequences. In such a way, it can better use the relations between every frame in the sequence of the pose video to produce more accurate results. Finally, we combine the above network with a block called convex combination to help refine the 3-D pose location. We test the proposed approach on both Human3.6m and MPII datasets. The result confirms that our method can achieve better performance than improved CNN supervision, a simple yet effective baseline, and coarse-to-fine volumetric prediction. Besides, a robustness test experiment is carried out for the proposed method while the input is interrupted. The result verifies that our method shows better robustness.

Original languageEnglish
Pages (from-to)6500-6508
Number of pages9
JournalIEEE Transactions on Artificial Intelligence
Volume5
Issue number12
DOIs
Publication statusPublished - 2024
Externally publishedYes

Keywords

  • 3-D pose estimation
  • kinematic constraints
  • pose sequence

Fingerprint

Dive into the research topics of 'Two-Stage Representation Refinement Based on Convex Combination for 3-D Human Poses Estimation'. Together they form a unique fingerprint.

Cite this