GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories

Yaohang Han; Huijun Di; Hanfeng Zheng; Jianyong Qi; Jianwei Gong

doi:10.1007/978-3-030-60633-6_15

GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories

Yaohang Han, Huijun Di^*, Hanfeng Zheng, Jianyong Qi, Jianwei Gong

^*此作品的通讯作者

Beijing Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

3 引用（Scopus）

摘要

As a fundamental AI problem, monocular 3D pose estimation has received much attention. This paper addresses the challenge of estimating full perspective model parameters, including object pose and camera intrinsics, from a single 2D image of fine-grained object categories. To tackle this highly ill-posed problem, we propose a Geometry Constrained Voting Network (GCVNet). It is a unified end-to-end network consisting of four synergic task-specific subnetworks: 1) Fine-grained classification subnetwork, offering fine-grained 3D shape priors. 2) Voting subnetwork, generating 2D measurements. 3) Segmentation subnetwork, providing a foreground mask for voting. 4) PnP subnetwork, estimating the perspective parameters via explicit geometric reasoning, as well as constraining the classification subnetwork to provide proper 3D priors and the voting subnetwork to generate a group of geometric consistent 2D measurements, rather than independent voting for each 2D measurement in the literature. Experiments on challenging datasets demonstrate the superior performance of GCVNet.

源语言	英语
主期刊名	Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings
编辑	Yuxin Peng, Hongbin Zha, Qingshan Liu, Huchuan Lu, Zhenan Sun, Chenglin Liu, Xilin Chen, Jian Yang
出版商	Springer Science and Business Media Deutschland GmbH
页	180-192
页数	13
ISBN（印刷版）	9783030606329
DOI	https://doi.org/10.1007/978-3-030-60633-6_15
出版状态	已出版 - 2020
活动	3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020 - Nanjing, 中国期限: 16 10月 2020 → 18 10月 2020

出版系列

姓名	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	12305 LNCS
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020
国家/地区	中国
市	Nanjing
时期	16/10/20 → 18/10/20

访问文件

10.1007/978-3-030-60633-6_15

其它文件与链接

链接到 Scopus 的出版物

引用此

Han, Y., Di, H., Zheng, H., Qi, J., & Gong, J. (2020). GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories. 在 Y. Peng, H. Zha, Q. Liu, H. Lu, Z. Sun, C. Liu, X. Chen, & J. Yang (编辑), Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings (页码 180-192). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 12305 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60633-6_15

Han, Yaohang ; Di, Huijun ; Zheng, Hanfeng 等. / GCVNet : Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories. Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. 编辑 / Yuxin Peng ; Hongbin Zha ; Qingshan Liu ; Huchuan Lu ; Zhenan Sun ; Chenglin Liu ; Xilin Chen ; Jian Yang. Springer Science and Business Media Deutschland GmbH, 2020. 页码 180-192 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{3a5d0e5375e849de86790b8dacf3d065,

title = "GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories",

abstract = "As a fundamental AI problem, monocular 3D pose estimation has received much attention. This paper addresses the challenge of estimating full perspective model parameters, including object pose and camera intrinsics, from a single 2D image of fine-grained object categories. To tackle this highly ill-posed problem, we propose a Geometry Constrained Voting Network (GCVNet). It is a unified end-to-end network consisting of four synergic task-specific subnetworks: 1) Fine-grained classification subnetwork, offering fine-grained 3D shape priors. 2) Voting subnetwork, generating 2D measurements. 3) Segmentation subnetwork, providing a foreground mask for voting. 4) PnP subnetwork, estimating the perspective parameters via explicit geometric reasoning, as well as constraining the classification subnetwork to provide proper 3D priors and the voting subnetwork to generate a group of geometric consistent 2D measurements, rather than independent voting for each 2D measurement in the literature. Experiments on challenging datasets demonstrate the superior performance of GCVNet.",

keywords = "Differentiable PnP, Geometric reasoning, Pose estimation",

author = "Yaohang Han and Huijun Di and Hanfeng Zheng and Jianyong Qi and Jianwei Gong",

note = "Publisher Copyright: {\textcopyright} 2020, Springer Nature Switzerland AG.; 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020 ; Conference date: 16-10-2020 Through 18-10-2020",

year = "2020",

doi = "10.1007/978-3-030-60633-6_15",

language = "English",

isbn = "9783030606329",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "180--192",

editor = "Yuxin Peng and Hongbin Zha and Qingshan Liu and Huchuan Lu and Zhenan Sun and Chenglin Liu and Xilin Chen and Jian Yang",

booktitle = "Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings",

address = "Germany",

}

Han, Y, Di, H, Zheng, H, Qi, J & Gong, J 2020, GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories. 在 Y Peng, H Zha, Q Liu, H Lu, Z Sun, C Liu, X Chen & J Yang (编辑), Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 12305 LNCS, Springer Science and Business Media Deutschland GmbH, 页码 180-192, 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020, Nanjing, 中国, 16/10/20. https://doi.org/10.1007/978-3-030-60633-6_15

GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories. / Han, Yaohang; Di, Huijun; Zheng, Hanfeng 等.
Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. 编辑 / Yuxin Peng; Hongbin Zha; Qingshan Liu; Huchuan Lu; Zhenan Sun; Chenglin Liu; Xilin Chen; Jian Yang. Springer Science and Business Media Deutschland GmbH, 2020. 页码 180-192 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 12305 LNCS).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - GCVNet

T2 - 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020

AU - Han, Yaohang

AU - Di, Huijun

AU - Zheng, Hanfeng

AU - Qi, Jianyong

AU - Gong, Jianwei

PY - 2020

Y1 - 2020

N2 - As a fundamental AI problem, monocular 3D pose estimation has received much attention. This paper addresses the challenge of estimating full perspective model parameters, including object pose and camera intrinsics, from a single 2D image of fine-grained object categories. To tackle this highly ill-posed problem, we propose a Geometry Constrained Voting Network (GCVNet). It is a unified end-to-end network consisting of four synergic task-specific subnetworks: 1) Fine-grained classification subnetwork, offering fine-grained 3D shape priors. 2) Voting subnetwork, generating 2D measurements. 3) Segmentation subnetwork, providing a foreground mask for voting. 4) PnP subnetwork, estimating the perspective parameters via explicit geometric reasoning, as well as constraining the classification subnetwork to provide proper 3D priors and the voting subnetwork to generate a group of geometric consistent 2D measurements, rather than independent voting for each 2D measurement in the literature. Experiments on challenging datasets demonstrate the superior performance of GCVNet.

AB - As a fundamental AI problem, monocular 3D pose estimation has received much attention. This paper addresses the challenge of estimating full perspective model parameters, including object pose and camera intrinsics, from a single 2D image of fine-grained object categories. To tackle this highly ill-posed problem, we propose a Geometry Constrained Voting Network (GCVNet). It is a unified end-to-end network consisting of four synergic task-specific subnetworks: 1) Fine-grained classification subnetwork, offering fine-grained 3D shape priors. 2) Voting subnetwork, generating 2D measurements. 3) Segmentation subnetwork, providing a foreground mask for voting. 4) PnP subnetwork, estimating the perspective parameters via explicit geometric reasoning, as well as constraining the classification subnetwork to provide proper 3D priors and the voting subnetwork to generate a group of geometric consistent 2D measurements, rather than independent voting for each 2D measurement in the literature. Experiments on challenging datasets demonstrate the superior performance of GCVNet.

KW - Differentiable PnP

KW - Geometric reasoning

KW - Pose estimation

UR - http://www.scopus.com/inward/record.url?scp=85093824324&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-60633-6_15

DO - 10.1007/978-3-030-60633-6_15

M3 - Conference contribution

AN - SCOPUS:85093824324

SN - 9783030606329

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 180

EP - 192

BT - Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings

A2 - Peng, Yuxin

A2 - Zha, Hongbin

A2 - Liu, Qingshan

A2 - Lu, Huchuan

A2 - Sun, Zhenan

A2 - Liu, Chenglin

A2 - Chen, Xilin

A2 - Yang, Jian

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 16 October 2020 through 18 October 2020

ER -

Han Y, Di H, Zheng H, Qi J, Gong J. GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories. 在 Peng Y, Zha H, Liu Q, Lu H, Sun Z, Liu C, Chen X, Yang J, 编辑, Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings. Springer Science and Business Media Deutschland GmbH. 2020. 页码 180-192. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-60633-6_15

GCVNet: Geometry Constrained Voting Network to Estimate 3D Pose for Fine-Grained Object Categories

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此