Fusion-competition framework of local topology and global texture for head pose estimation

Dongsheng Ma; Tianyu Fu; Yifei Yang; Kaibin Cao; Jingfan Fan; Deqiang Xiao; Hong Song; Ying Gu; Jian Yang

doi:10.1016/j.patcog.2024.110285

Fusion-competition framework of local topology and global texture for head pose estimation

Dongsheng Ma, Tianyu Fu^*, Yifei Yang, Kaibin Cao, Jingfan Fan, Deqiang Xiao, Hong Song, Ying Gu, Jian Yang

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

RGB image and point cloud involve texture and geometric structure, which are widely used for head pose estimation. However, images lack of spatial information, and the quality of point cloud is easily affected by sensor noise. In this paper, a novel fusion-competition framework (FCF) is proposed to overcome the limitations of a single modality. The global texture information is extracted from image and the local topology information is extracted from point cloud to project heterogeneous data into a common feature subspace. The projected texture feature weighted by the channel attention mechanism is embedded into each local point cloud region with different topological features for fusion. The scoring mechanism creates competition among the regions involving local-global fused features to predict final pose with the highest score. According to the evaluation results on the public and our constructed datasets, the FCF improves the estimation accuracy and stability by an average of 13.6 % and 12.7 %, which is compared to nine state-of-the-art methods.

Original language	English
Article number	110285
Journal	Pattern Recognition
Volume	149
DOIs	https://doi.org/10.1016/j.patcog.2024.110285
Publication status	Published - May 2024

Keywords

Feature channel attention
Feature fusion
Head pose estimation
Local regions competition
Point cloud
RGB image

Access to Document

10.1016/j.patcog.2024.110285

Cite this

Ma, D., Fu, T., Yang, Y., Cao, K., Fan, J., Xiao, D., Song, H., Gu, Y., & Yang, J. (2024). Fusion-competition framework of local topology and global texture for head pose estimation. Pattern Recognition, 149, Article 110285. https://doi.org/10.1016/j.patcog.2024.110285

@article{5e3ccd59c0684f9895cbca4ec7b8ae99,

title = "Fusion-competition framework of local topology and global texture for head pose estimation",

abstract = "RGB image and point cloud involve texture and geometric structure, which are widely used for head pose estimation. However, images lack of spatial information, and the quality of point cloud is easily affected by sensor noise. In this paper, a novel fusion-competition framework (FCF) is proposed to overcome the limitations of a single modality. The global texture information is extracted from image and the local topology information is extracted from point cloud to project heterogeneous data into a common feature subspace. The projected texture feature weighted by the channel attention mechanism is embedded into each local point cloud region with different topological features for fusion. The scoring mechanism creates competition among the regions involving local-global fused features to predict final pose with the highest score. According to the evaluation results on the public and our constructed datasets, the FCF improves the estimation accuracy and stability by an average of 13.6 % and 12.7 %, which is compared to nine state-of-the-art methods.",

keywords = "Feature channel attention, Feature fusion, Head pose estimation, Local regions competition, Point cloud, RGB image",

author = "Dongsheng Ma and Tianyu Fu and Yifei Yang and Kaibin Cao and Jingfan Fan and Deqiang Xiao and Hong Song and Ying Gu and Jian Yang",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Ltd",

year = "2024",

month = may,

doi = "10.1016/j.patcog.2024.110285",

language = "English",

volume = "149",

journal = "Pattern Recognition",

issn = "0031-3203",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Fusion-competition framework of local topology and global texture for head pose estimation

AU - Ma, Dongsheng

AU - Fu, Tianyu

AU - Yang, Yifei

AU - Cao, Kaibin

AU - Fan, Jingfan

AU - Xiao, Deqiang

AU - Song, Hong

AU - Gu, Ying

AU - Yang, Jian

PY - 2024/5

Y1 - 2024/5

N2 - RGB image and point cloud involve texture and geometric structure, which are widely used for head pose estimation. However, images lack of spatial information, and the quality of point cloud is easily affected by sensor noise. In this paper, a novel fusion-competition framework (FCF) is proposed to overcome the limitations of a single modality. The global texture information is extracted from image and the local topology information is extracted from point cloud to project heterogeneous data into a common feature subspace. The projected texture feature weighted by the channel attention mechanism is embedded into each local point cloud region with different topological features for fusion. The scoring mechanism creates competition among the regions involving local-global fused features to predict final pose with the highest score. According to the evaluation results on the public and our constructed datasets, the FCF improves the estimation accuracy and stability by an average of 13.6 % and 12.7 %, which is compared to nine state-of-the-art methods.

AB - RGB image and point cloud involve texture and geometric structure, which are widely used for head pose estimation. However, images lack of spatial information, and the quality of point cloud is easily affected by sensor noise. In this paper, a novel fusion-competition framework (FCF) is proposed to overcome the limitations of a single modality. The global texture information is extracted from image and the local topology information is extracted from point cloud to project heterogeneous data into a common feature subspace. The projected texture feature weighted by the channel attention mechanism is embedded into each local point cloud region with different topological features for fusion. The scoring mechanism creates competition among the regions involving local-global fused features to predict final pose with the highest score. According to the evaluation results on the public and our constructed datasets, the FCF improves the estimation accuracy and stability by an average of 13.6 % and 12.7 %, which is compared to nine state-of-the-art methods.

KW - Feature channel attention

KW - Feature fusion

KW - Head pose estimation

KW - Local regions competition

KW - Point cloud

KW - RGB image

UR - http://www.scopus.com/inward/record.url?scp=85183314639&partnerID=8YFLogxK

U2 - 10.1016/j.patcog.2024.110285

DO - 10.1016/j.patcog.2024.110285

M3 - Article

AN - SCOPUS:85183314639

SN - 0031-3203

VL - 149

JO - Pattern Recognition

JF - Pattern Recognition

M1 - 110285

ER -

Fusion-competition framework of local topology and global texture for head pose estimation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this