UniHead: Unifying Multi-Perception for Detection Heads

Hantao Zhou, Rui Yang, Yachao Zhang, Haoran Duan, Yawen Huang, Runze Hu, Xiu Li, Yefeng Zheng

科研成果: 期刊稿件文章同行评审

1 引用 (Scopus)

摘要

The detection head constitutes a pivotal component within object detectors, tasked with executing both classification and localization functions. Regrettably, the commonly used parallel head often lacks omni perceptual capabilities, such as deformation perception (DP), global perception (GP), and cross-task perception (CTP). Despite numerous methods attempting to enhance these abilities from a single aspect, achieving a comprehensive and unified solution remains a significant challenge. In response to this challenge, we develop an innovative detection head, termed UniHead, to unify three perceptual abilities simultaneously. More precisely, our approach: 1) introduces DP, enabling the model to adaptively sample object features; 2) proposes a dual-axial aggregation transformer (DAT) to adeptly model long-range dependencies, thereby achieving GP; and 3) devises a cross-task interaction transformer (CIT) that facilitates interaction between the classification and localization branches, thus aligning the two tasks. As a plug-and-play method, the proposed UniHead can be conveniently integrated with existing detectors. Extensive experiments on the COCO dataset demonstrate that our UniHead can bring significant improvements to many detectors. For instance, the UniHead can obtain <inline-formula> <tex-math notation="LaTeX">$+$</tex-math> </inline-formula>2.7 AP gains in RetinaNet, <inline-formula> <tex-math notation="LaTeX">$+$</tex-math> </inline-formula>2.9 AP gains in FreeAnchor, and <inline-formula> <tex-math notation="LaTeX">$+$</tex-math> </inline-formula>2.1 AP gains in GFL. The code is available at https://github.com/zht8506/UniHead.

源语言英语
页(从-至)1-12
页数12
期刊IEEE Transactions on Neural Networks and Learning Systems
DOI
出版状态已接受/待刊 - 2024

指纹

探究 'UniHead: Unifying Multi-Perception for Detection Heads' 的科研主题。它们共同构成独一无二的指纹。

引用此