Abstract
Category-level object pose estimation is an important task in computer vision. Some prior methods based on assumptions often struggle with drastic changes in object appearance. To address this challenge, we propose a new method for object pose estimation based on object-adaptive keypoints. In this paper, we first introduce a transformer-based keypoint prediction method for adaptive forecasting of point cloud keypoints. This method calculates the similarity between keypoint features and point cloud features, allowing keypoints to represent object geometry more effectively. Furthermore, to enhance the geometric feature construction of keypoints, we propose a graph-based keypoint feature aggregation method, which considers both the structural relationships between keypoints and the point cloud, strengthening the network’s understanding of geometric structures. At this stage, keypoints remain at the geometric spatial level of the object and have not been predicted in NOCS. To improve the accuracy of keypoint prediction in NOCS, we design a NOCS voxelization method that divides NOCS into multiple voxels and accurately predicts NOCS keypoints within these voxels. Experimental results on multiple benchmark datasets demonstrate that our proposed KeyPose method outperforms all existing methods, achieving over 20% improvement in pose accuracy on some critical datasets.
Original language | English |
---|---|
Pages (from-to) | 9653-9661 |
Number of pages | 9 |
Journal | Proceedings of the AAAI Conference on Artificial Intelligence |
Volume | 39 |
Issue number | 9 |
DOIs | |
Publication status | Published - 11 Apr 2025 |
Externally published | Yes |
Event | 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States Duration: 25 Feb 2025 → 4 Mar 2025 |