TY - GEN
T1 - Deep Interactive Full Transformer Framework for Point Cloud Registration
AU - Chen, Guangyan
AU - Wang, Meiling
AU - Zhang, Qingxiang
AU - Yuan, Li
AU - Liu, Tong
AU - Yue, Yufeng
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.
AB - Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.
UR - http://www.scopus.com/inward/record.url?scp=85168655038&partnerID=8YFLogxK
U2 - 10.1109/ICRA48891.2023.10160863
DO - 10.1109/ICRA48891.2023.10160863
M3 - Conference contribution
AN - SCOPUS:85168655038
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 2825
EP - 2832
BT - Proceedings - ICRA 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE International Conference on Robotics and Automation, ICRA 2023
Y2 - 29 May 2023 through 2 June 2023
ER -