Deep Interactive Full Transformer Framework for Point Cloud Registration

Guangyan Chen; Meiling Wang; Qingxiang Zhang; Li Yuan; Tong Liu; Yufeng Yue

doi:10.1109/ICRA48891.2023.10160863

Deep Interactive Full Transformer Framework for Point Cloud Registration

Guangyan Chen, Meiling Wang, Qingxiang Zhang, Li Yuan, Tong Liu, Yufeng Yue^*

^*Corresponding author for this work

School of Automation

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

3 Citations (Scopus)

Abstract

Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.

Original language	English
Title of host publication	Proceedings - ICRA 2023
Subtitle of host publication	IEEE International Conference on Robotics and Automation
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	2825-2832
Number of pages	8
ISBN (Electronic)	9798350323658
DOIs	https://doi.org/10.1109/ICRA48891.2023.10160863
Publication status	Published - 2023
Event	2023 IEEE International Conference on Robotics and Automation, ICRA 2023 - London, United Kingdom Duration: 29 May 2023 → 2 Jun 2023

Publication series

Name	Proceedings - IEEE International Conference on Robotics and Automation
Volume	2023-May
ISSN (Print)	1050-4729

Conference

Conference	2023 IEEE International Conference on Robotics and Automation, ICRA 2023
Country/Territory	United Kingdom
City	London
Period	29/05/23 → 2/06/23

Access to Document

10.1109/ICRA48891.2023.10160863

Cite this

Chen, G., Wang, M., Zhang, Q., Yuan, L., Liu, T., & Yue, Y. (2023). Deep Interactive Full Transformer Framework for Point Cloud Registration. In Proceedings - ICRA 2023: IEEE International Conference on Robotics and Automation (pp. 2825-2832). (Proceedings - IEEE International Conference on Robotics and Automation; Vol. 2023-May). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICRA48891.2023.10160863

@inproceedings{0aea09ac6f7e4e1e8d212dd538ab4178,

title = "Deep Interactive Full Transformer Framework for Point Cloud Registration",

abstract = "Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.",

author = "Guangyan Chen and Meiling Wang and Qingxiang Zhang and Li Yuan and Tong Liu and Yufeng Yue",

note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 2023 IEEE International Conference on Robotics and Automation, ICRA 2023 ; Conference date: 29-05-2023 Through 02-06-2023",

year = "2023",

doi = "10.1109/ICRA48891.2023.10160863",

language = "English",

series = "Proceedings - IEEE International Conference on Robotics and Automation",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "2825--2832",

booktitle = "Proceedings - ICRA 2023",

address = "United States",

}

Chen, G, Wang, M, Zhang, Q, Yuan, L, Liu, T & Yue, Y 2023, Deep Interactive Full Transformer Framework for Point Cloud Registration. in Proceedings - ICRA 2023: IEEE International Conference on Robotics and Automation. Proceedings - IEEE International Conference on Robotics and Automation, vol. 2023-May, Institute of Electrical and Electronics Engineers Inc., pp. 2825-2832, 2023 IEEE International Conference on Robotics and Automation, ICRA 2023, London, United Kingdom, 29/05/23. https://doi.org/10.1109/ICRA48891.2023.10160863

Deep Interactive Full Transformer Framework for Point Cloud Registration. / Chen, Guangyan; Wang, Meiling; Zhang, Qingxiang et al.
Proceedings - ICRA 2023: IEEE International Conference on Robotics and Automation. Institute of Electrical and Electronics Engineers Inc., 2023. p. 2825-2832 (Proceedings - IEEE International Conference on Robotics and Automation; Vol. 2023-May).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Deep Interactive Full Transformer Framework for Point Cloud Registration

AU - Chen, Guangyan

AU - Wang, Meiling

AU - Zhang, Qingxiang

AU - Yuan, Li

AU - Liu, Tong

AU - Yue, Yufeng

PY - 2023

Y1 - 2023

N2 - Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.

AB - Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.

UR - http://www.scopus.com/inward/record.url?scp=85168655038&partnerID=8YFLogxK

U2 - 10.1109/ICRA48891.2023.10160863

DO - 10.1109/ICRA48891.2023.10160863

M3 - Conference contribution

AN - SCOPUS:85168655038

T3 - Proceedings - IEEE International Conference on Robotics and Automation

SP - 2825

EP - 2832

BT - Proceedings - ICRA 2023

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2023 IEEE International Conference on Robotics and Automation, ICRA 2023

Y2 - 29 May 2023 through 2 June 2023

ER -

Chen G, Wang M, Zhang Q, Yuan L, Liu T, Yue Y. Deep Interactive Full Transformer Framework for Point Cloud Registration. In Proceedings - ICRA 2023: IEEE International Conference on Robotics and Automation. Institute of Electrical and Electronics Engineers Inc. 2023. p. 2825-2832. (Proceedings - IEEE International Conference on Robotics and Automation). doi: 10.1109/ICRA48891.2023.10160863

Deep Interactive Full Transformer Framework for Point Cloud Registration

Abstract

Publication series

Conference

Access to Document

Other files and links

Fingerprint

Cite this