LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for Place Recognition

Zijie Zhou; Jingyi Xu; Guangming Xiong; Junyi Ma

doi:10.1109/LRA.2023.3346753

LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for Place Recognition

Zijie Zhou, Jingyi Xu, Guangming Xiong, Junyi Ma^*

^*此作品的通讯作者

机械与车辆学院

科研成果: 期刊稿件 › 文章 › 同行评审

2 引用（Scopus）

摘要

Place recognition is one of the most crucial modules for autonomous vehicles to identify places that were previously visited in GPS-invalid environments. Sensor fusion is considered an effective method to overcome the weaknesses of individual sensors. In recent years, multimodal place recognition fusing information from multiple sensors has gathered increasing attention. However, most existing multimodal place recognition methods only use limited field-of-view camera images, which leads to an imbalance between features from different modalities and limits the effectiveness of sensor fusion. In this letter, we present a novel neural network named LCPR for robust multimodal place recognition, which fuses LiDAR point clouds with multi-view RGB images to generate discriminative and yaw-rotation invariant representations of the environment. A multi-scale attention-based fusion module is proposed to fully exploit the panoramic views from different modalities of the environment and their correlations. We evaluate our method on the nuScenes dataset, and the experimental results show that our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance while maintaining strong robustness to viewpoint changes.

源语言	英语
页（从-至）	1342-1349
页数	8
期刊	IEEE Robotics and Automation Letters
卷	9
期	2
DOI	https://doi.org/10.1109/LRA.2023.3346753
出版状态	已出版 - 1 2月 2024

访问文件

10.1109/LRA.2023.3346753

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{a1e712e83d1149ac9381cfe741cf673a,

title = "LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for Place Recognition",

abstract = "Place recognition is one of the most crucial modules for autonomous vehicles to identify places that were previously visited in GPS-invalid environments. Sensor fusion is considered an effective method to overcome the weaknesses of individual sensors. In recent years, multimodal place recognition fusing information from multiple sensors has gathered increasing attention. However, most existing multimodal place recognition methods only use limited field-of-view camera images, which leads to an imbalance between features from different modalities and limits the effectiveness of sensor fusion. In this letter, we present a novel neural network named LCPR for robust multimodal place recognition, which fuses LiDAR point clouds with multi-view RGB images to generate discriminative and yaw-rotation invariant representations of the environment. A multi-scale attention-based fusion module is proposed to fully exploit the panoramic views from different modalities of the environment and their correlations. We evaluate our method on the nuScenes dataset, and the experimental results show that our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance while maintaining strong robustness to viewpoint changes.",

keywords = "Place recognition, SLAM, deep learning, sensor fusion",

author = "Zijie Zhou and Jingyi Xu and Guangming Xiong and Junyi Ma",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.",

year = "2024",

month = feb,

day = "1",

doi = "10.1109/LRA.2023.3346753",

language = "English",

volume = "9",

pages = "1342--1349",

journal = "IEEE Robotics and Automation Letters",

issn = "2377-3766",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "2",

}

TY - JOUR

T1 - LCPR

T2 - A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for Place Recognition

AU - Zhou, Zijie

AU - Xu, Jingyi

AU - Xiong, Guangming

AU - Ma, Junyi

PY - 2024/2/1

Y1 - 2024/2/1

N2 - Place recognition is one of the most crucial modules for autonomous vehicles to identify places that were previously visited in GPS-invalid environments. Sensor fusion is considered an effective method to overcome the weaknesses of individual sensors. In recent years, multimodal place recognition fusing information from multiple sensors has gathered increasing attention. However, most existing multimodal place recognition methods only use limited field-of-view camera images, which leads to an imbalance between features from different modalities and limits the effectiveness of sensor fusion. In this letter, we present a novel neural network named LCPR for robust multimodal place recognition, which fuses LiDAR point clouds with multi-view RGB images to generate discriminative and yaw-rotation invariant representations of the environment. A multi-scale attention-based fusion module is proposed to fully exploit the panoramic views from different modalities of the environment and their correlations. We evaluate our method on the nuScenes dataset, and the experimental results show that our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance while maintaining strong robustness to viewpoint changes.

AB - Place recognition is one of the most crucial modules for autonomous vehicles to identify places that were previously visited in GPS-invalid environments. Sensor fusion is considered an effective method to overcome the weaknesses of individual sensors. In recent years, multimodal place recognition fusing information from multiple sensors has gathered increasing attention. However, most existing multimodal place recognition methods only use limited field-of-view camera images, which leads to an imbalance between features from different modalities and limits the effectiveness of sensor fusion. In this letter, we present a novel neural network named LCPR for robust multimodal place recognition, which fuses LiDAR point clouds with multi-view RGB images to generate discriminative and yaw-rotation invariant representations of the environment. A multi-scale attention-based fusion module is proposed to fully exploit the panoramic views from different modalities of the environment and their correlations. We evaluate our method on the nuScenes dataset, and the experimental results show that our method can effectively utilize multi-view camera and LiDAR data to improve the place recognition performance while maintaining strong robustness to viewpoint changes.

KW - Place recognition

KW - SLAM

KW - deep learning

KW - sensor fusion

UR - http://www.scopus.com/inward/record.url?scp=85181569033&partnerID=8YFLogxK

U2 - 10.1109/LRA.2023.3346753

DO - 10.1109/LRA.2023.3346753

M3 - Article

AN - SCOPUS:85181569033

SN - 2377-3766

VL - 9

SP - 1342

EP - 1349

JO - IEEE Robotics and Automation Letters

JF - IEEE Robotics and Automation Letters

IS - 2

ER -

LCPR: A Multi-Scale Attention-Based LiDAR-Camera Fusion Network for Place Recognition

摘要

访问文件

其它文件与链接

指纹

引用此