TY - JOUR
T1 - Real-Time Multi-Modal Active Vision for Object Detection on UAVs Equipped with Limited Field of View LiDAR and Camera
AU - Shi, Chuanbeibei
AU - Lai, Ganghua
AU - Yu, Yushu
AU - Bellone, Mauro
AU - Lippiello, Vincezo
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2023/10/1
Y1 - 2023/10/1
N2 - This letter aims to solve the challenging problems in multi-modal active vision for object detection on unmanned aerial vehicles (UAVs) with a monocular camera and a limited Field of View (FoV) LiDAR. The point cloud acquired from the low-cost LiDAR is firstly converted into a 3-channel tensor via motion compensation, accumulation, projection, and up-sampling processes. The generated 3-channel point cloud tensor and RGB image are fused into a 6-channel tensor using an early fusion strategy for object detection based on a Gaussian YOLO network structure. To solve the low computational resource problem and improve the real-time performance, the velocity information of the UAV is further fused with the detection results based on an extended Kalman Filter (EKF). A perception-aware model predictive control (MPC) is designed to achieve active vision on our UAV. According to our performance evaluation, our pre-processing step improves other literature methods running time by a factor of 10 while maintaining acceptable detection performance. Furthermore, our fusion architecture reaches 94.6 mAP on the test set, outperforming the individual sensor networks by roughly 5%. We also described an implementation of the overall algorithm on a UAV platform and validated it in real-world experiments.
AB - This letter aims to solve the challenging problems in multi-modal active vision for object detection on unmanned aerial vehicles (UAVs) with a monocular camera and a limited Field of View (FoV) LiDAR. The point cloud acquired from the low-cost LiDAR is firstly converted into a 3-channel tensor via motion compensation, accumulation, projection, and up-sampling processes. The generated 3-channel point cloud tensor and RGB image are fused into a 6-channel tensor using an early fusion strategy for object detection based on a Gaussian YOLO network structure. To solve the low computational resource problem and improve the real-time performance, the velocity information of the UAV is further fused with the detection results based on an extended Kalman Filter (EKF). A perception-aware model predictive control (MPC) is designed to achieve active vision on our UAV. According to our performance evaluation, our pre-processing step improves other literature methods running time by a factor of 10 while maintaining acceptable detection performance. Furthermore, our fusion architecture reaches 94.6 mAP on the test set, outperforming the individual sensor networks by roughly 5%. We also described an implementation of the overall algorithm on a UAV platform and validated it in real-world experiments.
KW - Aerial systems: applications
KW - perception-action coupling
KW - sensor fusion
UR - http://www.scopus.com/inward/record.url?scp=85170560250&partnerID=8YFLogxK
U2 - 10.1109/LRA.2023.3309575
DO - 10.1109/LRA.2023.3309575
M3 - Article
AN - SCOPUS:85170560250
SN - 2377-3766
VL - 8
SP - 6571
EP - 6578
JO - IEEE Robotics and Automation Letters
JF - IEEE Robotics and Automation Letters
IS - 10
ER -