TY - GEN
T1 - Core
T2 - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
AU - Wang, Binglu
AU - Zhang, Lei
AU - Wang, Zhaozhong
AU - Zhao, Yongqiang
AU - Zhou, Tianfei
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - This paper presents Core, a conceptually simple, effective and communication-efficient model for multi-agent cooperative perception. It addresses the task from a novel perspective of cooperative reconstruction, based on two key insights: 1) cooperating agents together provide a more holistic observation of the environment, and 2) the holistic observation can serve as valuable supervision to explicitly guide the model learning how to reconstruct the ideal observation based on collaboration. Core instantiates the idea with three major components: a compressor for each agent to create more compact feature representation for efficient broadcasting, a lightweight attentive collaboration component for cross-agent message aggregation, and a reconstruction module to reconstruct the observation based on aggregated feature representations. This learning-to-reconstruct idea is task-agnostic, and offers clear and reasonable supervision to inspire more effective collaboration, eventually promoting perception tasks. We validate Core on two large-scale multi-agent percetion dataset, OPV2V and V2X-Sim, in two tasks, i.e., 3D object detection and semantic segmentation. Results demonstrate that Core achieves state-of-the-art performance, and is more communication-efficient.
AB - This paper presents Core, a conceptually simple, effective and communication-efficient model for multi-agent cooperative perception. It addresses the task from a novel perspective of cooperative reconstruction, based on two key insights: 1) cooperating agents together provide a more holistic observation of the environment, and 2) the holistic observation can serve as valuable supervision to explicitly guide the model learning how to reconstruct the ideal observation based on collaboration. Core instantiates the idea with three major components: a compressor for each agent to create more compact feature representation for efficient broadcasting, a lightweight attentive collaboration component for cross-agent message aggregation, and a reconstruction module to reconstruct the observation based on aggregated feature representations. This learning-to-reconstruct idea is task-agnostic, and offers clear and reasonable supervision to inspire more effective collaboration, eventually promoting perception tasks. We validate Core on two large-scale multi-agent percetion dataset, OPV2V and V2X-Sim, in two tasks, i.e., 3D object detection and semantic segmentation. Results demonstrate that Core achieves state-of-the-art performance, and is more communication-efficient.
UR - http://www.scopus.com/inward/record.url?scp=85172348106&partnerID=8YFLogxK
U2 - 10.1109/ICCV51070.2023.00800
DO - 10.1109/ICCV51070.2023.00800
M3 - Conference contribution
AN - SCOPUS:85172348106
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 8676
EP - 8686
BT - Proceedings - 2023 IEEE/CVF International Conference on Computer Vision, ICCV 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 October 2023 through 6 October 2023
ER -