摘要
Autonomous driving requires a structured understanding of the surrounding road maps and networks to navigate. However, considering the flexibility of autonomous vehicles and the variations in lane curvature and shape, the online and accurate extraction of road maps with fine-grained boundaries and road networks with lane topology in a unified framework remains challenging. This paper proposes SRSU, an online road map detection and network estimation framework for structured bird's-eye view road scene understanding. Specifically, we introduce a hierarchical map representation, <italic>i.e.</italic>, representing the road map as a set of ordered point sets with equivalent permutations and the road network as a directed graph, accurately describing the fine-grained map boundaries and lane topology in a unified framework. Building upon the above representation, we propose an online hierarchical map construction framework. It utilizes two sets of learnable hierarchical query embeddings to extract road maps with fine-grained boundaries and road networks with lane topologies, achieving a comprehensive understanding of the road scene. Furthermore, we introduce three empirical modules to enhance the accuracy of hierarchical map construction. These modules are termed auxiliary task prediction, multi-modal distillation, and higher-order interaction, responsible for enhancing the model's representational capabilities and providing valuable auxiliary information for subsequent tasks, generating robust features for final tasks, and learning the association information between different tasks, respectively. Finally, experiments on the nuScenes dataset demonstrate the proposed framework's effectiveness while highlighting the empirical module's superiority. Code will be available at <uri>https://github.com/jiapeng789/SRSU</uri>.
源语言 | 英语 |
---|---|
页(从-至) | 1-13 |
页数 | 13 |
期刊 | IEEE Transactions on Intelligent Vehicles |
DOI | |
出版状态 | 已接受/待刊 - 2024 |