RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Chunliang Li; Wencheng Han; Junbo Yin; Sanyuan Zhao; Jianbing Shen

doi:10.1007/978-3-031-73411-3_16

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

Chunliang Li, Wencheng Han, Junbo Yin, Sanyuan Zhao^*, Jianbing Shen

^*此作品的通讯作者

计算机学院

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches. This paper addresses these issues by proposing a novel unified representation, RepVF, which harmonizes the representation of various perception tasks such as 3D object detection and 3D lane detection within a single framework. RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model that significantly reduces computational redundancy and feature competition. Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks by utilizing a hierarchical structure of queries that implicitly model the relationships both between and within tasks. This approach eliminates the need for task-specific heads and parameters, fundamentally reducing the conflicts inherent in traditional multi-task learning paradigms.We validate our approach by combining labels from the OpenLane dataset with the Waymo Open dataset. Our work presents a significant advancement in the efficiency and effectiveness of multi-task perception in autonomous driving, offering a new perspective on handling multiple 3D perception tasks synchronously and in parallel. The code will be available at: https://github.com/jbji/RepVF.

源语言	英语
主期刊名	Computer Vision – ECCV 2024 - 18th European Conference, Proceedings
编辑	Aleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol
出版商	Springer Science and Business Media Deutschland GmbH
页	273-292
页数	20
ISBN（印刷版）	9783031734106
DOI	https://doi.org/10.1007/978-3-031-73411-3_16
出版状态	已出版 - 2025
活动	18th European Conference on Computer Vision, ECCV 2024 - Milan, 意大利期限: 29 9月 2024 → 4 10月 2024

出版系列

姓名	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	15090 LNCS
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	18th European Conference on Computer Vision, ECCV 2024
国家/地区	意大利
市	Milan
时期	29/09/24 → 4/10/24

访问文件

10.1007/978-3-031-73411-3_16

其它文件与链接

链接到 Scopus 的出版物

引用此

Li, C., Han, W., Yin, J., Zhao, S., & Shen, J. (2025). RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception. 在 A. Leonardis, E. Ricci, S. Roth, O. Russakovsky, T. Sattler, & G. Varol (编辑), Computer Vision – ECCV 2024 - 18th European Conference, Proceedings (页码 273-292). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 15090 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-73411-3_16

Li, Chunliang ; Han, Wencheng ; Yin, Junbo 等. / RepVF : A Unified Vector Fields Representation for Multi-task 3D Perception. Computer Vision – ECCV 2024 - 18th European Conference, Proceedings. 编辑 / Aleš Leonardis ; Elisa Ricci ; Stefan Roth ; Olga Russakovsky ; Torsten Sattler ; Gül Varol. Springer Science and Business Media Deutschland GmbH, 2025. 页码 273-292 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{0178f94a4e374a6592f05b2f672d5445,

title = "RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception",

abstract = "Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches. This paper addresses these issues by proposing a novel unified representation, RepVF, which harmonizes the representation of various perception tasks such as 3D object detection and 3D lane detection within a single framework. RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model that significantly reduces computational redundancy and feature competition. Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks by utilizing a hierarchical structure of queries that implicitly model the relationships both between and within tasks. This approach eliminates the need for task-specific heads and parameters, fundamentally reducing the conflicts inherent in traditional multi-task learning paradigms.We validate our approach by combining labels from the OpenLane dataset with the Waymo Open dataset. Our work presents a significant advancement in the efficiency and effectiveness of multi-task perception in autonomous driving, offering a new perspective on handling multiple 3D perception tasks synchronously and in parallel. The code will be available at: https://github.com/jbji/RepVF.",

keywords = "3D Lane Detection, 3D Object Detection, Multi-task Method.",

author = "Chunliang Li and Wencheng Han and Junbo Yin and Sanyuan Zhao and Jianbing Shen",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.; 18th European Conference on Computer Vision, ECCV 2024 ; Conference date: 29-09-2024 Through 04-10-2024",

year = "2025",

doi = "10.1007/978-3-031-73411-3_16",

language = "English",

isbn = "9783031734106",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "273--292",

editor = "Ale{\v s} Leonardis and Elisa Ricci and Stefan Roth and Olga Russakovsky and Torsten Sattler and G{\"u}l Varol",

booktitle = "Computer Vision – ECCV 2024 - 18th European Conference, Proceedings",

address = "Germany",

}

Li, C, Han, W, Yin, J, Zhao, S & Shen, J 2025, RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception. 在 A Leonardis, E Ricci, S Roth, O Russakovsky, T Sattler & G Varol (编辑), Computer Vision – ECCV 2024 - 18th European Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 卷 15090 LNCS, Springer Science and Business Media Deutschland GmbH, 页码 273-292, 18th European Conference on Computer Vision, ECCV 2024, Milan, 意大利, 29/09/24. https://doi.org/10.1007/978-3-031-73411-3_16

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception. / Li, Chunliang; Han, Wencheng; Yin, Junbo 等.
Computer Vision – ECCV 2024 - 18th European Conference, Proceedings. 编辑 / Aleš Leonardis; Elisa Ricci; Stefan Roth; Olga Russakovsky; Torsten Sattler; Gül Varol. Springer Science and Business Media Deutschland GmbH, 2025. 页码 273-292 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); 卷 15090 LNCS).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - RepVF

T2 - 18th European Conference on Computer Vision, ECCV 2024

AU - Li, Chunliang

AU - Han, Wencheng

AU - Yin, Junbo

AU - Zhao, Sanyuan

AU - Shen, Jianbing

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

PY - 2025

Y1 - 2025

N2 - Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches. This paper addresses these issues by proposing a novel unified representation, RepVF, which harmonizes the representation of various perception tasks such as 3D object detection and 3D lane detection within a single framework. RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model that significantly reduces computational redundancy and feature competition. Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks by utilizing a hierarchical structure of queries that implicitly model the relationships both between and within tasks. This approach eliminates the need for task-specific heads and parameters, fundamentally reducing the conflicts inherent in traditional multi-task learning paradigms.We validate our approach by combining labels from the OpenLane dataset with the Waymo Open dataset. Our work presents a significant advancement in the efficiency and effectiveness of multi-task perception in autonomous driving, offering a new perspective on handling multiple 3D perception tasks synchronously and in parallel. The code will be available at: https://github.com/jbji/RepVF.

AB - Concurrent processing of multiple autonomous driving 3D perception tasks within the same spatiotemporal scene poses a significant challenge, in particular due to the computational inefficiencies and feature competition between tasks when using traditional multi-task learning approaches. This paper addresses these issues by proposing a novel unified representation, RepVF, which harmonizes the representation of various perception tasks such as 3D object detection and 3D lane detection within a single framework. RepVF characterizes the structure of different targets in the scene through a vector field, enabling a single-head, multi-task learning model that significantly reduces computational redundancy and feature competition. Building upon RepVF, we introduce RFTR, a network designed to exploit the inherent connections between different tasks by utilizing a hierarchical structure of queries that implicitly model the relationships both between and within tasks. This approach eliminates the need for task-specific heads and parameters, fundamentally reducing the conflicts inherent in traditional multi-task learning paradigms.We validate our approach by combining labels from the OpenLane dataset with the Waymo Open dataset. Our work presents a significant advancement in the efficiency and effectiveness of multi-task perception in autonomous driving, offering a new perspective on handling multiple 3D perception tasks synchronously and in parallel. The code will be available at: https://github.com/jbji/RepVF.

KW - 3D Lane Detection

KW - 3D Object Detection

KW - Multi-task Method.

UR - http://www.scopus.com/inward/record.url?scp=85210873033&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-73411-3_16

DO - 10.1007/978-3-031-73411-3_16

M3 - Conference contribution

AN - SCOPUS:85210873033

SN - 9783031734106

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 273

EP - 292

BT - Computer Vision – ECCV 2024 - 18th European Conference, Proceedings

A2 - Leonardis, Aleš

A2 - Ricci, Elisa

A2 - Roth, Stefan

A2 - Russakovsky, Olga

A2 - Sattler, Torsten

A2 - Varol, Gül

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 29 September 2024 through 4 October 2024

ER -

Li C, Han W, Yin J, Zhao S, Shen J. RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception. 在 Leonardis A, Ricci E, Roth S, Russakovsky O, Sattler T, Varol G, 编辑, Computer Vision – ECCV 2024 - 18th European Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. 页码 273-292. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-73411-3_16

RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此