Group Query-based Two-Stage Multi-Modal Trajectory Prediction

  • Junxia Mi
  • , Junqiang Xi*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Predicting future multi-modal trajectories of vehicles is a critical task for autonomous driving, while it is challenging to predict multi-modal trajectories with the supervision of only single-modal ground truth. Existing latent feature-based models predict multiple trajectories directly from the latent features, which is susceptible to mode collapse. Previous proposal based models depend heavily on manually designed proposals and post-processing techniques, which affect the prediction accuracy. To address these challenges, this paper proposes a group query based two-stage end-to-end framework for trajectory prediction. In the first stage, the scene context encoder captures the complex interactions among surrounding agents, the road environment, and the target agent, the rough trajectory anchors are generated by the latent features. In our second stage, we improve the accu racy based on the anchors from the first stage. Our improvement lies in three core designs. Firstly, we generate queries from the anchors derived from the first step by a network, with each query representing a distinct trajectory. Secondly, to mitigate modal collapse and ensure that different queries comprehensively extract scene information and yield diverse trajectories, we design a Transformer-based feature fusion network to model interactions across four dimensions: query-to-query attention, query-to-agent attention, query-to-lane attention, and query-to-time attention. Thirdly, we propose a group training strategy that derives multiple group queries and simultaneously predicts multi-modal trajectories within each group. This training strategy enhances accuracy through additional supervision. With the above designs, our experiments conducted on both the Argoverse and Argoverse 2 dataset demonstrate that the proposed method significantly outperforms competitive methods.

Original languageEnglish
JournalIEEE Transactions on Vehicular Technology
DOIs
Publication statusAccepted/In press - 2026
Externally publishedYes

Keywords

  • Trajectory prediction
  • group query
  • multi-modal prediction
  • transformer

Fingerprint

Dive into the research topics of 'Group Query-based Two-Stage Multi-Modal Trajectory Prediction'. Together they form a unique fingerprint.

Cite this