TY - JOUR
T1 - Edge-Enriched Graph Transformer for Multiagent Trajectory Prediction with Relative Positional Semantics
AU - Zhang, Ting
AU - Fu, Mengyin
AU - Yang, Yi
AU - Song, Wenjie
AU - Liu, Tong
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Trajectory prediction is critical for safe and efficient autonomous driving, especially in scenarios with intricate road structures and complex interactions. To address this challenge, we propose a framework based on edge-enriched graph transformers for multimodal trajectory prediction of multiple agents. The model is novel in interaction representation and unified input format. First, to model the interaction, an edge-featured graph is constructed with relative coordinates, and positional semantics as edge properties, where the position information like front, rear, left, right, and conflict modes are encoded using binary codes. The edge features capturing the interaction relationship are further used for closeness recognition. Second, we achieve a unified representation of map and agent features, ensuring the consistent scale and interpretation of the heterogeneous input. Specifically, we vectorize and discretize the lanes into agent-like units and individualize the lanelets with agent-specific features. To handle the graph-like input, the edge-enriched graph transformer is first introduced for feature encoding. Finally, the dynamic, interaction, and map features are concatenated for multimodal prediction decoding. The experiments are conducted using the INTERACTION dataset and Argoverse2. The results of the comparison and ablation experiments demonstrate the competitive performance of our model in highly interactive scenes compared with other state-of-the-art prediction methods.
AB - Trajectory prediction is critical for safe and efficient autonomous driving, especially in scenarios with intricate road structures and complex interactions. To address this challenge, we propose a framework based on edge-enriched graph transformers for multimodal trajectory prediction of multiple agents. The model is novel in interaction representation and unified input format. First, to model the interaction, an edge-featured graph is constructed with relative coordinates, and positional semantics as edge properties, where the position information like front, rear, left, right, and conflict modes are encoded using binary codes. The edge features capturing the interaction relationship are further used for closeness recognition. Second, we achieve a unified representation of map and agent features, ensuring the consistent scale and interpretation of the heterogeneous input. Specifically, we vectorize and discretize the lanes into agent-like units and individualize the lanelets with agent-specific features. To handle the graph-like input, the edge-enriched graph transformer is first introduced for feature encoding. Finally, the dynamic, interaction, and map features are concatenated for multimodal prediction decoding. The experiments are conducted using the INTERACTION dataset and Argoverse2. The results of the comparison and ablation experiments demonstrate the competitive performance of our model in highly interactive scenes compared with other state-of-the-art prediction methods.
KW - Edge-enriched graph
KW - trajectory prediction
KW - transformer
UR - http://www.scopus.com/inward/record.url?scp=85195373549&partnerID=8YFLogxK
U2 - 10.1109/TIM.2024.3406791
DO - 10.1109/TIM.2024.3406791
M3 - Article
AN - SCOPUS:85195373549
SN - 0018-9456
VL - 73
SP - 1
EP - 12
JO - IEEE Transactions on Instrumentation and Measurement
JF - IEEE Transactions on Instrumentation and Measurement
M1 - 2520812
ER -