UMD-Net: A Unified Multi-Task Assistive Driving Network Based on Multimodal Fusion

Wenzhuo Liu, Yicheng Qiao, Zhiwei Li, Wenshuo Wang*, Wei Zhang, Jiayin Zhu, Yanhuan Jiang*, Li Wang, Hong Wang, Huaping Liu, Kunfeng Wang

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, researchers have focused on identifying tasks related to driver state, traffic environment, and others to enhance the safety of autonomous driving assistance systems. However, current research on these tasks is conducted independently, neglecting the interconnections between the driver, traffic environment, and vehicle. In this paper, we propose a Unified Multi-task Assistive Driving Network Based on Multimodal Fusion (UMD-Net), the first unified model capable of recognizing four tasks simultaneously by utilizing multimodal data: driver behavior recognition, driver emotion recognition, traffic context recognition, and vehicle behavior recognition. In order to better enhance the synergistic effects between multiple tasks, we designed the position-sensitive multi-directional attention feature extraction subnetwork and recursive dynamic feature fusion module. The former captures the key features of multi-view images by different directions of attention mechanism to improve the generalization of the model across multiple tasks. The latter dynamically adjusts the fusion weight according to the multimodal features to enhance the representation ability of important features in multi-task learning. Our model was evaluated on the public dataset AIDE, achieving the best performance across all four tasks and a high accuracy of 95.31% in the traffic context recognition task, demonstrating the superiority of our approach.

Original languageEnglish
JournalIEEE Transactions on Intelligent Transportation Systems
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • ADAS
  • driver state recognition
  • Multi-task learning
  • multimodal fusion
  • traffic environment recognition

Fingerprint

Dive into the research topics of 'UMD-Net: A Unified Multi-Task Assistive Driving Network Based on Multimodal Fusion'. Together they form a unique fingerprint.

Cite this