TY - JOUR
T1 - Equipping with Human Cognition
T2 - Driver Intention Recognition with Multimodal Information Fusion
AU - Zhang, Bo
AU - Hou, Xiaohui
AU - Wu, Wei
AU - Gan, Minggang
N1 - Publisher Copyright:
© 2025 World Scientific Publishing Company.
PY - 2024
Y1 - 2024
N2 - To address the issue of autonomous driving systems' inability to timely detect danger and respond in urban environments, an intention recognition system integrating driver cognitive information was developed. By utilizing the drivers' electroencephalogram (EEG), eye movement and operational data, the system identifies driver intentions in a type of hazardous scenario. Initially, a driver-in-the-loop simulation platform was used for data collection, followed by experiments and data preprocessing to create a multimodal fused dataset through feature-level fusion. Models based on multilayer perceptron (MLP), convolutional neural network (CNN) and transformer were then developed to predict emergency braking and steering evasion intentions. The transformer-based model, with multimodal data fusion, achieved the best performance with an accuracy of 93.02%, significantly surpassing EEG-only (80.80%) and eye movement and operational data-only models (78.46%). This highlights the transformer's superior ability to capture complex spatiotemporal correlations in multimodal data. Additionally, pre-extracted EEG frequency domain features could improve model performance, though less significantly than changes in model architecture. Embedding this system into autonomous driving systems is expected to enhance their ability to quickly and accurately recognize and respond to dangerous scenarios.
AB - To address the issue of autonomous driving systems' inability to timely detect danger and respond in urban environments, an intention recognition system integrating driver cognitive information was developed. By utilizing the drivers' electroencephalogram (EEG), eye movement and operational data, the system identifies driver intentions in a type of hazardous scenario. Initially, a driver-in-the-loop simulation platform was used for data collection, followed by experiments and data preprocessing to create a multimodal fused dataset through feature-level fusion. Models based on multilayer perceptron (MLP), convolutional neural network (CNN) and transformer were then developed to predict emergency braking and steering evasion intentions. The transformer-based model, with multimodal data fusion, achieved the best performance with an accuracy of 93.02%, significantly surpassing EEG-only (80.80%) and eye movement and operational data-only models (78.46%). This highlights the transformer's superior ability to capture complex spatiotemporal correlations in multimodal data. Additionally, pre-extracted EEG frequency domain features could improve model performance, though less significantly than changes in model architecture. Embedding this system into autonomous driving systems is expected to enhance their ability to quickly and accurately recognize and respond to dangerous scenarios.
KW - automatic driving
KW - human-machine cooperation
KW - intention recognition
KW - Multimodal information fusion
UR - http://www.scopus.com/inward/record.url?scp=85213424559&partnerID=8YFLogxK
U2 - 10.1142/S2301385025500864
DO - 10.1142/S2301385025500864
M3 - Article
AN - SCOPUS:85213424559
SN - 2301-3850
JO - Unmanned Systems
JF - Unmanned Systems
ER -