TY - GEN
T1 - Human activity recognition with a multibranch network based on CNN and LSTM
AU - Yuan, Ruixin
AU - Zhang, Yanmei
AU - Wang, Lizhe
AU - Li, Shengyun
N1 - Publisher Copyright:
© 2024 SPIE.
PY - 2024
Y1 - 2024
N2 - With the widespread use of wearable devices, human activity recognition (HAR) holds immense potential in health monitoring, smart environment. Notably, temporal sensory sequences collected from the wearable devices can provide accurate reflections of the daily activities. Nonetheless, existing CNN-based and LSTM-based methods have predominantly concentrated on feature extraction from univariate sequences, overlooking the implicit frequency information. Therefore, we firstly employed the Short Time Fourier Transform (STFT) in HAR tasks, extracting inherent frequency feature. Concurrently, we introduced a multi-branch network that combines CNN and LSTM. The CNN component captures spatial information of different dimensions. The LSTM, on the other hand, comprises two parts, one focused on temporal relationships within a single channel and the other concerned about channel relationships at a specific time point. In addition, recognizing the limitations in the available datasets, particularly the insufficient coverage of daily activities, we collected our custom dataset, encompassing eight distinct daily activity categories. Finally, we evaluated our proposed model and benchmark models. The results demonstrate that our network exhibits superior generalization across different datasets, archieving accuracy of 91.70%, 95.79%, 87.81% on the PAMAP2, UCI HAR and our own dataset respectively.
AB - With the widespread use of wearable devices, human activity recognition (HAR) holds immense potential in health monitoring, smart environment. Notably, temporal sensory sequences collected from the wearable devices can provide accurate reflections of the daily activities. Nonetheless, existing CNN-based and LSTM-based methods have predominantly concentrated on feature extraction from univariate sequences, overlooking the implicit frequency information. Therefore, we firstly employed the Short Time Fourier Transform (STFT) in HAR tasks, extracting inherent frequency feature. Concurrently, we introduced a multi-branch network that combines CNN and LSTM. The CNN component captures spatial information of different dimensions. The LSTM, on the other hand, comprises two parts, one focused on temporal relationships within a single channel and the other concerned about channel relationships at a specific time point. In addition, recognizing the limitations in the available datasets, particularly the insufficient coverage of daily activities, we collected our custom dataset, encompassing eight distinct daily activity categories. Finally, we evaluated our proposed model and benchmark models. The results demonstrate that our network exhibits superior generalization across different datasets, archieving accuracy of 91.70%, 95.79%, 87.81% on the PAMAP2, UCI HAR and our own dataset respectively.
KW - cnn
KW - deep learning
KW - human activity recognition
KW - lstm
KW - multi-branch
UR - http://www.scopus.com/inward/record.url?scp=85191228993&partnerID=8YFLogxK
U2 - 10.1117/12.3023366
DO - 10.1117/12.3023366
M3 - Conference contribution
AN - SCOPUS:85191228993
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Third International Conference on Computer Technology, Information Engineering, and Electron Materials, CTIEEM 2023
A2 - Inoue, Atsushi
PB - SPIE
T2 - 3rd International Conference on Computer Technology, Information Engineering, and Electron Materials, CTIEEM 2023
Y2 - 17 November 2023 through 19 November 2023
ER -