TY - JOUR
T1 - Interpretable Multi-Channel Capsule Network for Human Motion Recognition
AU - Li, Peizhang
AU - Fei, Qing
AU - Chen, Zhen
AU - Liu, Xiangdong
N1 - Publisher Copyright:
© 2023 by the authors.
PY - 2023/10
Y1 - 2023/10
N2 - Recently, capsule networks have emerged as a novel neural network architecture for human motion recognition owing to their enhanced interpretability compared to traditional deep learning networks. However, the characteristic features of human motion are often distributed across distinct spatial dimensions and existing capsule networks struggle to independently extract and combine features across multiple spatial dimensions. In this paper, we propose a new multi-channel capsule network architecture that extracts feature capsules in different spatial dimensions, generates a multi-channel capsule chain with independent routing within each channel, and culminates in the aggregation of information from capsules in different channels to activate categories. The proposed structure endows the network with the capability to independently cluster interpretable features within different channels; aggregates features across channels during classification, thereby enhancing classification accuracy and robustness; and also presents the potential for mining interpretable primitives within individual channels. Experimental comparisons with several existing capsule network structures demonstrate the superior performance of the proposed architecture. Furthermore, in contrast to previous studies that vaguely discussed the interpretability of capsule networks, we include additional visual experiments that illustrate the interpretability of the proposed network structure in practical scenarios.
AB - Recently, capsule networks have emerged as a novel neural network architecture for human motion recognition owing to their enhanced interpretability compared to traditional deep learning networks. However, the characteristic features of human motion are often distributed across distinct spatial dimensions and existing capsule networks struggle to independently extract and combine features across multiple spatial dimensions. In this paper, we propose a new multi-channel capsule network architecture that extracts feature capsules in different spatial dimensions, generates a multi-channel capsule chain with independent routing within each channel, and culminates in the aggregation of information from capsules in different channels to activate categories. The proposed structure endows the network with the capability to independently cluster interpretable features within different channels; aggregates features across channels during classification, thereby enhancing classification accuracy and robustness; and also presents the potential for mining interpretable primitives within individual channels. Experimental comparisons with several existing capsule network structures demonstrate the superior performance of the proposed architecture. Furthermore, in contrast to previous studies that vaguely discussed the interpretability of capsule networks, we include additional visual experiments that illustrate the interpretability of the proposed network structure in practical scenarios.
KW - capsule networks
KW - human motion recognition
KW - interpretability
KW - multi-channel routing
UR - http://www.scopus.com/inward/record.url?scp=85175069046&partnerID=8YFLogxK
U2 - 10.3390/electronics12204313
DO - 10.3390/electronics12204313
M3 - Article
AN - SCOPUS:85175069046
SN - 2079-9292
VL - 12
JO - Electronics (Switzerland)
JF - Electronics (Switzerland)
IS - 20
M1 - 4313
ER -