Interpretable Multi-Channel Capsule Network for Human Motion Recognition

Peizhang Li; Qing Fei; Zhen Chen; Xiangdong Liu

doi:10.3390/electronics12204313

Interpretable Multi-Channel Capsule Network for Human Motion Recognition

Peizhang Li, Qing Fei^*, Zhen Chen, Xiangdong Liu

^*Corresponding author for this work

School of Automation

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

4 Citations (Scopus)

Abstract

Recently, capsule networks have emerged as a novel neural network architecture for human motion recognition owing to their enhanced interpretability compared to traditional deep learning networks. However, the characteristic features of human motion are often distributed across distinct spatial dimensions and existing capsule networks struggle to independently extract and combine features across multiple spatial dimensions. In this paper, we propose a new multi-channel capsule network architecture that extracts feature capsules in different spatial dimensions, generates a multi-channel capsule chain with independent routing within each channel, and culminates in the aggregation of information from capsules in different channels to activate categories. The proposed structure endows the network with the capability to independently cluster interpretable features within different channels; aggregates features across channels during classification, thereby enhancing classification accuracy and robustness; and also presents the potential for mining interpretable primitives within individual channels. Experimental comparisons with several existing capsule network structures demonstrate the superior performance of the proposed architecture. Furthermore, in contrast to previous studies that vaguely discussed the interpretability of capsule networks, we include additional visual experiments that illustrate the interpretability of the proposed network structure in practical scenarios.

Original language	English
Article number	4313
Journal	Electronics (Switzerland)
Volume	12
Issue number	20
DOIs	https://doi.org/10.3390/electronics12204313
Publication status	Published - Oct 2023

Keywords

capsule networks
human motion recognition
interpretability
multi-channel routing

Access to Document

10.3390/electronics12204313

Cite this

Li, P., Fei, Q., Chen, Z., & Liu, X. (2023). Interpretable Multi-Channel Capsule Network for Human Motion Recognition. Electronics (Switzerland), 12(20), Article 4313. https://doi.org/10.3390/electronics12204313

@article{889a278858cf4d2b8fc17f53a8379668,

title = "Interpretable Multi-Channel Capsule Network for Human Motion Recognition",

abstract = "Recently, capsule networks have emerged as a novel neural network architecture for human motion recognition owing to their enhanced interpretability compared to traditional deep learning networks. However, the characteristic features of human motion are often distributed across distinct spatial dimensions and existing capsule networks struggle to independently extract and combine features across multiple spatial dimensions. In this paper, we propose a new multi-channel capsule network architecture that extracts feature capsules in different spatial dimensions, generates a multi-channel capsule chain with independent routing within each channel, and culminates in the aggregation of information from capsules in different channels to activate categories. The proposed structure endows the network with the capability to independently cluster interpretable features within different channels; aggregates features across channels during classification, thereby enhancing classification accuracy and robustness; and also presents the potential for mining interpretable primitives within individual channels. Experimental comparisons with several existing capsule network structures demonstrate the superior performance of the proposed architecture. Furthermore, in contrast to previous studies that vaguely discussed the interpretability of capsule networks, we include additional visual experiments that illustrate the interpretability of the proposed network structure in practical scenarios.",

keywords = "capsule networks, human motion recognition, interpretability, multi-channel routing",

author = "Peizhang Li and Qing Fei and Zhen Chen and Xiangdong Liu",

note = "Publisher Copyright: {\textcopyright} 2023 by the authors.",

year = "2023",

month = oct,

doi = "10.3390/electronics12204313",

language = "English",

volume = "12",

journal = "Electronics (Switzerland)",

issn = "2079-9292",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "20",

}

TY - JOUR

T1 - Interpretable Multi-Channel Capsule Network for Human Motion Recognition

AU - Li, Peizhang

AU - Fei, Qing

AU - Chen, Zhen

AU - Liu, Xiangdong

PY - 2023/10

Y1 - 2023/10

N2 - Recently, capsule networks have emerged as a novel neural network architecture for human motion recognition owing to their enhanced interpretability compared to traditional deep learning networks. However, the characteristic features of human motion are often distributed across distinct spatial dimensions and existing capsule networks struggle to independently extract and combine features across multiple spatial dimensions. In this paper, we propose a new multi-channel capsule network architecture that extracts feature capsules in different spatial dimensions, generates a multi-channel capsule chain with independent routing within each channel, and culminates in the aggregation of information from capsules in different channels to activate categories. The proposed structure endows the network with the capability to independently cluster interpretable features within different channels; aggregates features across channels during classification, thereby enhancing classification accuracy and robustness; and also presents the potential for mining interpretable primitives within individual channels. Experimental comparisons with several existing capsule network structures demonstrate the superior performance of the proposed architecture. Furthermore, in contrast to previous studies that vaguely discussed the interpretability of capsule networks, we include additional visual experiments that illustrate the interpretability of the proposed network structure in practical scenarios.

AB - Recently, capsule networks have emerged as a novel neural network architecture for human motion recognition owing to their enhanced interpretability compared to traditional deep learning networks. However, the characteristic features of human motion are often distributed across distinct spatial dimensions and existing capsule networks struggle to independently extract and combine features across multiple spatial dimensions. In this paper, we propose a new multi-channel capsule network architecture that extracts feature capsules in different spatial dimensions, generates a multi-channel capsule chain with independent routing within each channel, and culminates in the aggregation of information from capsules in different channels to activate categories. The proposed structure endows the network with the capability to independently cluster interpretable features within different channels; aggregates features across channels during classification, thereby enhancing classification accuracy and robustness; and also presents the potential for mining interpretable primitives within individual channels. Experimental comparisons with several existing capsule network structures demonstrate the superior performance of the proposed architecture. Furthermore, in contrast to previous studies that vaguely discussed the interpretability of capsule networks, we include additional visual experiments that illustrate the interpretability of the proposed network structure in practical scenarios.

KW - capsule networks

KW - human motion recognition

KW - interpretability

KW - multi-channel routing

UR - http://www.scopus.com/inward/record.url?scp=85175069046&partnerID=8YFLogxK

U2 - 10.3390/electronics12204313

DO - 10.3390/electronics12204313

M3 - Article

AN - SCOPUS:85175069046

SN - 2079-9292

VL - 12

JO - Electronics (Switzerland)

JF - Electronics (Switzerland)

IS - 20

M1 - 4313

ER -

Interpretable Multi-Channel Capsule Network for Human Motion Recognition

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this