TY - JOUR
T1 - Kronecker CP Decomposition With Fast Multiplication for Compressing RNNs
AU - Wang, Dingheng
AU - Wu, Bijiao
AU - Zhao, Guangshe
AU - Yao, Man
AU - Chen, Hengnu
AU - Deng, Lei
AU - Yan, Tianyi
AU - Li, Guoqi
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2023/5/1
Y1 - 2023/5/1
N2 - Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. However, because the modern RNNs have complex topologies and expensive space/computation complexity, compressing them becomes a hot and promising topic in recent years. Among plenty of compression methods, tensor decomposition, e.g., tensor train (TT), block term (BT), tensor ring (TR), and hierarchical Tucker (HT), appears to be the most amazing approach because a very high compression ratio might be obtained. Nevertheless, none of these tensor decomposition formats can provide both space and computation efficiency. In this article, we consider to compress RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition, which is derived from Kronecker tensor (KT) decomposition, by proposing two fast algorithms of multiplication between the input and the tensor-decomposed weight. According to our experiments based on UCF11, Youtube Celebrities Face, UCF50, TIMIT, TED-LIUM, and Spiking Heidelberg digits datasets, it can be verified that the proposed KCP-RNNs have a comparable performance of accuracy with those in other tensor-decomposed formats, and even 278 219× compression ratio could be obtained by the low-rank KCP. More importantly, KCP-RNNs are efficient in both space and computation complexity compared with other tensor-decomposed ones. Besides, we find KCP has the best potential of parallel computing to accelerate the calculations in neural networks.
AB - Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition. However, because the modern RNNs have complex topologies and expensive space/computation complexity, compressing them becomes a hot and promising topic in recent years. Among plenty of compression methods, tensor decomposition, e.g., tensor train (TT), block term (BT), tensor ring (TR), and hierarchical Tucker (HT), appears to be the most amazing approach because a very high compression ratio might be obtained. Nevertheless, none of these tensor decomposition formats can provide both space and computation efficiency. In this article, we consider to compress RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition, which is derived from Kronecker tensor (KT) decomposition, by proposing two fast algorithms of multiplication between the input and the tensor-decomposed weight. According to our experiments based on UCF11, Youtube Celebrities Face, UCF50, TIMIT, TED-LIUM, and Spiking Heidelberg digits datasets, it can be verified that the proposed KCP-RNNs have a comparable performance of accuracy with those in other tensor-decomposed formats, and even 278 219× compression ratio could be obtained by the low-rank KCP. More importantly, KCP-RNNs are efficient in both space and computation complexity compared with other tensor-decomposed ones. Besides, we find KCP has the best potential of parallel computing to accelerate the calculations in neural networks.
KW - Fast multiplication
KW - Kronecker CP decomposition
KW - Kronecker tensor (KT) decomposition
KW - network compression
KW - recurrent neural networks (RNNs)
UR - http://www.scopus.com/inward/record.url?scp=85115203702&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2021.3105961
DO - 10.1109/TNNLS.2021.3105961
M3 - Article
C2 - 34534089
AN - SCOPUS:85115203702
SN - 2162-237X
VL - 34
SP - 2205
EP - 2219
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 5
ER -