TY - JOUR
T1 - Label-Based Disentanglement Measure among Hidden Units of Deep Learning
AU - Zhang, Chenguang
AU - Hou, Yuexian
AU - Song, Dawei
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - The capability to disentangle underlying factors hidden in the observable data, thereby obtaining their abstract representations, is considered one important ingredient for the subsequent success of deep networks in various application scenarios. Recently, numerous practical measures and learning strategies have been established for disentanglement, showcasing their potential in improving the model’s explainability, controlability, and robustness. However, when the downstream tasks come to the classification issues, there is still no consensus in the community on the definition or measurement for disentanglement, and its connection to the generalization capacity remains not very clear. Aiming at this, we explore the highly non-linear effect of a specified hidden layer on the generalization capacity from an information perspective and obtain a tight bound. Upon decompsing the bound, we find that besides the unsupervised disentanglement measure term in the conventional sense, a new supervised disentanglement term also emerges with a nonnegligible effect on the generality. Consequently, a novel label-based disentanglement measure (LDM) is naturally introduced as the discrepancy between these two terms under the supervised learning settings to substitute the commonly used unsupervised disentanglement measure. The theoretical analysis reveals an inverse relationship between the defined LDM and the generalization capacity. Finally, using LDM as regularizer, the experiments show that the deep neural networks (DNNs) can effectively reduce generalization error while improving classification accuracy when noise is added to the data features or labels, which strongly supports our points.
AB - The capability to disentangle underlying factors hidden in the observable data, thereby obtaining their abstract representations, is considered one important ingredient for the subsequent success of deep networks in various application scenarios. Recently, numerous practical measures and learning strategies have been established for disentanglement, showcasing their potential in improving the model’s explainability, controlability, and robustness. However, when the downstream tasks come to the classification issues, there is still no consensus in the community on the definition or measurement for disentanglement, and its connection to the generalization capacity remains not very clear. Aiming at this, we explore the highly non-linear effect of a specified hidden layer on the generalization capacity from an information perspective and obtain a tight bound. Upon decompsing the bound, we find that besides the unsupervised disentanglement measure term in the conventional sense, a new supervised disentanglement term also emerges with a nonnegligible effect on the generality. Consequently, a novel label-based disentanglement measure (LDM) is naturally introduced as the discrepancy between these two terms under the supervised learning settings to substitute the commonly used unsupervised disentanglement measure. The theoretical analysis reveals an inverse relationship between the defined LDM and the generalization capacity. Finally, using LDM as regularizer, the experiments show that the deep neural networks (DNNs) can effectively reduce generalization error while improving classification accuracy when noise is added to the data features or labels, which strongly supports our points.
KW - Deep leanring
KW - Disentanglement
KW - Generality
UR - http://www.scopus.com/inward/record.url?scp=85211344277&partnerID=8YFLogxK
U2 - 10.1007/s11063-024-11708-8
DO - 10.1007/s11063-024-11708-8
M3 - Article
AN - SCOPUS:85211344277
SN - 1370-4621
VL - 56
JO - Neural Processing Letters
JF - Neural Processing Letters
IS - 6
M1 - 252
ER -