TY - JOUR
T1 - Extracting salient features from convolutional discriminative filters
AU - Zhou, Yuxiang
AU - Liao, Lejian
AU - Gao, Yang
AU - Huang, Heyan
N1 - Publisher Copyright:
© 2021 The Author(s)
PY - 2021/5
Y1 - 2021/5
N2 - Convolutional neural networks (CNN) have been widely used in various tasks, largely due to their ability to efficiently extract n-gram features for text analysis and document representation. In this paper, we intend to insight the CNN model regarding its capability on text analysis. Vanilla CNNs do have weaknesses when it comes to the representation and feature extraction. Duplicate filters are inevitable with vanilla CNNs, which reduces the discriminative power of the representations. In addition, the current pooling operations either limit the CNN to the local optimum (i.e., max pooling) or they do not consider the importance of all features (i.e., mean pooling). In this paper, we propose two modules for vanilla CNNs to overcome these shortcomings. The first equips the CNN with discriminative filters (distinct filters with maximised divergence) and the second provides the ability to comprehensively extract all salient features. Specifically, our model increases the discriminative power of the model by maximizing the distance between different filters, and a novel global pooling mechanism for feature extraction. Validation tests against state-of-the-art baselines on five benchmark classification datasets achieve the competitive performance of our proposed model. Furthermore, visualization on upgrade filters and pooling features verify our hypothesis that the proposed model can receive discriminative filters and salient features.
AB - Convolutional neural networks (CNN) have been widely used in various tasks, largely due to their ability to efficiently extract n-gram features for text analysis and document representation. In this paper, we intend to insight the CNN model regarding its capability on text analysis. Vanilla CNNs do have weaknesses when it comes to the representation and feature extraction. Duplicate filters are inevitable with vanilla CNNs, which reduces the discriminative power of the representations. In addition, the current pooling operations either limit the CNN to the local optimum (i.e., max pooling) or they do not consider the importance of all features (i.e., mean pooling). In this paper, we propose two modules for vanilla CNNs to overcome these shortcomings. The first equips the CNN with discriminative filters (distinct filters with maximised divergence) and the second provides the ability to comprehensively extract all salient features. Specifically, our model increases the discriminative power of the model by maximizing the distance between different filters, and a novel global pooling mechanism for feature extraction. Validation tests against state-of-the-art baselines on five benchmark classification datasets achieve the competitive performance of our proposed model. Furthermore, visualization on upgrade filters and pooling features verify our hypothesis that the proposed model can receive discriminative filters and salient features.
KW - Convolutional neural network (CNN)
KW - Discriminative filters
KW - Document representation
KW - Pooling mechanism
KW - Salient feature
KW - Text classification
UR - http://www.scopus.com/inward/record.url?scp=85100880513&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2020.12.084
DO - 10.1016/j.ins.2020.12.084
M3 - Article
AN - SCOPUS:85100880513
SN - 0020-0255
VL - 558
SP - 265
EP - 279
JO - Information Sciences
JF - Information Sciences
ER -