TY - JOUR
T1 - Deep unsupervised active learning via matrix sketching
AU - Li, Changsheng
AU - Li, Rongqing
AU - Yuan, Ye
AU - Wang, Guoren
AU - Xu, Dong
N1 - Publisher Copyright:
© 2021 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
PY - 2021
Y1 - 2021
N2 - —Most existing unsupervised active learning methods aim at minimizing the data reconstruction loss by using the linear models to choose representative samples for manually labeling in an unsupervised setting. Thus these methods often fail in modelling data with complex non-linear structure. To address this issue, we propose a new deep unsupervised Active Learning method for classification tasks, inspired by the idea of Matrix Sketching, called ALMS. Specifically, ALMS leverages a deep auto-encoder to embed data into a latent space, and then describes all the embedded data with a small size sketch to summarize the major characteristics of the data. In contrast to previous approaches that reconstruct the whole data matrix for selecting the representative samples, ALMS aims to select a representative subset of samples to well approximate the sketch, which can preserve the major information of data meanwhile significantly reducing the number of network parameters. This makes our algorithm alleviate the issue of model overfitting and readily cope with large datasets. Actually, the sketch provides a type of self-supervised signal to guide the learning of the model. Moreover, we propose to construct an auxiliary self-supervised task by classifying real/fake samples, in order to further improve the representation ability of the encoder. We thoroughly evaluate the performance of ALMS on both single-label and multi-label classification tasks, and the results demonstrate its superior performance against the state-of-the-art methods. The code can be found at https://github.com/lrq99/ALMS.
AB - —Most existing unsupervised active learning methods aim at minimizing the data reconstruction loss by using the linear models to choose representative samples for manually labeling in an unsupervised setting. Thus these methods often fail in modelling data with complex non-linear structure. To address this issue, we propose a new deep unsupervised Active Learning method for classification tasks, inspired by the idea of Matrix Sketching, called ALMS. Specifically, ALMS leverages a deep auto-encoder to embed data into a latent space, and then describes all the embedded data with a small size sketch to summarize the major characteristics of the data. In contrast to previous approaches that reconstruct the whole data matrix for selecting the representative samples, ALMS aims to select a representative subset of samples to well approximate the sketch, which can preserve the major information of data meanwhile significantly reducing the number of network parameters. This makes our algorithm alleviate the issue of model overfitting and readily cope with large datasets. Actually, the sketch provides a type of self-supervised signal to guide the learning of the model. Moreover, we propose to construct an auxiliary self-supervised task by classifying real/fake samples, in order to further improve the representation ability of the encoder. We thoroughly evaluate the performance of ALMS on both single-label and multi-label classification tasks, and the results demonstrate its superior performance against the state-of-the-art methods. The code can be found at https://github.com/lrq99/ALMS.
KW - Data reconstruction
KW - Matrix sketching
KW - Self-supervised learning
KW - Unsupervised active learning
UR - http://www.scopus.com/inward/record.url?scp=85118666282&partnerID=8YFLogxK
U2 - 10.1109/TIP.2021.3124317
DO - 10.1109/TIP.2021.3124317
M3 - Article
C2 - 34739378
AN - SCOPUS:85118666282
SN - 1057-7149
VL - 30
SP - 9280
EP - 9293
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -