Deep unsupervised active learning via matrix sketching

Changsheng Li; Rongqing Li; Ye Yuan; Guoren Wang; Dong Xu

doi:10.1109/TIP.2021.3124317

Deep unsupervised active learning via matrix sketching

Changsheng Li^*, Rongqing Li, Ye Yuan, Guoren Wang, Dong Xu

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

8 Citations (Scopus)

Abstract

—Most existing unsupervised active learning methods aim at minimizing the data reconstruction loss by using the linear models to choose representative samples for manually labeling in an unsupervised setting. Thus these methods often fail in modelling data with complex non-linear structure. To address this issue, we propose a new deep unsupervised Active Learning method for classification tasks, inspired by the idea of Matrix Sketching, called ALMS. Specifically, ALMS leverages a deep auto-encoder to embed data into a latent space, and then describes all the embedded data with a small size sketch to summarize the major characteristics of the data. In contrast to previous approaches that reconstruct the whole data matrix for selecting the representative samples, ALMS aims to select a representative subset of samples to well approximate the sketch, which can preserve the major information of data meanwhile significantly reducing the number of network parameters. This makes our algorithm alleviate the issue of model overfitting and readily cope with large datasets. Actually, the sketch provides a type of self-supervised signal to guide the learning of the model. Moreover, we propose to construct an auxiliary self-supervised task by classifying real/fake samples, in order to further improve the representation ability of the encoder. We thoroughly evaluate the performance of ALMS on both single-label and multi-label classification tasks, and the results demonstrate its superior performance against the state-of-the-art methods. The code can be found at https://github.com/lrq99/ALMS.

Original language	English
Pages (from-to)	9280-9293
Number of pages	14
Journal	IEEE Transactions on Image Processing
Volume	30
DOIs	https://doi.org/10.1109/TIP.2021.3124317
Publication status	Published - 2021

Keywords

Data reconstruction
Matrix sketching
Self-supervised learning
Unsupervised active learning

Access to Document

10.1109/TIP.2021.3124317

Cite this

@article{9607e18cb7d7437689d3d94e7f9d00c0,

title = "Deep unsupervised active learning via matrix sketching",

abstract = "—Most existing unsupervised active learning methods aim at minimizing the data reconstruction loss by using the linear models to choose representative samples for manually labeling in an unsupervised setting. Thus these methods often fail in modelling data with complex non-linear structure. To address this issue, we propose a new deep unsupervised Active Learning method for classification tasks, inspired by the idea of Matrix Sketching, called ALMS. Specifically, ALMS leverages a deep auto-encoder to embed data into a latent space, and then describes all the embedded data with a small size sketch to summarize the major characteristics of the data. In contrast to previous approaches that reconstruct the whole data matrix for selecting the representative samples, ALMS aims to select a representative subset of samples to well approximate the sketch, which can preserve the major information of data meanwhile significantly reducing the number of network parameters. This makes our algorithm alleviate the issue of model overfitting and readily cope with large datasets. Actually, the sketch provides a type of self-supervised signal to guide the learning of the model. Moreover, we propose to construct an auxiliary self-supervised task by classifying real/fake samples, in order to further improve the representation ability of the encoder. We thoroughly evaluate the performance of ALMS on both single-label and multi-label classification tasks, and the results demonstrate its superior performance against the state-of-the-art methods. The code can be found at https://github.com/lrq99/ALMS.",

keywords = "Data reconstruction, Matrix sketching, Self-supervised learning, Unsupervised active learning",

author = "Changsheng Li and Rongqing Li and Ye Yuan and Guoren Wang and Dong Xu",

year = "2021",

doi = "10.1109/TIP.2021.3124317",

language = "English",

volume = "30",

pages = "9280--9293",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - Deep unsupervised active learning via matrix sketching

AU - Li, Changsheng

AU - Li, Rongqing

AU - Yuan, Ye

AU - Wang, Guoren

AU - Xu, Dong

PY - 2021

Y1 - 2021

N2 - —Most existing unsupervised active learning methods aim at minimizing the data reconstruction loss by using the linear models to choose representative samples for manually labeling in an unsupervised setting. Thus these methods often fail in modelling data with complex non-linear structure. To address this issue, we propose a new deep unsupervised Active Learning method for classification tasks, inspired by the idea of Matrix Sketching, called ALMS. Specifically, ALMS leverages a deep auto-encoder to embed data into a latent space, and then describes all the embedded data with a small size sketch to summarize the major characteristics of the data. In contrast to previous approaches that reconstruct the whole data matrix for selecting the representative samples, ALMS aims to select a representative subset of samples to well approximate the sketch, which can preserve the major information of data meanwhile significantly reducing the number of network parameters. This makes our algorithm alleviate the issue of model overfitting and readily cope with large datasets. Actually, the sketch provides a type of self-supervised signal to guide the learning of the model. Moreover, we propose to construct an auxiliary self-supervised task by classifying real/fake samples, in order to further improve the representation ability of the encoder. We thoroughly evaluate the performance of ALMS on both single-label and multi-label classification tasks, and the results demonstrate its superior performance against the state-of-the-art methods. The code can be found at https://github.com/lrq99/ALMS.

AB - —Most existing unsupervised active learning methods aim at minimizing the data reconstruction loss by using the linear models to choose representative samples for manually labeling in an unsupervised setting. Thus these methods often fail in modelling data with complex non-linear structure. To address this issue, we propose a new deep unsupervised Active Learning method for classification tasks, inspired by the idea of Matrix Sketching, called ALMS. Specifically, ALMS leverages a deep auto-encoder to embed data into a latent space, and then describes all the embedded data with a small size sketch to summarize the major characteristics of the data. In contrast to previous approaches that reconstruct the whole data matrix for selecting the representative samples, ALMS aims to select a representative subset of samples to well approximate the sketch, which can preserve the major information of data meanwhile significantly reducing the number of network parameters. This makes our algorithm alleviate the issue of model overfitting and readily cope with large datasets. Actually, the sketch provides a type of self-supervised signal to guide the learning of the model. Moreover, we propose to construct an auxiliary self-supervised task by classifying real/fake samples, in order to further improve the representation ability of the encoder. We thoroughly evaluate the performance of ALMS on both single-label and multi-label classification tasks, and the results demonstrate its superior performance against the state-of-the-art methods. The code can be found at https://github.com/lrq99/ALMS.

KW - Data reconstruction

KW - Matrix sketching

KW - Self-supervised learning

KW - Unsupervised active learning

UR - http://www.scopus.com/inward/record.url?scp=85118666282&partnerID=8YFLogxK

U2 - 10.1109/TIP.2021.3124317

DO - 10.1109/TIP.2021.3124317

M3 - Article

C2 - 34739378

AN - SCOPUS:85118666282

SN - 1057-7149

VL - 30

SP - 9280

EP - 9293

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

ER -

Deep unsupervised active learning via matrix sketching

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this