TY - JOUR
T1 - Probabilistic Semi-Supervised Learning via Sparse Graph Structure Learning
AU - Wang, Li
AU - Chan, Raymond
AU - Zeng, Tieyong
N1 - Publisher Copyright:
© 2012 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - We present a probabilistic semi-supervised learning (SSL) framework based on sparse graph structure learning. Different from existing SSL methods with either a predefined weighted graph heuristically constructed from the input data or a learned graph based on the locally linear embedding assumption, the proposed SSL model is capable of learning a sparse weighted graph from the unlabeled high-dimensional data and a small amount of labeled data, as well as dealing with the noise of the input data. Our representation of the weighted graph is indirectly derived from a unified model of density estimation and pairwise distance preservation in terms of various distance measurements, where latent embeddings are assumed to be random variables following an unknown density function to be learned, and pairwise distances are then calculated as the expectations over the density for the model robustness to the data noise. Moreover, the labeled data based on the same distance representations are leveraged to guide the estimated density for better class separation and sparse graph structure learning. A simple inference approach for the embeddings of unlabeled data based on point estimation and kernel representation is presented. Extensive experiments on various data sets show promising results in the setting of SSL compared with many existing methods and significant improvements on small amounts of labeled data.
AB - We present a probabilistic semi-supervised learning (SSL) framework based on sparse graph structure learning. Different from existing SSL methods with either a predefined weighted graph heuristically constructed from the input data or a learned graph based on the locally linear embedding assumption, the proposed SSL model is capable of learning a sparse weighted graph from the unlabeled high-dimensional data and a small amount of labeled data, as well as dealing with the noise of the input data. Our representation of the weighted graph is indirectly derived from a unified model of density estimation and pairwise distance preservation in terms of various distance measurements, where latent embeddings are assumed to be random variables following an unknown density function to be learned, and pairwise distances are then calculated as the expectations over the density for the model robustness to the data noise. Moreover, the labeled data based on the same distance representations are leveraged to guide the estimated density for better class separation and sparse graph structure learning. A simple inference approach for the embeddings of unlabeled data based on point estimation and kernel representation is presented. Extensive experiments on various data sets show promising results in the setting of SSL compared with many existing methods and significant improvements on small amounts of labeled data.
KW - Graph structure learning
KW - kernel learning
KW - latent variable model
KW - semi-supervised learning (SSL)
UR - http://www.scopus.com/inward/record.url?scp=85100815447&partnerID=8YFLogxK
U2 - 10.1109/TNNLS.2020.2979607
DO - 10.1109/TNNLS.2020.2979607
M3 - Article
C2 - 32287009
AN - SCOPUS:85100815447
SN - 2162-237X
VL - 32
SP - 853
EP - 867
JO - IEEE Transactions on Neural Networks and Learning Systems
JF - IEEE Transactions on Neural Networks and Learning Systems
IS - 2
M1 - 9063663
ER -