TY - JOUR
T1 - Nonlinear dimensionality reduction for discriminative analytics of multiple datasets
AU - Chen, Jia
AU - Wang, Gang
AU - Giannakis, Georgios B.
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2019/2/1
Y1 - 2019/2/1
N2 - Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.
AB - Principal component analysis (PCA) is widely used for feature extraction and dimensionality reduction, with documented merits in diverse tasks involving high-dimensional data. PCA copes with one dataset at a time, but it is challenged when it comes to analyzing multiple datasets jointly. In certain data science settings however, one is often interested in extracting the most discriminative information from one dataset of particular interest (a.k.a. target data) relative to the other(s) (a.k.a. background data). To this end, this paper puts forth a novel approach, termed discriminative (d) PCA, for such discriminative analytics of multiple datasets. Under certain conditions, dPCA is proved to be least-squares optimal in recovering the latent subspace vector unique to the target data relative to background data. To account for nonlinear data correlations, (linear) dPCA models for one or multiple background datasets are generalized through kernel-based learning. Interestingly, all dPCA variants admit an analytical solution obtainable with a single (generalized) eigenvalue decomposition. Finally, substantial dimensionality reduction tests using synthetic and real datasets are provided to corroborate the merits of the proposed methods.
KW - Principal component analysis
KW - discriminative analytics
KW - kernel learning
KW - multiple background datasets
UR - https://www.scopus.com/pages/publications/85051188014
U2 - 10.1109/TSP.2018.2885478
DO - 10.1109/TSP.2018.2885478
M3 - Article
AN - SCOPUS:85051188014
SN - 1053-587X
VL - 67
SP - 740
EP - 752
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
IS - 3
M1 - 8565879
ER -