TY - JOUR
T1 - Canonical correlation analysis of datasets with a common source graph
AU - Chen, Jia
AU - Wang, Gang
AU - Shen, Yanning
AU - Giannakis, Georgios B.
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2018/8/15
Y1 - 2018/8/15
N2 - Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA, however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is developed too. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.
AB - Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA, however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is developed too. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.
KW - Dimensionality reduction
KW - Laplacian regularization
KW - correlation analysis
KW - generalized eigen-decomposition
KW - signal processing over graphs
UR - https://www.scopus.com/pages/publications/85049696772
U2 - 10.1109/TSP.2018.2853130
DO - 10.1109/TSP.2018.2853130
M3 - Article
AN - SCOPUS:85049696772
SN - 1053-587X
VL - 66
SP - 4398
EP - 4405
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
IS - 16
M1 - 8408767
ER -