Abstract
Principal component analysis (PCA) has been widely used in analyzing high-dimensional data. It converts a set of observed data points of possibly correlated variables into a set of linearly uncorrelated variables via an orthogonal transformation. To handle streaming data and reduce the complexities of PCA, (subspace) online PCA iterations were proposed to iteratively update the orthogonal transformation by taking one observed data point at a time. Existing works on the convergence of (subspace) online PCA iterations mostly focus on the case where the samples are almost surely uniformly bounded. In this paper, we analyze the convergence of a subspace online PCA iteration under more practical assumption and obtain a nearly optimal finite-sample error bound. Our convergence rate almost matches the minimax information lower bound. We prove that the convergence is nearly global in the sense that the subspace online PCA iteration is convergent with high probability for random initial guesses. This work also leads to a simpler proof of the recent work on analyzing online PCA for the first principal component only.
Original language | English |
---|---|
Pages (from-to) | 1087-1122 |
Number of pages | 36 |
Journal | Science China Mathematics |
Volume | 66 |
Issue number | 5 |
DOIs | |
Publication status | Published - May 2023 |
Externally published | Yes |
Keywords
- 62H12
- 62H25
- 65F99
- 68W27
- finite-sample analysis
- high-dimensional data
- online algorithm
- principal component analysis
- principal component subspace
- stochastic approximation