Abstract
Many scientific datasets are of high dimension, and the analysis usually requires retaining the most important structures of data. Principal curve is a widely used approach for this purpose. However, many existing methods work only for data with structures that are mathematically formulated by curves, which is quite restrictive for real applications. A few methods can overcome the above problem, but they either require complicated human-made rules for a specific task with lack of adaption flexibility to different tasks, or cannot obtain explicit structures of data. To address these issues, we develop a novel principal graph and structure learning framework that captures the local information of the underlying graph structure based on reversed graph embedding. As showcases, models that can learn a spanning tree or a weighted undirected ℓ1 graph are proposed, and a new learning algorithm is developed that learns a set of principal points and a graph structure from data, simultaneously. The new algorithm is simple with guaranteed convergence. We then extend the proposed framework to deal with large-scale data. Experimental results on various synthetic and six real world datasets show that the proposed method compares favorably with baselines and can uncover the underlying structure correctly.
Original language | English |
---|---|
Article number | 7769209 |
Pages (from-to) | 2227-2241 |
Number of pages | 15 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 39 |
Issue number | 11 |
DOIs | |
Publication status | Published - 1 Nov 2017 |
Externally published | Yes |
Keywords
- Principal curve
- principal graph
- structure learning