Abstract
Cross-validation (CV) is a widely adopted approach for selecting the optimal model. However, the computation of empirical cross-validation error (CVE) has high complexity due to multiple times of learner training. In this paper, we develop a novel approximation theory of CVE and present an approximate approach to CV based on the Bouligand influence function (BIF) for kernel-based algorithms. We first represent the BIF and higher order BIFs in Taylor expansions, and approximate CV via the Taylor expansions. We then derive an upper bound of the discrepancy between the original and approximate CV. Furthermore, we provide a novel computing method to calculate the BIF for general distribution, and evaluate BIF criterion for sample distribution to approximate CV. The proposed approximate CV requires training on the full data set only once and is suitable for a wide variety of kernel-based algorithms. Experimental results demonstrate that the proposed approximate CV is sound and effective.
Original language | English |
---|---|
Article number | 8611136 |
Pages (from-to) | 1083-1096 |
Number of pages | 14 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 42 |
Issue number | 5 |
DOIs | |
Publication status | Published - 1 May 2020 |
Externally published | Yes |
Keywords
- Cross-validation
- approximation
- bouligand influence function
- kernel methods
- model selection