Skip to main navigation Skip to search Skip to main content

An Innovative Subsampling Approach for Efficient SVM Training with Large Datasets

  • Shuo Sun
  • , Wenlin Dai
  • , Dianpeng Wang*
  • *Corresponding author for this work
  • Beijing Institute of Technology
  • Institute of Statistics and Big Data

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Support vector machines (SVMs) are widely recognized for their effectiveness in handling classification problems, owing to their solid theoretical foundation and excellent generalization performance. However, despite these advantages, SVMs have a significant drawback in the form of high computational time, which increases with the size of the training dataset. To address this limitation, this article presents a novel adaptive sequential subsampling method designed to accelerate the training process of SVMs. The proposed method consists of two stages. In the first stage, a space-filling design is employed to group samples into cells. Then, an initial pilot SVM model is trained by utilizing the centroids and corresponding labels of these cells. In the second stage, an adaptive sequential stratified sampling method, based on the distance between each cell and the hyperplane, is employed to select informative samples, thereby enhancing the SVM model. Numerical studies show that our approach achieves classification accuracy that is comparable to or even better than that of basic SVM, while requiring only approximately 1% of the CPU time. Consequently, our algorithm is a more efficient choice for large-scale data applications.

Original languageEnglish
Title of host publication2025 4th International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1798-1807
Number of pages10
ISBN (Electronic)9798331565817
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event2025 4th International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2025 - Chongqing, China
Duration: 21 Nov 202523 Nov 2025

Publication series

Name2025 4th International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2025

Conference

Conference2025 4th International Conference on Image Processing, Computer Vision and Machine Learning, ICICML 2025
Country/TerritoryChina
CityChongqing
Period21/11/2523/11/25

Keywords

  • Adaptive subsampling
  • Distance-based
  • Space-filling
  • Support vector machines

Fingerprint

Dive into the research topics of 'An Innovative Subsampling Approach for Efficient SVM Training with Large Datasets'. Together they form a unique fingerprint.

Cite this