Probabilistic Variance-Reduced Shuffling Gradient Descent Algorithm for Nonconvex Optimization

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Nonconvex finite-sum optimization finds wide applications in various signal processing and machine learning tasks. The well-known stochastic gradient algorithms generate unbiased stochastic gradient estimates by taking the uniform sampling mechanism with replacement, which can be difficult to implement in practice. This paper explores a sampling-without-replacement shuffling scheme to generate stochastic gradients. Additionally, to handle the variance of gradients, this paper develops a novel variance reduction step in the shuffling gradient descent algorithm. Specifically, this paper proposes a probabilistic variance-reduced shuffling gradient descent algorithm, which reduces the number of gradient oracle calls in the variance reduction step. The proposed algorithm owns a better convergence rate compared to existing stochastic gradient algorithms for nonconvex optimization. Finally, numerical results are presented to demonstrate the efficiency of the proposed algorithm.

Original languageEnglish
Title of host publicationProceedings of the 44th Chinese Control Conference, CCC 2025
EditorsJian Sun, Hongpeng Yin
PublisherIEEE Computer Society
Pages2124-2129
Number of pages6
ISBN (Electronic)9789887581611
DOIs
Publication statusPublished - 2025
Externally publishedYes
Event44th Chinese Control Conference, CCC 2025 - Chongqing, China
Duration: 28 Jul 202530 Jul 2025

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference44th Chinese Control Conference, CCC 2025
Country/TerritoryChina
CityChongqing
Period28/07/2530/07/25

Keywords

  • Bernoulli distribution
  • Nonconvex optimization
  • Random reshuffling
  • Variance reduction

Fingerprint

Dive into the research topics of 'Probabilistic Variance-Reduced Shuffling Gradient Descent Algorithm for Nonconvex Optimization'. Together they form a unique fingerprint.

Cite this