TY - JOUR
T1 - Variance-reduced reshuffling gradient descent for nonconvex optimization
T2 - Centralized and distributed algorithms
AU - Jiang, Xia
AU - Zeng, Xianlin
AU - Xie, Lihua
AU - Sun, Jian
AU - Chen, Jie
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2025/1
Y1 - 2025/1
N2 - Nonconvex finite-sum optimization plays a crucial role in signal processing and machine learning, fueling the development of numerous centralized and distributed stochastic algorithms. However, existing stochastic optimization algorithms often suffer from high stochastic gradient variance due to the use of random sampling with replacement. To address this issue, this paper introduces an explicit variance-reduction step and proposes variance-reduced reshuffling gradient algorithms with a sampling-without-replacement scheme. Specifically, this paper proves that the proposed centralized variance-reduced reshuffling gradient algorithm (VR-RG) with constant step sizes converges to a stationary point for nonconvex optimization under the Kurdyka–Łojasiewicz condition. Moreover, for nonconvex optimization over connected multi-agent networks, the proposed distributed variance-reduced reshuffling gradient algorithm (DVR-RG) converges to a neighborhood of stationary points, where the neighborhood can be made arbitrarily small under mild conditions. Notably, the proposed DVR-RG requires only one communication round at each epoch. Finally, numerical simulations demonstrate the efficiency of the proposed algorithms.
AB - Nonconvex finite-sum optimization plays a crucial role in signal processing and machine learning, fueling the development of numerous centralized and distributed stochastic algorithms. However, existing stochastic optimization algorithms often suffer from high stochastic gradient variance due to the use of random sampling with replacement. To address this issue, this paper introduces an explicit variance-reduction step and proposes variance-reduced reshuffling gradient algorithms with a sampling-without-replacement scheme. Specifically, this paper proves that the proposed centralized variance-reduced reshuffling gradient algorithm (VR-RG) with constant step sizes converges to a stationary point for nonconvex optimization under the Kurdyka–Łojasiewicz condition. Moreover, for nonconvex optimization over connected multi-agent networks, the proposed distributed variance-reduced reshuffling gradient algorithm (DVR-RG) converges to a neighborhood of stationary points, where the neighborhood can be made arbitrarily small under mild conditions. Notably, the proposed DVR-RG requires only one communication round at each epoch. Finally, numerical simulations demonstrate the efficiency of the proposed algorithms.
KW - Multi-agent networks
KW - Nonconvex optimization
KW - Reshuffling gradient descent
KW - Sampling without replacement
KW - Variance reduction
UR - https://www.scopus.com/pages/publications/85205150348
U2 - 10.1016/j.automatica.2024.111954
DO - 10.1016/j.automatica.2024.111954
M3 - Article
AN - SCOPUS:85205150348
SN - 0005-1098
VL - 171
JO - Automatica
JF - Automatica
M1 - 111954
ER -