TY - GEN
T1 - ADAPID
T2 - 47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022
AU - Weng, Boxi
AU - Sun, Jian
AU - Sadeghi, Alireza
AU - Wang, Gang
N1 - Publisher Copyright:
© 2022 IEEE
PY - 2022
Y1 - 2022
N2 - Deep neural networks (DNNs) have well-documented merits in learning nonlinear functions in high-dimensional spaces. Stochastic gradient descent (SGD)-type optimization algorithms are the 'workhorse' for training DNNs. Nonetheless, such algorithms often suffer from slow convergence, sizable fluctuations, and abundant local solutions, to name a few. In this context, the present paper draws ideas from adaptive control of dynamical systems, and develops an adaptive proportional-integral-derivative (AdaPID) solver for fast, stable, and effective training of DNNs. AdaPID relies on second-order moment estimates of gradients to adaptively adjust the PID coefficients. Numerical tests corroborate the merits of AdaPID on several tasks such as image generation using generative adversarial networks (GANs) and image classification using convolutional neural networks (CNNs) as well as long-short term memories (LSTMs).
AB - Deep neural networks (DNNs) have well-documented merits in learning nonlinear functions in high-dimensional spaces. Stochastic gradient descent (SGD)-type optimization algorithms are the 'workhorse' for training DNNs. Nonetheless, such algorithms often suffer from slow convergence, sizable fluctuations, and abundant local solutions, to name a few. In this context, the present paper draws ideas from adaptive control of dynamical systems, and develops an adaptive proportional-integral-derivative (AdaPID) solver for fast, stable, and effective training of DNNs. AdaPID relies on second-order moment estimates of gradients to adaptively adjust the PID coefficients. Numerical tests corroborate the merits of AdaPID on several tasks such as image generation using generative adversarial networks (GANs) and image classification using convolutional neural networks (CNNs) as well as long-short term memories (LSTMs).
KW - Deep neural network
KW - PID control
KW - adaptive control
KW - adaptive learning rate
KW - stochastic optimization
UR - http://www.scopus.com/inward/record.url?scp=85131250607&partnerID=8YFLogxK
U2 - 10.1109/ICASSP43922.2022.9746279
DO - 10.1109/ICASSP43922.2022.9746279
M3 - Conference contribution
AN - SCOPUS:85131250607
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 3943
EP - 3947
BT - 2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2022 through 27 May 2022
ER -