ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization

Xunpeng Huang, Runxin Xu*, Hao Zhou, Zhe Wang*, Zhengyang Liu, Lei Li

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Stochastic gradient descent (SGD) is a widely used method for its outstanding generalization ability and simplicity. Adaptive gradient methods have been proposed to further accelerate the optimization process. In this paper, we revisit existing adaptive gradient optimization methods with a new interpretation. Such new perspective leads to a refreshed understanding of the roles of second moments in stochastic optimization. Based on this, we propose Angle-Calibration Moment method (ACMo), a novel stochastic optimization method. It enjoys the benefits of second moments with only first moment updates. Theoretical analysis shows that ACMo is able to achieve the same convergence rate as mainstream adaptive methods. Experiments on a variety of CV and NLP tasks demonstrate that ACMo has a comparable convergence to state-of-the-art Adam-type optimizers, and even a better generalization performance in most cases. The code is available at https://github.com/Xunpeng746/ACMo.

Original languageEnglish
Title of host publication35th AAAI Conference on Artificial Intelligence, AAAI 2021
PublisherAssociation for the Advancement of Artificial Intelligence
Pages7857-7864
Number of pages8
ISBN (Electronic)9781713835974
Publication statusPublished - 2021
Event35th AAAI Conference on Artificial Intelligence, AAAI 2021 - Virtual, Online
Duration: 2 Feb 20219 Feb 2021

Publication series

Name35th AAAI Conference on Artificial Intelligence, AAAI 2021
Volume9A

Conference

Conference35th AAAI Conference on Artificial Intelligence, AAAI 2021
CityVirtual, Online
Period2/02/219/02/21

Fingerprint

Dive into the research topics of 'ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization'. Together they form a unique fingerprint.

Cite this