Adaptive Period Control for Communication Efficient and Fast Convergent Federated Learning

Jude Tchaye-Kondi, Yanlong Zhai*, Jun Shen, Akbar Telikani, Liehuang Zhu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Federated Learning is particularly challenging in IoT environments, where edge and cloud nodes have imbalanced computation capacity and networking bandwidth. The main scalability barrier in distributed stochastic gradient descent-based machine learning frameworks is the communication overhead from frequent model parameter exchanges between workers and the central server. One way to reduce this overhead is by employing constant and periodic averaging, which sends model parameters to the server after a few iterations of local updates from workers. However, investigations have shown that the optimal communication period for balancing communication and convergence is not constant. Although some studies have explored the effectiveness of federated learning with a constant period, dynamically adjusting the period for optimal convergence remains under-explored. To address this, we investigate the impact of the period on global model convergence and propose an adaptive period control mechanism (AdaPC). This mechanism adaptively adjusts the aggregation period of the federated learning framework to achieve fast convergence with minimal communication. Our theoretical and empirical findings demonstrate that our proposed solution achieves faster convergence, lower final training loss, and minimized communication overhead compared to the constant period averaging strategy and other existing solutions.

Original languageEnglish
Pages (from-to)12572-12586
Number of pages15
JournalIEEE Transactions on Mobile Computing
Volume23
Issue number12
DOIs
Publication statusPublished - 2024

Keywords

  • Adaptive communication
  • Internet of Things
  • distributed SGD
  • edge AI
  • federated learning
  • sparse averaging

Fingerprint

Dive into the research topics of 'Adaptive Period Control for Communication Efficient and Fast Convergent Federated Learning'. Together they form a unique fingerprint.

Cite this