云边协同大模型块粒度重训方法

Translated title of the contribution: Cloud-Edge Collaborative Retraining of Foundation Models at the Block Granularity

Qing Long Zhang, Rui Han*, Chi Liu

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Foundation models deployed in dynamic edge environment encounter continuously evolving input data distributions, requiring retraining them to maintain high accuracy. However, existing retraining techniques can only train fixed compressed models within the constraints of device resources and retraining windows, thus considerably lowering accuracies due to these small models’ limited generalization ability. For such an issue, this paper proposes BlockTrainer, an edge-cloud collaborative retraining approach of foundation models at the block granularity. BlockTrainer first introduces a model retraining scaling law to evaluate the accuracy contributions of different blocks in a foundation model according to its latest input data at edge. Based on this evaluation, it generates the optimal retraining solution under resource constraints, and dynamically converts the most accuracy-relevant parts of the model into retrainable small models at edge, thereby constructing a collaborative training system between large and small models. Comparative experiments on real edge-cloud platforms show that BlockTrainer improves the retraining accuracy of foundation models by 81.24% using the same resource consumptions, and supports retraining a model of up to 33 billion parameters.

Translated title of the contributionCloud-Edge Collaborative Retraining of Foundation Models at the Block Granularity
Original languageChinese (Traditional)
Pages (from-to)287-300
Number of pages14
JournalTien Tzu Hsueh Pao/Acta Electronica Sinica
Volume53
Issue number2
DOIs
Publication statusPublished - 25 Feb 2025
Externally publishedYes

Fingerprint

Dive into the research topics of 'Cloud-Edge Collaborative Retraining of Foundation Models at the Block Granularity'. Together they form a unique fingerprint.

Cite this