Deep Model Fusion: A Survey

  • Weishi Li
  • , Yong Peng
  • , Miao Zhang
  • , Liang Ding
  • , Han Hu
  • , Li Shen*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Deep model fusion/merging is an emerging technique that integrates parameters or predictions from multiple deep learning (DL) models into a unified framework. It combines the abilities of different models to compensate for the biases and errors of an individual model, improving overall performance. However, deep model fusion, especially on large-scale DL models such as large language models (LLMs) and foundation models, faces several challenges, including high computational cost and interference between different heterogeneous models. In order to understand it better, we present a comprehensive survey to summarize the recent progress. We categorize existing model fusion methods as fourfold: 1) weight average (WA) averages the parameters of multiple models to obtain results closer to the optimal solution; 2) considering that direct averaging of models often yields suboptimal results, 'mode connectivity' connects networks via paths of nonincreasing loss in weight spaces before the fusion. Along these paths, initial models are transformed into forms with consistent functions and better fusion effects; 3) similarly, for models with poor direct fusion results, 'alignment' matches the corresponding units and merges these models, thus fully exploiting the corresponding relationships between the models; and 4) in addition to the above-mentioned methods of parameter fusion, 'ensemble learning' fuses the outputs of multiple models in the inference stage to improve the accuracy and robustness of networks. In addition, we analyze the challenges of deep model fusion and illuminate the possible research directions in the future.

Original languageEnglish
JournalIEEE Transactions on Neural Networks and Learning Systems
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • Deep learning (DL)
  • federated learning (FL)
  • large language models (LLMs)
  • model aggregation
  • model fusion
  • survey

Fingerprint

Dive into the research topics of 'Deep Model Fusion: A Survey'. Together they form a unique fingerprint.

Cite this