TY - GEN
T1 - Fractal Augmented Pre-training and Gaussian Virtual Feature Calibration for Tackling Data Heterogeneity in Federated Learning
AU - Zheng, Yan
AU - Zhai, Yanlong
AU - Liu, Yanglin
AU - Li, You
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Federated learning (FL) enables collaborative model training across multiple clients while preserving privacy. In practical situations, the heterogeneous and unbalanced distribution of the data has a significant impact on the performance of the model. Although some work has been carried out to address this issue, such as adding regularization terms, employing specific server aggregation strategies, and utilizing deep generative models to augment the training data, there is still a lack of efficient approaches to derive intrinsic representation of the local data to improve the global model without compromising client privacy. Through our careful observations and analysis, we found that incorporating pre-training and calibrating of the global model using virtual data and virtual features that are generated based on the client data distribution can improve model generalization. In this work, we propose Virtual Data Augmented Federated Learning (FedVDA) to resolve this problem. Specifically, FedVDA combines unsupervised pre-training with Augmented Fractal (AF) virtual images and Gaussian Mixture Model (GMM) virtual feature calibration. By integrating color tone transformations into the virtual data generated by fractals, we bridge the gap between virtual and client data distributions. Multi-modal feature modeling using variances on each client allows the server to efficiently calibrate the classifier with balanced sampled virtual features, reducing both computational and communication overhead. Compared to other data augmentation methods, our method directly calibrates model features, significantly improving model performance in scenarios with data heterogeneity and imbalance, while minimizing additional computational and communication costs. Our experiments demonstrate that FedVDA outperforms existing federated learning methods and can seamlessly integrate with other algorithms.
AB - Federated learning (FL) enables collaborative model training across multiple clients while preserving privacy. In practical situations, the heterogeneous and unbalanced distribution of the data has a significant impact on the performance of the model. Although some work has been carried out to address this issue, such as adding regularization terms, employing specific server aggregation strategies, and utilizing deep generative models to augment the training data, there is still a lack of efficient approaches to derive intrinsic representation of the local data to improve the global model without compromising client privacy. Through our careful observations and analysis, we found that incorporating pre-training and calibrating of the global model using virtual data and virtual features that are generated based on the client data distribution can improve model generalization. In this work, we propose Virtual Data Augmented Federated Learning (FedVDA) to resolve this problem. Specifically, FedVDA combines unsupervised pre-training with Augmented Fractal (AF) virtual images and Gaussian Mixture Model (GMM) virtual feature calibration. By integrating color tone transformations into the virtual data generated by fractals, we bridge the gap between virtual and client data distributions. Multi-modal feature modeling using variances on each client allows the server to efficiently calibrate the classifier with balanced sampled virtual features, reducing both computational and communication overhead. Compared to other data augmentation methods, our method directly calibrates model features, significantly improving model performance in scenarios with data heterogeneity and imbalance, while minimizing additional computational and communication costs. Our experiments demonstrate that FedVDA outperforms existing federated learning methods and can seamlessly integrate with other algorithms.
KW - Data Heterogeneity
KW - Federated Learning
KW - Gaussian Mixture Model
KW - Virtual Data Augmentation
KW - Virtual Feature Model Calibration
UR - http://www.scopus.com/inward/record.url?scp=85204999912&partnerID=8YFLogxK
U2 - 10.1109/IJCNN60899.2024.10650808
DO - 10.1109/IJCNN60899.2024.10650808
M3 - Conference contribution
AN - SCOPUS:85204999912
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2024 International Joint Conference on Neural Networks, IJCNN 2024 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 International Joint Conference on Neural Networks, IJCNN 2024
Y2 - 30 June 2024 through 5 July 2024
ER -