TY - GEN
T1 - BDVFL
T2 - 23rd IEEE International Conference on Data Mining, ICDM 2023
AU - Wang, Shuo
AU - Gai, Keke
AU - Yu, Jing
AU - Zhu, Liehuang
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Vertical Federated Learning (VFL) effectively addresses the issue of data isolation, which makes data mining secure. Most VFL implementations rely on a single server or third party for training, which will be terminated if the server or third party fails. In addition, the model accuracy trained by VFL depends on the quality of the client's local features; nevertheless, the client's local feature quality is difficult to verify. There exists a chance that the features owned by the client are irrelevant to the model or the intermediate results submitted by the client are inaccurate, such that the model's accuracy will be seriously affected. In order to solve the single point failure and model accuracy issues in VFL, this paper first proposes a Blockchain - based Decentralized VFL (BDVFL) training model. With the integration of blockchain and the VFL training process, the nodes within the blockchain are categorized into non-training and training nodes. Our method focuses on the scenario in which all training nodes possess labeled data and actively engage in the training procedure of VFL. To be specific, first, each client utilizes local features and initial models to carry out forward activation and generate intermediate results. Second, we randomly choose a training node and combine it with the intermediate results from all clients to formulate the loss function. Finally, each client updates the local model by using the gradient. To protect the raw features, a blinding factor is utilized for safeguarding the intermediate results submitted by the client, such that the training nodes cannot infer the local features from intermediate results. To mitigate the interference of irrelevant training outcomes from clients on the model's accuracy, we propose a verifiable aggregation method to assess the validity of the intermediate results submitted by the clients. We have conducted both theoretical and experimental analysis, and the results demonstrate the effectiveness of the proposed method.
AB - Vertical Federated Learning (VFL) effectively addresses the issue of data isolation, which makes data mining secure. Most VFL implementations rely on a single server or third party for training, which will be terminated if the server or third party fails. In addition, the model accuracy trained by VFL depends on the quality of the client's local features; nevertheless, the client's local feature quality is difficult to verify. There exists a chance that the features owned by the client are irrelevant to the model or the intermediate results submitted by the client are inaccurate, such that the model's accuracy will be seriously affected. In order to solve the single point failure and model accuracy issues in VFL, this paper first proposes a Blockchain - based Decentralized VFL (BDVFL) training model. With the integration of blockchain and the VFL training process, the nodes within the blockchain are categorized into non-training and training nodes. Our method focuses on the scenario in which all training nodes possess labeled data and actively engage in the training procedure of VFL. To be specific, first, each client utilizes local features and initial models to carry out forward activation and generate intermediate results. Second, we randomly choose a training node and combine it with the intermediate results from all clients to formulate the loss function. Finally, each client updates the local model by using the gradient. To protect the raw features, a blinding factor is utilized for safeguarding the intermediate results submitted by the client, such that the training nodes cannot infer the local features from intermediate results. To mitigate the interference of irrelevant training outcomes from clients on the model's accuracy, we propose a verifiable aggregation method to assess the validity of the intermediate results submitted by the clients. We have conducted both theoretical and experimental analysis, and the results demonstrate the effectiveness of the proposed method.
KW - Blinding factor
KW - Blockchain
KW - Decentralized
KW - Verifiable aggregation
KW - Vertical federated learning
UR - http://www.scopus.com/inward/record.url?scp=85185396477&partnerID=8YFLogxK
U2 - 10.1109/ICDM58522.2023.00072
DO - 10.1109/ICDM58522.2023.00072
M3 - Conference contribution
AN - SCOPUS:85185396477
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 628
EP - 637
BT - Proceedings - 23rd IEEE International Conference on Data Mining, ICDM 2023
A2 - Chen, Guihai
A2 - Khan, Latifur
A2 - Gao, Xiaofeng
A2 - Qiu, Meikang
A2 - Pedrycz, Witold
A2 - Wu, Xindong
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 December 2023 through 4 December 2023
ER -