FedVisual: Heterogeneity-Aware Model Aggregation for Federated Learning in Visual-Based Vehicular Crowdsensing

Wenjun Zhang; Xiaoli Liu; Ruoyi Zhang; Chao Zhu; Sasu Tarkoma

doi:10.1109/JIOT.2024.3456751

FedVisual: Heterogeneity-Aware Model Aggregation for Federated Learning in Visual-Based Vehicular Crowdsensing

Wenjun Zhang, Xiaoli Liu, Ruoyi Zhang, Chao Zhu^*, Sasu Tarkoma

^*Corresponding author for this work

School of Cyberspace Science and Technology

Research output: Contribution to journal › Article › peer-review

Abstract

With the advancement of assisted and autonomous driving technologies, vehicles are being outfitted with an ever-increasing number of sensors. Among these, visible light sensors, or dash-cameras, produce visual data rich in information. Analyzing this visual data through crowdsensing allows for low-cost and timely perception of urban road conditions, such as identifying dangerous driving behaviors and locating parking spaces. However, uploading such massive visual data to the cloud for centralized processing can lead to significant bandwidth challenges and also raise privacy concerns among vehicle owners. Federated learning (FL), in which vehicles serve as both data generators and computing nodes, presents a promising solution to address these challenges. Nevertheless, urban roads are complex and vehicles in different locations encounter completely different scenes, resulting in non-independently and identically distributed (non-i.i.d.) characteristics. Additionally, the diversity in dash-camera and onboard computation resources may lead to differences in the performance of locally trained models. Indiscriminate aggregating of local models from all vehicles can potentially degrade the global model's performance. To overcome these challenges, we introduce FedVisual, a model aggregation approach for FL in vehicular visual crowdsensing. FedVisual leverages deep Q-network (DQN) to select appropriate local models, considering the heterogeneities in visual data contents and vehicles' specifications. By leveraging the historical training experience, an effective model selection strategy can be obtained without complex mathematical modeling. Through the extensive simulations of our self-collected driving videos, FedVisual reduces model aggregation latency by up to 3.8% while improving the model's performance by up to 3.2% compared to reference works.

Original language	English
Pages (from-to)	36191-36202
Number of pages	12
Journal	IEEE Internet of Things Journal
Volume	11
Issue number	22
DOIs	https://doi.org/10.1109/JIOT.2024.3456751
Publication status	Published - 2024

Keywords

Autonomous Internet of Things (IoT) systems
deep Q-network (DQN)
federated learning (FL)

Access to Document

10.1109/JIOT.2024.3456751

Cite this

@article{9e16f6b6c3fd4f7ea1d5221cd03a9e8c,

title = "FedVisual: Heterogeneity-Aware Model Aggregation for Federated Learning in Visual-Based Vehicular Crowdsensing",

abstract = "With the advancement of assisted and autonomous driving technologies, vehicles are being outfitted with an ever-increasing number of sensors. Among these, visible light sensors, or dash-cameras, produce visual data rich in information. Analyzing this visual data through crowdsensing allows for low-cost and timely perception of urban road conditions, such as identifying dangerous driving behaviors and locating parking spaces. However, uploading such massive visual data to the cloud for centralized processing can lead to significant bandwidth challenges and also raise privacy concerns among vehicle owners. Federated learning (FL), in which vehicles serve as both data generators and computing nodes, presents a promising solution to address these challenges. Nevertheless, urban roads are complex and vehicles in different locations encounter completely different scenes, resulting in non-independently and identically distributed (non-i.i.d.) characteristics. Additionally, the diversity in dash-camera and onboard computation resources may lead to differences in the performance of locally trained models. Indiscriminate aggregating of local models from all vehicles can potentially degrade the global model's performance. To overcome these challenges, we introduce FedVisual, a model aggregation approach for FL in vehicular visual crowdsensing. FedVisual leverages deep Q-network (DQN) to select appropriate local models, considering the heterogeneities in visual data contents and vehicles' specifications. By leveraging the historical training experience, an effective model selection strategy can be obtained without complex mathematical modeling. Through the extensive simulations of our self-collected driving videos, FedVisual reduces model aggregation latency by up to 3.8% while improving the model's performance by up to 3.2% compared to reference works.",

keywords = "Autonomous Internet of Things (IoT) systems, deep Q-network (DQN), federated learning (FL)",

author = "Wenjun Zhang and Xiaoli Liu and Ruoyi Zhang and Chao Zhu and Sasu Tarkoma",

note = "Publisher Copyright: {\textcopyright} 2014 IEEE.",

year = "2024",

doi = "10.1109/JIOT.2024.3456751",

language = "English",

volume = "11",

pages = "36191--36202",

journal = "IEEE Internet of Things Journal",

issn = "2327-4662",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "22",

}

TY - JOUR

T1 - FedVisual

T2 - Heterogeneity-Aware Model Aggregation for Federated Learning in Visual-Based Vehicular Crowdsensing

AU - Zhang, Wenjun

AU - Liu, Xiaoli

AU - Zhang, Ruoyi

AU - Zhu, Chao

AU - Tarkoma, Sasu

PY - 2024

Y1 - 2024

N2 - With the advancement of assisted and autonomous driving technologies, vehicles are being outfitted with an ever-increasing number of sensors. Among these, visible light sensors, or dash-cameras, produce visual data rich in information. Analyzing this visual data through crowdsensing allows for low-cost and timely perception of urban road conditions, such as identifying dangerous driving behaviors and locating parking spaces. However, uploading such massive visual data to the cloud for centralized processing can lead to significant bandwidth challenges and also raise privacy concerns among vehicle owners. Federated learning (FL), in which vehicles serve as both data generators and computing nodes, presents a promising solution to address these challenges. Nevertheless, urban roads are complex and vehicles in different locations encounter completely different scenes, resulting in non-independently and identically distributed (non-i.i.d.) characteristics. Additionally, the diversity in dash-camera and onboard computation resources may lead to differences in the performance of locally trained models. Indiscriminate aggregating of local models from all vehicles can potentially degrade the global model's performance. To overcome these challenges, we introduce FedVisual, a model aggregation approach for FL in vehicular visual crowdsensing. FedVisual leverages deep Q-network (DQN) to select appropriate local models, considering the heterogeneities in visual data contents and vehicles' specifications. By leveraging the historical training experience, an effective model selection strategy can be obtained without complex mathematical modeling. Through the extensive simulations of our self-collected driving videos, FedVisual reduces model aggregation latency by up to 3.8% while improving the model's performance by up to 3.2% compared to reference works.

AB - With the advancement of assisted and autonomous driving technologies, vehicles are being outfitted with an ever-increasing number of sensors. Among these, visible light sensors, or dash-cameras, produce visual data rich in information. Analyzing this visual data through crowdsensing allows for low-cost and timely perception of urban road conditions, such as identifying dangerous driving behaviors and locating parking spaces. However, uploading such massive visual data to the cloud for centralized processing can lead to significant bandwidth challenges and also raise privacy concerns among vehicle owners. Federated learning (FL), in which vehicles serve as both data generators and computing nodes, presents a promising solution to address these challenges. Nevertheless, urban roads are complex and vehicles in different locations encounter completely different scenes, resulting in non-independently and identically distributed (non-i.i.d.) characteristics. Additionally, the diversity in dash-camera and onboard computation resources may lead to differences in the performance of locally trained models. Indiscriminate aggregating of local models from all vehicles can potentially degrade the global model's performance. To overcome these challenges, we introduce FedVisual, a model aggregation approach for FL in vehicular visual crowdsensing. FedVisual leverages deep Q-network (DQN) to select appropriate local models, considering the heterogeneities in visual data contents and vehicles' specifications. By leveraging the historical training experience, an effective model selection strategy can be obtained without complex mathematical modeling. Through the extensive simulations of our self-collected driving videos, FedVisual reduces model aggregation latency by up to 3.8% while improving the model's performance by up to 3.2% compared to reference works.

KW - Autonomous Internet of Things (IoT) systems

KW - deep Q-network (DQN)

KW - federated learning (FL)

UR - http://www.scopus.com/inward/record.url?scp=85204111822&partnerID=8YFLogxK

U2 - 10.1109/JIOT.2024.3456751

DO - 10.1109/JIOT.2024.3456751

M3 - Article

AN - SCOPUS:85204111822

SN - 2327-4662

VL - 11

SP - 36191

EP - 36202

JO - IEEE Internet of Things Journal

JF - IEEE Internet of Things Journal

IS - 22

ER -

FedVisual: Heterogeneity-Aware Model Aggregation for Federated Learning in Visual-Based Vehicular Crowdsensing

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this