TY - JOUR
T1 - Adversarial attacks on video quality assessment models
AU - Hu, Zongyao
AU - Liu, Lixiong
AU - Sang, Qingbing
AU - Wang, Chongwen
N1 - Publisher Copyright:
© 2024 Elsevier B.V.
PY - 2024/6/7
Y1 - 2024/6/7
N2 - Most currently developed video quality assessment (VQA) algorithms have achieved excellent performance by using deep neural network (DNN). However, DNN is vulnerable to adversarial attacks, as an efficient surrogate for validating the model robustness, and there lack adversarial attack methods against VQA models. To this end, we propose a spatiotemporal attack network to generate adversarial examples for evaluating the robustness of VQA models that contains a spatial subnetwork and a temporal subnetwork. The proposed network, dubbed the Space-Time Quality Attack Network (STQA-Net1), first computes the just noticeable difference (JND) maps of a video sequence as the input of the spatial subnetwork. The spatial subnetwork encodes the computed maps as spatial features and feeds the spatial features to the temporal subnetwork. Then, the spatial features are fused with the output of the temporal subnetwork and the fused features are decoded as attack weight maps. A visual constraint is used to control the visibility of perturbations and guide the generation of perturbation maps by multiplying JND maps with attack weight maps. Finally, the generated perturbation maps are added to the original video to form an adversarial example. Further, we also try to design a two-branch network to generate two opposite examples in a targeted attack scenario. The proposed attack methods against six state-of-the-art VQA algorithms are thoroughly tested on three VQA databases. The experimental results show that the proposed attack methods are very effective for testing the robustness of VQA models.
AB - Most currently developed video quality assessment (VQA) algorithms have achieved excellent performance by using deep neural network (DNN). However, DNN is vulnerable to adversarial attacks, as an efficient surrogate for validating the model robustness, and there lack adversarial attack methods against VQA models. To this end, we propose a spatiotemporal attack network to generate adversarial examples for evaluating the robustness of VQA models that contains a spatial subnetwork and a temporal subnetwork. The proposed network, dubbed the Space-Time Quality Attack Network (STQA-Net1), first computes the just noticeable difference (JND) maps of a video sequence as the input of the spatial subnetwork. The spatial subnetwork encodes the computed maps as spatial features and feeds the spatial features to the temporal subnetwork. Then, the spatial features are fused with the output of the temporal subnetwork and the fused features are decoded as attack weight maps. A visual constraint is used to control the visibility of perturbations and guide the generation of perturbation maps by multiplying JND maps with attack weight maps. Finally, the generated perturbation maps are added to the original video to form an adversarial example. Further, we also try to design a two-branch network to generate two opposite examples in a targeted attack scenario. The proposed attack methods against six state-of-the-art VQA algorithms are thoroughly tested on three VQA databases. The experimental results show that the proposed attack methods are very effective for testing the robustness of VQA models.
KW - Adversarial attack
KW - Spatiotemporal attack
KW - Video quality assessment
KW - Visual constraint
UR - http://www.scopus.com/inward/record.url?scp=85188803104&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2024.111655
DO - 10.1016/j.knosys.2024.111655
M3 - Article
AN - SCOPUS:85188803104
SN - 0950-7051
VL - 293
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 111655
ER -