TY - JOUR
T1 - Self-enhancing defense for protecting against model stealing attacks on deep learning systems
AU - Zhang, Chenlong
AU - Luo, Senlin
AU - Li, Jiawei
AU - Pan, Limin
AU - Lu, Chuan
N1 - Publisher Copyright:
© 2025 Elsevier Ltd
PY - 2025/4/15
Y1 - 2025/4/15
N2 - Defending against model stealing (MS) is crucial for safeguarding intellectual property and the security of deep learning applications. Current countermeasures, however, have notable shortcomings. First, defense strategies reliant on distribution classification often fail to accurately identify attack samples with semantic and visual similarities, thereby reducing their effectiveness. Second, the method of leveraging query samples from unknown origins to bolster defense capability in application scenarios remains an unresolved yet critical issue. This paper presents SED (Self-Enhancing Model Stealing Defense Method), an innovative defense method against model stealing. SED incorporates a deep hashing model and introduces a novel Penalty-Weighted Hamming (PWH) distance for sample segmentation, which effectively overcomes the drawbacks of traditional distribution-based classification. Subsequently, SED employs dynamic temperature scaling and label flipping to realize defense. Moreover, SED maintains an archive of historical query samples and utilizes a greedy algorithm to construct a database of malicious samples, thereby improving defense tactics for future queries similar to those catalogued. Experimental results confirm that SED substantially diminishes the accuracy of the attackers’ substitute models and effectively utilizes historical data for self-enhancement.
AB - Defending against model stealing (MS) is crucial for safeguarding intellectual property and the security of deep learning applications. Current countermeasures, however, have notable shortcomings. First, defense strategies reliant on distribution classification often fail to accurately identify attack samples with semantic and visual similarities, thereby reducing their effectiveness. Second, the method of leveraging query samples from unknown origins to bolster defense capability in application scenarios remains an unresolved yet critical issue. This paper presents SED (Self-Enhancing Model Stealing Defense Method), an innovative defense method against model stealing. SED incorporates a deep hashing model and introduces a novel Penalty-Weighted Hamming (PWH) distance for sample segmentation, which effectively overcomes the drawbacks of traditional distribution-based classification. Subsequently, SED employs dynamic temperature scaling and label flipping to realize defense. Moreover, SED maintains an archive of historical query samples and utilizes a greedy algorithm to construct a database of malicious samples, thereby improving defense tactics for future queries similar to those catalogued. Experimental results confirm that SED substantially diminishes the accuracy of the attackers’ substitute models and effectively utilizes historical data for self-enhancement.
KW - Deep hash
KW - Model stealing attack
KW - Model stealing defense
KW - Security and privacy
KW - Self-enhancing method
UR - http://www.scopus.com/inward/record.url?scp=85214833111&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2025.126438
DO - 10.1016/j.eswa.2025.126438
M3 - Article
AN - SCOPUS:85214833111
SN - 0957-4174
VL - 269
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 126438
ER -