Abstract
Large Language Models (LLMs) have become integral in advancing content understanding and generation, leading to the proliferation of Embedding as a Service (EaaS) within cloud computing platforms. EaaS leverages LLMs to offer scalable, on-demand linguistic embeddings, enhancing various cloud-based applications. However, the proprietary nature of EaaS makes it a target for model extraction attacks, where the timing of such infringements often remains elusive. This paper introduces TimeMarker, a novel framework that enhances temporal traceability in cloud computing environments by embedding distinct watermarks at different sub-periods, marking the first attempt to identify the timing of model extraction attacks. TimeMarker employs an adaptive watermark strength method based on information entropy and frequency domain transformations to refine the detection accuracy of model extraction attacks within cloud infrastructures. The granularity of time frame identification for theft improves as more sub-periods are used. Furthermore, our approach investigates single sub-period theft and extends to multi-sub-period theft scenarios where attackers steal data across many sub-periods to train their models in cloud settings. Validated across five widely used datasets, TimeMarker is capable of detecting model extraction over various sub-periods and assessing its impact on the accuracy and robustness of large models deployed in the cloud. The results demonstrate that TimeMarker effectively identifies different periods of extraction attacks, enhancing EaaS security within cloud computing and extending traditional watermarking to offer copyright protection for LLMs.
| Original language | English |
|---|---|
| Article number | 103983 |
| Journal | Computer Standards and Interfaces |
| Volume | 94 |
| DOIs | |
| Publication status | Published - Aug 2025 |
Keywords
- Cloud computing security
- Copyright protection
- Large language model
- Watermark technology
Fingerprint
Dive into the research topics of 'Periodic watermarking for copyright protection of large language models in cloud computing security'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver